DNS Conditional Forwarder CNAME Cross-Domain Resolution Failure - Secure Cache Against Pollution
Case Summary: DNS Conditional Forwarder CNAME Cross-Domain Resolution Failure
Product/Service: Windows DNS Server / Conditional Forwarder
Issue Definition
Clients intermittently failed to resolve lingma-api.tongyi.aliyun.com via Windows DNS Server. When the issue occurred, DNS Server returned NOERROR but the response contained only a CNAME record without the final A record (IP address), causing application timeouts. The issue was intermittent β worked most of the time, failed occasionally for several minutes.
Architecture:
Client --> Windows DNS Server (x3) --> Conditional Forwarder (aliyun.com) --> Alibaba Cloud DNS (x4)
Information Gathered
-
Conditional Forwarder configured for
aliyun.com, pointing to 4 Alibaba DNS servers (e.g.,10.199.162.112). The 3rd server in the list was unreachable. -
Secure Cache Against Pollution was enabled (default). Confirmed via DNS Manager GUI β βSecure cache against pollutionβ checkbox was ticked. Registry key
SecureResponsesdid not exist, meaning default value (enabled) applied. - Alibaba Cloud Wireshark capture (captured on Alibaba DNS server side) showed their DNS response contained a complete CNAME chain + A records:
```
-
CNAME: lingma-api.tongyi.aliyun.com β> lingma-api.tongyi.aliyun.com.gds.alibabadns.com TTL=190s
-
CNAME: lingma-api.tongyi.aliyun.com.gds.alibabadns.com β> ga-bpladtuvv8ekkixpbejij.aliyunga0019.com TTL=80s
- A: ga-bpladtuvv8ekkixpbejij.aliyunga0019.com β> 47.57.143.171 TTL=194s
- A: ga-bpladtuvv8ekkixpbejij.aliyunga0019.com β> 47.57.7.142 TTL=194s
```
The CNAME chain crossed 3 domains:
aliyun.comβ>alibabadns.comβ>aliyunga0019.com
-
- Client nslookup reproduced the issue β returned only CNAME, no A record:
> server 10.107.125.71 > lingma-api.tongyi.aliyun.com Non-authoritative answer: Name: lingma-api.tongyi.aliyun.com (No IP address) - No DNS Policy configured. No conflicting zones. Network connectivity confirmed good.
DNS Debug Log Findings
Debug log was enabled on the DNS Server during reproduction. Key findings:
Finding 1: Client query was served from cache
The nslookup client (10.65.90.12) at 11:48:35 received a response directly from cache β DNS Server did NOT forward to Alibaba at that moment:
206217 | 11:48:35 | Rcv 10.65.90.12 --> Q A lingma-api.tongyi.aliyun.com
206219 | 11:48:35 | Snd 10.65.90.12 <-- R NOERROR (from cache, no forwarding)
This means the cache already contained a CNAME-only entry (no A record).
Finding 2: Earlier forwards to Alibaba all returned NOERROR
Other clients triggered cache refresh earlier. All 3 forwards to Alibaba returned NOERROR:
26145 | 11:47:33 | Snd 10.199.162.112 --> Q lingma-api.tongyi.aliyun.com (forward to Alibaba)
26155 | 11:47:33 | Rcv 10.199.162.112 <-- R NOERROR (Alibaba responded)
26157 | 11:47:33 | Snd 10.64.116.94 <-- R NOERROR (returned to client)
29515 | 11:47:34 | (2nd forward to Alibaba --> NOERROR)
59323 | 11:47:45 | (3rd forward to Alibaba --> NOERROR)
Note: Debug log only records packet metadata (direction, IP, response code, query name). It does NOT record Answer Section content. So we can see NOERROR but cannot see what records were inside the response.
Finding 3: CNAME Chase to general forwarder (KEY FINDING)
At 11:48:06, DNS Server sent a query to the general forwarder (10.111.125.34) β NOT the Alibaba Conditional Forwarder β for the CNAME target domain:
118985 | 11:48:06 | Snd 10.111.125.34 --> Q lingma-api.tongyi.aliyun.com.gds.alibabadns.com
119193 | 11:48:06 | Rcv 10.111.125.34 <-- R NOERROR
This same CNAME chase pattern appeared for multiple other domains:
g.alicdn.com --> g.alicdn.com.gds.alibabadns.com
bluedot.is.autonavi.com --> bluedot.is.autonavi.com.gds.alibabadns.com
d-gm.mmstat.com --> d-gm.mmstat.com.gds.alibabadns.com
Finding 4: Event statistics
Over ~2.5 minutes of debug log: 752 client queries all served from cache, only 3 forwards to Alibaba, 1 CNAME chase.
Client-Side Packet Capture β Direct Evidence
A Wireshark capture between the Windows DNS Server (10.107.125.71) and the general forwarder (10.111.125.34) provided direct evidence of the CNAME chase and the intermittency:
Frame 81366 (11:42:41) β Failure case:
10.107.125.71 --> 10.111.125.34: Q lingma-api.tongyi.aliyun.com.gds.alibabadns.com
10.111.125.34 --> 10.107.125.71: R Answer RRs: 1
Answer:
CNAME: lingma-api.tongyi.aliyun.com.gds.alibabadns.com
--> ga-bp1mt0z4o22gjtuolz4j1.aliyunga0017.com
TTL: 300s
Additional: OPT only (NO A record!)
Frame 2191x (11:47:42) β Success case:
10.107.125.71 --> 10.111.125.34: Q lingma-api.tongyi.aliyun.com.gds.alibabadns.com
10.111.125.34 --> 10.107.125.71: R CNAME + A 47.57.143.171 (A record present)
This capture directly confirmed:
- CNAME Chase is happening β DNS Server queries the general forwarder for
*.gds.alibabadns.com - Intermittency root cause β The general forwarder (
10.111.125.34) returns inconsistent results: sometimes CNAME-only (no A), sometimes CNAME + A - The general forwarder should never have been involved β This query should have gone to Alibaba Cloud DNS, not the general forwarder
Logical Reasoning
Evidence chain:
- Alibaba Cloud DNS returns complete CNAME + A records (confirmed by Alibaba-side Wireshark)
- Client only receives CNAME without A (confirmed by nslookup)
- DNS Server performs CNAME chase to general forwarder for
*.gds.alibabadns.com(confirmed by both debug log AND client-side Wireshark) - General forwarder returns inconsistent results β sometimes with A, sometimes without (confirmed by client-side Wireshark)
Inference:
DNS Serverβs Secure Cache Against Pollution (enabled by default) performs Bailiwick checks on each record in Alibabaβs response:
- CNAME for
lingma-api.tongyi.aliyun.comβ withinaliyun.comscope β ACCEPTED - CNAME for
*.gds.alibabadns.comβ outsidealiyun.comscope β DISCARDED - A records for
*.aliyunga0019.comβ outsidealiyun.comscope β DISCARDED
Per Microsoft documentation:
- KB241352: βa Windows-based DNS server can filter out the responses for these non-secure recordsβ
- CERT VU#109475: βthe DNS server will cache any records in a response even if those records are outside the namespace delegated to the remote DNS serverβ (when Secure Cache is disabled; when enabled, such out-of-bailiwick records are discarded)
DNS Server then attempts to resolve the CNAME target independently via the general forwarder, which produces inconsistent results.
Note: The A record being discarded by Secure Cache is inference β we do not have a Wireshark capture on the DNS Server itself comparing inbound (from Alibaba) vs outbound (to client) packets. However, the evidence chain strongly supports this conclusion: if the A record were cached from Alibabaβs response, the CNAME chase to the general forwarder would not occur.
Root Cause
The issue was caused by two layers of problems:
Layer 1 β Secure Cache Against Pollution + Conditional Forwarder scope mismatch:
- Alibaba Cloud DNS returns CNAME chains crossing domain boundaries (
aliyun.comβalibabadns.comβaliyunga00xx.com) - Conditional Forwarder only covers
aliyun.com - Secure Cache (default enabled) discards records outside
aliyun.comscope, including the final A records - DNS Server is forced to do CNAME chase via general forwarder instead of using Alibabaβs complete response
Layer 2 β General forwarder inconsistent resolution:
- The general forwarder (
10.111.125.34) resolves*.gds.alibabadns.cominconsistently - Sometimes returns CNAME + A β works
- Sometimes returns CNAME only β fails
- This explains the intermittent nature of the issue
Customer confirmed the general forwarder behavior was their issue.
Resolution
Option 1 (Recommended): Expand Conditional Forwarder coverage
# On all 3 DNS Servers:
Add-DnsServerConditionalForwarderZone -Name "alibabadns.com" -MasterServers "Ali_DNS_IP1","Ali_DNS_IP2","Ali_DNS_IP3","Ali_DNS_IP4"
This ensures *.gds.alibabadns.com queries go to Alibaba Cloud DNS (stable) instead of the general forwarder (inconsistent).
Option 2: Fix general forwarder resolution
Investigate why the general forwarder (10.111.125.34) returns inconsistent results for *.gds.alibabadns.com.
Workaround for verification
# Temporarily disable Secure Cache (no restart required):
Set-DnsServerCache -PollutionProtection $false
Clear-DnsServerCache
# Test, then restore:
Set-DnsServerCache -PollutionProtection $true