MacOSX 10.6 Rejected DNS responses causing IPv6 failures

Originator:ron
Number:rdar://7333104 Date Originated:23-Oct-2009 10:54 PM
Status:Duplicate/7327894 Resolved:No
Product:Mac OS X Product Version:10.6
Classification:Serious Bug Reproducible:Yes
 
Summary:
After upgrading to 10.6, users are complaining that they can't reach certain web sites from IPv6-only networks, or complain that connections to dual-stack (IPv4 + IPv6) web sites are using IPv4 transport instead of IPv6 transport as was the case in 10.5 and earlier, and in all other operating systems.  There is a general complaint that 10.6 is preferring IPv4 over IPv6, when it is supposed to be the other way around (per RFC 3484).

In the process of sniffing the traffic, we see that any name query results in 2 DNS queries, one for the "A" resource record, and one for the "AAAA" record, as expected.  We also see that when we query for the name of a system that operates both IPv4 and IPv6, we get replies for each of those queries, one reply for the IPv4 addresses ("A" records) and one reply for the IPv6 addresses ("AAAA" records).  However, whatever answer comes in second is usually rejected, as evidenced by a "UDP Port Unreachable" ICMP response from the MacOSX 10.6 machine that initiated the query in the first place.

In observing the debug output from mDNSResponder, it appears that the first response causes the remaining query to be Canceled, the socket closed, and the OS rejects the message, so mDNSResponder never receives it.

The result is that the first reply wins.  If the first reply happens to be the "A" record (most often the case), then the "AAAA" answer is rejected, and the eventual client connection is made via IPv4.  If the first reply is the "AAAA", then that one wins and the connection is made over IPv6.

Here is an example when the "A" worked, but the "AAAA" was rejected (lookup for www.kame.net):

21:01:40.583892 192.168.1.2.62499 > 192.168.1.1.domain: 20561+ A? www.kame.net. (30)
21:01:40.594347 192.168.1.2.55014 > 192.168.1.1.domain: 24513+ AAAA? www.kame.net. (30)
21:01:40.739626 192.168.1.1.domain > 192.168.1.2.62499: 20561 1/0/0 www.kame.net. A 203.178.141.194 (46)
21:01:40.904784 192.168.1.1.domain > 192.168.1.2.55014: 24513 1/0/0 www.kame.net. AAAA 2001:200::8002:203:47ff:fea5:3085 (58)
21:01:40.904812 192.168.1.2 > 192.168.1.1: ICMP 192.168.1.2 udp port 55014 unreachable, length 36
    192.168.1.1.domain > 192.168.1.2.55014: [|domain]

Here's an example where the "AAAA" was first, and the "A" was rejected (lookup for www.sixxs.net):

20:32:53.877307 192.168.1.2.64639 > 192.168.1.1.domain: 63083+ A? www.sixxs.net. (31)
20:32:53.886759 192.168.1.2.63909 > 192.168.1.1.domain: 44461+ AAAA? www.sixxs.net. (31)
20:32:54.056428 192.168.1.1.domain > 192.168.1.2.63909: 44461 2/0/0 www.sixxs.net. CNAME www.m.sixxs.net., www.m.sixxs.net. AAAA 2001:838:1:1:210:dcff:fe20:7c7c (79)
20:32:54.167615 192.168.1.1.domain > 192.168.1.2.64639: 63083 3/0/0 www.sixxs.net. CNAME www.m.sixxs.net., www.m.sixxs.net. A 213.204.193.2, www.m.sixxs.net. A 193.109.122.244 (83)
20:32:54.167658 192.168.1.2 > 192.168.1.1: ICMP 192.168.1.2 udp port 64639 unreachable, length 36
    192.168.1.1.domain > 192.168.1.2.64639: [|domain]

The second reply is always rejected with a UDP Port Unreachable.

Here's the debug output from mDNSResponder for that www.kame.net lookup:

Oct 23 21:01:40 neko mDNSResponder[17]:  43: Error socket 40 created 00000000 00000233
Oct 23 21:01:40 neko mDNSResponder[17]:  43: DNSServiceQueryRecord(www.kame.net., Addr, 5000) START
Oct 23 21:01:40 neko mDNSResponder[17]:  43: Error socket 40 closed  00000000 00000233 (0)
Oct 23 21:01:40 neko mDNSResponder[17]:  43: Error socket 40 created 00000000 00000234
Oct 23 21:01:40 neko mDNSResponder[17]:  43: DNSServiceQueryRecord(www.kame.net., AAAA, 5000) START
Oct 23 21:01:40 neko mDNSResponder[17]:  43: Error socket 40 closed  00000000 00000234 (0)
Oct 23 21:01:40 neko mDNSResponder[17]: -- Sent UDP DNS Query (flags 0100) RCODE: NoErr (0) RD ID: 20561 18 bytes from port 62499 to 192.168.1.1:53 --
Oct 23 21:01:40 neko mDNSResponder[17]:  1 Questions
Oct 23 21:01:40 neko mDNSResponder[17]:  0 www.kame.net. Addr
Oct 23 21:01:40 neko mDNSResponder[17]:  0 Answers
Oct 23 21:01:40 neko mDNSResponder[17]:  0 Authorities
Oct 23 21:01:40 neko mDNSResponder[17]:  0 Additionals
Oct 23 21:01:40 neko mDNSResponder[17]: --------------
Oct 23 21:01:40 neko mDNSResponder[17]: -- Sent UDP DNS Query (flags 0100) RCODE: NoErr (0) RD ID: 24513 18 bytes from port 55014 to 192.168.1.1:53 --
Oct 23 21:01:40 neko mDNSResponder[17]:  1 Questions
Oct 23 21:01:40 neko mDNSResponder[17]:  0 www.kame.net. AAAA
Oct 23 21:01:40 neko mDNSResponder[17]:  0 Answers
Oct 23 21:01:40 neko mDNSResponder[17]:  0 Authorities
Oct 23 21:01:40 neko mDNSResponder[17]:  0 Additionals
Oct 23 21:01:40 neko mDNSResponder[17]: --------------
Oct 23 21:01:40 neko mDNSResponder[17]: -- Received UDP DNS Response (flags 8180) RCODE: NoErr (0) RD RA ID: 20561 34 bytes from 192.168.1.1:53 to 192.168.1.2:62499 --
Oct 23 21:01:40 neko mDNSResponder[17]:  1 Questions
Oct 23 21:01:40 neko mDNSResponder[17]:  0 www.kame.net. Addr
Oct 23 21:01:40 neko mDNSResponder[17]:  1 Answers
Oct 23 21:01:40 neko mDNSResponder[17]:  0 TTL    900    4 www.kame.net. Addr 203.178.141.194
Oct 23 21:01:40 neko mDNSResponder[17]:  0 Authorities
Oct 23 21:01:40 neko mDNSResponder[17]:  0 Additionals
Oct 23 21:01:40 neko mDNSResponder[17]: --------------
Oct 23 21:01:40 neko mDNSResponder[17]:  43: DNSServiceQueryRecord(www.kame.net., Addr) ADD    4 www.kame.net. Addr 203.178.141.194
Oct 23 21:01:40 neko mDNSResponder[17]:  43: Cancel 00000000 00000233
Oct 23 21:01:40 neko mDNSResponder[17]:  43: DNSServiceQueryRecord(www.kame.net., Addr) STOP
Oct 23 21:01:40 neko mDNSResponder[17]:  43: Cancel 00000000 00000234
Oct 23 21:01:40 neko mDNSResponder[17]:  43: DNSServiceQueryRecord(www.kame.net., AAAA) STOP

Observe how after the first answer was received and accepted, all related queries were STOPped/Canceled.  There is no record of ever receiving the second response.

Steps to Reproduce:

Use Safari to browse to an IPv6-enabled web site (http://www.kame.net), from a machine that is dual stack and has reachability to the IPv4 Internet and IPv6 Internet, and observe that it is connecting using IPv4 instead of IPv6.   Use "tcpdump -i en0 -s 1500 icmp or port 53" to observe the DNS behavior:

# tcpdump -i en0 -s 1500 icmp or port 53
listening on en0, link-type EN10MB (Ethernet), capture size 1500 bytes
22:38:45.524982 IP 192.168.1.2.56540 > 192.168.1.1.domain: 3163+ A? www.kame.net. (30)
22:38:45.534432 IP 192.168.1.2.63324 > 192.168.1.1.domain: 41675+ AAAA? www.kame.net. (30)
22:38:45.578476 IP 192.168.1.1.domain > 192.168.1.2.56540: 3163 1/0/0 A 203.178.141.194 (46)
22:38:45.754904 IP 192.168.1.1.domain > 192.168.1.2.63324: 41675 1/0/0 AAAA 2001:200::8002:203:47ff:fea5:3085 (58)
22:38:45.754943 IP 192.168.1.2 > 192.168.1.1: ICMP 192.168.1.2 udp port 63324 unreachable, length 36

See how the "AAAA" reply is rejected.

Expected Results:

The "AAAA" should not be rejected, and the browser should receive the AAAA record, and connect to the web site using IPv6.

Actual Results:

The "AAAA" is rejected, so the browser never knows the IPv6 address, and connects using IPv4.

Regression:

This problem did not occur in 10.5 or any previous version of MacOSX that had IPv6 support.  It only appears after upgrade to 10.6.  It is not fixed in 10.6.1.

Notes:

We have a large research and engineering network that is dual-stack everywhere (all systems, all networks), and we received complaints from all users that upgraded to 10.6.  Since this impacts us so seriously, we've had to tell all Mac users to NOT upgrade to 10.6 until this issue is resolved.  We have not found a workaround.

Comments

I recently ran into a similar issue.

Below are a few tests from a machine running Mac OS X 10.6.3 with IPv6 connectivity via 6to4. The DNS cache was flushed prior to these tests using dscacheutil -flushcache.

$ cat gai.c 
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>

int main(int a, char **v) {
  struct addrinfo h, *airoot, *ai;
  int r;

  memset(&h, 0, sizeof(h));
  h.ai_family = PF_UNSPEC;
  h.ai_socktype = SOCK_STREAM;
  if((r = getaddrinfo(v[1], "http", &h, &airoot)) < 0) {
    fprintf(stderr, "%s", gai_strerror(r));
    return -1;
  }

  for(ai = airoot; ai; ai = ai->ai_next)
    printf("Got record of type %s\n", ai->ai_family == PF_INET6? "IPv6": "IPv4");

  freeaddrinfo(airoot);

  return 0;
}

DNS and ICMP traffic is logged using: tcpdump -i en0 -nts0 port 53 or icmp

Test against host with A and AAAA

$ ./gai www.kame.net
Got record of type IPv6
Got record of type IPv4

IP w.x.y.z.55040 > 195.67.199.33.53: 64953+ A? www.kame.net. (30)
IP 195.67.199.33.53 > w.x.y.z.55040: 64953 1/2/6 A 203.178.141.194 (241)
IP w.x.y.z.54122 > 195.67.199.33.53: 30937+ AAAA? www.kame.net. (30)
IP 195.67.199.33.53 > w.x.y.z.54122: 30937 1/2/6 AAAA 2001:200::8002:203:47ff:fea5:3085 (253)

Test again CNAME -> AAAA

$ ./gai ipv6.google.com
Got record of type IPv6
Got record of type IPv6
Got record of type IPv6
Got record of type IPv6
Got record of type IPv6
Got record of type IPv6

IP w.x.y.z.59259 > 195.67.199.33.53: 44704+ A? ipv6.google.com. (33)
IP w.x.y.z.57969 > 195.67.199.33.53: 58233+ AAAA? ipv6.google.com. (33)
IP 195.67.199.33.53 > w.x.y.z.57969: 58233 7/4/4 CNAME ipv6.l.google.com., AAAA 2a00:1450:8003::68, AAAA 2a00:1450:8003::6a, AAAA 2a00:1450:8003::69, AAAA 2a00:1450:8003::63, AAAA 2a00:1450:8003::67, AAAA 2a00:1450:8003::93 (358)
IP 195.67.199.33.53 > w.x.y.z.59259: 44704 1/1/0 CNAME ipv6.l.google.com. (104)
IP w.x.y.z > 195.67.199.33: ICMP w.x.y.z udp port 59259 unreachable, length 36
IP w.x.y.z.63900 > 195.67.199.33.53: 17200+ A? ipv6.l.google.com. (35)
IP 195.67.199.33.53 > w.x.y.z.63900: 17200 0/1/0 (85)

Test against CNAME -> A and AAAA

$ ./gai www.sixxs.net
Got record of type IPv4
Got record of type IPv4
Got record of type IPv4
Got record of type IPv4

IP w.x.y.z.56307 > 195.67.199.33.53: 54233+ A? www.sixxs.net. (31)
IP w.x.y.z.55073 > 195.67.199.33.53: 32481+ AAAA? www.sixxs.net. (31)
IP 195.67.199.33.53 > w.x.y.z.56307: 54233 5/3/2 CNAME nginx.sixxs.net., A 94.75.219.73, A 193.109.122.244, A 213.204.193.2, A 213.197.30.67 (235)
IP 195.67.199.33.53 > w.x.y.z.55073: 32481 5/3/4 CNAME nginx.sixxs.net., AAAA 2001:1af8:1:f006::6, AAAA 2001:838:2:1::30:67, AAAA 2001:960:800::2, AAAA 2001:7b8:3:4f:202:b3ff:fe46:bec (327)
IP w.x.y.z > 195.67.199.33: ICMP w.x.y.z udp port 55073 unreachable, length 36

Re-runs of the above test (after dscacheutil -flushcache) are not consistent:

$ ./gai www.sixxs.net
Got record of type IPv6
Got record of type IPv6
Got record of type IPv6
Got record of type IPv6
Got record of type IPv4
Got record of type IPv4
Got record of type IPv4
Got record of type IPv4

IP w.x.y.z.56515 > 195.67.199.33.53: 272+ A? www.sixxs.net. (31)
IP w.x.y.z.64103 > 195.67.199.33.53: 8984+ AAAA? www.sixxs.net. (31)
IP 195.67.199.33.53 > w.x.y.z.64103: 8984 5/3/4 CNAME nginx.sixxs.net., AAAA 2001:960:800::2, AAAA 2001:7b8:3:4f:202:b3ff:fe46:bec, AAAA 2001:1af8:1:f006::6, AAAA 2001:838:2:1::30:67 (327)
IP 195.67.199.33.53 > w.x.y.z.56515: 272 5/3/3 CNAME nginx.sixxs.net., A 94.75.219.73, A 193.109.122.244, A 213.204.193.2, A 213.197.30.67 (251)
IP w.x.y.z > 195.67.199.33: ICMP w.x.y.z udp port 56515 unreachable, length 36
IP w.x.y.z.62291 > 195.67.199.33.53: 3295+ A? nginx.sixxs.net. (33)
IP 195.67.199.33.53 > w.x.y.z.62291: 3295 4/3/4 A 94.75.219.73, A 193.109.122.244, A 213.204.193.2, A 213.197.30.67 (261)

This seems broken.

BTW, below is a video that explains the reasoning behind the multiple queries:

http://www.stuartcheshire.org/IETF72/ (Apple IPv6 Experiences, by Stuart Chesire)

By noah.williamsson at April 9, 2010, 6:56 p.m. (reply...)

It's worse than that, I'm afraid. If you have a host with a single AAAA record, sometimes if the NXDOMAIN reply gets through before the AAAA reply mDNSResponder will abort the query altogether, tell the calling application that the hostname doesn't exist and cache the negative reply.


Please note: Reports posted here will not necessarily be seen by Apple. All problems should be submitted at bugreport.apple.com before they are posted here. Please only post information for Radars that you have filed yourself, and please do not include Apple confidential information in your posts. Thank you!