DNS insights - UDP vs TCP and EDNS

In this article I will elaborate the research I did in relation to DNS (Domain Name Service). The particular issue discussed here is when and how DNS uses TCP or UDP transport layer in relation to the packet size. Normally, DNS will run queries and replies via UDP protocol unless there is a Zone Transfer , incremental or full (IXFR or AXFR , respectively). According to the original DNS specification as given in the RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt), DNS will also use TCP for packets larger than 512 bytes. However, there was another RFC issued in 2013 which is RFC 6891 (Extension Mechanism for DNS), labelled as EDNS(0). https://tools.ietf.org/html/rfc6891
I came across this issue accidentally, but as we will see in the text, some issues were noticed that are some not well documented reasons why DNS may fail to comply with EDNS.
To do the proper investigation, here are some assumptions I worked with:

a) I wanted to check DNS clients from both Windows and Linux
b) I wanted to check different DNS client applications ("dig" and "nslookup")
c) I wanted to check local DNS servers of different flavour (Windows DNS and Linux/UNIX BIND)
as well as public DNS servers of both kinds (Linux and Windows)
d) I had to check DNS transactions for TXT and DNSSSEC records as those are the most likely to generate replies > 512 bytes

The number of possible combinations from the above assumptions raised as I considered a number of public DNS server, so I will restrict the discussion on several representative examples - those responsible for google.com, Microsoft.com, redhat.com and oracle.com domains.
The first test I will describe uses dig client on Linux to check DNSSEC records from a public DNS server. First , we will note the dig reply with legitimate reply within UDP domain.
When the wireshark session is run along the query, we can confirm the whole transaction is UDP.
Notice that the size is 231 bytes, so no reason to switchover to TCP. Need to run another test so that the DNSSEC record returns the reply >512 bytes. This is why I setup a little lab and created long enough records. The next test runs dig DNSSEC query against the local DNS server running on Windows 2016 server.
Checking the wireshark session, we can establish that the reply came as UDP even though the payload was > 512 bytes.  1409 bytes, to be exact.
The next text assessed the DNS reply to a long TXT record against the local DNS server running BIND 9.11 on Linux.
Again the wireshark proves the reponse (912 bytes) comes via UDP.
And now comes the interesting part - the next test checks a TXT record from a local DNS server running on Windows 2012. But this time, we run the query via nslookup.
Let's see what wireshark shows.
We run into the first surprise - note the DNS is followed by TCP 3way handshake for destination port 53 (DNS service). Length is 631 bytes and it triggered DNS to switchover to TCP.  So, why is that - is Windows 2012 DNS not RFC 6891 compatible or has EDNS feature disabled? We check the configuration and all looks ok - EDNS is on.
Something else must be wrong...Let us run the same check against Windows 2016 - maybe it's Windows 2012 DNS error only?
What does wireshark say?
The same issue here - DNS query triggers the switchover to TCP. Note this is the nslookup query coming from Windows 10 client machine to Windows 2016 DNS server with a long TXT record. The behaviour is the same as with Windows 2012. Further digging into the wireshark session reveals that the response packet from DNS server returns the flag "Truncated".
This makes sense - DNS server is supposed to return "Truncated" when the packet is loo large to fit in 512 bytes. But the problem is that this behaviour is aligned with old RFC 1035. Look at what EDNS extension says about truncated packets:
This is not what happened here even though EDNS was enabled on both Windows 2012 and Windows 2016 servers. At this stage I decided to run several tests against public DNS servers to check if my local installations are faulty in any way.
The next test runs the query against TXT record on the public DNS server responsible for Microsoft.com domain (one of many servers).
Note the "Truncated, retrying in TCP mode". Let us check what other EDNS compliance checker says.
It says, essentially it's ok, it's just SOA record that appears there whereas it should not. But this is not our problem. Let us check some redhat.com servers.
This one remains in the UDP - no "Truncated", no retrying in TCP mode...
Wireshark also shows it's the UDP that carries the whole transaction (note the size of 1345 bytes). So, BIND (or whatever redhat.com runs) is EDNS compatible. Go ahead and check oracle.com.
Again, looks EDNS compatible. Just to be sure, check wireshark,
Sure enough, 2685 bytes over UDP.
After this, there was one more combination to check - dig client from Linux running the query against local Windows 2012 and 2016 which seemed incompatible with EDNS while nslooked-up.
Long TXT against local Windows DNS, this time with dig showed no "Truncated" flag.
Confirmed -662 bytes transferred over UDP. No TCP.
There are couple of conclusions here:

a)       nslookup client implementation seems to cause the change in how UDP packets >512 bytes are negotiated. Nslookup query will switchover to TCP whereas dig will remain in UDP. Also bear in mind that nslookup has no capabilities to query for DNSSEC records.

Recommended usage of dig.

b)      non-compliance with RFC6891 will not be critical for most of the internet browsing (only PTR and A records important), but it may cause issues with SMTP server checks that are using DNSSEC records. If the firewall does not allow DNS via TCP and the DNS is not EDNS compliant, the response will time-out

c)       Firewalls must be allowed to pass DNS via TCP in case you’re dealing with non-RFC6891 compatible DNS servers otherwise timeouts will occur

d)      Internet browsers and other applications do not necessarily use nslookup, they will rely on Windows APIs like “gethostbyname” to do the name resolution. This research did not inspect behavior of DNS resolution APIs in regards to processing UDP DNS packets > 512 bytes


Interesting is that “Truncated” message is actually contained in a DNS reply , meaning, coming from the server. So, it is not clear to me why the difference in client (nslookup vs dig) affects the way how server negotiates further transmission. Probably nslookup is simply too old and does not "understand" newer DNS specifications? 
However, the results are conclusive – queries with nslookup will cause the server to switchover to TCP. The only exception to this rule was one of the online Microsoft DNS servers which truncated the UDP >512 bytes , switched over to TCP even though the query came from the dig client. But even that Microsoft DNS server apparently indicated EDNS0 was used (but seems it did not implement it properly). 

A small utility that checks the EDNS compatibility can be found on my GitHub site.
https://github.com/adenosine-phosphatase/edns






Comments

  1. This comment has been removed by a blog administrator.

    ReplyDelete

Post a Comment

Popular posts from this blog

Signature verification bypass vulnerability in some Huawei routers

Attacking encrypted VOIP (SIP) protocols

Investigating suspicious emails