There’s a paper from 2013 outlining a fragmentation attack on DNS that allows an off-path attacker to poison certain DNS results using IP fragmentation. I’ve been thinking about mitigation techniques and I’m interested in hearing what this group thinks.
To paraphrase the attack: Find an authoritative resolver (target A) that fragments its replies. Convince a recursive resolver (target R) to query that authoritative resolver, and at the same time (or shortly before), send target R a fake second fragment. Since the query id and source port are in the first fragment, this bypasses the usual anti-spoofing mechanisms in DNS. Target R reassembles the fake second fragment with the real first fragment and accepts it as a valid response.
I think one mitigation (thanks to Andrew Ayer for the idea) is for target R to set the Requester’s Payload Size in EDNS(0) to a low value. (Edit 2018/11/20: We have since implemented this mitigation) This should cause authoritative resolvers to truncate answers more frequently rather than fragmenting them. The truncated answers in turn would cause a recursive resolver to fall back to TCP. Downsides:
Authoritative resolvers that return large responses but don’t support TCP would stop working. Hopefully there aren’t too many of these, since TCP support and large responses in theory go hand in hand.
An increase in the frequency of TCP fallback would use more resources on the recursive resolver. This is probably worthwhile for the extra security.
There’s the additional question: Since the attack depends on IP-level fragmentation, does TCP actually protect against it? If the authoritative resolver sets the DF (Don’t Fragment) bit on its TCP packets, then yes. But that flag is not universally set, and I don’t have good numbers on how common it is.
However, I think the set of TCP responses that will be fragmented is much smaller. Under UDP, if the recursive resolver sets the Requester’s Payload Size option to 4096, for instance, the authoritative resolver will send datagrams up to 4096 bytes in size. That’s almost certainly larger than the MTU of the authoritative resolver’s outbound interface, so any large DNS answer will be immediately fragmented at the IP layer, since that’s the only way to break up a datagram.
However, TCP has a notion of segments. An authoritative resolver sending a 4096-byte reply via TCP would not fragment the reply. Instead, it would break it up into segments based on the host’s outbound interface MTU. Most likely that’s 1500, the Ethernet MTU. You would only get fragmentation if the reply subsequently passed through a link with a lower MTU. Certainly not impossible, but this greatly reduces the scope of authoritative resolvers available to exploit.
That last fraction of potentially vulnerable authoritative resolvers could in theory be eliminated by blocking fragmented packets at the recursive resolver, at the cost of effectively blocking large responses from those resolvers. Since the authoritative resolvers can also protect themselves by keeping response sizes small, it seems like blocking all fragmented packets would probably be a bad tradeoff.
There’s some interesting quantitative research that could be done here: For each hostname in CT, look up A, AAAA, TXT, and DNS records, once with a resolver that sets Requester’s Payload Size to 4096, and once with one that omits it (and does TCP fallback). Count up the differences in success rates. For good measure, also track the sizes of the responses to create a CDF.