During secondary validation: No valid IP addresses found

Last Thursday some of our Unbound instances were running out of memory, so we deployed a change to the Unbound config to add:

msg-buffer-size: 4096

This is in addition to edns-buffer-size: 512 setting we already have. Note that the semantics here are different: edns-buffer-size controls what we tell other DNS servers we are willing to resolve over UDP, but doesn't limit the size of TCP responses. msg-buffer-size limits the size of both TCP and UDP responses.

There's also a limit in Boulder. Boulder won't handle responses from Unbound larger than 4096 bytes. This led us to conclude the msg-buffer-size change would not impact the vast majority of users. However, there are, as we've seen here, some domain names where Unbound receives large responses over TCP during a recursive lookup, but the overall result of the query is small and can be processed by Boulder. This category of domain names was broken by the change.

We're rolling back that change now, which should fix these issues. My apologies for not notifying everyone here about the change earlier, and for the breakage and time people spent debugging. I really appreciate all your help digging into the issue.

The current Boulder LookupHost logic looks up both A and AAAA records in parallel. If either lookup succeeds, LookupHost will return the list of addresses from that lookup.

In my testing with msg-buffer-size: 4096, a query for the A records for log.aeschi.edu returns a SERVFAIL, but a query for AAAA returns NOERROR with a CNAME and no IP addresses:

$ dig AAAA log.aeschi.eu -p 1053 @127.0.0.1

; <<>> DiG 9.16.1-Ubuntu <<>> AAAA log.aeschi.eu -p 1053 @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3583
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;log.aeschi.eu.                 IN      AAAA

;; ANSWER SECTION:
log.aeschi.eu.          0       IN      CNAME   www.aeschi.eu.

;; AUTHORITY SECTION:
aeschi.eu.              0       IN      SOA     ns0.core.aeschi.eu. hostmaster.aeschi.eu. 2020050711 1800 3600 604800 1200

Since that query succeeded, LookupHost would consider this a successful response, with no IP addresses.

4 Likes