DNS problem: SERVFAIL looking up A for go.airliquide.com

Howdy!

I’m unable to request an SSL cert for go.airliquide.com due to the following reported error: “DNS problem: SERVFAIL looking up A for go.airliquide.com - the domain’s nameservers may be malfunctioning”

Taking that domain for a spin on unboundtest.com [1] shows that DNS resolution indeed fails due to a “Missing DNSKEY RRset in response to DNSKEY query”, thus we “[c]ould not establish a chain of trust to keys for airliquide.com.”

It’s probably worth mentioning that airliquide.com’s nameservers are configured for DNSSEC.

Out of curiosity, I ran a local instance of the Unbound DNS resolver configured with an edns-buffer-size of 512 bytes which I understand is similar to the configuration that Let’s Encrypt uses. I was able to reproduce the same failure as reported by unboundtest.com.

However, increasing the edns-buffer-size to 1024 bytes allowed the DNS resolution for go.airliquide.com to complete successfully.

This brings me to my questions for the experts here…

  • Can anyone confirm if the eDNS buffer size is indeed the root cause failure for certificate provisioning in this case?
  • Can anyone recommend any obvious DNS configuration changes to make for the airliquide.com domain that would circumvent DNS resolution failure in this case?

[1] https://unboundtest.com/m/A/go.airliquide.com/SW33BBGO

1 Like

Interesting. My dig command with +bufsize=512 works perfectly (dig +dnssec +trace +bufsize=512 go.airliquide.com). According to the man page of dig, that should reduce the advertised EDNS0 buffer size to 512 bytes.

Also, DNSViz doesn’t see any issue with your hostname: https://dnsviz.net/d/go.airliquide.com/dnssec/

1 Like

Hi @pd-aray

looks like you have flacky name servers. There are some checks of your domain - 2020-03-04 and older - https://check-your-website.server-daten.de/?q=go.airliquide.com

2020-03-08.go.airliquide.com

Checked with Unboundtest, the same error:

Missing DNSKEY RRset in response to DNSKEY query.

But

https://dnssec-analyzer.verisignlabs.com/go.airliquide.com

is happy with the airliquide.com zone.

Looks like these name servers sometimes send the DNSKEY, sometimes not.

Reason? Unknown.

PS: Checked your domain manual - there is no DNSKEY sent back. The answer has 97 Bytes, only a SOA is reported.

Curious. Checked all of your three name servers, always the same result.

Same command with my own domain (DNSSEC), 773 bytes.

1 Like

@Osiris Thanks for the additional tests! For the dig test, I’m not sure if the +bufsize param “suggests” or “forces” the DNS resolver’s eDNS buffer size to be the advertised amount. Also, my hunch is that DNSViz is using a DNS resolver configured with a buffer size >512 bytes if it’s able to successfully resolve a query against the problem domain.

@JuergenAuer That’s a helpful tool you’ve got there! What’s the buffer size of the DNS resolver your web service is using?

My working theory is that Unbound configured with an edns-buffer-size of 512 bytes is being forced into TCP fallback when resolving queries against go.airliquide.com. This is probably because that domain’s nameservers are configured for DNSSEC, resulting in larger responses than the resolver’s UDP buffer can accommodate.

I tested this theory by running a local instance of Unbound with dnstap [1] logging enabled:

• dnstap logs from Unbound configured with a 512 byte eDNS buffer [2]
• dnstap logs from Unbound configured with a 1024 byte eDNS buffer [3]

In the 1024 byte case [3], we can see that the DNS resolver was able to successfully resolve the Client Query (CQ) with a Client Response (CR). We can also see Resolver Responses (RR) that exceed 512 bytes, e.g. 1013b, 1023b, 998b, etc

In the 512 byte case [2], we can see the DNS resolver was not able to successfully resolve the CQ because there is no CR row. We can also see the last Resolver Query (RQ) was for a DNSKEY record from 58.65.12.66 (ns03.airliquide.com). While we see every other RQ/RR in this session fallback from UDP to TCP, it’s conspicuous that the conversation ends with a UDP query to the ns03 nameserver – no TCP fallback!

This leads me to believe this is a problem with the ns03.airliquide.com nameserver being incorrectly configured for handling TCP fallback. Unfortunately, I don’t control these nameservers or this domain; I’m acting on behalf of another party.

Have I made any bad assumptions here? Or am I the victim of any obvious misunderstanding?

I would really appreciate any confirmation or correction from someone who understands DNS better than I do (…which is basically anyone)!

[1] http://dnstap.info/Examples/
[2] https://gist.github.com/pd-aray/3cbb215bb6e48ad16d4b98fd1968a13c#file-go-airliquide-com-512-tap
[3] https://gist.github.com/pd-aray/3cbb215bb6e48ad16d4b98fd1968a13c#file-go-airliquide-com-1024-tap

DNSSEC queries are always via TCP. If it is not the EDNS512 - check, the EDNS max size is set to 2048. But checked with 4096 - same result, only a SOA answer.

They aren’t always. You can fit quite a lot in a small packet, especially when using ECDSA or EdDSA.

Edit: If you meant they’re always TCP with your tool, sorry, OK!

1 Like

@JuergenAuer I’m curious what tools or commands you’re using in your tests that are only reporting an SOA record.

When I run dig against the Unbound resolver configured with a 512 byte buffer, this is the response I get:

[root@01a6c78ea3de bin]# dig @127.0.0.1 -p 5353 +dnssec go.airliquide.com

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-9.P2.el7 <<>> @127.0.0.1 -p 5353 +dnssec go.airliquide.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 23694
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 512
;; QUESTION SECTION:
;go.airliquide.com.		IN	A

;; Query time: 4947 msec
;; SERVER: 127.0.0.1#5353(127.0.0.1)
;; WHEN: Mon Mar 09 23:50:14 UTC 2020
;; MSG SIZE  rcvd: 46

And when I run it against Unbound with a 1024 byte buffer, this is what I see:

[root@01a6c78ea3de bin]# dig @127.0.0.1 -p 5353 +dnssec go.airliquide.com

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-9.P2.el7 <<>> @127.0.0.1 -p 5353 +dnssec go.airliquide.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40146
;; flags: qr rd ra; QUERY: 1, ANSWER: 7, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1024
;; QUESTION SECTION:
;go.airliquide.com.		IN	A

;; ANSWER SECTION:
go.airliquide.com.	0	IN	CNAME	go.pardot.com.
go.airliquide.com.	0	IN	RRSIG	CNAME 8 3 7200 20200316001126 20200309001126 25044 airliquide.com. lfBpRelFwkeoXZ1GZwmGaqIgW4AteGHt0Qc//2GqIf7QTuRrlDm3S88V hoAIFMFVRa2DXnoYaVnGJICrZxTpH078av0oztPKSs2SeY0bO0zP0c/L mRMIhzWnnoj3oto/B4Trb4E3MelBfXT8fy6RA4cfVLGXpOUvwVTt0YxJ uaE=
go.pardot.com.		0	IN	CNAME	pi.pardot.com.
pi.pardot.com.		0	IN	CNAME	pi-ue1.pardot.com.
pi-ue1.pardot.com.	0	IN	CNAME	pi-ue1.t.pardot.com.
pi-ue1.t.pardot.com.	0	IN	CNAME	pi-ue1-lba5.pardot.com.
pi-ue1-lba5.pardot.com.	0	IN	A	35.174.78.146

;; Query time: 2645 msec
;; SERVER: 127.0.0.1#5353(127.0.0.1)
;; WHEN: Mon Mar 09 23:50:33 UTC 2020
;; MSG SIZE  rcvd: 347

Looks like all 3 nameservers return an unsigned NODATA response to TCP DNSKEY queries.

$ dig +dnssec +norecurse +vc airliquide.com dnskey @ns01.airliquide.com

; <<>> DiG 9.16.0 <<>> +dnssec +norecurse +vc airliquide.com dnskey @ns01.airliquide.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46845
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: 5d24d6859ce77615edbd52a05e66da64e77bc285ec1cbf24 (good)
;; QUESTION SECTION:
;airliquide.com.                        IN      DNSKEY

;; AUTHORITY SECTION:
airliquide.com.         7200    IN      SOA     ns01.airliquide.com. postmaster.airliquide.com. 2020030502 25200 3600 604800 21600

;; Query time: 119 msec
;; SERVER: 194.2.192.5#53(194.2.192.5)
;; WHEN: Tue Mar 10 00:08:04 UTC 2020
;; MSG SIZE  rcvd: 123

$ dig +dnssec +norecurse +vc airliquide.com dnskey @ns02.airliquide.com

; <<>> DiG 9.16.0 <<>> +dnssec +norecurse +vc airliquide.com dnskey @ns02.airliquide.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49021
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: cdb290bb0023a6a8e54103045e66da688ac1ff92169838e4 (good)
;; QUESTION SECTION:
;airliquide.com.                        IN      DNSKEY

;; AUTHORITY SECTION:
airliquide.com.         7200    IN      SOA     ns01.airliquide.com. postmaster.airliquide.com. 2020030502 25200 3600 604800 21600

;; Query time: 126 msec
;; SERVER: 90.80.28.9#53(90.80.28.9)
;; WHEN: Tue Mar 10 00:08:08 UTC 2020
;; MSG SIZE  rcvd: 123

$ dig +dnssec +norecurse +vc airliquide.com dnskey @ns03.airliquide.com

; <<>> DiG 9.16.0 <<>> +dnssec +norecurse +vc airliquide.com dnskey @ns03.airliquide.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35459
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
; COOKIE: 040e91d0b8174869e9fede505e66da6cb82f080e11092e17 (good)
;; QUESTION SECTION:
;airliquide.com.                        IN      DNSKEY

;; AUTHORITY SECTION:
airliquide.com.         7200    IN      SOA     ns01.airliquide.com. postmaster.airliquide.com. 2020030502 25200 3600 604800 21600

;; Query time: 246 msec
;; SERVER: 58.65.12.66#53(58.65.12.66)
;; WHEN: Tue Mar 10 00:08:12 UTC 2020
;; MSG SIZE  rcvd: 123

(Note: They don’t work any better if I use +nocookie.)

1 Like

Awesome, thanks @mnordhoff! That seems pretty incriminating and supports what I was inferring from my dnstap logs – seems like it’s safe to say these nameservers are misconfigured :slight_smile:

Also, thanks to @JuergenAuer! Turns out I didn’t know quite enough about DNS to put together what you were saying but I appreciate the feedback and help!

Yes, my tool switches to TCP, if DNSSEC is used. That may be untypical, but it should always work.

Yep, that’s the result. 0 Answer, instead a SOA.

And it’s a name server misconfiguration.

1 Like