Reporting Extended DNS Errors in Boulder problems

I was reading the NLnet newsletter and it looks like Unbound has added support for RFC8914 Extended DNS Errors. Read the newsletter here: The NLnet Labs Newsletter – Spring 2022.

I think it would be a great addition if this extra information could be included in ACME problems, where a DNS error has been encountered. Even if it is not included in the problem document title or detail, maybe in another field that can be later retrieved in e.g. Certbot's logs?

Looking into and interpreting the myriad causes of SERVFAIL can be tedious for community members, and sometimes it's not possible to reproduce the issue using unboundtest or letsdebug or whatever.

Though I think Let's Encrypt intend to eventually move on from Unbound, it might be something useful to have today.

Cloudflare's 1.1.1.1 supports it (I think), the extra error information comes looking like:

; OPT=15: 00 09 6e 6f 20 53 45 50 20 6d 61 74 63 68 69 6e 67 20 74 68 65 20 44 53 20 66 6f 75 6e 64 20 66 6f 72 20 64 6e 73 73 65 63 2d 66 61 69 6c 65 64 2e 6f 72 67 2e ("..no SEP matching the DS found for dnssec-failed.org.")

Though Unbound's messages will be different.

11 Likes

Even if the more verbose error messages would be present on e.g. unboundtest.com it would be very helpful! Usually I can't make heads or tails from the unbound logging :confused:

7 Likes

Yeah this will be a cool feature, also for my homelab DNS server :slight_smile:

However that looks like it's so bleeding edge right now that you can't get that feature unless you're building unbound from source right off the master branch. Might take a while until that has stabilized into an actual production-ready release.

8 Likes

We've noted the new unbound support for Extended DNS Errors as well; it would definitely be helpful to have better errors. It seems unlikely that we're going to deploy prerelease unbound though, so there will be some amount of delay until we have it available.

10 Likes

Who runs that site? I know the code is by @jsha – but is that domain/deployment from him personally, ISRG officially, or someone else?

4 Likes

Other Let's Encrypt staff helps run unboundtest.com but it isn't official LE infrastructure

6 Likes

It should be pretty straight forward to run unboundtest.com off of Unbound's main branch. Generally I keep it moderately close to prod, but this seems like a valuable enough feature that it's worth running a different version.

The main code change would probably be plumbing up miekg/dns to report the EDE info (if that doesn't happen automatically). If someone would like to try this locally (GitHub - jsha/unboundtest: Web service to test DNS resolution against Unbound with config similar to Let's Encrypt) and let me know how it goes / send PRs, that would be quite useful!

8 Likes

With Gentoo I can easily compile latest master of Unbound. But does anyone have a reproducable failing DNS laying around anywhere? :stuck_out_tongue:

Hm, dnssec-failed.org was mentioned already I see. rhybar.cz is another one.

Funnily when I do dig @localhost dnssec-failed.org it resolves fine :question: But Unboundtest indeed returns a SERVFAIL.. Weird.

4 Likes

There are some more intended-to-fail cases (specifically for CAA, even), at https://caatestsuite.com/.

Might you need to add +dnssec to the dig command to tell it specifically to validate DNSSEC? Just a wild guess, I just know that dig doesn't always use the same defaults as the system's main resolver.

6 Likes

Tried that, got a working result but now with some DNSSEC-related RR :stuck_out_tongue:

4 Likes

Ah, if you want to hide the records then you want +dnssec +nocrypto. Weirdly, "nocrypto" seems to actually mean "do the crypto but don't tell me about the details".

5 Likes

I don't mind the RR, I just want to see a SERVFAIL, not a NOERROR :rofl:

4 Likes

Try dej.in.ua for Servfail

Edit: It still fails consistently just not every time

5 Likes

:rofl:

4 Likes

Have you verified that your local DNS resolver (chain) actually verifies DNSSEC? My dig doesn't seem to validate DNSSEC [in default mode], even with +dnssec on (it just seems to set the DO flag which causes the response to have DNSSEC-records such as RRSIGs).

I've been using the test system by https://verteiltesysteme.net/ for years. They've always been failing :stuck_out_tongue: reliable for me. Contrary to dnssec-failed.org (which seems to have intentionally placed no keys on the zone while publishing a DS) the test by verteiltesysteme actually produces intentionally broken RRSIGs.

This is a broken response:

# dig sigfail.verteiltesysteme.net @ns1.verteiltesysteme.net +dnssec

; <<>> DiG 9.16.27-Debian <<>> sigfail.verteiltesysteme.net @ns1.verteiltesysteme.net +dnssec +crypto
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6192
;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
; COOKIE: f92b46e94822532e010000006283b92c714b51ba7a2fc0ce (good)
;; QUESTION SECTION:
;sigfail.verteiltesysteme.net.  IN      A

;; ANSWER SECTION:
sigfail.verteiltesysteme.net. 60 IN     A       134.91.78.139
sigfail.verteiltesysteme.net. 60 IN     RRSIG   A 5 3 60 20220730020002 20220430020002 30665 verteiltesysteme.net. //This+RRSIG+is+deliberately+broken///For+more+informati on+please+go+to/http+//www+verteiltesysteme+net///////// //////////////////////////////////////////////////////// //8=

;; Query time: 12 msec
;; SERVER: 2001:638:501:8efc::139#53(2001:638:501:8efc::139)
;; WHEN: Tue May 17 17:03:08 CEST 2022
;; MSG SIZE  rcvd: 281

Note that dig doesn't seem to transform the authoritative reply from the NS to a SERVFAIL (as stated above), but when I query a DNSSEC-validating resolver (Cloudflare's 1.1.1.1 also has EDE support):

# dig sigfail.verteiltesysteme.net @1.1.1.1 +dnssec

; <<>> DiG 9.16.27-Debian <<>> sigfail.verteiltesysteme.net @1.1.1.1 +dnssec
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 31057
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
; EDE: 6 (DNSSEC Bogus): (failed to verify sigfail.verteiltesysteme.net. A: using DNSKEY ids = [30665])
;; QUESTION SECTION:
;sigfail.verteiltesysteme.net.  IN      A

;; Query time: 12 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Tue May 17 17:09:03 CEST 2022
;; MSG SIZE  rcvd: 139

They also offer a positive test, sigok.verteiltesysteme.net

# dig sigok.verteiltesysteme.net @1.1.1.1 +dnssec

; <<>> DiG 9.16.27-Debian <<>> sigok.verteiltesysteme.net @1.1.1.1 +dnssec
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22293
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
;; QUESTION SECTION:
;sigok.verteiltesysteme.net.    IN      A

;; ANSWER SECTION:
sigok.verteiltesysteme.net. 60  IN      A       134.91.78.139
sigok.verteiltesysteme.net. 60  IN      RRSIG   A 5 3 60 20220730020002 20220430020002 30665 verteiltesysteme.net. Ob8AFd19nUT689fsrds2nC7D+iFK8AfaEquH//9iuZ69Z4zdIzUeglVI PY0ZJMsj0uZM+AddNQ5leaQuWXUcU3lJ9aGLTxyNjLHQTxkPT9tdbRtL qqSIOQcKSlW2mBSpghWnuvKnejL253uFirvB2VWzzVtXoXI+TZxnoELQ TBc=

;; Query time: 184 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Tue May 17 17:10:05 CEST 2022
;; MSG SIZE  rcvd: 251
7 Likes

I assumed dig would validate DNSSEC.. And localhost was running Unbound with the default settings, maybe it doesn't validate DNSSEC by default?

Unfortunately, it doesn't fail with Unboundtest: https://unboundtest.com/m/A/verteiltesysteme.net/EJL7SZWY

4 Likes

Does the extended error always appear? I don't see anything extra here:
https://unboundtest.com/m/CAA/dej.in.ua/T6766DBG

5 Likes

Jacob hasn't enabled EDE as far as I know, I'm testing locally.

Setting ede to yes didn't change anything :stuck_out_tongue:

5 Likes

No idea, I think it probably does with the right settings. But probably you need to set a bunch of flags or something.

It should validate by default, but that depends on modules/configs. IIRC Unbound validates DNSSEc if the "validator" module is enabled in module-config. So you should probably check what's in your config.

According to docs, if this option is unset the default modules are validator and iterator.

Wrong hostname, that's the main website which is supposed to work (there's a website there to test your browser) :slight_smile:. You're looking for sigfail.verteiltesysteme.net.

6 Likes

Got it working :slight_smile:

The EDE message is presented in the OPT RR which can be returned by Unbound. By adding an (almost) empty OPT RR to the request presented to Unbound, Unbound also adds an OPT RR to the answer. Including the EDE field if ede: yes is in the config file.

It returns stuff like:

Query results for A sigfail.verteiltesysteme.net

Response:
;; opcode: QUERY, status: SERVFAIL, id: 44628
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;sigfail.verteiltesysteme.net.	IN	 A

;; ADDITIONAL SECTION:

;; OPT PSEUDOSECTION:
; EDNS: version 0; flags: do; udp: 512
; EDE: 6 (DNSSEC Bogus): ()

The following list of possible errors is currently available:

7 Likes