CAA servfail with

Thanks! Between the advanced dnsvis options, recent suggestions and dig using clouidflare I think I have a couple of tangible things to pass on to the client to help them talk to the DNS provider.

This dnsvis query seems to encapsulate the advanced options petercooperjr was so helpful in spelling out:
At least when I open that URL in a fresh incognito window it seems to reliably show the 4 NSEC errors. (I think I will provide the client with that url and a link to this thread for the details)

Even their apex record seems to have some NSEC issues, at least in the 'proving non-existence area):


I think it is very interesting that the same +dnssec querty using dns101 returns the NOERROR and seems to get an additional AUTHORITY record (does this indicate they aren't propagating something to other DNS servers properly?):

$ dig AAAA +dnssec

; <<>> DiG 9.11.36-RedHat-9.11.36-5.el8_7.2 <<>> AAAA +dnssec
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62754
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 1
;; WARNING: recursion requested but not available

; EDNS: version: 0, flags: do; udp: 1232
;            IN      AAAA

;; AUTHORITY SECTION:         3600    IN      SOA     DNS101.REGISTER.COM. root.REGISTER.COM. 122111212 10800 3600 604800 3600         3600    IN      RRSIG   SOA 13 2 7200 20221229000000 20221208000000 21296 E21YZ2VrSnbAb7/SME9R+3vyZZvpOMpsYjcWLlFwH7kVjltUMvb1ibqi TvneAeWzOHwKXIN8ynjL9OjZJT0uiQ==         3600    IN      NSEC A NS SOA MX TXT RRSIG NSEC DNSKEY         3600    IN      RRSIG   NSEC 13 2 3600 20221229000000 20221208000000 21296 NULdu6I0JC60E6WTdYLD40M34qC4y7nFppssBunHIexjvqz6EnTkKPYe BjbWkGNQO27fQ7YAuMlueY7N9AzKBw==

;; Query time: 64 msec
;; WHEN: Wed Dec 21 09:38:41 CST 2022
;; MSG SIZE  rcvd: 364

Or maybe the live signing vs wildcard items mentioned by Nummer378 is the root of the issue here...

1 Like

No, this is to be expected. When you run

dig AAAA +dnssec

you send the query directly to the responsible nameserver. The nameserver then replies with the data it has. A nameserver is not a [recursive] resolver, it does not validate its own DNSSEC responses (it could, but that would be fairly useless and not helpful). It's like asking a server to tell whether its own certificates are valid - it can obviously do this check, but if we're going to trust what the server says we could also stop validating altogether. The client has to perform the validation, and the nameserver is not the client - the [recursive] resolver is.

A resolver like Cloudflare is a [recursive] resolver, meaning that it retrieves the data from the domain's nameservers for you and performs DNSSEC validation. So the SERVFAIL you see is generated by Cloudflare in response to getting invalid data from the nameserver. Internally, Cloudflare got the exact same dataset you see when running the above command.

The authority section is often changed (primarily removal of irrelevant data) by recursive resolvers, so its to be expected that it looks different when asking the resolver vs the nameserver directly. Recursive resolvers also add data where they think its useful, while nameservers usually just tell you the answer to your "question" in DNS-speak.

(Also, to clarify: dig itself doesn't do any DNSSEC validation, at least not by default. I don't know if there's a switch to turn that on, +dnssec just sets the DNSSEC-OK (DO) flag, so that DNSSEC is visible in responses)


Uch, I selected CAA at the "Analyze" in the beginning, but for some reason it didn't give me any error previously. Manually selecting it again shows the errors now :slight_smile:


Thanks for the help everybody. I marked the entry that seemed most helpful as the solution.

Removing DNSSEC was finally the avenue that the client took and was the main element in fixing their issues. They got in touch with me Friday last week to resolve their remaining issues.

The client had some sort of DNS issue 2-3 months ago where their DNS was 'hacked' (it was unclear to me if it was a typical DNS expiration / takeover or a nefarious type getting control of their login). As part of trying to prevent similar issues in the future they must have turned on DNSSEC from the user interface. However, the implementation doesn't seem compatible with letsecnrypt, causing issues on their next certificate renewal. They didn't seem to have any true requirement for DNSSEC and had turned it back off before I started helping Friday.

They had made some additional A record and other changes that needed reverted before I got them on track.

CAA with was buggy.
The client had attempted the CAA workaround as well but mistakenly put the CAA record at the apex and didn't have in the value. Additionally, I found that when trying to add the correct www level record that still doesn't seem to have proper CAA support. IE: After adding the www CAA record, the UI showed two apex level records and worse, neither could be removed (it sounds like it will require a call to their tech support to resolve that issue).

Although a CNAME www record is our preferred configuration, we were able to get things working with an A record alongside the broken and now unnecessary CAA record.


Does that mean it has been fixed at this point in time?


Sorry for lack of clarity. I worked around it and moved on, I doubt they changed/fixed anything.

If some technical writer can give me the proper way I should have worded that to indicate it was buggy when I was using without implying it has been fixed maybe I can learn from my mistakes.


Thank you @TeQuilYa, that was my guess; but I like to know things for sure. :slight_smile:


Maybe using something more like:
... was too buggy (for me).
[which implies that you moved away from it - not that it has been fixed]


This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.