SERVFAIL looking up CAA - prevent the case

Hi all,

yesterday we had an interesting issue with the issuance of a letsencrypt certificate.

We requested the signing of a CSR.
We were able to go through all steps, but in the finalize step of the issuance process we were getting the following error:
Error finalizing order ::While processing CAA for *.device.level1.sub.domain.tld: DNS problem: SERVFAIL looking up CAA for level1.sub.domain.tld - the domain's nameservers may be malfunctioning

The Setup:

  • two layers of DNS Servers (powerdns 4.4)
    -- 6 DNS Servers responsible for sub.domain.tld
    -- 6 DNS Servers responsible for level1.sub.domain.tld
  • no CAA entries configured

We checked all of the log files, but were unable to find any "SERVFAIL" entries for the time range between requesting the sign and the error message.
From Certificate Authority Authorization (CAA) - Let's Encrypt - Free SSL/TLS Certificates we understood that a misleading answer from a caching server between letsencrypt and our dns servers might be the issue.
BUT we see a request for the critical time for an "CAA" entry on the layer "sub.domain.tld", which was answered by packetcache MISS and one HIT, but still no SERVFAIL...:

Feb 2 14:22:08 hostname pdns_server[29292]: Remote 2600:3000:2710:200::18 wants 'level1.sub.domain.tld|CAA', do = 1, bufsize = 512: packetcache MISS
Feb 2 14:22:08 hostname pdns_server[29292]: Remote 2600:3000:2710:200::18 wants 'sub.domain.tld|CAA', do = 1, bufsize = 512: packetcache MISS
Feb 2 14:22:08 hostname pdns_server[29292]: Remote 66.133.109.36 wants 'device.level1.sub.domain.tld|CAA', do = 1, bufsize = 512: packetcache MISS
Feb 2 14:22:08 hostname pdns_server[29292]: Remote 66.133.109.36 wants 'sub.domain.tld|CAA', do = 1, bufsize = 512: packetcache HIT

In the end we could just raise another request (as we did afterwards, which went fine), but we would like to know how to prevent such an issue.
Good would be the possibility to however get the information about who was reporting the SERVFAIL or any other error. (e.g. in the return message of the error).

Cheers,
Matthias

Hi @mwiora

your domain name is required if you want help.

Letsencrypt uses Unbound, if unbound needs too long to find an answer, that may be a Servfail without an explicit Servfail answer.

2 Likes

You can also try testing with

https://unboundtest.com/

which is deliberately designed to be similar to (though not necessarily identical to) Let's Encrypt's own DNS resolver environment, and gives a lot of technical details about the outcome of a query.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.