I’m using certbot-auto to register hundreds of SSL certificates for domains that my company owns. However, about 2% of the registrations failed due to DNS problem: query timed out looking up CAA for <domain_name> error. How can I solve this problem?
One of the domains is www.babushop.com.tw. And the error message is:
Failed authorization procedure. www.babushop.com.tw (http-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: DNS problem: query timed out looking up CAA for www.babushop.com.tw
IMPORTANT NOTES:
- The following errors were reported by the server:
Domain: www.babushop.com.tw
Type: connection
Detail: DNS problem: query timed out looking up CAA for
www.babushop.com.tw
To fix these errors, please make sure that your domain name was
entered correctly and the DNS A/AAAA record(s) for that domain
contain(s) the right IP address. Additionally, please check that
your computer has a publicly routable IP address and that no
firewalls are preventing the server from communicating with the
client. If you're using the webroot plugin, you should also verify
that you are serving files from the webroot path you provided.
Saving debug log to /var/log/letsencrypt/letsencrypt.log
The command I use to register is /opt/certbot-auto certonly --keep --no-bootstrap --no-self-upgrade --non-interactive --webroot -w /usr/share/nginx/html -d www.babushop.com.tw.
I don’t understand why some of the domains KEEP failing to register. Are there any connection problems between let’s encrypt server and those domains?
I originally thought that those domains will be registered successfully if I run the registration process for several times. But I was wrong.
I also get an error trying to resolve CAA record for your domain:
;; AUTHORITY SECTION:
babushop.com.tw. 5 IN NS vdns3.seed.net.tw.
babushop.com.tw. 5 IN NS vdns4.seed.net.tw.
[mkwm:~] $ host -t CAA babushop.com.tw vdns3.seed.net.tw
;; connection timed out; no servers could be reached
[mkwm:~] $ host -t CAA babushop.com.tw vdns3.seed.net.tw
;; connection timed out; no servers could be reached
but other records work properly, so I guess that your DNS provider servers are incapable of handling queries for unknown records types (note: they don’t have to support CAA records - they only have to respond with NODATA for non-existent records, instead of timing out - presumably, beacause of dropping queries for unknown record types as invalid). I’ve also tried to query your DNS servers with “artificial” records types and the result was the same - timeout.
Since September 2017, CAs (not only Let’s Encrypt) are obliged to check CAA records before issuing certificate. As timeout may mean that there is record that prevents issuance, but CA is unable to get it (for example, because some malicious third-party is blocking responses), Let’s Encrypt “fails secure” and prevents certificate issuance.
You have to either resolve this issue with your DNS provider or switch your DNS providers.
I've contacted the DNS provider, they said that currently they are not able to process CAA queries.
I found a article saying
CAA validation follows CNAMEs, like all other DNS requests. If www.community.example.com is a CNAME to web1.example.net, the CA will first request CAA records for www.community.example.com, then seeing that there is a CNAME for that domain name instead of CAA records, will request CAA records for web1.example.net instead. Note that if a domain name has a CNAME record, it is not allowed to have any other records according to the DNS standards.
So I tried adding a CAA record for dname.91app.io, wishing that Let's Encrypt will query the CAA record for s2454.dname.91app.io, which is where www.babushop.com.tw points to.
However, I still get the same error saying query timed out looking up CAA.
Is this the expected behavior of Let's Encrypt? Can Let's Encrypt be modified so that it queries the CAA for s2454.dname.91app.io instead of www.babushop.com.tw in this case?
That nslookup command didn’t query for a CAA record specifically. Queries for CAA records should return the CNAME, which will be followed, but your DNS returns a SERVFAIL. I believe you’ll need to either switch providers or convince yours to follow the required standards. All public CAs are now required to verify CAA records, so anyone on your DNS service will be unable to issue any publicly-trusted certificates. That sounds like a very unsound business practice for a DNS provider.
In principle, yes. It would be complicated. At this point, Let's Encrypt is not likely to put a large amount of engineering effort into working around DNS compliance issues at a small number of DNS providers.
To be fair(?), a CA can ignore an error like this for unsigned zones. Let's Encrypt no longer will, though.
Let's Encrypt will indeed query for the CAA record at s2454.dname.91app.io, and will receive a CNAME to a couple more domains:
;; ANSWER SECTION:
www.babushop.com.tw. 86399 IN CNAME s2454.dname.91app.io.
s2454.dname.91app.io. 299 IN CNAME proxy.letssl.91app.io.
proxy.letssl.91app.io. 299 IN CNAME proxy-letssl-91app-io-196811564.ap-northeast-1.elb.amazonaws.com.
However, s2454.dname.91app.io is different from dname.91app.io, and Let's Encrypt will not query dname.91app.io (the parent domain). Confusingly, there was a period of a couple weeks when Let's Encrypt would have queried the parents of all CNAMEs, but the CAA spec has been updated, and Let's Encrypt's behavior was updated to match, so it no longer checks parent domains.
Is it possible to add a CAA record for s2454.dname.91app.io? I got an error saying
RRSet of type CAA with DNS name s2454.dname.91app.io. is not permitted because a conflicting RRSet of type CNAME with the same DNS name already exists in zone dname.91app.io.
when trying to add a CAA record for s2454.dname.91app.io. Maybe I'm not adding it correctly.
if a certificate is requested for X.Y.Z the issuer will
search for the relevant CAA record set in the following order:
X.Y.Z
Alias (X.Y.Z)
Y.Z
Alias (Y.Z)
Z
Alias (Z)
Return Empty
I think the reason for failing to register an SSL certificate for www.babushop.com.tw is that the issuer(Let's Encrypt) got a timed out error when searching the for CAA record for www.babushop.com.tw, then Let's Encrypt stops searching for the CAA record for Alias(www.babushop.com.tw), which is s2454.dname.91app.io.
Because switching the DNS provider or convincing them to follow the standard takes a lot of effort. I want to make sure if there’s really no other solutions before I switch the DNS provider.
Yes and no. A CNAME record by that name exists. You can't have a CNAME record and a CAA record with the same name.
91App's DNS provider (Amazon Route 53) supports an "alias" feature. They should be able to work something out, if they want to. Doing it right would probably require some changes to their current DNS architecture, which they may not be eager to do. (They could probably lower their bills, though!)
Still, it may be a nice feature, but it wouldn't help you get a Let's Encrypt certificate.
For what it's worth, there have been changes to CAA since the RFC was published, to correct errors and make other changes. If you want to read about how the current spec is deployed in practice, you should also read the official errata and the CA/Browser Forum ballots related to it.
Right.
Right. Let's Encrypt isn't behaving incorrectly.
When the CAA specification says an implementation should "search for" an alias, it's not requiring the implementation to make extra DNS queries of different types, it's just specifying how to examine the records it got in response to the DNS query of type CAA that it did make.
Let CAA(X) be the record set returned in response to performing a CAA
record query on the label X
That's how DNS resolution normally works. To give an example (while leaving out many steps), if you want the A (IPv4 address) records for www.example.com., the recursive DNS server makes an A query for www.example.com.. If any such records exist, the authoritative DNS server returns them. If instead a CNAME record exists, the authoritative server returns it. If nothing at all exists, the authoritative server returns "nothing at all exists".
When queried for CAA, the authoritative DNS servers you're using don't return anything. Not CAA records, not the CNAME that exists (which is what they should return), not any sort of negative response, and not even an error, or something invalid.
That's broken. As far as a normal DNS client knows, the servers are down, and it should behave accordingly.
Let's Encrypt could make changes to work around this specific issue. They could make a query of a different type, like A or CNAME, first, and cache it, so the following CAA query would rely on the cached CNAME and only send CAA queries to its destination. They could do that always, or when there's an error. Let's Encrypt could patch the recursive DNS server to do it, or modify its configuration to increase caching and then patch the CA software to do it. In fact, they could simply ignore all sorts of errors and consider them permission to issue, as long as the zone doesn't use DNSSEC.
Those types of changes would require some work, or a lot of work, and maybe increase maintenance burden, risk mistakes and compliance failures, and make the service less robust and secure, all for the sake of working around some (but not all) issues with a small proportion of broken DNS servers affecting a small proportion of (prospective) Let's Encrypt users.
At this time, Let's Encrypt's policy position is not to implement hacks or exceptions to work around CAA bugs in authoritative DNS servers. Some CAs (particularly more expensive and less automated ones) may make other decisions.
I can't speak for Let's Encrypt, but I don't expect that policy to change. In most ways, there's less justification to change it every day. CAA has been universally required for 3 months, and Let's Encrypt has deployed it even longer (with, at times, certain exceptions for broken servers). Most widely used DNS implementations were fixed months or years ago. Any DNS provider still having problems has probably been getting yelled at by angry users for 3-4 months, if not longer. It was always hard to justify broken behavior, and it's even harder now.
Well, I'm sorry to hear that. Sometimes, it is easy to switch DNS providers, or easy to convince them to fix important things they ought to have fixed long ago. Sometimes it's not.
You say it would take a lot of effort, and that may be true, but... I don't mean this to be rude, but it would take a lot of effort on your part, and on the part of your DNS provider. Changing Let's Encrypt would most likely take more effort on the part of Let's Encrypt's engineers, and potentially many other people (if something goes wrong). A global cost-benefit analysis of this decision -- and isn't that a way to cause anxiety about everyday concerns, like what to eat for lunch -- may not have a result you'd like.
I don't know how to conclude this post. (Which I definitely need to do before I say something even more ridiculous.) You should contact your DNS provider, but you may need to switch. I'm sorry.