CAA servfail on canhost?

My domain is:
grm.fleetnova.com

I ran this command:
certbot-auto --apache

It produced this output:
Failed authorization procedure. grm.fleetnova.com (tls-sni-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: DNS problem: SERVFAIL looking up CAA for fleetnova.com

My web server is (include version):
apache 2

The operating system my web server runs on is (include version):
debian 7

My hosting provider, if applicable, is:
canhost / OVH

I can login to a root shell on my machine (yes or no, or I don’t know):
yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel):
not sure

Yeah… All 4 of that domain’s nameservers respond with SERVFAIL to CAA queries.

$ digr fleetnova.com caa @ns12.kookiejar.net

; <<>> DiG 9.10.3-P4-Ubuntu <<>> +norecurse fleetnova.com caa @ns12.kookiejar.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 13314
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;fleetnova.com.                 IN      CAA

;; Query time: 79 msec
;; SERVER: 208.73.59.195#53(208.73.59.195)
;; WHEN: Tue Aug 15 09:59:21 UTC 2017
;; MSG SIZE  rcvd: 31

According to version.bind queries, they all run PowerDNS 2. (And their version.bind responses have class IN, which was fixed in 3.0.0.)

"Served by POWERDNS 2.9.22 $Id: packethandler.cc 1321 2008-12-06 19:44:36Z ahu $"
"Served by POWERDNS 2.9.22.6 $Id: packethandler.cc 2063 2011-03-14 14:26:38Z ahu $"

Folks need to upgrade, or you need to switch DNS providers. :grimacing:

1 Like

understood - the odd thing is that it has worked before - and i have multiple domains using the same host and dns and I have been able to get a cert for the other domains just yesterday
try getice.ca for example that one should servfail too but I got a cert yesterday no problem

Yeah, getice.ca fails in the same way.

Let's Encrypt -- for now -- still has a whitelist of domains for which CAA errors are ignored. That's probably why getice.ca works.

@jsha:

August 15 is a bit late for this, but i think there may be a bug/limitation in the broken CAA whitelist.

  1. User has certificate for grm.fleetnova.com but not fleetnova.com.
  2. User tries to validate grm.fleetnova.com.
  3. CAA query for grm.fleetnova.com fails, but it's on the exception list.
  4. CAA query for fleetnova.com fails, but it's not on the exception list.
  5. CAA query for com succeeds.
  6. Validation fails with CAA error due to fleetnova.com failure.

It's only a hypothesis, but i think this could explain what @rictd is experiencing.

LookupCAA's exception check seems to be an "exact match" check rather than taking into account parent or child domains, so the exception list would need to have been manually generated to include failed domains and their parents (at least when their parents are also broken).

grm.fleetnova.com has past certificates and is in the SERVFAIL exception files you pasted the other day. fleetnova.com does not and is not.

getice.ca and www.getice.ca both have past certificates and are both on the list and @rictd was just able to get a new certificate for them.

https://crt.sh/?q=%fleetnova.com
https://crt.sh/?q=%getice.ca

What do you think? Other than "I wish it was September 8 already." :stuck_out_tongue_winking_eye:

If fleetnova.com is in the internal exception list, i guess i'm way off and something maybe weird is happening. If it isn't, the exception list or code may need to be updated to include parents where necessary.

Edit: Fix "example.com" and a couple errors. I should do editing before hitting submit...

1 Like

we are rallying our host to upgrade PowerDNS and they said they plan on having CAA support by month end

i think we should be whitelisted in the mean time, since we used to have a cert now we don’t, and it is affecting our business

1 Like

Good sleuthing, @mnordhoff. I agree with your assessment: since the SERVFAIL exception is implemented in LookupDNS, it only applies on a per-lookup basis. Most affected sites haven’t hit this issue because they had certs for both subdomains and parent domains, meaning both were listed in the exception list. However, since the parent domain fleetnova.com wasn’t in the exception list, it hit this bug.

I think rather than fix the exceptions code, the best temporary fix is to manually add fleetnova.com to the list. Maybe @cpu or @roland can help with that? Note that this will allow @rictd to get a new cert, but only until Sep 8.

@rictd, thanks for following up with your host. Glad they are working on it! From a close reading, it sounds like they just committed to CAA support, not to upgrading their PowerDNS. Just in case they’re not planning to upgrade, you might want to remind them that there are a lot of known security vulnerabilities in the version they are running.

1 Like

That's good! :smile: To clarify, the critical thing is that they need to upgrade to a version of PowerDNS that fixes some bugs.

They don't need to allow users to create CAA records, develop the associated interface for it, write documentation, etc. Though it's great if they are!

But if they only kick off a software upgrade ASAP, you'll be in good shape, and able to create certificates.

Yeah... you were intended to be whitelisted (for the next few weeks) but there was an issue. :sweat:

One other thought: Per https://letsencrypt.org/docs/caa/, CAA records for subdomains override parent domains. So setting a CAA record authorizing Let’s Encrypt on grm.fleetnova.com would solve your problem. Of course, since your DNS provider doesn’t support CAA, there’s a catch-22. But this can be solved by adding an NS record to grm.fleetnova.com pointing just that domain to another provider. This might be easier and faster than moving all of your DNS, in case adding the base domain to the exceptions list is taking too long.

@jsha
can you recommend a free alternative (another DNS provider) that is known to be working fine with Lets Encrypt?

I believe Cloudflare DNS is free, and I know it works with Let’s Encrypt.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.