Trouble with Verify error:DNS problem: SERVFAIL looking up CAA

I am trying to generate a certificate to accomodate https our internal artifactory. No visibility outside the company. We are using acme.sh to automate the dns_01 challenges on our GoDaddy dns provider. It has always work in the past but now I added a new wildcard *.jfrog-osp.kaloom.io for some redundancy and it does not seem to work.
On our internal dns server, jfrog-osp.kaloom.io is an A record that point to the ip of the server. jforg.kaloom.io is a cname pointing to jfrog-osp.kaloom.io. Not sure this is relevant though.
I was able to generate the equivalent with *.jfrog-ocp.apps.os-sanbox.kaloom.io for out tests on RedHat OpenShift and it worked fine.
I can see the txt record in our dns provider, the script proceeds with validation when I see the propagation using 8.8.8.8 so all seems good. I cannot understand what fails with that validation.

There is this entry in the beginning of the log about pending status for that specific subdomain, not sure if it's normal:

Getting webroot for domain='*.jfrog-osp.kaloom.io'
_w='dns_gd'
_currentRoot='dns_gd'
entry='"type":"dns-01","status":"pending","url":"https://acme-v02.api.letsencrypt.org/acme/chall-v3/67807072910/WNcw7A","token":"4utJ3ybuSwpty64qXSKb6j9GL5AHMWqDLIeC9fekFRE"'

My domain is: kaloom.io

I ran this command:
/root/.acme.sh/acme.sh --issue --force --log --dns dns_gd -d kaloom.io -d *.kaloom.io -d *.artifactory.kaloom.io -d *.jfrog-osp.kaloom.io -d *.jfrog-ocp.kaloom.io -d *.jfrog.kaloom.io

It produced this output:
*.jfrog-osp.kaloom.io:Verify error:DNS problem: SERVFAIL looking up CAA for jfrog-osp.kaloom.io - the domain's nameservers may be malfunctioning

My web server is (include version):
It is an internal artifactory site in my company

The operating system my web server runs on is (include version):
Docker instance of artifactory on Centos VM

My hosting provider, if applicable, is:
GoDaddy

I can login to a root shell on my machine (yes or no, or I don't know):
Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
GitHub - acmesh-official/acme.sh: A pure Unix shell script implementing ACME client protocol v2.8.6

Here is an image of the output just before it fails:

2 Likes

Your DNS server is not responding correctly to a query for the CAA record. The CAA record allows you to specify which Certificate Authorities are allowed to issue for your domain. You don't need to have one, but if you don't then your DNS server needs to correctly say you don't rather than giving an error.

Info on CAA:

A test site showing the same SERVFAIL that Let's Encrypt is getting:

https://unboundtest.com/m/CAA/jfrog-osp.kaloom.io/YDTY4UL4

The DNSViz checker has some more details about the failure:

https://dnsviz.net/d/jfrog-osp.kaloom.io/dnssec/?rr=1&rr=28&rr=257&a=all&ds=all&ta=.&tk=

I think that it's simultaneously saying the name does not exist, while not having a DNSSEC signature that says the name doesn't exist, but I'm far from an expert on DNSSEC. Regardless, this is something your DNS server needs to fix. Depending on who hosts your DNS, you might try disabling DNSSEC and re-enabling it, or adding a CAA record (even one allowing for all CAs) might work around their DNS implementation bug. Or maybe your DNS provider needs to upgrade their DNS server.

6 Likes

Ok thanks for the reply. I have this record in my dns:
CAA @ letsencrypt.org 0 issue 1 Hour

Looking at your reply I'm concerned because all our certificate requests are working. Just for this one I'm having trouble. Two days I have been trying to generate it...

About DNSSEC it is enabled and it has two entries for kaloom.io but did not try to disable and enable.

3 Likes

There's a CAA record for kaloom.io that works fine.

https://unboundtest.com/m/CAA/kaloom.io/F7NDTURS

But before checking that, Let's Encrypt first needs to check for a CAA record for the full hostname jfrog-osp.kaloom.io (since it takes precedence if it exists), and that query is the one that's failing (so Let's Encrypt can't tell if it in fact exists).

5 Likes

Ok but why it does not fail when I ask for jfrog-ocp.apps.os-sandbox.kaloom.io. There is no CAA for this one or for any other domain I ask for in my requests. They are all for internal sites so none of them actually respond from the letsencrypt perspective. But I understand that only the txt record is suppose to prove you control the domain you ask for.

2 Likes

The CAA query for jfrog-ocp.apps.os-sandbox.kaloom.io is correctly returning NXDOMAIN (that the domain doesn't exist). Saying it doesn't exist is fine from Let's Encrypt's perspective (and as you say is common for internal sites), but returning an error isn't.

But I don't know why your DNS server is returning an error for some hostnames and not others. You'd have to ask whomever runs your DNS server.

5 Likes

Ahh ok this is a good lead, I will try to take it with goDaddy to see if they have anything to say about this.

2 Likes

It seems that disable and enable DNSSEC did the trick!

No one at goDaddy could explain why some domains were getting SERVFAIL and some NXDOMAIN. So I went ahead and tried what you asked me first and it seems it fixed it.

Thank you this was not obvious for me!

5 Likes

It's quite something how many DNS providers allow for DNSSEC to be used but then don't seem to have an understanding of how to test, debug, or correct issues with it. Glad to hear that you got it working, though!

4 Likes

A few minutes after my post it stopped working again. I did again the same procedure and it worked after little while. It seems very fleaky to me and I do not know if I should disable DNSSEC or not because I do not know what it does.

I could get my certificate for jfrog.kaloom.io, jfrog-osp.kaloom.io so for now it's fine but it will happen again I suppose. The result of the CAA query for these two is no NOERROR even though there is no record matching these domains.
If I try with jfrog-ocp.kaloom.io for which I did not ask for a certificate yet, I still get the NXDOMAIN.

I suppose that as long as I do not get the SERVFAIL I should be good.

Anyways thanks for your insight it helped a lot!

3 Likes