SERVFAIL only on staging environment


#1

My domain is:
mijnwordpress.be

I ran this command:
I’m using https://github.com/unixcharles/acme-client. In production we’re using v1 of the spec (without any issues), but we’re in the process of moving to v2. I am testing against the staging environment for the time being.

It produced this output:
Almost half of the tries to get a certificate for the domain results in a SERVFAIL error for the CAA from Let’s Encrypt. There is no CAA set on the domain and the nameserver is set to respond with NOERROR (which is also what I get from dig. We are using an older version of PowerDNS on our nameservers (so the 4.0.4 is not an option at the moment), but we do not use DNSSEC either, so I think the PowerDNS-bug does not apply here.

We also tried with authenticate.be on Cloudflare and there it also caused the same error pretty often.

The operating system my web server runs on is (include version):
Debian 8

My hosting provider, if applicable, is: we are a hosting provider, Openminds.

I can login to a root shell on my machine (yes or no, or I don’t know): yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel): yes (in-house)


#2

I can confirm I see the same behavior for my domain using the dehydrated client.
I have no CAA record and NOERROR is returned (without answer section) by auth DNS servers.

The prod environment has no issues I have just requested a cert successfully from there.


#3

@cstamas, could you let us know what your domain is?

@lestaff, this sounds like it could be a real bug in staging of some sort.


#4

It was lists2.iszt.hu.

The command I used to check was:
% dig -t caa lists2.iszt.hu @ns.iszt.hu

I have just retried forcing a renewal with the staging env and it does work at the moment.


#5

I just tried manually checking both domains’ CAA records from our staging infrastructure, and things seem to be working properly. I’m inclined to chalk the past SERVFAILs up to the usual sporadic Internet traffic glitches. If this keeps happening, please update us. If you’re using the HTTP-01 challenge type, it would be useful to see your traceroutes back to the Let’s Encrypt validation server that appears in your HTTP logs. Thanks!