Invalid Challenge response due to SERVFAIL on CAA request (solved)

We are a hosting company and have been successfully creating SSL certs on thousands of domains for our user’s for quite some time without issue. Only in the last few days, we’ve been unable to create SSL certs for certain domains without any apparent reason; while we are able to create certs for other domains on the same host. From our apache logs, we are able to see that LetsEncrypt is able to connect to our server for the domain that failed and retrieve the challenge file with a 200 response from our server, yet we still get an “invalid” response from the LetsEncrypt API. We can verify that the challenge file has the correct contents and we can retrieve it with the correct contents without issue from our server. The challenge will repeatedly fail for the same domain with no apparent reason, while repeatedly succeed for another domain on the same server, with all headers in the response to an http request to the challenge file looking exactly the same for both domains. Here is an example:

1500057740.714744 Private key loaded

1500057741.345338 Let’s Encrypt Directories loaded.

1500057741.483964 Sending registration message

1500057741.621206 Known key used

1500057741.621407 Refetching with location URL

1500057741.947683 TOS already accepted. Skipping

1500057741.948004 Sending authz message for domain

1500057742.140741 Handing challenge for token: iNMFh58TGhf1NAFutE9kNVs_1B4kru0mg51cPbawu5s.psAeiFY6OkcX3SqWcKNnHtXRFF1xCrsLAO_KSsMzGAg

1500057742.508231 Polling for challenge fulfillment

1500057742.508279 Status: pending

1500057744.697098 Status: invalid

From our apache logs at the time this API request was done:

domain 66.133.109.36 - - [14/Jul/2017:18:42:22 +0000] “GET /.well-known/acme-challenge/iNMFh58TGhf1NAFutE9kNVs_1B4kru0mg51cPbawu5s HTTP/1.1” 200 87 “-” “Mozilla/5.0 (compat
ible; Let’s Encrypt validation server; +https://www.letsencrypt.org)”

When I request the challenge file with my own http client I get:

wget -S http://domain/.well-known/acme-challenge/iNMFh58TGhf1NAFutE9kNVs_1B4kru0mg51cPbawu5s
–2017-07-14 11:42:40-- http://homegymstrong.com/.well-known/acme-challenge/iNMFh58TGhf1NAFutE9kNVs_1B4kru0mg51cPbawu5s
Resolving domain (domain)… 54.243.187.70
Connecting to domain (domain)|54.243.187.70|:80… connected.
HTTP request sent, awaiting response…
HTTP/1.1 200 OK
Date: Fri, 14 Jul 2017 18:50:20 GMT
Server: Apache/2.2.22 (Debian)
Last-Modified: Fri, 14 Jul 2017 18:42:22 GMT
ETag: "ed8bc-57-5544b68dec405"
Accept-Ranges: bytes
Content-Length: 87
Cache-Control: max-age=0
Expires: Fri, 14 Jul 2017 18:50:20 GMT
Connection: close
Content-Type: text/plain
Length: 87 [text/plain]
Saving to: ‘iNMFh58TGhf1NAFutE9kNVs_1B4kru0mg51cPbawu5s’

#cat iNMFh58TGhf1NAFutE9kNVs_1B4kru0mg51cPbawu5s
iNMFh58TGhf1NAFutE9kNVs_1B4kru0mg51cPbawu5s.psAeiFY6OkcX3SqWcKNnHtXRFF1xCrsLAO_KSsMzGAg

We are using the following perl modules to communicate with the LetsEncrypt API, which have worked quite well for us on all domains until now, and still do work well for many domains:

use Protocol::ACME;
use Protocol::ACME::Challenge::LocalFile;

What we know is that we write the correct challenge file, letsencrypt is
requesting the correct challenge file from our server, we are sending
back the exact number of bytes of the content of the challenge file to
letsencrypt, the connection is closed normally, and then letsencrypt api
responds with “invalid”. So absolutely everything on our end looks 100% as it normally does when we get a “valid” response from the api, but for some unknown reason letsencrypt is responding with “invalid”.

It looks like we are getting the following error for all of these. It must be due to some new LetsEncrypt requirement, as we’ve not gotten this error until recently:

'DNS problem: SERVFAIL looking up CAA for ’

Hi @nichemarketing,

Let’s Encrypt has always enforced CAA, which is a way to use DNS to indicate which certificate authorities are allowed to issue certificates for a particular domain.

If you look at the history of a forum search for this error, people have encountered it as early as August 2016.

https://community.letsencrypt.org/search?q="servfail%20looking%20up%20caa"%20order%3Alatest

Maybe some of the previous discussions about this on the forum will be helpful to you. It is a Let’s Encrypt requirement (and has been for a long time) that DNS providers answer a query for CAA records for subject names for which certificates are requested. They can answer that no such record exists (CAA records are not mandatory), but they can’t answer with an error (answering the question successfully is mandatory). Is it possible that you changed DNS providers recently or that your DNS provider or internal DNS infrastructure has changed its software recently?

Some servers still don’t answer queries about CAA correctly because CAA is a comparatively recently introduced RR type (from 2013), although returning a SERVFAIL error in response to not recognizing the RR type is, I think, not really correct behavior.

The server that has this problem in this example seems to be ns1.mywahosting.com (and also ns2 through ns4); are those your servers or someone else’s?

Thanks for your help. Our server was returning SERVFAIL for CAA requests but it has now been corrected and we are responding with CAA records. Everything is fine now.

There was an issue on one of our DNS servers that was causing a SERVFAIL for CAA requests. We now have CAA records implemented.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.