Intermittent Secondary validation: DNS problem: SERVFAIL

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is:
www.tal09.clients.merrehill.co.uk

I ran this command:
certbot certonly --apache --non-interactive --preferred-challenges http --cert-name tal09.clients.merrehill.co.uk -d tal09.clients.merrehill.co.uk,talent-finder-limited.co.uk,c.tal09.clients.merrehill.co.uk,c.talent-finder-limited.co.uk,www.tal09.clients.merrehill.co.uk,www.talent-finder-limited.co.uk

It produced this output:
During secondary validation: DNS problem: SERVFAIL looking up A for www.tal09.clients.merrehill.co.uk - the domain's nameservers may be malfunctioning; no valid AAAA records found for www.tal09.clients.merrehill.co.uk

My web server is (include version):
Apache 2.4.46

The operating system my web server runs on is (include version): Amazon Linux, version 2.

My hosting provider, if applicable, is:
N/A

I can login to a root shell on my machine (yes or no, or I don't know):
Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): 1.11.0

We have been seeing intermittent errors for a number of months when getting certificates. As an example, this one above failed at 2022-07-14 12:59:47 BST, and when ran again, was successful at 2022-07-14 13:05:42 BST - basically 6 minutes later.

The failures we see always appear during the secondary validation. There are other sub-domains on the certificate request which go through ok. The log indicates that both:
c.tal09.clients.merrehill.co.uk & tal09.clients.merrehill.co.uk
had valid responses during the processing.

These domains are hosted under the wildcard A record of *.clients.merrehill.co.uk.
There are no AAAA records as the solution doesn't use ipv6.
As the issue is intermittent, it is really confusing as to why it errors.
It would be good to understand where the issue lies, be it with LetsEncrypt or our DNS provider, so I can focus attention on getting these certificates issued smoothly.

I appreciate any light the community can shed on this.
Many thanks in advance.
Andy

1 Like

Your site might not be on IPv6, but your DNS servers claim to be, yet they don't actually reply over UDP to the queries. That sounds like exactly the kind of thing that the DNS querying that Let's Encrypt does might be able to sometimes work around but be unreliable.

https://dnsviz.net/d/www.tal09.clients.merrehill.co.uk/dnssec/

merrehill.co.uk zone: The server(s) were not responsive to queries over UDP. (2001:19f0:5001:34c4:7777:7777:7777:7777, 2001:19f0:6801:5b:7777:7777:7777:7777, 2001:19f0:6c01:2cf9:7777:7777:7777:7777, 2001:19f0:7402:4cf:7777:7777:7777:7777)

5 Likes

Cheers Peter, That looks like a great place to dig. Thanks very much for that.
It's much appreciated.

2 Likes

I have spoken to our DNS supplier and they indicate that when there are no AAAA records, the request should fall back to the A record. The LetsEncrypt documentation does state that AAAA records would be used as preference over A records, though if a domain is not intended to use AAAA records, those should be removed to prevent issues.

I appreciate that the nameservers are accepting IPv6 traffic, though as there are no AAAA records for the domain, I'm a little confused why the secondary validation is then trying to use them and failing, when it should be falling back to using the A records.

Nameservers will fail for a wide number of reasons, domain not listed in records, not available, or other errors. I'm wondering if there is something that either LetsEncrypt or the nameservers are not doing to make it fall over to use the A record.

If anyone in the community could help shed any further light on what may be happening, it would be much appreciated.

Kind regards,
Andy

The issue isn't the lack of AAAA records on your name, it's that there are AAAA records on your DNS servers' names, but those DNS servers aren't actually responding on those IPs.

While any nameserver errors should fall back to a working IP, I agree, it wouldn't shock me if the extra time that it was taking to do so was causing the intermittent issues you were seeing.

4 Likes

That's correct. But, when no AAAA record exists your DNS server should respond with a not-found. It was responding with a SERVFAIL. That should never be.

I cannot reproduce this using unboundtest.com test site right now but the dnsviz report still shows the problems Peter pointed out. unboundtest uses a similar method to the Let's Encrypt servers to lookup CAA, A, AAAA, and TXT records.

3 Likes

Thank you very much for your input on this Peter and Mike.
Your time and knowledge is truly appreciated.
Andy

2 Likes

Hi all,

For completeness, in case anyone else finds and reads this thread.

After discussions with our name server provider, there was little feedback or support on the problem, so we took the decision to move provider and the problem has since stopped.

Many thanks for the friendly support.

5 Likes

Please notify your previous hosting provider the reason why they've lost you as a customer, so hopefully, if enough customers provide such feedback, they may learn of it.

I think changing provider is indeed a very good solution when confronted with dysfunctional/unprofessional service providers.

4 Likes