Acme Challenge fails due to one nameserver out of sync

My domain is: not relevant to the question or troubleshooting

I ran this command: sudo certbot certonly --manual --preferred-challenges=dns --email [redacted] --server https://acme-v02.api.letsencrypt.org/directory --agree-tos -d *.example.com

It produced this output: It generates an _acme-challenge TXT record for me to create.

My web server is (include version): N/A

The operating system my web server runs on is (include version): N/A

My hosting provider, if applicable, is: N/A

I can login to a root shell on my machine (yes or no, or I don't know): yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): cpanel

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): 0.31.0

I am running the certbot in manual mode with dns-challenge. It works as expected and generates a key for me to add as a TXT record in my DNS.

The issue that I'm experiencing is that one of the nameservers is not updating based on the TTL. When I check the DNS records using mxtoolbox or what's my dns it shows that ns2 & ns3 have the valid TXT key for _acme-challenge record. However ns1 does not.
I've run the script 3 times now and each time ns2 and ns3 show the correct key, while ns1 still shows the key from the first try.

I know this is not a certbot issue but the nameservers. My questions are:

Is there a way to retry validation without it generating a new key? This would allow me to wait to retry until I've verified that the nameservers are all in sync.

Is there a way to tell the certbot which nameserver to query?

Is there a timeout for the certbot process? I could run the command and get the new code and then just wait until I've verified that the nameservers are all synced with the correct code. However, my concern is that if this takes over an hour to complete, will the cerbot process timeout?

I appreciate any assistance or guidance with this issue.

No.

No. LetsEncrypt will query the authoritative nameservers from multiple vantage points at it's discretion.

There shouldn't be.

What plugins are you using to automate the DNS-01 challenge? Some contain an option to sleep for a few seconds. There are often tutorials on how to enable this for different plugins.

3 Likes

This is a DNS sync issue.
You need to fix NS1.
OR
Remove it from use.

3 Likes

You might want to up date to a newer Certbot version; Certbot 3.0.1 Release

4 Likes

Thanks for the response! I'm not using any plugins as I'm trying to run the standalone process manually. There is no web server present on the server that I am running the certbot on. The real issue is with the single nameserver but without access to troubleshoot that I am looking for other options. I think I may be stuck waiting for the slow nameserver to sync up and hoping the script doesn't timeout by then.

1 Like

I will definitely update! Unfortunately this will not solve my issue this time. Thank you for the input.

1 Like

@rg305 advice is sound and reasonable.

(emphasis is mine)

2 Likes

With --manual Certbot should stop after it displays the TXT record value. You said you apply it and then continue.

But, you just can't continue until all of your NS can reply properly. It sounds like you are proceeding even when you know one of them is wrong.

4 Likes

You need to wait until everything is synced.

Depending on your DNS provider, attempting to query the DNS might load and cache the bad value into their system. Make sure the TTLs are set incredibly low (max 60s, less is preferable) and wait TTL+2 seconds before you try to load a value. TTLs can be meaningless, because they only exist on the Nameserver layer - but some DNS providers have caches on their application/database layer that don't respect the TTL, so you might get a bad value stuck within your DNS provider even though the external world sees a short TTL.

When you are ready to automate for renewals, use a sleep command for whatever timeout is necessary after updating the DNS and before checking it (or triggering the authorization, but you should ideally check the DNS first).

There are a lot of posts here on these topics.

3 Likes

The problem was that only one nameserver was getting hung up. I was looking for a solution for that but the best option was to just remove that name server from the DNS. If I hadn't run the mxtoolbox to check the TXT record multiple times I wouldn't have caught it because the first couple only returned the nameservers that were updating correctly. The only one that LetsEncrypt seemed to hit was the one that was hanging.

1 Like

A better lookup tool for that purpose is: https://unboundtest.com

Yeah, it is a bit of luck of the draw. LE checks from 5 global points (today) and each can choose differently. Today, at least 4 of 5 must succeed. Problems from points 2-5 should identify as "Secondary" validation problems in the error message.

You could query each one directly too like:

dig TXT _acme-challenge.example.com @NameServer1.com.
dig TXT _acme-challenge.example.com @NameServer2.net.
and so on

Use your actual DNS server names after the @ :slight_smile:

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.