DNS problem: NXDOMAIN looking up TXT for _acme-challenge

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is:
ha.mynym.us

I ran this command:
I am using the Home Assistant Let's Encrypt Add-On. It runs certbot when launched, but I don't have the exact command.

It produced this output:
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
cont-init: info: running /etc/cont-init.d/file-structure.sh
cont-init: info: /etc/cont-init.d/file-structure.sh exited 0
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service legacy-services: starting
services-up: info: copying legacy longrun lets-encrypt (no readiness notification)
s6-rc: info: service legacy-services successfully started
[16:34:58] INFO: Selected DNS Provider: dns-linode
[16:34:58] INFO: Use propagation seconds: 60
[16:34:58] INFO: Detecting existing certificate type for ha.mynym.us
Saving debug log to /var/log/letsencrypt/letsencrypt.log
[16:35:03] INFO: Existing certificate using 'rsa' key type.
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Renewing an existing certificate for ha.mynym.us
Waiting 120 seconds for DNS changes to propagate
Certbot failed to authenticate some domains (authenticator: dns-linode). The Certificate Authority reported these problems:
Domain: ha.mynym.us
Type: dns
Detail: DNS problem: NXDOMAIN looking up TXT for _acme-challenge.ha.mynym.us - check that a DNS record exists for this domain
Hint: The Certificate Authority failed to verify the DNS TXT records created by --dns-linode. Ensure the above domains are hosted by this DNS provider, or try increasing --dns-linode-propagation-seconds (currently 120 seconds).
Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.
s6-rc: info: service legacy-services: stopping
s6-rc: info: service legacy-services successfully stopped
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped

My web server is (include version):
Home Assistant Core 2025.5.3

The operating system my web server runs on is (include version):
Home Assistant OS 15.2

My hosting provider, if applicable, is:
not applicable

I can login to a root shell on my machine (yes or no, or I don't know):
I'm not sure I can access the Let's Encrypt add-on from the shell because it is containerized.

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
Let's Encrypt Addon 5.4.9

Fortunately, I noticed this renewal failed before the certificate actually expired. There have been 4 failed attempts, with the first 2 days ago and the other 3 today. During the most recent attempt, I was logged into Linode and saw the DNS entry update immediately. I was also able to get it from 8.8.8.8 using nslookup long before the error occurred, and the entry was automatically deleted, which is normal, presumably after the error occurred (there were about 2 minutes between the creation and deletion events logged at Linode). The docs for this add-on indicate there is a configuration item propagation_seconds: 60 that is apparently a default based on some of the INFO output, as I don't have that item in my config, so I am unsure why it is waiting 120 seconds later when renewing. Glancing at the commits for addons/letsencrypt in the github docs link above, I don't see any commits that sound like they would change this behavior from when it was working (it has been working for as long as the domain has had a cert, I never used a method other than the Linode API for this particular domain).

Welcome @mynym

I don't see a "smoking gun" but I can explain some of what you see.

Certbot has a linode "plugin" for its DNS challenges. Its default propagation wait time is 120 seconds. So, that's where that comes from. And, trying a longer wait time is worthwhile. Try 300 or 600 seconds as an experiment. I'm not sure how you modify this config from HA setup but see: Welcome to certbot-dns-linode’s documentation! — certbot-dns-linode 0 documentation

I don't know why HA shows a 60s wait time. The wait that is important is after the Certbot plugin adds the TXT record and before Certbot sends the cert request to Let's Encrypt server. HA isn't involved in that sequence. That's a question for the HA people :slight_smile:

Your DNS config shows a problem at: ha.mynym.us | DNSViz

That said, I don't see how that problem could cause a repeated NXDOMAIN. Still, when we see unexpected behavior from DNS it often helps just to clean up the DNS config.

Not sure how you exactly fix that. But, sometimes disabling DNSSEC and re-enabling it will fix that. Otherwise you may need to ask Linode about that.

That's not unreasonable but I'd be interested to know result of you using https://unboundtest.com to check the TXT record for the _acme-challenge... record. That site uses a DNS query method similar to Let's Encrypt.

I realize none of these are very definitive but at the moment that's all I have. Maybe something above will lead to more info and the root cause.

3 Likes

I just want to point out something here...

The 60 second value is HA's default global propagation config. Ideally it would be passing that value onto Certbot. I looked though the HA source, and it doesn't seem leverage the propagation seconds – or anything else – for linode. Linode falls under "all other dns providers" catchall here - addons/letsencrypt/rootfs/etc/services.d/lets-encrypt/run at e756c7fc2b66795a15eef5b4dd0c3431b3e99e28 · home-assistant/addons · GitHub

Other providers in explicit sections have the seconds sent correctly.

The 120s timeout on Certbot's linode plugin has been around for many years.

TLDR; this has always been happening, you just didn't realize it.

3 Likes

TBH, I suspected this was the case, but does this:

also mean that adding something like propagation_seconds: 300 won't have an effect here? I'm trying to prioritize between trying different settings there or trying older plugin versions if I can find them in my backups. Going to open a ticket with Linode first given the DNSSEC issue even though it seems unlikely to be relevant (unless unboundtest indicates otherwise, but I figured I'd try it while trying something else).

Changing the HA delay won't help. But, if you can modify the Certbot Linode plugin delay that is worth a try. I posted link for those docs earlier. I just don't know where those are when HA "wraps" Certbot like it does.

2 Likes

I just tried the unbound test after seeing the result with 8.8.8.8 on nslookup before the 120 seconds passed. Logs are here:
https://unboundtest.com/m/TXT/_acme-challenge.mynym.us/MUX547X6

I don't think that successfully found it, either, because when I use the browser find in page function to search the logs for the key that was provided without quotes, it isn't found.

Unfortunately, I don't think I can easily change the delay sent to certbot in this case because, as I understand it, that would require either accessing the container from another addon and editing it, or replacing the addon container with a custom one, and I'm not sure where to begin to find steps for doing either of those things.

I don't have edit yet, but I searched for NXDOMAIN and found that, confirming unboundtest wasn't able to resolve it:

Jun 18 16:06:26 unbound[25615:0] info: response for _acme-challenge.mynym.us. TXT IN
Jun 18 16:06:26 unbound[25615:0] info: reply from <mynym.us.> 2600:14c0:7::2#53
Jun 18 16:06:26 unbound[25615:0] info: incoming scrubbed packet: ;; ->>HEADER<<- opcode: QUERY, rcode: NXDOMAIN, id: 0
;; flags: qr aa ; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
_acme-challenge.mynym.us. IN TXT

;; ANSWER SECTION:

;; AUTHORITY SECTION:
mynym.us. 0 IN SOA ns1.linode.com. domain_admin.mynym.us. 2021000092 14400 14400 1209600 86400

;; ADDITIONAL SECTION:
;; MSG SIZE rcvd: 105

Jun 18 16:06:26 unbound[25615:0] debug: iter_handle processing q with state QUERY RESPONSE STATE
Jun 18 16:06:26 unbound[25615:0] info: query response was NXDOMAIN ANSWER

No, it didn't but based on your first post the query should be for
_acme-challenge.ha.mynym.us

You looked for _acme-challenge.mynym.us missing ha.

If the top-most section of unbound's page says NXDOMAIN then it did not find it.

If it does you will see the TXT value clearly in a top ANSWER section.

1 Like

Yes. You'd have to file an issue against HA to update their code.

Lines 31-35 define propagation seconds

Specific providers utilize this; for example Cloudflare lines 54-65

; note how PROPAGATION_SECONDS is used here:

ACME_ARGUMENTS+=("--${DNS_PROVIDER}" "--${DNS_PROVIDER}-credentials" "/data/dnsapikey" "--${DNS_PROVIDER}-propagation-seconds" "${PROPAGATION_SECONDS}")

Linode does not have a specific section, so just falls to the default - line 320 - which does not utilize PROPAGATION_SECONDS at all::

        ACME_ARGUMENTS+=("--${DNS_PROVIDER}" "--${DNS_PROVIDER}-credentials" "/data/dnsapikey")
2 Likes

OK, so I found an old backup of add-on version 5.4.4 and tried, it worked, but the result also showed up at unbound test. Unfortunately, this means I don't know whether the newer add-on versions cause a problem or the unbound test would have failed before if I hadn't gotten the domain typed wrong and Linode may have fixed something. Could also be a sporadic/intermittent issue, as I also noticed that 8.8.8.8 and 9.9.9.9 still weren't seeing the record when 1.1.1.1 and unboundtest saw it.

1 Like

FWIW, Linode hasn't responded to my ticket, but that doesn't mean the issue isn't transient, either.

In theory, I could file a PR, but I don't have an HA dev environment in which to test that (if I did, the previously mentioned roadblocks likely also wouldn't have applied to me). I think I'll wait and see if the same thing happens with any of my other Linode plugin wildcards on other systems first and foremost.

5.4.5

  • Update certbot-dns-directadmin to 1.0.15

Dumb luck I had a 5.4.4 backup. Any chance that could be relevant?