Failed because TXT record not found but it actually exists

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is:
scichat.esss.lu.se
I ran this command:
certbot --text --agree-tos --non-interactive certonly --rsa-key-size 4096 -a dns-rfc2136 --cert-name 'scichat.esss.lu.se' -d 'scichat.esss.lu.se' --dns-rfc2136-credentials /etc/letsencrypt/dns-rfc2136.ini --dns-rfc2136-propagation-seconds 120

It produced this output:
Certbot failed to authenticate some domains (authenticator: dns-rfc2136). The Certificate Authority reported these problems:
Domain: scichat.esss.lu.se
Type: unauthorized
Detail: During secondary validation: No TXT record found at _acme-challenge.scichat.esss.lu.se

During the propagation, the TXT record exists when I ran
dig TXT _acme-challenge.scichat.esss.lu.se @8.8.8.8
dig TXT _acme-challenge.scichat.esss.lu.se @1.1.1.1
;; ANSWER SECTION:
_acme-challenge.scichat.esss.lu.se. 120 IN TXT "f_sfD..."

I recently moved our nameserver to another server. The new server's IP is changed in the .ini configuration. It failed after the migration.

My web server is (include version):
Haproxy 2.4

The operating system my web server runs on is (include version):
Ubuntu 22.04

My hosting provider, if applicable, is:

I can login to a root shell on my machine (yes or no, or I don't know):
yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
certbot 1.21.0

This part of the error means the primary validation server, located in the US, indeed did successfully fetch the TXT RR, but at least one (or two) of the four (or five, not sure how many there currently are) secondary validation servers, of which most are situated on a different continent than North America, could not.

Usually with other challenges than the dns-01 challenge this means there is some geographical blocking going on in some kind of firewall, but with the dns-01 challenge this might also be a global propogation issue.

It might also be that perhaps there is some randomised querying going on. E.g., when going down the DNS root to your TXT RR, one of the validation clients might succeed by a random fluke ending up at the correct nameserver, but a different validation DNS client might go down a different nameserver route ending up at an incorrectly configured nameserver.

I see you have ns1.ess.eu. and sunic.sunet.se. listed as nameservers. Do both nameservers get the correct update?

Sometimes it's sufficient to just increase --dns-rfc2136-propagation-seconds to e.g. 300 seconds.

1 Like

Yes, they both got it.
I also query on https://dnschecker.org/. Most countries got the update.
I can run the command now and put a very long propagation second so you can see.

"Most" might not be enough. I believe at least one validation location is hosted in Asia for example.

I don't see any TXT RR, yet.

Now it is running
Requesting a certificate for scichat.esss.lu.se
Waiting 1200 seconds for DNS changes to propagate

sunic.sunet.se. is taking its time to get the value though..

Right now only ns1.ess.eu has it.

 dig +noall +answer TXT _acme-challenge.scichat.esss.lu.se @sunic.sunet.se

dig +noall +answer TXT _acme-challenge.scichat.esss.lu.se @ns1.ess.eu
_acme-challenge.scichat.esss.lu.se. 120 IN TXT  "2NSalHJQe9UNPOAZnh38zqLgFJuPg40K5NMbt032F5Y"
2 Likes

And only a minority (9 out of 29) on dnschecker.org can resolve the TXT RR..

Yes...
What reason could it be?

sunic.sunet.se did not get updated. With "taking its time" I meant it didn't have the TXT RR at that moment. Still doesn't.

That also results in VERY erratic behaviour at dnschecker.com: sometimes it's 9 out of 29, sometimes it's 11/29 and then goes down to just 9 again.

1 Like

But it is updated when I query from cmd

But not from here:

osiris@erazer ~ $ dig @sunic.sunet.se. _acme-challenge.scichat.esss.lu.se TXT

; <<>> DiG 9.16.42 <<>> @sunic.sunet.se. _acme-challenge.scichat.esss.lu.se TXT
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48897
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;_acme-challenge.scichat.esss.lu.se. IN	TXT

;; AUTHORITY SECTION:
esss.lu.se.		10800	IN	SOA	esss.lu.se. registry.esss.se. 2023096441 3600 1800 604800 10800

;; Query time: 31 msec
;; SERVER: 2001:6b0:7::2#53(2001:6b0:7::2)
;; WHEN: Tue Dec 10 15:31:39 CET 2024
;; MSG SIZE  rcvd: 113

osiris@erazer ~ $ 

Please make sure sunic.sunet.se can be used to resolve the TXT RR from outside your network too.

Ah, wait: it's the IPv6 address that's failing. Using IPv4 it does work.

Make sure the IPv6 is correct. Is 2001:6b0:7::2 the correct IPv6 address for your sunic.sunet.se DNS server? Or perhaps some setting in your DNS software so it behaves differently between IPv4 and IPv6? The reverse DNS pointer of 2001:6b0:7::2 does resolve to sunic.sunet.se :man_shrugging:t2: Although that doesn't have to mean anything, could be a relic.

2 Likes

Yes, the IPV6 address is correct. But I don't notify the IPV6 address, only IPV4. sunic.sunet.se is the secondary DNS server

Well, if both IPv6 and IPv4 end up at the exact same DNS server, that shouldn't matter I think :thinking:

That said, the fact is that over IPv6, the nameserver thinks it does not have that TXT RR, while when queried over IPv4 it does have it..

2 Likes

So you mean we should enable IPv6? But it worked before on the old server which worked the same way.

Something weird going on: I query dig txt _acme-challenge.scichat.esss.lu.se @sunic.sunet.se. once and receive no answer (section); I query a second later—I get an answer. I query again and again and it seems I get an answer and non-answer randomly. Do you have any kind of load-balancer behind sunic.sunet.se/192.36.125.2?

2 Likes

I think you should check your setup and configuration. I don't know enough of your setup to give you detailed advice.

The only thing I also notice is that when I query that nameserver using IPv6 for _acme-challenge.scichat.esss.lu.se. IN A it returns an IP address (the wildcard *.scichat.esss.lu.se. IN A which seems to be in place) when queried over IPv6, but when queried over IPv4, it returns a NOERROR without an answer.

So there really is something differently going on with that server between IPv4 and IPv6, but I don't know what.

@Nekit You can use -4 and -6 to differentiate between IPv4 and IPv6. Using just one protocol, I get consistent results.

1 Like

I query only over v4 (don't have a v6 at home :frowning:)

1 Like

Hmm, weird that you're getting random answers, I couldn't reproduce that here.

@limanzhang Are you ABSOLUTELY sure both IPv4 and IPv6 are the same nameservers? Because when I query them both for their version, I'm getting:

osiris@erazer ~ $ nslookup -q=txt -class=CHAOS version.bind 2001:6b0:7::2
Server:		2001:6b0:7::2
Address:	2001:6b0:7::2#53

Non-authoritative answer:
version.bind	text = "sunic node1"

Authoritative answers can be found from:

osiris@erazer ~ $ 
osiris@erazer ~ $ nslookup -q=txt -class=CHAOS version.bind 192.36.125.2
Server:		192.36.125.2
Address:	192.36.125.2#53

Non-authoritative answer:
version.bind	text = "sunic node2"

Authoritative answers can be found from:

osiris@erazer ~ $ 

Notice "node1" vs "node2".

3 Likes

It works this time. I got the certificate.
I had the problems for some other zones as well. It works normally when I use a super long propagation time. But sometimes it just doesn't work no matter how long it is.

The sunic nodes are a cluster. They have many nodes. And yes, I noticed that they are not synchronized very fast.