Certbot renewal dns-01 challenge failure, During secondary validation: DNS problem

My domain is: https://dashboard.panorama9.com/

I ran this command:
certbot certonly --dns-rfc2136 --dns-rfc2136-credentials -v

It produced this output:
Certbot failed to authenticate some domains (authenticator: dns-rfc2136). The Certificate Authority reported these problems:
Domain: dashboard.panorama9.com
Type: dns
Detail: During secondary validation: DNS problem: NXDOMAIN looking up TXT for _acme-challenge.dashboard.panorama9.com - check that a DNS record exists for this domain

The version of my client is certbot 1.21.0

we have two bind name servers ns1.panorama9.com and ns2.panorama9.com

while renewal is ongoing i can validate that the TXT record is there with dig, whether using our primary, secondary dns or the public google dns
dig @ns1.panorama9.com _acme-challenge.dashboard.panorama9.com TXT +short

but i get the output from above that the secondary validation failed
Is this related to Multi-Perspective Validation ? that some or multiple validation from some regions failed ?

I have used --dns-rfc2136-propagation-seconds to increase time up to 3 hours, i still get the same error.

Hi @ariah, and welcome to the LE community forum :slight_smile:

Probably (see secondary validation):

2 Likes

Thank you @rg305

right, it says NXDOMAIN, but when i dig the TXT record from any server i get NOERROR along side the value of the record. is the certficate authority querying some dns that have yet to be propoganded ? as i mentioned before, i did wait for 3 hours. And i can't think of anything on how to further debug this

Do you have the challenge token there at the moment? I'm currently seeing NXDOMAIN as well.

3 Likes

@petercooperjr it does now

dig @ns1.panorama9.com _acme-challenge.dashboard.panorama9.com TXT +short
"qrFzGWwfKPr5E1FrvOlr6n4BoQ5cUFEq6mMHqoVJ8d4"

Not from here

$ dig @ns1.panorama9.com _acme-challenge.dashboard.panorama9.com TXT +norecurse

; <<>> DiG 9.16.22-RH <<>> @ns1.panorama9.com _acme-challenge.dashboard.panorama9.com TXT +norecurse
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 59051
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
3 Likes

And I don't know if this is the root cause, but your DNS delegation isn't right either:

https://dnsviz.net/d/_acme-challenge.dashboard.panorama9.com/servers/

Warnings

  • com to panorama9.com: The following NS name(s) were found in the authoritative NS RRset, but not in the delegation NS RRset (i.e., in the com zone): ns2.panorama9.org, ns1.panorama9.org
  • com to panorama9.com: The following NS name(s) were found in the delegation NS RRset (i.e., in the com zone), but not in the authoritative NS RRset: ns1.panorama9.com, ns2.panorama9.com

That is, the .com zone says your nameservers end in .com:

$ dig -t NS panorama9.com. +norecurse @a.gtld-servers.net.

; <<>> DiG 9.16.22-RH <<>> -t NS panorama9.com. +norecurse @a.gtld-servers.net.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62877
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 3

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;panorama9.com.                 IN      NS

;; AUTHORITY SECTION:
panorama9.com.          172800  IN      NS      ns1.panorama9.com.
panorama9.com.          172800  IN      NS      ns2.panorama9.com.

;; ADDITIONAL SECTION:
ns1.panorama9.com.      172800  IN      A       46.51.187.235
ns2.panorama9.com.      172800  IN      A       79.125.105.180

;; Query time: 0 msec
;; SERVER: 2001:503:a83e::2:30#53(2001:503:a83e::2:30)
;; WHEN: Tue Sep 13 17:19:00 UTC 2022
;; MSG SIZE  rcvd: 110

But, your nameserver says that they actually end in .org:

$ dig -t NS panorama9.com. +norecurse @46.51.187.235

; <<>> DiG 9.16.22-RH <<>> -t NS panorama9.com. +norecurse @46.51.187.235
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34825
;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 3

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;panorama9.com.                 IN      NS

;; ANSWER SECTION:
panorama9.com.          3600    IN      NS      ns1.panorama9.org.
panorama9.com.          3600    IN      NS      ns2.panorama9.org.

;; ADDITIONAL SECTION:
ns1.panorama9.org.      1068    IN      A       46.51.187.235
ns2.panorama9.org.      1780    IN      A       79.125.105.180

;; Query time: 67 msec
;; SERVER: 46.51.187.235#53(46.51.187.235)
;; WHEN: Tue Sep 13 17:19:21 UTC 2022
;; MSG SIZE  rcvd: 123
3 Likes

I can't say either, but afaik, until recently the dns servers were helping validating tokens correctly, and there was no change there.
the .org servers are the one registered as NS on the hosting provider while .com are alias they both point to the same servers

this is more interesting, could it be geo related ? i've tried query on different laptops and i get the token correctly(can you please make the query again ? it'll be there for couple minutes)

I can't see it either. You can try this site. It uses similar method as Let's Encrypt server uses
https://unboundtest.com/

4 Likes

From a server in AWS's us-east-1 region:

[ec2-user@ip-172-31-59-142 ~]$ dig @46.51.187.235 _acme-challenge.dashboard.panorama9.com TXT +norecurse

; <<>> DiG 9.16.22-RH <<>> @46.51.187.235 _acme-challenge.dashboard.panorama9.com TXT +norecurse
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 3256
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;_acme-challenge.dashboard.panorama9.com. IN TXT

;; AUTHORITY SECTION:
panorama9.com.          3600    IN      SOA     ns1.panorama9.com. dns.jomax.net. 2016114767 28800 7200 604800 86400

;; Query time: 67 msec
;; SERVER: 46.51.187.235#53(46.51.187.235)
;; WHEN: Tue Sep 13 17:21:18 UTC 2022
;; MSG SIZE  rcvd: 121

[ec2-user@ip-172-31-59-142 ~]$ dig @79.125.105.180 _acme-challenge.dashboard.panorama9.com TXT +norecurse

; <<>> DiG 9.16.22-RH <<>> @79.125.105.180 _acme-challenge.dashboard.panorama9.com TXT +norecurse
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 43328
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;_acme-challenge.dashboard.panorama9.com. IN TXT

;; AUTHORITY SECTION:
panorama9.com.          3600    IN      SOA     ns1.panorama9.com. dns.jomax.net. 2016114767 28800 7200 604800 86400

;; Query time: 67 msec
;; SERVER: 79.125.105.180#53(79.125.105.180)
;; WHEN: Tue Sep 13 17:21:30 UTC 2022
;; MSG SIZE  rcvd: 121

From unboundtest, which uses a similar unbound configuration to that which Let's Encrypt uses (though I don't know what region it queries from):

https://unboundtest.com/m/TXT/_acme-challenge.dashboard.panorama9.com/NQOFTN56

Query results for TXT _acme-challenge.dashboard.panorama9.com

Response:
;; opcode: QUERY, status: NXDOMAIN, id: 11351
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;_acme-challenge.dashboard.panorama9.com.	IN	 TXT

;; AUTHORITY SECTION:
panorama9.com.	0	IN	SOA	ns1.panorama9.com. dns.jomax.net. 2016114767 28800 7200 604800 86400

All just saying NXDOMAIN. Are you sure you're updating the DNS servers that the outside world sees, and not some sort of visibly-internal-only server? Though the fact that you see a "secondary validation" message implies that the primary validation did see the record. Hmm…

5 Likes

yes, 79.125.105.180 and 46.51.187.235 are both aws bind servers registered as NS, and as you said the primary validation passed, but one or more of the validation from different regions didn't, which may mean that those dns weren't propagated with new records, hence why i did try to wait from 10 minutes to couple hours to no avail

I see

> $ nslookup
> > server ns1.panorama9.com
> Default server: ns1.panorama9.com
> Address: 46.51.187.235#53
> > dashboard.panorama9.com
> Server:         ns1.panorama9.com
> Address:        46.51.187.235#53
> 
> Name:   dashboard.panorama9.com
> Address: 35.225.59.74
> > set q=soa
> > dashboard.panorama9.com
> Server:         ns1.panorama9.com
> Address:        46.51.187.235#53
> 
> *** Can't find dashboard.panorama9.com: No answer
> > panorama9.com
> Server:         ns1.panorama9.com
> Address:        46.51.187.235#53
> 
> panorama9.com
>         origin = ns1.panorama9.com
>         mail addr = dns.jomax.net
>         serial = 2016114818
>         refresh = 28800
>         retry = 7200
>         expire = 604800
>         minimum = 86400
> >
1 Like

And yet I do see the record when querying 1.1.1.1 (Cloudflare) or 8.8.8.8 (Google), even after their TTL expires, but I get NXDOMAIN via 64.6.64.6 (Verisign) or 208.67.222.222 (OpenDNS) (or when I try to check the authoritative servers myself).

So it does look like your DNS servers are giving different responses to different parts of the Internet, somehow.

5 Likes

And I see 2 different IP Addresses being used around the world

1 Like

those are load balancers ip for the dashboard, in different regions: eu and us

So are you saying that your DNS server intentionally gives different answers for different regions for some queries? Because if that's the case, then definitely check that your TXT update is updating the responses for all regions.

5 Likes

Since Let's Encrypt use Multi-Perspective Validation Improves Domain Validation Security - Let's Encrypt

2 Likes

that's most likely it, let me check on this and get back to you, thanks for the breakthrough

4 Likes

Keep in mind that even though the problem seems to have started very recently, the root cause for it may have been implemented up to 60 days ago.

4 Likes

Sorry for late reply(just got access to the server)

So, yes that was it, we do have multiple zones, one for US regions, and the other zone for the rest, we were always updating the us one, and it was working file until recently(idk if Multi-Perspective Validation was implemented recently hence why validating from multiple regions ?)

Using certbot, is it possible to do the dns-challenge by adding the record simultaneously to the two zones ? or must we configure one zone to transfer data to the other ?