During secondary validation: No valid IP addresses found

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. https://crt.sh/?q=example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is: log.aeschi.eu

I ran this command: USE_PYTHON_3=1 /root/bin/certbot-auto certonly --verbose --no-bootstrap --renew-by-default --rsa-key-size=4096 --webroot --webroot-path /var/apache/log.aeschi.eu --domain log.aeschi.eu

It produced this output:

Domain: log.aeschi.eu
Type: dns
Detail: No valid IP addresses found for log.aeschi.eu

Occasionally it says: “During secondary validation: No valid IP addresses found for log.aeschi.eu”

nslookup, via 8.8.8.8, 8.8.4.4 and 1.1.1.1 produces a valid IP address once it follows the CNAME, so I’m a bit stuck here.

My web server is (include version): apache 2.4.41

The operating system my web server runs on is (include version): linux 5.4.x

My hosting provider, if applicable, is:

I can login to a root shell on my machine (yes or no, or I don’t know): yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel): no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you’re using Certbot): 1.4.0

1 Like

Hi @ajs1k

if you have that error, first step: Read

So the main Letsencrypt server can check your domain / name server, some of the second servers are blocked.

--> You have to open your firewall.

Yep, checking your domain - https://check-your-website.server-daten.de/?q=log.aeschi.eu

You have your own name servers ns0.core.aeschi.eu.

1 Like

Hi Juergen, thanks for the tips. Meanwhile, the message about “during secondary validation” has also vanished at the moment, the LE primary server appears to now not be able to find it.

I’m not blocking any DNS traffic or queries, on any of my nameservers. All of the secondaries are up to date - the serial number is consistent - checking further, there is a CNAME in place on all four of them, as expected.

Would you have any other notions?

Many thanks,
Al.

1 Like

After quite a lot of investigation I can’t come up with any answers. tcpdumping on port 53 reveals that everything is working fine as far as I can see, and queries are coming in during the certbot transaction and replies are being sent back as expected.

Interestinly, the error is sometimes “No IP address found” sometimes “During secondary validation: no IP address found” and occasionally “can’t retrieve CAA record: SERVFAIL”.

I took the opportunity to update all the binds to the ISC’s latest version, just in case, and to validate that DNSSEC was doing the right thing. I’m running out of things to try, at this point.

1 Like

@lestaff Can someone maybe look into the traffic or detailed logs on the resolvers?

This is the ~5th time in the last week or so that someone has reported issues under similar circumstances (half of them were with one DNS service, Gransy): Very large responses (because of large DNSSEC signatures and/or non-minimal responses), sometimes CNAME records, TCP.

"No valid IP addresses found" seems particularly strange -- a SERVFAIL or "query timed out" error would be unfortunate but logical if there were resolution issues; "No valid IP addresses" doesn't make sense when the records apparently do exist.

Past threads include:

2 Likes

Thanks @mnordhoff ! I’ll take a look.

2 Likes

Gransy has fixed his error.

There was a long list of name servers without Glue records.

So resolving name server -> ip address needed a lot of time.

If that’s too long, Unbound answers with a Servfail.

Same with CNAME wildcards or a combination of both problems.

2 Likes

Last Thursday some of our Unbound instances were running out of memory, so we deployed a change to the Unbound config to add:

msg-buffer-size: 4096

This is in addition to edns-buffer-size: 512 setting we already have. Note that the semantics here are different: edns-buffer-size controls what we tell other DNS servers we are willing to resolve over UDP, but doesn't limit the size of TCP responses. msg-buffer-size limits the size of both TCP and UDP responses.

There's also a limit in Boulder. Boulder won't handle responses from Unbound larger than 4096 bytes. This led us to conclude the msg-buffer-size change would not impact the vast majority of users. However, there are, as we've seen here, some domain names where Unbound receives large responses over TCP during a recursive lookup, but the overall result of the query is small and can be processed by Boulder. This category of domain names was broken by the change.

We're rolling back that change now, which should fix these issues. My apologies for not notifying everyone here about the change earlier, and for the breakage and time people spent debugging. I really appreciate all your help digging into the issue.

The current Boulder LookupHost logic looks up both A and AAAA records in parallel. If either lookup succeeds, LookupHost will return the list of addresses from that lookup.

In my testing with msg-buffer-size: 4096, a query for the A records for log.aeschi.edu returns a SERVFAIL, but a query for AAAA returns NOERROR with a CNAME and no IP addresses:

$ dig AAAA log.aeschi.eu -p 1053 @127.0.0.1

; <<>> DiG 9.16.1-Ubuntu <<>> AAAA log.aeschi.eu -p 1053 @127.0.0.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3583
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;log.aeschi.eu.                 IN      AAAA

;; ANSWER SECTION:
log.aeschi.eu.          0       IN      CNAME   www.aeschi.eu.

;; AUTHORITY SECTION:
aeschi.eu.              0       IN      SOA     ns0.core.aeschi.eu. hostmaster.aeschi.eu. 2020050711 1800 3600 604800 1200

Since that query succeeded, LookupHost would consider this a successful response, with no IP addresses.

4 Likes

The rollback of msg-buffer-size is now complete. Please try issuance again and let me know how it goes.

3 Likes

Thanks @jsha! This morning it’s back in working order. :rocket:

I’m going to reenable DNSSEC on the domain and retry later though, as I had suspected that might have been a factor and disabled it yesterday.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.