DNS problem: query timed out looking up A record

Good morning everyone,
I'm having trouble renewing my certificate.
The last time in March, it worked without a problem.
Now, however, I can't renew it because Let's Encrypt can't resolve the A record, which is correctly resolvable from anywhere in the world. Can you help me find a solution to this problem?
Below is the information requested for support.

My domain is:
zabbix.sistel.it

I ran this command:
certbot renew
certbot renew --dry-run

It produced this output:

Saving debug log to /var/log/letsencrypt/letsencrypt.log


Processing /etc/letsencrypt/renewal/zabbix.sistel.it.conf


Simulating renewal of an existing certificate for zabbix.sistel.it

Certbot failed to authenticate some domains (authenticator: apache). The Certificate Authority reported these problems:
Domain: zabbix.sistel.it
Type: dns
Detail: DNS problem: query timed out looking up A for zabbix.sistel.it; DNS problem: query timed out looking up AAAA for zabbix.sistel.it

Hint: The Certificate Authority failed to verify the temporary Apache configuration changes made by Certbot. Ensure that the listed domains point to this Apache server and that it is accessible from the internet.

Failed to renew certificate zabbix.sistel.it with error: Some challenges have failed.


All simulated renewals failed. The following certificates could not be renewed:
/etc/letsencrypt/live/zabbix.sistel.it/fullchain.pem (failure)


1 renew failure(s), 0 parse failure(s)
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

My web server is (include version):
Apache/2.4.67 (Debian)

The operating system my web server runs on is (include version):
Debian 13

My hosting provider, if applicable, is:
self-hosted

I can login to a root shell on my machine (yes or no, or I don't know):
Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
certbot 4.0.0

I’d separate this from Apache first: the error is still at the DNS lookup stage, before HTTP-01 is reached. Check the authoritative nameservers for sistel.it directly, not only a normal recursive resolver. If one auth NS times out on UDP/TCP 53, or DNSSEC is inconsistent, Let’s Encrypt may hit that while your local resolver still returns a cached A record. Also check whether an AAAA record exists; if the host is not reachable over IPv6, remove or fix it. Once the auth DNS answers consistently from outside, rerun the dry run.

Hi, thank you for the reply
Yeah I think the problem is that Let’s Encrypt can't reach my authoritative nameservers.
The problem is that I don't have any filters on the nameservers whatsoever.
I do not have any AAAA records nor DNSSEC enabled.

I agree. Tests from various tools resolve the records fine. Are you sure there is no firewall blocking? Perhaps some kind of "smart" blocking to protect from ddos attacks or similar. Or just from selective sources?

No, nothing that I can see or think of

The DNS servers look to all be within the same subnet, which isn't ideal, but certainly ought to work. I also don't see any immediate issues with connectivity. My best guess is that there's something not quite right with the backbone routing between Let's Encrypt's main datacenters and your datacenter, though those kinds of problems are certainly challenging to debug.

Let's Encrypt does have an open "status" that they're "operating normally, but with reduced redundancy" after some "upstream network event" a week and a half ago. A bit of a longshot, but might be related?

I don't know how much influence you have over your network's peering and connectivity, but if you could maybe run some pings and traceroutes and such to various other networks, especially those networks in the United States, maybe it would uncover something? Sorry, I know that's not very specific or actionable, I'm just grasping at straws.

Hi, thanks for your reply!

Yeah, that's my best bet too.

Yeah I ran some test but it's a shot in the dark not knowing the destination network.
Actually I was hoping someone from Let's Encrypt could check the connection the other way around..

None of us volunteers have access to LE servers. And, I have never seen LE staff get involved in debugging a network problem affecting a single location.

You could check network logs from any equipment between your DNS servers and the broader net. Make sure you don't see any incoming request during a cert request and that you do see them when running a DNS query from another location like using: https://unboundtest.com

Report those results to your ISP. We had a similar case recently where they had their ISP switch the subnets used and it resolved their problem. That is worth a try too. If there was a backbone problem near the LE center there would be many more validation problems which would likely show up on their routine health monitoring / alerting system. That hasn't happened and, so far, you are the only one reporting this here.

That all said, this kind of thing is one reason why your DNS servers should be more broadly distributed :slight_smile:

Another workaround you might want to try in the meantime is to get a certificate from some other CA. There are several out there, and many offer free certificates using the same ACME protocol as Let's Encrypt. Or, if another CA also can't connect, that might provide more insight into what's going wrong.

I have, though certainly very rarely and only for particularly thorny problems. But I don't think it can hurt to ask. :slight_smile:

@jsha, @mcpherrinm, or someone else: When you get a chance, can you take a look at why Let's Encrypt's validation servers (both prod and staging) can't seem to connect to DNS servers in 185.249.92.0/24 even though all the other tools and systems that we've tried can? Any chance it might be related to the networking issues the datacenters have occasionally been tripping on this month? Thanks.

I was flatly rebuffed the last time I asked so ... :slight_smile:

Perhaps was just a bad day.

Thank you a lot guys, I guess I'll be waiting a little bit to see if someone from LE staff reply.
Meanwhile I'll do some testing with other CAs even if I would love to not change it.

I investigated with the firewall administrator and noticed that a "GrayNoiseFeed" filter was active. Disabling that resolved the problem.
I apologize for the inconvenience and thank you for your time.