Can't Renew Certificates for Chinese Domains

Hello,

We run several hundred domains with top level domains from all over the world with LetsEncrypt certificates on a server located in the US. It looks like for some time now, only Chinese certificates are not renewed any more. I get an error message for several domains now, like this:

Invalid response from https://acme-v02.api.letsencrypt.org/acme/authz-v3/348101101807.
Details:
Type: urn:ietf:params:acme:error:dns
Status: 400
Detail: During secondary validation: DNS problem: query timed out looking up A for mydomain.cn; no valid AAAA records found for mydomain.cn

This is very interesting, as our DNS settings didn't change. When I use other external tools like MXToolbox, the A record is found without problems. I also asked at our DNS provider, they are not aware of any issues. And no, we don't have any geoblocking enabled anywhere, like suggested in another post for the new remote perspectives.

Can anyone help?

Thanks in advance :slight_smile:
Udo

My domain is: eucerin.cn (and others)

I ran this command: Automatic SSL renewal via Plesk Control Panel

It produced this output: see above

My web server is (include version): Apache 2.4.52 + nginx 1.24.0 managed by Plesk

The operating system my web server runs on is (include version): Ubuntu 22.04

My hosting provider, if applicable, is: Azure

I can login to a root shell on my machine (yes or no, or I don't know): yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): Plesk

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): Unclear, managed via Plesk

1 Like

Welcome @UdoW

The secondary in the message might mean that a firewall is blocking request from certain countries. Is that possible? Please see below thread.

2 Likes

Hi Mike,
no, there is no firewall in place (or, to be precise: Of course, only the necessary ports are open, but thre is no geoblocking or anything).

1 Like

The query to the DNS authoritive servers are timing out. Any ports related to your HTTP services would not matter (at this phase).

I also cannot reproduce the timeout using https://unboundtest.com or with my own test tools. unboundtest queries like Let's Encrypt.

There are some minor problems with the DNS config reported by eucerin.cn | DNSViz I don't think they would cause this but worth getting those fixed anyway. In short, some of the glue records are wrong.

When we can't reproduce using single queries from tools a couple common causes of failure are:

  • DNS servers have rate limiting. LE makes a large number of queries as it walks the DNS tree. These come from various points around the globe so some overly sensitive DDoS firewalls start blocking. This can result in the query timeout.
  • The country's firewall is blocking queries for that domain. Sometimes these are temporary conditions.
4 Likes

Thank you @MikeMcQ, this is all very helpful! I would then assume that we are probably talking about the country's firewall here. Interestingly, I just tried again and was able to renew the certificate for eucerin.cn but not for the www version (same issue). I will monitor and also review the DNS config.

3 Likes

With the cn firewall we sometimes see temp failures like this. Just one of the reasons we recommend running auto-renew at least twice per day at random times so you can tolerate such temp failures.

Let's Encrypt caches successful validations so might be easier to get the cert issued now just needing the www validation.

From the LE FAQ (which is easy to overlook):

Once you successfully complete the challenges for a domain, the resulting authorization is cached for your account to use again later. Cached authorizations last for 30 days from the time of validation. If the certificate you requested has all of the necessary authorizations cached then validation will not happen again until the relevant cached authorizations expire.

4 Likes

This also often happens due to routing errors within data networks. There could be some misconfigurations with your ISP, ISRG's ISP, or any of the data networks in use between them. They are usually temporary and resolve within a few hours as automated test systems at these companies detect the errors - or customers realize what is happening and complain.

ISRG currently operates a lot of the validation systems on Amazon AWS networks. If you have access to those, you may be able to use traceroute and other systems to see if there is a routing error and where it is happning.

4 Likes

As recommended, I set auto-renew to twice a day, and indeed, the certificates for the domains in question have now all been renewed.

Thanks everyone for your help!

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.