Renew certificate failed due to secondary validation (again)

All symptoms are the same as several months ago Renew certificate failed due to secondary validation
Letsencrypt validation servers could not connect to the web server at Hetzner (DE) network:

Domain: mydomain.com
  Type:   connection
  Detail: During secondary validation: my.ip.address: Fetching http://mydomain.com/.well-known/acme-challenge/my_token: Timeout during connect (likely firewall problem)

This is not my server problem or firewall problem. It was confirmed by Letsencrypt staff.

This is contradictory with the last statement made by LE staff in your old thread.

  • There is no known connectivity problem between LE's secondary validation servers (AWS) and Hetzner
  • I host several sites on Hetzner myself, without any issue
  • Hetzner is not known to block IP addresses not requested by the customer

I strongly suggest you inspect your own firewall again*. The secondary validation IP addresses change frequently, which is likely why It worked for a while (and will likely continue working again in the future, until it breaks again).

Other than that, without more info from your side there is nothing this community can do for you.

*You can try with your iptables rules deleted:

# sudo iptables -F

will temporarily clear everything (until reboot/reload/reapply).

11 Likes

Please do not read only the last post in the previous thread.
Of cource I totally deleted all firewall rules before I check the problem and post it here again.
The issue: some of Letsencrypt verification servers have connection problems to my Hetzner DE IP subnet.
It was confirmed by Letsencrypt staff:

I have confirmed we saw errors from validator instances in both AWS's us-east-2 and eu-central-1 regions to your IP address. Sorry, I don't easily have the IP address of the instances and it would take a bit of work to correlate the different logs to find what the external IPs of the instances are.

Hetzner has many IP subnets.
If Letsencrypt validation works well for your subnet it does not mean it works well for all Hetzner subnets.

How do you infer that

"I have confirmed we saw errors from validator instances in both AWS's us-east-2 and eu-central-1 regions to your IP address."

means

"there is no error on my side"?

LE staff was simply saying "yes, there are secondary validation failures. We don't know the reason - may be firewalls".

So far, you are the only person reporting issues from Hetzner subnets. Yes, there is a possibility that a single Hetzner subnet is affected - in that case you should talk to Hetzner, not Let's Encrypt.

However, there is a significantly higher chance that it is related to your iptables. Have you tried flushing your iptables?

11 Likes
  1. I did not change iptables for years.
  2. I did not change IP address for years.
  3. Letsencrypt renewal worked well on my server for years untill May 2022.
  4. I contacted Hetzner support. They confirmed that there no trafic or route restrictions for my server.
  5. I have web sites that accessed by many peoples every minute 24 * 7 * 365.
  6. I totally deleted ALL iptables rules to check again the LE renewal issue manually.
    Is this enough for you to believe that some LE verification server can have some route|connection problem to some IP address/subnet?

It would be great if LE staff make traceroute from verification server to my IP (I can send detailed info to PM).
Traceroute can place the dots on "i" in the the question on which side (or on the route) the issue.
No me nor Hetzner can do this on our side.

I see in the access logs many successfull verifications from the LE servers resided at United States (primary LE verifications as I think).
... [US][mysite] LE_IP_address_here - [29/Sep/2022:14:52:31 +0300] "GET /.well-known/acme-challenge/BisUemf-T2enbpst946NXVNbxTiWhvU6DARjRQjnXKY HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "-" [-, United States]

But of course I do not see in logs any accesses from the secondary verification LE servers.

As noted before, there will be (usually) 4 challenges. There is one primary and the others secondary. The --dry-run option for certbot will show them better. Otherwise, you may not see challenges from previously successful domains for your account as they are cached for 30 days.

sudo certbot renew --dry-run --cert-name (name of certbot cert)

As for location ID, it depends what system you use to look them up. For example, my latest --dry-run test showed 2 US locations, 1 Germany, and one marked Cloudflare using the ipstack.com site.

As noted, these IP's are often rotated by the server farms used and may change as often as hour-to-hour.

You have demonstrated some challenges are not reaching your system. That much is clear. There is either some sort of firewall, maybe even a DDoS protection style firewall, or a very odd network routing problem (maybe within Hetzner even).

The volunteers here do not have access to the LE servers (or their logs) to assist. You may want to re-read the last post in your prior thread which explained from an LE Staff member what they do in such cases.

You could try switching to a different Certificate Authority. Or, try fronting your server with a CDN like Cloudflare and use their Origin CA cert in your server.

10 Likes

I ran certbot renew with --dry-run option -> it completed successfully without any error (all validation requests from LE US servers).
Then I ran cerbot in the regular way -> it ended with the same error as earlier.
What's the difference between these two modes concerning LE validation?
Why validation is successfull in the dry-run mode but unsuccessfull in the regular mode?

There are sometimes technical differences between them but you see some challenge requests from production so it is unlikely to be related to any such differences (if any even exist right now). Other experts are better equipped to address that.

But, the IP addresses will often be different from one request to another.

A lot has already been said and I don't think I have anything more to add.

9 Likes

Is there by any chance any sort of Geo-location/fencing device inline?
Or any software operating like Fail2BAN?

10 Likes

Nothing blocks the http requests on the server.
As I see sometimes LE verification servers have outgoing connections issues.
Moreover sometimes LE verifications servers have even dns resolving problems. LE log records such as “could not find A record for domain ...” point this.

Well, that's as much as I can help with - given the very little information provided.

8 Likes

But the last post Renew certificate failed due to secondary validation - #32 by mcpherrinm

3 Likes

Also are you aware that
Let's Encrypt uses Multi-Perspective Validation Improves Domain Validation Security - Let's Encrypt

Since these are Domain Validation (DV) certificates the Domain Name System (DNS) is used extensively in the validation process as well a allowing us to assist here on Let's Encrypt community.
DNS Queries need to give consistent results from any location on the Internet, all your authoritative DNS Servers for the Domain need to also give consistent results as well.

1 Like

I know how LE validation works.
My web sites have no dns problems at the any point of the world.

Then I suggest using https://www.hetzner.com/support-center as that seem to be where the problem has happened with the block from the perspective of the Let's Encrypt community.
If you would share with the Let's Encrypt community more DNS information then there is something we can work with to further assist you. Thank you for assisting us in helping YOU!

1 Like

I did this as I mentioned before.
Hetzner said "there is nothing blocked from our side".

I greatly appreciate the support of the LE community.
But I am afraid the problem could not be solved without LE staff.

Then please address the Let's Encrypt staff.

1 Like

Hello @smon, you have Hetzner find the block between them and LE, I expect Hetzner has far more resources that LE to handle these types of issues.

7 Likes