Timeout during connect (likely firewall problem) on 8.43.85.0/24

averi · July 30, 2019, 11:18am

My domain is:

All the attempts to renew certificates for services hosted on the 8.43.85.0/24 subnet fails. We can clearly see the connection on the logs but the validation doesn’t happen:

Jul 30 09:34:45 wiki apache: 3.14.255.131 - - [30/Jul/2019:09:34:45 +0000] “GET /.well-known/acme-challenge/C7VRofqb87xsTZ6moD6Fhh6ePcKrt6mpopY8zMe3kMo HTTP/1.1” 302 382 “http://wiki.gnome.org/.well-known/acme-challenge/C7VRofqb87xsTZ6moD6Fhh6ePcKrt6mpopY8zMe3kMo” “Mozilla/5.0 (compatible; Let’s Encrypt validation server; +https://www.letsencrypt.org)”

The excerpt is taken from wiki.gnome.org (8.43.85.12). I believe there’s an issue with this specific subnet, it’d be lovely if any let’s encrypt engineer could look into it.

It produced this output:

response {
“type”: “http-01”,
“status”: “invalid”,
“error”: {
“type”: “urn:acme:error:connection”,
“detail”: “Fetching http://wiki.gnome.org/.well-known/acme-challenge/C7VRofqb87xsTZ6moD6Fhh6ePcKrt6mpopY8zMe3kMo: Timeout during connect (likely firewall problem)”,
“status”: 400
},

My web server is (include version):
httpd-2.4.6-89.el7_6.1.x86_64. No recent changes on the httpd configuration, nor on the let’s encrypt tools. The httpd configuration matches the one of hosts sitting on another subnet which receive validated certs just fine.

The operating system my web server runs on is (include version):
RHEL 7

I can login to a root shell on my machine (yes or no, or I don’t know):
yes

misc · July 30, 2019, 12:29pm

Hi, I am also hosting servers in the same DC, and I can reproduce on a unrelated server ( 8.43.85.171, just 1h ago ). Both averi and I have checked the network (with limited access), and from what I did see, the http request went on and the answer was sent, no error on the tcp level.

JuergenAuer · July 30, 2019, 1:07pm

Hi @averi

checking the wiki.gnome.org url there is a redirect to a specific subdomain ( https://check-your-website.server-daten.de/?q=wiki.gnome.org ):

Domainname	Http-Status	redirect	Sec.	G
• http://wiki.gnome.org/
8.43.85.12	302	https://wiki.gnome.org/	0.234	A

• https://wiki.gnome.org/
8.43.85.12	200		4.033	A

• http://wiki.gnome.org/.well-known/acme-challenge/check-your-website-dot-server-daten-dot-de
8.43.85.12	302	https://wiki.gnome.org/.well-known/acme-challenge/check-your-website-dot-server-daten-dot-de	0.233	A
Visible Content: Found The document has moved here . Apache/2.4.6 (Red Hat Enterprise Linux) Server at wiki.gnome.org Port 80

• https://wiki.gnome.org/.well-known/acme-challenge/check-your-website-dot-server-daten-dot-de	302	https://letsencrypt.gnome.org/.well-known/acme-challenge/check-your-website-dot-server-daten-dot-de	3.674	A
Visible Content: Found The document has moved here . Apache/2.4.6 (Red Hat Enterprise Linux) Server at wiki.gnome.org Port 443

• https://letsencrypt.gnome.org/.well-known/acme-challenge/check-your-website-dot-server-daten-dot-de	404		4.250	A
Not Found
Visible Content: Not Found The requested URL /.well-known/acme-challenge/check-your-website-dot-server-daten-dot-de was not found on this server. Apache/2.2.15 (Red Hat) Server at letsencrypt.gnome.org Port 80

/.well-known/acme-challenge/random-filename is redirected to https://letsencrypt.gnome.org/.well-known/acme-challenge/random-filename.

Looks more that wiki.gnome.org has changed something, so the redirect from your domain isn't redirected. Why do you redirect to wiki.gnome.org?

averi · July 30, 2019, 1:21pm

The reason behind the redirects is not relevant here as Let’s Encrypt officially supports up to 10 redirects. Nothing changed on the wiki.gnome.org side nor on any host hosted on that subnet. @misc manages a set of systems that sit outside of GNOME and he’s affected by the problem as well.

I’m confident the problem lies on Let’s Encrypt side

Phil · July 30, 2019, 2:54pm

Hi @averi and @misc,

We’re taking a look at this right now and will update this thread when we have more information.

Phil · July 30, 2019, 3:30pm

We’ve escalated to the affected datacenter’s upstream network engineers and have updated https://letsencrypt.status.io/.

averi · July 30, 2019, 3:40pm

I don’t see the status page showing any issue right now though and the problem is still there.

Thanks for the prompt action

cpu · July 30, 2019, 6:14pm

Hmm Did you try a hard refresh? I see this partial service disruption notice as active:

averi · July 30, 2019, 7:28pm

I see it now, thanks!

Phil · August 1, 2019, 3:08pm

We’re still waiting on our upstream ISP to resolve this. They are aware of the problem, but the responses we get for when the issue will be resolved are, “soon”.

averi · August 2, 2019, 2:12pm

Is there any chance this can be escalated? We’re going to run short on a set of certificates needing renewal with no enough time to rewrite the automation tools to swap the verification method to a different one than acme.

Thanks!

readetaylor · August 3, 2019, 10:09pm

Any update @Phil?

I am also having related issues.

Thanks!

jsha · August 4, 2019, 1:39pm

We’ll keep pushing on it, thanks for checking in. What’s your first certificate expiry date? That can help us communicate the urgency to our upstream.

readetaylor · August 4, 2019, 1:43pm

We are migrating a lot of sites and part of the migration process is to request a certificate. We are sporadically successful. Is there an option to point to another place for cert request?

jsha · August 4, 2019, 1:48pm

I'm afraid not, sorry.

readetaylor · August 4, 2019, 1:50pm

Sorry to hear that. My requests are coming from AWS us-east-1 region, so I assume many others are impacted by this issue.

JuergenAuer · August 5, 2019, 8:13am

2 posts were split to a new topic: Timeout and incomplete answer checking the validation file

JamesLE · August 4, 2019, 11:49pm

Hi, @averi & @misc,

Our upstream ISP is still investigating this. At this point, with the troubleshooting information we’ve gathered, it looks like the problem’s limited to the small slice of IP space you’re in (8.43.84.0/22). It’s reachable from most of our validation endpoints, but a traceroute from one of them appears to stop within Peak 10’s Raleigh POP, which is the last hop before your /22.

It’s hard to tell from the outside exactly how your network connectivity is set up, but I’ve spot-checked a few sites with what I think is very similar connectivity, and they are working for us.

We’re going to keep following up. It might speed things up for you to check with your network engineers, and with your immediate upstream ISP, as well. Is it possible there’s some kind of firewall or DDoS protection appliance that’s blocking our validation endpoints? That’s an issue we’ve seen before, with similar symptoms.

JamesLE · August 4, 2019, 11:55pm

Hi, @readetaylor,

Your issue may be unrelated: we’re not aware of any problems affecting validations for sites on AWS.

I assume you’re also seeing a “Timeout during connect (likely firewall problem).” A frequent cause of this is publishing IPv6 AAAA records in DNS, without allowing IPv6 connections through your firewall (or AWS security group).

If that’s not it, could you please start a new thread with full troubleshooting info, including some sample domains?

Thanks!

averi · August 6, 2019, 12:44pm

The problem appears to be fixed. We had our network engineering team double check and it appears asymmetric routing got in the way. An RCA is still being worked out internally. Thanks a lot for your support!

Topic		Replies	Views
Timeout during connect Help	5	50	March 31, 2025
Timeout during connect (likely firewall problem) Help	7	2533	April 21, 2020
Timeout during connect Help	7	431	March 15, 2024
Timeout during connect (likely firewall problem) Help	3	1611	January 28, 2021
Timeout during connect Help	6	952	October 22, 2020

Timeout during connect (likely firewall problem) on 8.43.85.0/24

Related topics