Connection reset by peer during HTTP ACME challenge for `.co.uk` domains

My domain is: madmask.co.uk

I ran this command: getssl -u (see GitHub - srvrco/getssl: obtain free SSL certificates from letsencrypt ACME server Suitable for automating the process on remote servers.)

It produced this output:

Check all certificates
Registering account
Verify each domain
Verifying madmask.co.uk
madmask.co.uk is already validated
Verifying www.madmask.co.uk
copying challenge token to /home/jakeqz/public_html/madmask.co.uk/.well-known/acme-challenge/gtlC7LRVj1_SGq3U2J31DdvDZxbQHgti77EHDnhHKy0
sending request to ACME server saying we're ready for challenge
checking if challenge is complete
getssl: www.madmask.co.uk:Verify error:    "detail": "92.205.0.87: Fetching http://www.madmask.co.uk/.well-known/acme-challenge/gtlC7LRVj1_SGq3U2J31DdvDZxbQHgti77EHDnhHKy0: Connection reset by peer",

My web server is (include version): Apache 2.4.57, LiteSpeed V8.0.1, Cloudlinux 1.3

The operating system my web server runs on is (include version): Linux sxb1plzcpnl489428.prod.sxb1.secureserver.net 2.6.32-954.3.5.lve1.4.90.el6.x86_64 #1 SMP Tue Feb 21 12:26:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

My hosting provider, if applicable, is: GoDaddy shared hosting with cPanel

I can login to a root shell on my machine (yes or no, or I don't know): no, it's shared hosting, but I have SSH access and can log into a non-root shell

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): cPanel 102.0.32

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): getssl V2.48 (see GitHub - srvrco/getssl: obtain free SSL certificates from letsencrypt ACME server Suitable for automating the process on remote servers.)

Automatic SSL certificate updates had been working fine for a few years until after 26 June 2023.

On 10 July, a twice-weekly cron job to renew SSLs failed for four .co.uk domains, yet succeeded for one .com domain. (On 26 June it succeeded for at least one .co.uk domain.)

The error in all cases was "Connection reset by peer".

The Apache logs show the expected requests and responses via HTTP (which is a 301 redirect to HTTPS) but do not show the follow-up via HTTPS.

I have tried disabling the HTTPS redirect. The Apache logs then show the content as being served with a 200 response and 87 bytes, but the error persists.

e.g.
18.219.241.224 - - [20/Jul/2023:15:10:26 -0700] "GET /.well-known/acme-challenge/gtlC7LRVj1_SGq3U2J31DdvDZxbQHgti77EHDnhHKy0 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" 1 **0/1630**

I can access the URL in a browser without problem.

I have tried with Let's Debug.

For a non-working .co.uk domain, I get

HTTPCheck
Debug
Requests made to the domain
Request to: madmask.co.uk/92.205.0.87, Result: [Address=92.205.0.87,Address Type=IPv4,Server=,HTTP Status=0], Issue: ANotWorking
Trace:
@0ms: Making a request to http://madmask.co.uk/.well-known/acme-challenge/letsdebug-test (using initial IP 92.205.0.87)
@0ms: Dialing 92.205.0.87
@196ms: Experienced error: read tcp 172.104.24.29:57898->92.205.0.87:80: read: connection reset by peer 

For a working .com domain, I get

HTTPCheck
Debug
Requests made to the domain
Request to: spinawoodworking.com/92.205.0.87, Result: [Address=92.205.0.87,Address Type=IPv4,Server=Apache,HTTP Status=301,Number of Redirects=1,Final HTTP Status=404], Issue:
Trace:
@0ms: Making a request to http://spinawoodworking.com/.well-known/acme-challenge/letsdebug-test (using initial IP 92.205.0.87)
@0ms: Dialing 92.205.0.87
@203ms: Server response: HTTP 301 Moved Permanently
@203ms: Received redirect to https://spinawoodworking.com/.well-known/acme-challenge/letsdebug-test
@203ms: Dialing 92.205.0.87
@1433ms: Server response: HTTP 404 Not Found 

(This is expected, since letsdebug-test doesn't exist.)

What on earth is going on? These are on the same server. Why is it only failing with .co.uk domains and not others?

What can I do? I have about 14 days before the current certificates expire. Please help.

Further Let's Debug tests show success with .org (e.g. sunningwellvillagehall.org) domains but failure with .org.uk (e.g. bdgc.org.uk) on the same above server.

It seems the issue is specific to the .uk TLD. But how could that be when the DNS lookup succeeds, and they are all hosted at the same IP address?

Let's Debug weirdly reports Let's Encrypt failures that never occurred for the succeeding domains, though succeeds itself, reportedly.

only thing I can think of is a crappy firewall in front your server: when I tried your site first time it failed than it browser refreshed successfully, so I think fast sessions of renewal attempt may succeed.

3 Likes

Supplemental Side Note: The Copyright messages is a little dated too.

Copyright © 2017–2019 Mad Mask Theatre Company; all rights reserved.
1 Like

it looks like site doesn't changed after than, you it's right number for copyright notice

3 Likes

Maybe random due to bandwidth limiting on account. Not sure what you mean by "failed".

Just tried that, failed twice in succession.

Yes, the site hasn't changed since Covid. Not sure how that's relevant.

2 Likes

Try this instead: https://www.bdgc.org.uk/ and check it with Let's Debug.

Then try https://www.sunningwellvillagehall.org/ and repeat.

The .uk one will fail and the .org one will succeed. They are both on the same IP address on the same (possibly virtual) server.

Yes, the site hasn't changed since Covid. Not sure how that's relevant.

it doesn't.

I kinda want to use DNS challange and sidestep this problem, https://github.com/srvrco/getssl/blob/master/dns_scripts/GoDaddy-README.txt but godaddy being godaddy looks painful.

P.S I guess their godaddy still doesn't expose Cpanel autossl?

4 Likes

No, of course not. They want to make money selling SSL certificates to people who don't know better.

DNS validation looked tricky. Also, some domains are registered with other providers. And besides, this was working for several years until a week or so ago.

Why has it suddenly stopped working only for .uk domains? While it still works for other TLDs on the same server at the same IP address.

You should be seeing at least 3 requests for each validation attempt; I'm not sure if you're saying you're only seeing one or if you're saying that your logs are showing your server responding to all of them.

That is pretty weird. I'm with @orangepizza; there's likely some kind of firewall that's resetting the connection instead of letting the request through.

I can reproduce the problem from a test machine in AWS, at least some of the time, for whatever's that worth:

$ curl -v http://madmask.co.uk/.well-known/acme-challenge/test
*   Trying 92.205.0.87:80...
* Connected to madmask.co.uk (92.205.0.87) port 80 (#0)
> GET /.well-known/acme-challenge/test HTTP/1.1
> Host: madmask.co.uk
> User-Agent: curl/8.0.1
> Accept: */*
>
* Recv failure: Connection reset by peer
* Closing connection 0
curl: (56) Recv failure: Connection reset by peer
5 Likes

I have some bad news for you. We have seen these exact symptoms with more than a few GoDaddy hosted servers in past couple months. We have not yet had anyone get it resolved. The only pattern is it was a GoDaddy host (or owned by GoDaddy)

The problem is that the first request from an IP gets a "connection refused". But, repeating the request rapidly succeeds. But, if you just wait a bit (a couple minutes) you get another "connection refused" followed by some successes.

It has nothing to do with Let's Encrypt. And, the most recent case I was able to reproduce this for the person's home page using a browser so it is not even "curl" related.

See this test sequence and note timestamps

curl -I https://www.bdgc.org.uk
curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.bdgc.org.uk:443

(immediately after the reset failure)
curl -I https://www.bdgc.org.uk
HTTP/2 200
x-powered-by: PHP/7.3.33
date: Fri, 21 Jul 2023 00:32:32 GMT
server: Apache

curl -I https://www.bdgc.org.uk
HTTP/2 200
x-powered-by: PHP/7.3.33
date: Fri, 21 Jul 2023 00:32:34 GMT
server: Apache

curl -I https://www.bdgc.org.uk
HTTP/2 200
x-powered-by: PHP/7.3.33
date: Fri, 21 Jul 2023 00:32:36 GMT
server: Apache

(this was 3 minutes later, probably would happen with less wait)
curl -I https://www.bdgc.org.uk
curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.bdgc.org.uk:443
5 Likes

I do too except the "connection refused" can happen for a normal person accessing the site. A DNS challenge could get the cert but still have this access problem.

Still, we don't know who all is affected. Maybe it's just people outside their hosting local region which might be ok.

I found the most recent thread we worked on this. You can see the browser failing to get the home page and quickly reload and see the expected home page (in that case just a landing page for new site).

@jakeqz TSOHOST is owned by GoDaddy

4 Likes

In the failure case (`.uk), logs show the server responding to the first request, then no more (whether the response be a 301 redirect to HTTPS, or the actual data requested with a 200 response).

In the successful case (.com), I see two more requests all from different IP addresses, then a getssl local request, then two more from LE (the first one from the same IP as the first previous).

Did it succeed sometimes? And have you tried with the .org or .com domain?

Guess I will have to phone GoDaddy and waste several days trying to convince them it's a problem their end. Or...

I need to switch hosting providers anyway, since GoDaddy are useless in every department. Any recommendations? Moving elsewhere might actually be less of a headache than dealing with this issue anyway.

1 Like

I didn't test as thoroughly as @MikeMcQ did, but it did seem to fail once, and then work for a while, yes.

No, I hadn't.

I think the consensus around here tends to be that the only advantage GoDaddy may have is that they can be cheap, if you don't mind working around their systems and attempts to upsell you.

There is a semi-official forum recommendation list listing providers that are known to have integration with Let's Encrypt for getting certificates. But of course, there are other considerations one might want to consider when choosing a hosting company. At this point, having good free integration with getting a certificate (regardless of CA) is the bare minimum one would expect.

4 Likes

I have managed so far (e.g. using cPanel uapi to install the Let's Encrypt certificates as part of an automated process). But am stuck on this. If it is a firewall issue, why is it only blocking requests to .uk domains, and why are they getting as far as Apache sending the response out? And why has it suddenly stopped working?

Could you try this with a non-.uk domain, such as https://www.sunningwellvillagehall.org/?

I cannot reproduce your results from within the UK.

However, it is seeming that access to .uk domain names, hosted on a GoDaddy-owned platform, accessed from outside the UK, suffers this problem.

I see the identical symptom with your .org domain as with the .uk (see below). I don't see this as tld related (even before this test). This is most likely some odd network level firewall or faulty routing gear.

I might try later with a UK based presence. It is possible this network failure only affects outside the hosted region (of indeterminate scope). If that's true and your site users are all local that might not bother you. Although search crawlers could potentially be affected too impacting SEO.

It does impact Let's Encrypt HTTP Challenges as these requests are made from several points around the globe. The test server I've been using is in AWS on the US East Coast.

Did you see the link to the previous thread where I showed the browser getting the failure too? That could be happening to your sites even now. You could use a DNS Challenge to get the cert knowing some people will be affected.

curl -I https://www.sunningwellvillagehall.org
curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.sunningwellvillagehall.org:443

curl -I https://www.sunningwellvillagehall.org
HTTP/2 200
date: Fri, 21 Jul 2023 02:12:16 GMT
server: Apache

curl -I https://www.sunningwellvillagehall.org
HTTP/2 200
date: Fri, 21 Jul 2023 02:12:17 GMT
server: Apache

curl -I https://www.sunningwellvillagehall.org
HTTP/2 200
date: Fri, 21 Jul 2023 02:12:19 GMT
server: Apache

(waited about 3 mins again - see timestamp in next request)
curl -I https://www.sunningwellvillagehall.org
curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection to www.sunningwellvillagehall.org:443

curl -I https://www.sunningwellvillagehall.org
HTTP/2 200
date: Fri, 21 Jul 2023 02:15:18 GMT
server: Apache
5 Likes

Adding on to my previous post I found one thing interesting ...

I now realize that after the initial "reset" failure (with either .org or .uk) I can immediately curl to both .org and .uk successfully. In other words, one reset failure "opens the door" for success to both domains. Wait 3 mins (probably even less) and again will get a "reset" to either one followed by success to both.

5 Likes

I've asked GoDaddy to look into this. Issue was escalated. They might get back to me.

2 Likes