Certbot results in connection refused

My domain is: kenpro.com.au

I ran this command:

certbot certonly --apache --dry-run --cert-name kenmcq -d bridgeontheriverchoir.com,thecleftomaniacs.com,djkm.com.au,garysmithmusic.com.au,kenpro.com.au,mcquire.au,mcquire.com.au,surryhillsstudio.com.au,tckell.com,www.bridgeontheriverchoir.com,www.thecleftomaniacs.com,www.djkm.com.au,www.garysmithmusic.com.au,www.kenpro.com.au,www.mcquire.au,www.mcquire.com.au,www.surryhillsstudio.com.au,www.tckell.com

It produced this output:

You are updating certificate kenmcq to include new domain(s):

You are also removing previously included domain(s):

Did you intend to make this change?


(U)pdate certificate/(C)ancel: U
Simulating renewal of an existing certificate for bridgeontheriverchoir.com >and 17 more domains

following abridged for brevity - but the same for all but three domains:

Certbot failed to authenticate some domains (authenticator: apache). The Certificate Authority reported these problems:
Domain: bridgeontheriverchoir.com
Type: connection
Detail: 203.23.36.1: Fetching http://bridgeontheriverchoir.com/.well-known/acme-challenge/jH-9YRYRw9DvO4F8HNjRv11fn4tosffGgAZJwn__IZs: Connection refused

I noticed that the <.well-known> directory does not exist.

I also get this for the other 3 domains:

Domain: tckell.com
Type: connection
Detail: 203.23.36.1: Fetching http://tckell.com/.well-known/acme-challenge/4rQxLGLq_1Tw0uhMz6_Bi_naMN3sjJBY-9F3ClV3FhM: Timeout during connect (likely firewall problem)

There are no firewall issues (80 and 443 are open), all the existing sites work and are certified. Some of the sites were supposed to be added so they are not certified.

most of these certificates have been in place and regularly renewed by cron for years.

I've noticed that I have TWO certificates for the same domains, but with different certificate names. Only one of them is referenced in the apache configs. I don't know how this happened. I've been trying to tidy up and may have messed something up, but can't see what.

My web server is (include version): Apache 2.4.41 (Ubuntu)

The operating system my web server runs on is (include version): ubuntu 22.04

My hosting provider, if applicable, is: me

I can login to a root shell on my machine (yes or no, or I don't know): yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): NO

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): certbot 4.2.0

2 Likes

Is there a rate limit? I found a comment that there might be a limit of 5 renewals / week.
I've been making a lot of changes and might have exceeded 5.

Nah it's not a rate limit, you are just refusing to let Let's Encrypt connect to your site when they check it. A common reason for this would be geographic firewall filtering (only allowing certain countries to access your site).

3 Likes

The sites have never been geographically firewalled, so I would be surprised if it was that. The certificates were renewed automatically quite recently.
Thanks for the thought though.

Thanks, yes that just an example, but you are (or were) blocking their request somehow. Even if the response was denied (a 404 etc) by Apache it would still connect at least.

Testing that url works for me, so you should try your request again.

There is a rate limit for failed validations but it's like 5 per hour and the error would be a rate limit not a timeout.

2 Likes

Interestingly, if I try to get a cert for that domain from Let's Encrypt production system the connection is NOT refused. Of course, it fails with a 404 Not Found as I can't update your server but it at least connects.

Every LE Staging attempt fails with Connection refused same as you show in post #1

This reinforces that there is something on your system that is selectively blocking incoming requests.

Do you get the same Connection refused even without --dry-run

There is a rate limit on failed requests for production but the error message for that is clear. The rate limit with --dry-run (staging) is very tolerant.

Info:
In case I can't reproduce this later I want to show the 404 result for production

sudo certbot certonly --webroot -w /var/www/html -d bridgeontheriverchoir.com
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Requesting a certificate for bridgeontheriverchoir.com

Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems:
  Domain: bridgeontheriverchoir.com
  Type:   unauthorized
  Detail: 203.23.36.1: Invalid response from http://bridgeontheriverchoir.com/.well-known/acme-challenge/x4HRHUwmX46fLroprazTB7yDTrnAQOtwnSAu3So_GW0: 404
2 Likes

I just ran the exact same command with and without --dry-run

with --dry-run it failed again, same problems
without --dry-run it succeeded normally

Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/kenmcq/fullchain.pem
Key is saved at: /etc/letsencrypt/live/kenmcq/privkey.pem
This certificate expires on 2025-11-24.
These files will be updated when the certificate renews.
Certbot has set up a scheduled task to automatically renew this certificate in the background.

Is this a bug? Should I report it?

I don't see how this could be a Let's Encrypt bug. Or even a Certbot bug. The "connection refused" error is something your system actively does to reject the incoming HTTP request. Most often a firewall does that. Let's Encrypt is just reporting what happened when it sent your system an HTTP request.

The other error you showed was a generic "timeout". This most often is a firewall although various causes are possible. Right now I cannot reproduce the timeout to your tckell domain. Does that still timeout for you? See: Let's Debug

You said earlier there is no firewall problem. Is that because you checked and found nothing wrong? Or because you don't know of any firewall? Do you have a router? Is there perhaps a firewall built-in to that?

This very much seems like a firewall at your premise blocking requests from certain IP addresses (or IP ranges). The LE staging system is a different server group than LE production and arrives on a different IP.

Can you explain more about the comms equipment / software you have?

5 Likes

It's a public facing IP number. No firewall in the router.
I am running a standard Ubuntu server - nothing clever. It's been running for years and I'm the only one who has access (I hope! :smiley:)

I suspect the time-outs aren't important. I noticed different domains time out each run, but always the last two or three.

Everything is now working fine, so it's now a question of why the --dry-run failed.

root@ns:/etc/apache2/sites-enabled# netstat -tunlp
tcp6 0 0 :::80 :::* LISTEN 5589/apache2
tcp6 0 0 :::443 :::* LISTEN 5589/apache2

i have read that tcp6 also listens for tcp4. I'm not an expert.

root@ns:/etc/apache2/sites-enabled# ufw status
Status: active

Apache ALLOW Anywhere
443 ALLOW Anywhere
Apache (v6) ALLOW Anywhere (v6)
443 (v6) ALLOW Anywhere (v6)

BTW, thanks for your replies.... much appreciated.

Sure. So, no firewall like fail2ban or similar?

Your Drupal 7 system has some kind of caching. Does it have any component doing firewall or IP filter? I didn't check every domain but some do not use Drupal but at least bridge and clefto do. This seems a good avenue to inspect thoroughly especially if not touched for a long time.

2 Likes

So.... I've been tidying up an old mess... that was the point of the exercise.

Somehow I had ended up with two certificates covering mostly the same domain names, although all the apache config files only referenced ONE of the certificates. The other certificate was redundant and I'm not sure how it got there.

I've now revoked/deleted the redundant certificate and guess what? --dry-run now works! It's odd that dry-run failed but live runs didn't fail.

BTW, the Drupal issue related to a missing 443 configuration - now fixed, but not related.

1 Like

That is very odd. Because revoking / deleting a production cert has no bearing on an HTTP Challenge for a cert from Staging.

Certs are just files. Once you receive them there is no interaction between your system and Let's Encrypt. They are static. You can copy them to other systems or do whatever you like. LE doesn't know and is not affected.

It is possible to get a cert from the LE Staging system. It won't be trusted as it's for testing only. But, the --dry-run option doesn't even do that. It doesn't download / retain the cert that LE issued. The --dry-run tests that your system can resolve the challenge.

I think by accident you fixed / changed something involving your local comms or server such that your system no longer rejects inbound HTTP requests from the LE server. Perhaps even something in Drupal got cleaned up when you fixed the "443" problem.

4 Likes

A post was split to a new topic: Certbot timeout error with Apache authenticator

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.