Certbot standalone Timeout after connect (your server may be slow or overloaded)

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is: e-daak.in

I ran this command: certbot certonly -v --preferred-challenge http-01 --standalone -d e-daak.in --pre-hook "service nginx stop" --post-hook "service nginx start"

It produced this output:
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator standalone, Installer None
Running pre-hook command: service nginx stop
Requesting a certificate for e-daak.in
Performing the following challenges:
http-01 challenge for e-daak.in
Waiting for verification...
Challenge failed for domain e-daak.in
http-01 challenge for e-daak.in

Certbot failed to authenticate some domains (authenticator: standalone). The Certificate Authority reported these problems:
Domain: e-daak.in
Type: connection
Detail: During secondary validation: 103.153.253.41: Fetching http://e-daak.in/.well-known/acme-challenge/crTT9K9OqlRZ_6mO7S66leXOrV0kyHdMXKCj4tb7LRg: Timeout after connect (your server may be slow or overloaded)

Hint: The Certificate Authority failed to download the challenge files from the temporary standalone webserver started by Certbot on port 80. Ensure that the listed domains point to this machine and that it can accept inbound connections from the internet.

Cleaning up challenges
Running post-hook command: service nginx start
Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

My web server is (include version):
nginx version: nginx/1.22.1 (though I am using standalone authenticator above)

The operating system my web server runs on is (include version):
Debian GNU/Linux 12 (bookworm)

My hosting provider, if applicable, is:
Self-hosted

I can login to a root shell on my machine (yes or no, or I don't know): Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): certbot 2.9.0

Your nginx server responds to HTTP requests on port 80. Is there a reason you want to stop it to use --standalone option?

The --webroot or --nginx plugin would be easier and does not require nginx to be stopped during cert request.

That still might not help your "Secondary validation" failure which is probably a firewall blocking requests from certain regions of the world or certain IP addresses.

But, those are better methods if you have a running server and are easier to debug.

2 Likes

I get the same error when I use --nginx. I have gone through the discussions on this forum, and in many cases, questions were raised about potential problems with the nginx configuration. That is why I presented the results with --standalone, so that nothing with the webserver configuration comes in the way of validation.

I have disabled firewall on the system (ufw disable), but still get this error.

The system is behind a home router, that is correctly forwarding port 80. I do not have any other firewall in between.

Exactly the same? Please always show the error so we don't have to guess. Sometimes small differences matter a lot.

Are you sure your router or even your ISP doesn't block requests from certain regions. Or do they have any kind of "denial of service" protection (DDoS)? Because I cannot consistently reach your domain from various parts of the world.

Please show result of this

sudo certbot certonly --dry-run --nginx -d e-daak.in

Yes, but then to test the connection you must run Certbot in debug mode which is difficult for inexperienced person to do. The error we see is not related to an nginx config issue but something related to communications. Something is blocking some requests.

2 Likes

Thanks a lot for your help.

I don't think there is anything in the router that would block requests from any region though one cannot rule out a buggy firmware. Unfortunately, I also do not have a way to find out if the ISP is doing something. They are incompetent and rather uncooperative.

Here is the result of: sudo certbot certonly --dry-run --nginx -d e-daak.in


Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator nginx, Installer nginx
Requesting a certificate for e-daak.in
Performing the following challenges:
http-01 challenge for e-daak.in
Waiting for verification...
Challenge failed for domain e-daak.in
http-01 challenge for e-daak.in

Certbot failed to authenticate some domains (authenticator: nginx). The Certificate Authority reported these problems:
Domain: e-daak.in
Type: connection
Detail: During secondary validation: 103.153.253.41: Fetching http://e-daak.in/.well-known/acme-challenge/NILK0wUtmrD9p4T6Ni7Aj87wbnkSn2MdbbfXTohIZLw: Timeout after connect (your server may be slow or overloaded)

Hint: The Certificate Authority failed to verify the temporary nginx configuration changes made by Certbot. Ensure the listed domains point to this nginx server and that it is accessible from the internet.

Cleaning up challenges
Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.


I could try using the debug mod if you can point me to the instructions and if you think it would help in finding out what could be the problem.

I am reasonably experienced with managing linux servers.

Thanks once again

1 Like

Do you have an nginx access log setup? do you see any http challenges coming in? We should see anywhere from 3 to 5

Connections have been very slow to your domain. Possible your server is just getting overloaded? Below is a good test site for comms
https://letsdebug.net

3 Likes

That means the primary validation succeeded.

That may mean that your system uses an IPS that doesn't like something about those requests/frequency/GeoLocation/etc.
HTTP is really not something one should be too concerned with IF you lock it down to (basically) do nothing.

IMHO, there are only two useful things for HTTP:

  • redirecting HTTP requests to HTTPS
    [which can be done by any system - doesn't have to be the same system that serves HTTPS]

  • handle ACME challenge requests
    this can be done many ways:
    a. from a separate system:
    -- proxy the requests to the originating system
    -- handle all the requests [be a centralized ACME certificate management server]
    b. from the same system:
    -- proxy to the local HTTPS service port
    -- handle them via --webroot
    -- handle them via redirecting to a specific local port [that an ACME client can use in --standalone mode]

2 Likes

[quote="MikeMcQ, post:6, topic:215149, full:true"]
Do you have an nginx access log setup? do you see any http challenges coming in? We should see anywhere from 3 to 5

Connections have been very slow to your domain. Possible your server is just getting overloaded? Below is a good test site for comms
https://letsdebug.net

It first complained that the nameserver were inaccessible, kept trying for a few seconds and then gave the same error:


103.153.253.41: Fetching http://e-daak.in/.well-known/acme-challenge/Qt3s19MU02R5lhsGc3Z96dSsXuP7c4hLuTlCUb7JzsM: Timeout after connect (your server may be slow or overloaded)


When I did it another time, it said:


All OK!
OK
No issues were found with e-daak.in. If you are having problems with creating an SSL certificate, please visit the Let's Encrypt Community forums and post a question there

These are the nginx logs:


65.21.146.168 - - [20/Mar/2024:11:32:30 +0530] "GET /.well-known/acme-challenge/letsdebug-test HTTP/1.1" 404 125 "-" "Mozilla/5.0 (compatible; Let's Debug emulating Let's Encrypt validation server; +https://letsdebug.net)"
18.236.192.251 - - [20/Mar/2024:11:32:31 +0530] "GET /.well-known/acme-challenge/Qt3s19MU02R5lhsGc3Z96dSsXuP7c4hLuTlCUb7JzsM HTTP/1.1" 404 125 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
184.105.99.36 - - [20/Mar/2024:11:35:19 +0530] "GET /.well-known/acme-challenge/Qt3s19MU02R5lhsGc3Z96dSsXuP7c4hLuTlCUb7JzsM: HTTP/1.1" 404 125 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15"
65.21.146.168 - - [20/Mar/2024:11:36:07 +0530] "GET /.well-known/acme-challenge/letsdebug-test HTTP/1.1" 404 125 "-" "Mozilla/5.0 (compatible; Let's Debug emulating Let's Encrypt validation server; +https://letsdebug.net)"
66.133.109.36 - - [20/Mar/2024:11:36:09 +0530] "GET /.well-known/acme-challenge/2eJmXuNiqhNn-c3mRWn2Z_t8_KEc4x5mfqAsDS6Yxnc HTTP/1.1" 404 125 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
1 Like

This is what the nginx access logs show when I run certbot:

54.169.71.196 - - [20/Mar/2024:11:42:20 +0530] "GET /.well-known/acme-challenge/VUz_5H-IdI_-LD-M_3OGjOnWVHXgwj07EK2ktdbVU_E HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
17.58.58.24 - - [20/Mar/2024:11:42:30 +0530] "GET /.well-known/acme-challenge/crTT9K9OqlRZ_6mO7S66leXOrV0kyHdMXKCj4tb7LRg: HTTP/1.1" 404 125 "-" "AppleNewsBot"
17.58.58.8 - - [20/Mar/2024:11:42:39 +0530] "GET /.well-known/acme-challenge/crTT9K9OqlRZ_6mO7S66leXOrV0kyHdMXKCj4tb7LRg: HTTP/1.1" 404 125 "-" "AppleNewsBot"
type or paste code here
1 Like

That is the only request coming from Let's Encrypt validation server. The user-agent string for the other requests look like bots (apple newsbot, Macintosh).

That's good and bad. Only one HTTP challenge is reaching you and your server returns the correct response (status 200).

Your earlier post with logs from Let's Debug tests also only showed only one HTTP challenge reaching you for each test (a 404 reply is expected with Let's Debug tests).

Interestingly, the IP addresses reaching you are from several different regions in the world that Let's Encrypt checks from.

My best guess is you have a firewall that is too sensitive about blocking repeated requests to protect you from a DDoS (denial of service) attack.

You may just have a firewall blocking certain IP addresses which match the Let's Encrypt primary location in the US. All the IP addresses that reach you come from secondary locations.

2 Likes

How is your system doing on resources?
What shows?:
top

2 Likes

Nothing particularly unusual.

I just managed to renew the certificate. This process seems extremely unreliable. My experience has been that I have to keep trying at different hours, and, when I am lucky, it just goes through.

Has this been a common problem?

No.

Let's Encrypt issues around 4 million certs per day. If your problem was common it would be very obvious to the monitoring systems in place.

This is most likely a local comms issue in your ISP or something in your equipment.

3 Likes

Yes, that makes sense. You have been extremely helpful. Would you have any further advice about how to deal with this? I will keep confronting this situation every few months before renewal of certificates unless I can identify the source and fix it.

1 Like

Just to follow the normal recommendation to start renewal with 30 days remaining before cert expiration. You are probably already setup with cronjob of systemd timer to run twice day. You could increase that to 4 times / day and then closely monitor during the last 30 days.

Use certbot certificates to view cert status

You could look at using a DNS Challenge instead of the HTTP standalone challenge. Let's Encrypt queries your DNS records instead of your web server. DNS method is often harder to setup and needs your DNS provider to support an API to allow auto-renew. And, for Certbot to support it.

There are other ACME clients (like acme.sh or lego) that support a wider variety of DNS providers.

Here are topics with more details of above

https://eff-certbot.readthedocs.io/en/latest/using.html#automated-renewals

4 Likes

nothing unusual to you...
care to share?
[at least the top part of top]

1 Like
top - 23:46:12 up 51 days, 10:16,  1 user,  load average: 0.02, 0.13, 0.38
Tasks: 410 total,   1 running, 409 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  7.7 sy,  0.0 ni, 92.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31993.1 total,   6637.9 free,   7232.7 used,  19055.1 buff/cache
MiB Swap:  15259.0 total,  15259.0 free,      0.0 used.  24760.4 avail Mem

Agreed.
That doesn't seem to be where any of this problem lies.

Back to network comms.

OR

Is using DNS-01 authentication a viable [automated] option?

3 Likes

Completely OFF-TOPIC...
But if you are not going to use the swap file [and you should never have to], then why even have one?
swapoff -a

1 Like