Alpine Linux, Nginx - Timeout during connect (likely firewall problem)

My domain is: catona.cloud

I ran this command: sudo certbot --nginx -d catona.cloud -d www.catona.cloud

It produced this output:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Requesting a certificate for catona.cloud and www.catona.cloud

Certbot failed to authenticate some domains (authenticator: nginx). The Certificate Authority reported these problems:
  Domain: www.catona.cloud
  Type:   connection
  Detail: 23.95.47.78: Fetching http://www.catona.cloud/.well-known/acme-challenge/KK-ZDy0DMzT-oLL8gknKeyr3gevb2TwQCuzBHpLXBCw: Timeout during connect (likely firewall problem)

  Domain: catona.cloud
  Type:   connection
  Detail: 23.95.47.78: Fetching http://catona.cloud/.well-known/acme-challenge/Um_n5Xb8FvJr3STbueeD24Ogk_qIfeGGnKhAygHQneU: Timeout during connect (likely firewall problem)

Hint: The Certificate Authority failed to verify the temporary nginx configuration changes made by Certbot. Ensure the listed domains point to this nginx server and that it is accessible from the internet.

Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

My web server is (include version): nginx-1.22.1-r0

The operating system my web server runs on is (include version): Alpine x86, up to date as of 08 jan 2023

I can login to a root shell on my machine (yes or no, or I don't know): Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): 1.32.0

Notes:

I can access a test page with http and I can netcat on 443 and send and receive messages. Nginx says the .well-known URL was accessed successfully:

52.14.131.62 - - [08/Jan/2023:20:40:25 +0000] "GET /.well-known/acme-challenge/KK-ZDy0DMzT-oLL8gknKeyr3gevb2TwQCuzBHpLXBCw HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +hthttps://www.letsencrypt.org)" "-"
35.88.181.98 - - [08/Jan/2023:20:40:25 +0000] "GET /.well-known/acme-challenge/KK-ZDy0DMzT-oLL8gknKeyr3gevb2TwQCuzBHpLXBCw HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "-"

I can post the nginx config but there isn't much to it, just a host in /etc/nginx/http.d/ and rtmp.

Firewall is awall, there isn't much to it either and I can access 80 and 443 so it shouldn't be the problem.

Those two IP addresses are only the 2 secondary vantage points Let's Encrypt uses to validate the hostname. The primary, which must always be successful, apparently cannot connect to your IP on port 80.

Might have something to do with specific block lists blocking certain IP addresses?

9 Likes

Any idea what could be causing this? I didn't block anything, I don't have fail2ban or any service like that yet. My machine and the secondary servers can access the ip and port.

I wouldn't know. Might even be your ISP, but that would be kinda weird. Some firewalls have geoblocking in place, maybe something like that.

4 Likes

I tried to allow all in on awall and got the same results. There is nothing on the logs except for SYN packets on weird ports.
But, I noticed something weird on the nginx logs, the same key was GET'ed again 25 min late with a weird user agent.

52.14.131.62 - - [08/Jan/2023:20:40:25 +0000] "GET /.well-known/acme-challenge/Um_n5Xb8FvJr3STbueeD24Ogk_qIfeGGnKhAygHQneU HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "-"
35.88.181.98 - - [08/Jan/2023:20:40:25 +0000] "GET /.well-known/acme-challenge/Um_n5Xb8FvJr3STbueeD24Ogk_qIfeGGnKhAygHQneU HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "-"
x.x.x.x - - [08/Jan/2023:21:48:22 +0000] "HEAD /.well-known/acme-challenge/Um_n5Xb8FvJr3STbueeD24Ogk_qIfeGGnKhAygHQneU HTTP/1.1" 404 0 "-" "curl/7.84.0" "-"
x.x.x.x - - [08/Jan/2023:22:05:13 +0000] "GET /favicon.ico HTTP/1.1" 404 548 "http://catona.cloud/.well-known/acme-challenge/Um_n5Xb8FvJr3STbueeD24Ogk_qIfeGGnKhAygHQneU" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36 Edg/108.0.1462.76" "-"

I can find some .well-known/ 404s on the logs but they are not accompanied of delayed attempts. Are these primary servers? 18.218.31.92, 52.14.131.62, 34.221.164.126

Supplemental information

Let's Debug gets an ERROR; results here https://letsdebug.net/catona.cloud/1328338

$ nmap catona.cloud
Starting Nmap 7.91 ( https://nmap.org ) at 2023-01-08 14:18 PST
Nmap scan report for catona.cloud (23.95.47.78)
Host is up (0.060s latency).
rDNS record for 23.95.47.78: mail1.gaysentniel.com
Not shown: 996 filtered ports
PORT     STATE  SERVICE
22/tcp   open   ssh
80/tcp   open   http
443/tcp  closed https
1935/tcp open   rtmp

Nmap done: 1 IP address (1 host up) scanned in 5.58 seconds
1 Like

That was probably me, seeing if I could connect from my home connection. Sorry for the confusion.

The short of it is that when you do a validation request, you should be seeing at least three connections and only saw two, so your site isn't actually accessible from everywhere on the Internet. Let's Encrypt needs to check from multiple places to help confirm that you actually own the name as seen from everywhere.

4 Likes

I reran the test with the firewall all open:

nmap -sS catona.cloud:

Not shown: 992 closed ports
PORT     STATE    SERVICE
22/tcp   open     ssh
25/tcp   filtered smtp
80/tcp   open     http
443/tcp  open     https
587/tcp  filtered submission
1935/tcp open     rtmp
5060/tcp open     sip
8080/tcp open     http-proxy

certbot still gives the same timeout error. Letsdebug still can't find the .well-know/ path, as it should because there is nothing there. Let's Debug

/etc/nginx/http.d/catona.cloud.conf

server {
        listen 80;
        root /var/www/catona.cloud/html;
        index index.html;
        server_name catona.cloud www.catona.cloud;
}

/etc/nginx/nginx.conf is pretty much the default on alpine with a rtmp section added.

1 Like

Thanks, I removed your IP from the post.

3 Likes

Yet this is what I am seeing from my location

# nmap -sS catona.cloud
Starting Nmap 7.91 ( https://nmap.org ) at 2023-01-08 14:38 PST
sendto in send_ip_packet_sd: sendto(4, packet, 40, 0, 23.95.47.78, 16) => Permission denied
Offending packet: TCP 192.168.1.51:36711 > 23.95.47.78:80 A ttl=38 id=28470 iplen=40  seq=0 win=1024
Nmap scan report for catona.cloud (23.95.47.78)
Host is up (0.063s latency).
rDNS record for 23.95.47.78: mail1.gaysentniel.com
Not shown: 993 closed ports
PORT     STATE    SERVICE
22/tcp   open     ssh
25/tcp   filtered smtp
80/tcp   open     http
135/tcp  filtered msrpc
139/tcp  filtered netbios-ssn
445/tcp  filtered microsoft-ds
1935/tcp open     rtmp

Nmap done: 1 IP address (1 host up) scanned in 2.40 seconds

And this

$ nmap -Pn catona.cloud
Starting Nmap 7.80 ( https://nmap.org ) at 2023-01-08 22:39 UTC
Nmap scan report for catona.cloud (23.95.47.78)
Host is up (0.061s latency).
rDNS record for 23.95.47.78: mail1.gaysentniel.com
Not shown: 993 closed ports
PORT     STATE    SERVICE
22/tcp   open     ssh
25/tcp   filtered smtp
80/tcp   open     http
135/tcp  filtered msrpc
139/tcp  filtered netbios-ssn
445/tcp  filtered microsoft-ds
1935/tcp open     rtmp

Nmap done: 1 IP address (1 host up) scanned in 2.11 seconds
$ curl -Ii http://catona.cloud/.well-known/acme-challenge/sometestfile
HTTP/1.1 404 Not Found
Server: nginx
Date: Sun, 08 Jan 2023 22:42:26 GMT
Content-Type: text/html
Content-Length: 146
Connection: keep-alive

$ curl -Ii http://catona.cloud/.well-known/acme-challenge/sometestfile -A "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
HTTP/1.1 404 Not Found
Server: nginx
Date: Sun, 08 Jan 2023 22:42:40 GMT
Content-Type: text/html
Content-Length: 146
Connection: keep-alive

1 Like
135/tcp  filtered msrpc
139/tcp  filtered netbios-ssn
445/tcp  filtered microsoft-ds

hmm.....? I'm completely lost now. This is just a generic VPS with recently installed Alpine, not much configuration done and no fail2ban.

This is really weird, I've configured let's encrypt before many times and it was always straightforward.
I'm using Racknerd now, do you know if there has been any problems with them before?

Using this online tool https://check-host.net/ the results I got look good Permanent link to this check report for HTTP Port 80 from around the world.

I put a file on "/var/www/catona.cloud/html/.well-known/acme-challenge/hello.txt" and you can see it live http://catona.cloud/.well-known/acme-challenge/hello.txt

I really don't know what else to try. I can create a file on the directory, secondary servers can see the certbot created file and firewall is open even if that shouldn't be the problem.

nmap -sS showed some different results for Bruce5051 but I don't know what to make of it.

Yes, I see the file fine with curl.

$ curl http://catona.cloud/.well-known/acme-challenge/hello.txt
hello let's encrypt discourse!

$ curl -Ii http://catona.cloud/.well-known/acme-challenge/hello.txt
HTTP/1.1 200 OK
Server: nginx
Date: Sun, 08 Jan 2023 22:56:08 GMT
Content-Type: text/plain
Content-Length: 31
Last-Modified: Sun, 08 Jan 2023 22:53:56 GMT
Connection: keep-alive
ETag: "63bb4984-1f"
Accept-Ranges: bytes

$ curl -Ii http://catona.cloud/.well-known/acme-challenge/hello.txt  -A "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
HTTP/1.1 200 OK
Server: nginx
Date: Sun, 08 Jan 2023 22:56:23 GMT
Content-Type: text/plain
Content-Length: 31
Last-Modified: Sun, 08 Jan 2023 22:53:56 GMT
Connection: keep-alive
ETag: "63bb4984-1f"
Accept-Ranges: bytes

1 Like

I've been trying since yesterday with many different options, is it possible that I did something wrong and some config file was left behind? I tried with certonly --webroot -w /var/www/catona.cloud/html and a few different options but nothing too crazy.

I installed with apk add certbot certbot-nginx.

I have wireguard configured on some high port (+40k UDP) and I connect from my pc, it uses 192.168.2.1 so there is no way I tried to access that by accident instead of catona.cloud.

I have no other ideas.

Try a sudo traceroute -T -p 443 acme-v02.api.letsencrypt.org

And a traceroute from me to you

$ sudo traceroute -T -p 443 catona.cloud
traceroute to catona.cloud (23.95.47.78), 30 hops max, 60 byte packets
 1 192.168.1.1  (192.168.1.1)  0.198 ms  0.149 ms  0.197 ms
 2  96.120.60.137 (96.120.60.137)  8.927 ms  8.915 ms  8.896 ms
 3  162.151.125.157 (162.151.125.157)  10.406 ms  10.394 ms  10.357 ms
 4  ae-2-rur02.beaverton.or.bverton.comcast.net (68.85.243.154)  15.522 ms  15.503 ms  15.490 ms
 5  96.216.60.245 (96.216.60.245)  8.761 ms  8.729 ms  8.711 ms
 6  68.85.243.197 (68.85.243.197)  10.208 ms  13.134 ms  13.100 ms
 7  be-36221-cs02.seattle.wa.ibone.comcast.net (68.86.93.53)  21.847 ms be-36231-cs03.seattle.wa.ibone.comcast.net (68.86.93.57)  12.261 ms be-36211-cs01.seattle.wa.ibone.comcast.net (68.86.93.49)  12.185 ms
 8  be-2112-pe12.seattle.wa.ibone.comcast.net (96.110.34.130)  12.161 ms be-2212-pe12.seattle.wa.ibone.comcast.net (96.110.34.134)  12.144 ms  16.227 ms
 9  * * *
10  be2085.ccr21.slc01.atlas.cogentco.com (154.54.2.198)  41.294 ms be2042.ccr32.slc01.atlas.cogentco.com (154.54.89.102)  41.150 ms  41.247 ms
11  be3037.ccr21.den01.atlas.cogentco.com (154.54.41.146)  46.943 ms be3038.ccr22.den01.atlas.cogentco.com (154.54.42.98)  42.063 ms  85.485 ms
12  be3035.ccr21.mci01.atlas.cogentco.com (154.54.5.90)  65.042 ms be3036.ccr22.mci01.atlas.cogentco.com (154.54.31.90)  64.624 ms  60.303 ms
13  be2433.ccr32.dfw01.atlas.cogentco.com (154.54.3.213)  60.277 ms  61.289 ms  61.277 ms
14  be2560.rcr21.b010621-0.dfw01.atlas.cogentco.com (154.54.5.238)  60.642 ms be2561.rcr21.b010621-0.dfw01.atlas.cogentco.com (154.54.6.74)  59.344 ms  59.256 ms
15  38.32.13.10 (38.32.13.10)  60.039 ms 38.32.13.2 (38.32.13.2)  58.546 ms 38.32.13.26 (38.32.13.26)  58.259 ms
16  * * *
17  mail1.gaysentniel.com (23.95.47.78)  63.698 ms  63.672 ms  63.036 ms
1 Like

No, nothing in certbot could cause this failure. It is a problem with the Let's Encrypt primary server trying to reach your domain on port 80.

Almost always this is some firewall blocking the IP address or request. LE does not publish the IP addresses and they change. (link here)

Sometimes there are geographic based blocks and sometimes there are settings for DDoS protection that don't work well and block the LE Server. There might be something like this with your hosting provider. You could ask them.

You could coordinate a test with your hosting service to see if they see 3 inbound requests to your IP. That way they know whether the request was lost from their outer network edge to your server. Or, whether a request never made it to them.

If you can't find a firewall setting the problem becomes a more involved comms issue. These are very rare.

4 Likes

I'm not able to run traceroute for some reason, I get send: Operation not permitted. As root, yes.
I'm trying to figure this out.

1 Like

I will open a ticked with the provider.

If you can't find a firewall setting the problem becomes a more involved comms issue. These are very rare.

Not for me, I have the worst luck.

1 Like

Yes the option on traceroute needs root level permission, that's why I had the sudo.

Here is what I get from my location.

$ sudo traceroute -T -p 443 acme-v02.api.letsencrypt.org
traceroute to acme-v02.api.letsencrypt.org (172.65.32.248), 30 hops max, 60 byte packets
 1  192.168.1.1 (192.168.1.1)  0.183 ms  0.190 ms  0.130 ms
 2  96.120.60.137 (96.120.60.137)  13.698 ms  13.670 ms  7.234 ms
 3  68.87.217.41 (68.87.217.41)  14.318 ms  14.299 ms  14.287 ms
 4  96.216.60.245 (96.216.60.245)  13.776 ms  13.734 ms  13.600 ms
 5  68.85.243.197 (68.85.243.197)  17.554 ms  17.530 ms  17.510 ms
 6  69.252.236.134 (69.252.236.134)  16.570 ms  15.901 ms  15.868 ms
 7  172.65.32.248 (172.65.32.248)  15.203 ms  8.078 ms  11.659 ms
1 Like