Virtualmin Letsencrypt unable to connect from EC2

My domain is:

insidernewspaper.com

I ran this command:
The manual Lets Encrypt "Request Certificate" manually from Virtualmin SSL Certificate.

It produced this output:
.. request failed : Web-based validation failed :

Saving debug log to /var/log/letsencrypt/letsencrypt.log An unexpected error occurred: requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fb02bd42950>, 'Connection to acme-v02.api.letsencrypt.org timed out. (connect timeout=45)')) Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

My web server is (include version):
EC2 t3.large

The operating system my web server runs on is (include version):
ubuntu-jammy-22.04-amd64-server-20221201

My hosting provider, if applicable, is:
AWS

I can login to a root shell on my machine (yes or no, or I don't know):
Yep

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
Virtualmin, 7.3-1

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
certbot --version certbot 1.21.0

ZVN DEV is a software company, and one of our client's Word Web Design was hosting about 150 smaller company websites on their physical server in Oregon. Late last year they found out their server provider was going out of business and noone to help them move their stuff elsewhere, so I helped stand this up on AWS. Everything worked great after we worked out the kinks, except for that there's always been an issue with certbot automatically renewing the ssl certificates for all of the sites. We had to manually renew them as they were nearing their expiration date.

About two weeks ago the server crashed for some reason, and had to be restarted. After that, Lets Encrypt started throwing these errors. Now we have multiple clients' websites expiring, and many more to come and we've been trying to find a fix here.

We're happy to pay someone to come help, since we are not DevOps engineers here, and I've exhausted my DevOps contacts with no luck so far.

Things we've tried:

  • Running Certbot manually from terminal
  • Restarting the EC2 again
  • Adjusting security groups, and replacing them on EC2
  • Disconnecting and reconnecting static IP
  • Dug through existing Virtualmin Community Forums and support posts as well as Let's Encrypt
  • Manually generating signed keys with openssl and dropping them into the custom ssl process on virtualmin
  • Curling to confirm we can connect to virtualmin:
curl -v https://acme-v02.api.letsencrypt.org 
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 2606:4700:60:0:f53d:5624:85c7:3a2c:443...
*   Trying 172.65.32.248:443...

  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:04 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:06 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:07 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:08 --:--:--     0
  • We can ping virtualmin, and we can connect to other sites.

Thank you!
-Kirby

1 Like

Welcome @Kirby

Maybe we will get lucky. Let's try to narrow the scope of the problem.

Can you show result of these four commands?

curl -4 https://ifconfig.io
curl -6 https://ifconfig.io

curl -I4 https://acme-v02.api.letsencrypt.org/directory
curl -I6 https://acme-v02.api.letsencrypt.org/directory

And, can you provide at least one fully qualified domain name that is failing to renew?

3 Likes

Thanks for responding Mike!

Here's the outputs with the internal ip obstructed:

[root@ip-111-11-11-111 ~]# curl -4 https://ifconfig.io 
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:04 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:06 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:07 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:08 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:10 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:11 --:--:--     0
^C
[root@ip-111-11-11-111 ~]# curl -6 https://ifconfig.io 
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:04 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:06 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:07 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:08 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--     0
^C
[root@ip-111-11-11-111 ~]# curl -I4 https://acme-v02.api.letsencrypt.org/directory 
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:04 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:06 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:07 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:08 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:09 --:--:--     0
^C
[root@ip-111-11-11-111 ~]# curl -I6 https://acme-v02.api.letsencrypt.org/directory 
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:02 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:04 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:06 --:--:--     0
1 Like

This is all run within virtualmin's terminal.
If I run them within the ssh of the ec2 instance, then I get no response from any of them.

I don't use Virtualmin and don't know why it is showing the output with progress reporting on.

But, did both curl requests (IPv4 and IPV6) to ifconfig.io return anything? The ifconfig.io should return your public IP. These are in the public DNS so are no secret.

Why can't you connect to anything outbound when ssh into EC2? Is the EC2 you ssh into the one where the wordwebdesign.com and insidernewspaper.com are hosted? I can reach both of those so inbound IPv4 is working (you don't have IPv6 in your DNS for those).

Is Virtualmin running in that same EC2 instance?

ping uses icmp. This feels to me like a network config problem for tcp. While ping is sometimes helpful, we need to use curl or other tools for tcp testing.

3 Likes

I don't know how this is related but the domain insidernewspaper.com is currently using a self-signed cert so is failing validation.

Yet, it was issued a Let's Encrypt cert on Aug26 which does not expire until Nov24

Were you aware of this problem?

See validation failure at this SSL Checker (link here)

2 Likes

You've tried just curl with https://google.com I assume? If outgoing https is blocked it could just be that absolutely no outgoing traffic is allowed.

2 Likes

That's why I asked about https://ifconfig.io - that should confirm outbound to something and confirm public IP - double bonus :slight_smile:

One of the displays in the first post showed IPv4 and IPv6 addresses for the acme-v02 endpoint so I one thing I want to know is whether both outbound routings are valid.

1 Like

Yep - this was an attempt to get them online first, via Virtualmin's self signing and then I tried to get the certificate from openssl, but that didn't work.

Hey Mike,

Yep - there are about 150 domains all hosted on this EC2 instance within virtualmin, including wordwebdesign.com and insidernewspaper.com and many more. Virtualmin is running on this same EC2 instance.

Thanks for the info about ping, that explains why curl was not working while ping was.

It does seem like the issue is with the EC2's outbound connections, but I can share the outbound Security group list with you, since it seems to be fully open.

Running on EC2 terminal:

curl -4 https://ifconfig.io
curl -6 https://ifconfig.io

curl -I4 https://acme-v02.api.letsencrypt.org/directory
curl -I6 https://acme-v02.api.letsencrypt.org/directory

None of them return anything at all, when run one by one.
They just hang without response:

$ curl -4 https://ifconfig.io

Same happens when curling google:

curl -4 http://www.google.com

No response.

I assume this means it's all about the EC2 instances connection to the outbound that isn't working.

Yes. Check your EC2 Security Group outbound. The default is usually all ports, all protocols allowed. But, make sure tcp is allowed for ports 80 and 443 if your settings are different. As an aside, it is the inbound rules that are usually restricted with outbound allowed.

Do you have any kind of firewall on the server that would block outbound?

Maybe this will show a clue. Although, the routings from inside an AWS VPC are not always clear to me. Still, if it doesn't get very far it might indicate where it is blocked.

sudo traceroute -T -p 443 acme-v02.api.letsencrypt.org
2 Likes

Outbound connections should be all open with these settings on the instance.

I had to remove `-T` and `-p` since they were unrecognized.  Hopefully this still helps:

sudo traceroute 443 acme-v02.api.letsencrypt.org
traceroute to ca80a1adb12a4fbdac5ffcbc944e9a61.pacloudflare.com (172.65.32.248), 64 hops max
  1   244.5.0.127  8.616ms  4.555ms  4.560ms 
  2   240.0.52.99  0.399ms  0.309ms  0.316ms 
  3   242.9.163.151  1.075ms  1.056ms  1.097ms 
  4   240.0.184.1  0.682ms  0.622ms  0.574ms 
  5   242.3.84.195  1.675ms  2.352ms  1.926ms 
  6   100.100.4.80  0.591ms  0.471ms  1.009ms 
  7   99.83.90.51  1.358ms  1.320ms  1.754ms 
  8   172.70.132.4  0.808ms  1.043ms  1.135ms 
  9   *  *  * 
 10   *  *  

Yeah, we cross-posted. Your outbound group looks the same as mine :slight_smile:

So, going back to my previous post - any firewall blocking outbound maybe?

And, check traceroute

Tip: use curl -m 5 to timeout after 5 secs (or 10 or whatever)

3 Likes
$ curl -4 http://www.google.com -m 5
curl: (28) Failed to connect to www.google.com port 80 after 4922 ms: Connection timed out

As for firewall, I'm not sure. We didn't have one originally, but how would I check?

No, your traceroute isn't the full featured one. I think it just does icmp like ping without the -T (-p needed to mimic route of HTTPS)

2 Likes

Ah that makes sense.

$ sudo traceroute -p 443 acme-v02.api.letsencrypt.org
traceroute to ca80a1adb12a4fbdac5ffcbc944e9a61.pacloudflare.com (172.65.32.248), 64 hops max
  1   *  *  * 
  2   240.0.52.65  0.315ms  0.303ms  0.291ms 
  3   242.7.113.67  7.613ms  6.967ms  1.030ms 
  4   240.0.184.3  1.120ms  1.036ms  1.731ms 
  5   242.3.84.71  1.105ms  1.098ms  1.082ms 
  6   100.100.4.94  0.584ms  0.568ms  0.581ms 
  7   99.83.90.51  13.793ms  14.229ms  10.038ms 
  8   172.70.36.5  1.281ms  1.396ms  1.152ms 
  9   *  * 

Maybe that's still not helpful without the -T

Not so much.

This checks one kind of firewall

sudo ufw status
2 Likes

It doesn't look like ufw exists:

$ sudo ufw status
sudo: ufw: command not found
1 Like