The operating system my web server runs on is (include version):
Ubuntu Server 18.04
My hosting provider, if applicable, is:
A box in my closet
I can login to a root shell on my machine (yes or no, or I don’t know):
Yes
I’m using a control panel to manage my site (no, or provide the name and version of the control panel):
No
I am having repeated “Timeout” issues when trying to get a certificate issued. As can be seen from the command above, I am trying to issue a certificate for the root domain and the www. subdomain of two domains with an existing CSR and key. At first, all the domains were giving the timeout error. I tried multiple times as I was messing around with firewall and DNS settings and eventually some of the domains started working, though I had not actually made any configuration changes. The last one (“www.matthewtmarley.com”) is still timing out in the production environment, though I was eventually able to get a certificate issued in the staging environment. Now when I rerun the command, I cannot see any requests arriving in nginx at all. I suspect that since the authorization requests come from multiple sources, one of the sources is getting blocked somewhere upstream of me, but all I get is the useless “Timeout” message, so there isn’t anything else I can think of to troubleshoot.
You got two cname on your server. (For www hostnames) since you are using your www cname to root then to other domain, why not just cname from www to the other domain?
I changed all the CNAMEs to point directly to the root domain name, but I don’t expect that to have any effect. The DNS server was already smart enough to follow the CNAMEs and return the IP address directly to the first request, so it doesn’t really have any effect.
My reasoning is that I pointed a domain at your IPv6 address and tried doing an authorization against it.
What happened is that it succeeded on the port 80 request (using my domain), and then timed out on the subsequent redirect to the same server on port 443, but using your domain.
Now, when we do an authorization using just your domain, it fails immediately on the port 80 request, even though it used the exact same IPv6 address as in the former authorization attempt.
This morning it seems to have started working properly. Maybe it was random server connectivity issues. I will keep an eye on it and close this if it does keep working.
It does seem to be working consistently now, but when I issue a certificate, I am no longer seeing any requests coming in in the nginx access log. This concerns me because I am afraid that a previous successful result has been cached somewhere on LE’s side and once that cache expires, it will go back to failing usually as it did before. However, I don’t think there is anything I can do to test this on my side. Does anyone know how I might go about testing this or know for sure if there were server issues that may have caused the original problem on March 24?
I tried testing by adding extra temporary CNAMEs for which I could request certificates. I once again get the repeated timeouts with the new CNAMEs, but this time I was more careful about collecting the logs. Here’s what I see, grouped by response code (301s first, then 200s), then requesting IP address, and finally sorted by requested URL:
It appears that the server at 2600:3000:2710:300::1d has connectivity issues to my server, as you can see by the fact that there is only one 301 response (there should have been two, one for each domain) and no 200 responses to that IP. That also makes sense with the error message I received, which said that one domain timed out on http://test.michaelmarley.com/.well-known/acme-challenge/COjR7vfg5IKWd2mozoYzZhdbogIUwzyPo_I93cBFslQ (the unredirected URL) and the other timed out on https://matthewtmarley.com/.well-known/acme-challenge/bdjJSHuvUrPHHEUpeOKlwvVb_t8qakMXpYBSbXe9798 (the URL from the single 301 redirect sent to 2600:3000:2710:300::1d above. I attempted to further diagnose this issue using ping and traceroute, but was unable to make any progress since apparently none of the servers respond to pings.
TL;DR: The server at 2600:3000:2710:300::1d seems to have connectivity issues to my server (at 2606:a000:4447:9802:baae:edff:fe73:314a). Can someone please investigate this? Thanks!
It provides less than half of the complete picture, but an mtr or traceroute6 to the Let’s Encrypt IP should get almost all the way to the server. I don’t know at what point they start getting filtered, but it should be informative enough.
michael@michaelmarley:~$ traceroute6 letsencrypt.org
traceroute to letsencrypt.org (2600:1408:9000:1ba::ce0) from 2606:a000:4447:9802:baae:edff:fe73:314a, port 33434, from port 40513, 30 hops max, 60 bytes packets
1 cpe-2606-A000-4447-9802-0-0-0-1.dyn6.twc.com (2606:a000:4447:9802::1) 0.229 ms 0.209 ms 0.274 ms
2 * * *
3 cpe-2606-A000-0-4-0-0-8-354.dyn6.twc.com (2606:a000:0:4::8:354) 13.771 ms 15.286 ms 17.477 ms
4 cpe-2606-A000-0-4-0-0-2-56.dyn6.twc.com (2606:a000:0:4::2:56) 20.284 ms 16.387 ms 16.364 ms
5 cpe-2606-A000-0-4-0-0-0-4E.dyn6.twc.com (2606:a000:0:4::4e) 23.407 ms 22.490 ms 17.161 ms
6 2001:1998:0:8::14 (2001:1998:0:8::14) 28.598 ms 23.052 ms 16.369 ms
7 * * *
8 g2600-1408-9000-0000-0000-0000-172d-b58c.deploy.static.akamaitechnologies.com (2600:1408:9000::172d:b58c) 19.386 ms 24.622 ms 21.316 ms
I also pinged that same IP for several minutes and got no packet loss, though I’m not sure if this s an accurate test because this IP isn’t even in the same prefix.
I decided to do a traceroute6 on 2600:3000:2710:300::1d too. It obviously didn’t get all the way since the 2600:3000:2710:300::1d doesn’t respond to pings, but here’s what I got:
michael@michaelmarley:~$ traceroute6 2600:3000:2710:300::1d
traceroute to 2600:3000:2710:300::1d (2600:3000:2710:300::1d) from 2606:a000:4447:9802:baae:edff:fe73:314a, port 33434, from port 38975, 30 hops max, 60 bytes packets
1 cpe-2606-A000-4447-9802-0-0-0-1.dyn6.twc.com (2606:a000:4447:9802::1) 0.280 ms 0.254 ms 0.315 ms
2 * * *
3 cpe-2606-A000-0-4-0-0-8-356.dyn6.twc.com (2606:a000:0:4::8:356) 20.805 ms 17.070 ms 13.203 ms
4 cpe-2606-A000-0-4-0-0-2-58.dyn6.twc.com (2606:a000:0:4::2:58) 15.015 ms 12.721 ms 13.404 ms
5 cpe-2606-A000-0-4-0-0-0-52.dyn6.twc.com (2606:a000:0:4::52) 23.928 ms 22.854 ms 26.523 ms
6 2001:1998:0:8::16 (2001:1998:0:8::16) 26.938 ms 30.790 ms 22.897 ms
7 2001:1998::66:109:6:171 (2001:1998::66:109:6:171) 24.319 ms 18.901 ms 21.583 ms
8 * * 2001:1998:0:8::1ba (2001:1998:0:8::1ba) 4990.888 ms
9 lo-0-v6.ear2.Denver1.Level3.net (2001:1900::3:19e) 55.269 ms 53.537 ms 55.176 ms
10 VIAWEST-INT.edge3.Denver1.Level3.net (2001:1900:2100::373a) 53.161 ms 55.108 ms 58.079 ms
11 2600:3000:2:330::1 (2600:3000:2:330::1) 55.544 ms 57.736 ms 49.415 ms
12 2600:3000:0:2::85 (2600:3000:0:2::85) 58.002 ms 55.659 ms 51.295 ms
13 * * *
14 2600:3000:3:38::2 (2600:3000:3:38::2) 73.983 ms 76.484 ms 79.562 ms
15 * * *
16 2600:3000:2700:1073::4 (2600:3000:2700:1073::4) 77.562 ms 77.829 ms 75.403 ms
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
There also doesn’t seem to be any packet loss while pinging 2600:3000:2700:1073::4.