The other thing that might be happening is that I think Wordpress might be redirecting to https and causing some confusion to the acme bot. I'm looking for a way to either modify .htaccess to redirect acme queries to a static page, or do it at the vhost level. I saw a post about it last night (which didn't work for me), but I can't find it now.
This is done now. The responses are definitely quicker as not engaging Wordpress. I see all three LE servers making the request and getting a 200 response of 292 bytes (should this be a 404 response?).
Interestingly, the LetsDebug test generates a 404 response with 360 bytes.
Well, 3 requests from the Let's Encrypt servers arrived and got a 404 reply. On a successful challenge (today) there will be 4 requests with a 200 response. Let's Encrypt also caches successful challenges so sometimes there may be fewer than that.
Note the 4th request you show was not from Let's Encrypt servers. But, to answer the question ... there is no way to tell the Let's Encrypt servers to lengthen the timeout.
And, why would 404 be expected there?
====
Lastly, I still see timeouts even just to your homepage from random locations.
I don't know letsencrypt-auto (too old for me) but the current certbot apache plug-in for authentication sets up its own temporary challenge folder. That is visible in the detailed log (again, at least in modern versions).
Was the request that shows 3 200 responses in your log successful? I don't see a fresh cert in crt.sh but was that a Let's Debug (staging) test?
And, yes, the response matters. A successful test will be a 200 with the proper data.
One example of random timeouts for a test challenge is here. The "request uri does not exist" is a successful connect with expected 404 response. Note 3 of 5 timed out.
I get random timeouts to your home page on geopeeker and site24x7
I agree with @rg305 earlier suggestion to try DNS challenge instead. I see that many people may not be able to reach your home page. But, if that doesn't bother you then the DNS challenge should give you a cert to carry forward. Best to upgrade to modern certbot or consider different client like acme.sh which supports many DNS systems. Are you a Google Cloud account or just Google Domains (it matters for automation)?
I don't believe there is an API for Google Domains (just Google Cloud DNS) so you would need to switch your DNS to something else. acme.sh has the widest support for DNS API's so pick one from their list. Cloudflare is a popular choice.
Apart from that you are stuck with HTTP challenge. I don't have any further ideas how to resolve the intermittent timeouts. You may need to discuss with your ISP or a network connectivity expert.
You could also try using a different Certificate Authority. Maybe a different one would not be affected by whatever is going wrong with your connections. Although, I still think visitors to your site are being affected.
Asked and answered previously re: apache plug-in and temp location
Timeouts mean LE can't reach your server or your server "hangs" responding. The form of redirect is not involved.
To avoid the apache plug-in and keep your challenge tests simpler try this. Mind, this is for modern certbot, not sure what your letsencrypt-auto offers.
The --dry-run uses staging system and won't issue cert but always does http challenges. the certonly webroot avoids apache plug-in and uses the -w folder indicated. The webroot also does not configure your server like the apache plug-in will but this is just for connectivity testing in your case.
Someone above suggested an updated client. ...but actually, thinking about it... I'm going to wait until after hours tonight and reboot the router. I'm wondering if it's caching some firewall rules that are blocking certain non-US connection requests. Flushing the table might not be working 100%.
If that doesn't work - is it possible to manually renew a cert (maybe via manually creating the DNS TXT record)? That way at least I can defer this issue until a few months from now.
You are now having intermittent connectivity problems from various locations and URLs
Your recent change to "new subnet" is where I would focus efforts. If there was any new equipment, a different ISP service, ... every component that was involved.
There are only 3 responses. Did the test succeed? Because otherwise the 4th request never got to Apache.
Discourse is the platform this forum runs on. Did you start to paste that URL in a post here 70 seconds after the test started? Because discourse would (likely) do a safety check of any URL. It could not otherwise know to check that newly random URL.
I don't have access to the LE servers nor do any other volunteers. I can see that URL from my test server but site24x7 only sees it from its North American locations. geopeeker did see it from each of its locations. The inconsistent responses still point to a generalized problem and not an LE specific one. Frustrating as that may be.
The Let's Debug site can't. One of its tests uses the LE staging system to try a test.
Of course it will fail with a 404 as the special tokens don't exist on your server. That's considered successful for the Let's Debug site test. The test is ensuring connectivity between LE and your server among other things (that your server behaves sensibly for example).
Conceptually, the ACME client creates the randomized local file and informs the LE Server to look for that specific one. Finding that file and the correct contents is how the LE Server knows you have control of that server (domain). That is, that you have control over its contents. The ACME client (should) delete the file on completion.
@hs8sj3kl2nds8sn Oh, my guess is not a geo block although might be.
The 3 good IP's were two US locations and one in Germany. The two US that worked are AWS hosted. The one US location that is missing is a different hosting / network provider. Mind you, LE may change these at any time (but that's how it was as of a few weeks ago last I checked).
LE uses various service paths to better ensure no one is tricking them into issuing a cert. A site like https://ipstack.com/ is nice for seeing connection details for IP's. Many such sites exist.
You can't, it is a server farm. IP's change regularly. IIRC an LE staff person recently said IP's were changing every few hours. These are large server facilities. If you block their IP's you might block some bad actors but the good ones with it.
If your firewall is blocking it there should be a log available in it. You know roughly when the challenges arrive and their URLs. And, maybe even some "working" IP's from the apache logs at that same time to help isolate.
So I upgraded the certbot and rebooted my firewall, and confirmed that there's now no IP or geo blocking going on at all. Same timeout error.
I recall that for the last couple months that the LE-auto cron has probably been trying to renew certs for domains that have long since been removed from my server (and hence failed).
Is it possible that one of the LE servers has me on a blocklist or something?