Yet another "Timeout" while verifying via HTTP

Are there any other inline devices that might be able to block IPs?
[IPS, DDoS, GEO, etc.]

But you already knew that!

Yup. :slight_smile:
The existing cert has expired, and my auto renew was failing, which is why I started looking into it. You'll get the same cert failure for ngs.tsqmadness.com as well.

Essentially I pretty much need to know what it is checking, and from where. Example: While I am not filtering any IPs - and was told my data center isn't - it is possible that if these LE servers are hosted in the Caribbean (some random location), then there is an unknown filter on them. Or, if they are pinging the root page and looking for a 200 (which would cause the issue with the home page as @rg305 mentioned, which I have cleared up). Or if they are looking for an existing certificate already on the site (which I have NOT tried to ditch yet, as I wouldn't thing the failiure of an existing cert would cause a failure to renew.

The root page for tsqmadness.com was corrected, and ngs.tsqmadness.com was already returning 200. I will try to remove the certificate, but then that would mean no https at all.

Thanks! :smiley:

1 Like

No, but they can be redirected.

Never.

I own my own server in the data-center. And they claim that they aren't blocking ANY IPs.
(My firewall entry for IIS is also not blocking IPs.)

It'd be nice to know potential IPs that these requests could be coming from, in case they are unknowingly blocked somewhere.

That same request has been brought up dozens of times.
They can't be listed for that specific reason.
And they can and should be expected to change without notice.

Okay, then thanks. No, their IPs are not blocked.

1 Like

Do you see requests in the IIS logs to the challenge folder?
[while you are running le64]

To be crystal clear:
You welcomed any pointers and I gave you two (free of charge and almost immediately).
[Two pointers which I would (and do) implement on my own systems - I also run LE64 on Windows]

But since Let's DEBUG continues to see problems (even after sites return 200), I can only assume that something/somewhere is blocking those particular HTTP requests - even though plenty of other ones are being allowed.

Good luck and cheers from Miami :beers:

Whether or not your home page works isn't relevant, no, Let's Encrypt is only checking the file in the .well-known directory.

Basically, that message is just that Let's Encrypt can't get to your IP, and there's not a lot more to go on than that.

The only thing I noticed, though I don't know if it's actually going to send you on a wild goose chase, is that the IP resolves to 66.151.242.26 for me, and that IP seems to be from a block that has bad IRR information.

https://bgp.he.net/net/66.151.242.0/24

The announcement of the 66.151.242.0/24 block is marked red with a message of "IRR Invalid - Origin Mismatch". I know only a minimal amount about BGP and how core Internet routing works, but I think that means that the IP block is being announced "wrong" or at least by an entity that hasn't properly proved that it owns the block. So my thinking is that routing to that IP from Let's Encrypt's servers might not be working right.

But if that is the issue (and again, I'm not sure that it's really related to the underlying problem you have), then it's pretty much an issue that only your ISP has the power to fix.

1 Like

Ooh, i will CERTAINLY look into that. ESPECIALLY after discovering this. After adjusting logging settings, my last attempt (via le64, not letsdebug), I see THIS:

2021-06-11 20:31:08 srvr 66.151.242.26 GET /.well-known/acme-challenge/fuzQt3sxAaeoxiTF-MRmV25lZCQq6Vau74ScMJyGDb4 - 80 - 3.143.223.150 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 527 270 22
2021-06-11 20:31:08 srvr 66.151.242.26 GET /.well-known/acme-challenge/fuzQt3sxAaeoxiTF-MRmV25lZCQq6Vau74ScMJyGDb4 - 80 - 3.67.34.92 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 527 270 102
2021-06-11 20:31:18 srvr 66.151.242.26 GET /.well-known/acme-challenge/fuzQt3sxAaeoxiTF-MRmV25lZCQq6Vau74ScMJyGDb4 - 80 - 18.236.228.243 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 527 270 51

Requests came in, and IIS returned a 200, the file was served properly. However, LE still reported:

2021/06/11 16:31:19 Domain verification results for 'www.tsqmadness.com': error. Fetching http://www.tsqmadness.com/.well-known/acme-challenge/fuzQt3sxAaeoxiTF-MRmV25lZCQq6Vau74ScMJyGDb4: Timeout during connect (likely firewall problem)

So.. I'm at a loss at the moment. :frowning: In the past, I was using GNS validation, which worked fine, but my DNS server apparently has no API to create/remove TXT entries, which is why I started messin' wth this. :slight_smile:

Looking at the error message (specifically, it's missing the term "secondary" somewhere), it looks like those three requests in your logs are from the secondary validation vantage points. You can find out more about that here:

It seems the primary datacenter can't connect, while the secondary ones can.

2 Likes

Interesting. This is good to know. I cleared logs, and ran the le64 tool. These are the entries in there. Two keys, one for "tsqmadness.com" and the other for "www.tsqmadness.com":

#Software: Microsoft Internet Information Services 10.0
#Version: 1.0
#Date: 2021-06-11 20:54:42
#Fields: date time s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs-version cs(User-Agent) cs(Referer) sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken
2021-06-11 20:54:42 srvr 66.151.242.26 HEAD / - 443 - 66.151.242.26 HTTP/1.1 Mozilla/5.0+(compatible;+Crypt::LE+v0.37+agent;+https://Do-Know.com/) - 200 0 0 277 128 130
2021-06-11 20:54:45 srvr 66.151.242.26 GET /.well-known/acme-challenge/hIkM_KzvYHevEEHL-WBYSXozSvBxpDgWkIbrDtHtbz8 - 80 - 3.143.223.150 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 529 270 23
2021-06-11 20:54:45 srvr 66.151.242.26 GET /.well-known/acme-challenge/hIkM_KzvYHevEEHL-WBYSXozSvBxpDgWkIbrDtHtbz8 - 80 - 18.236.228.243 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 529 270 51
2021-06-11 20:54:45 srvr 66.151.242.26 GET /.well-known/acme-challenge/hIkM_KzvYHevEEHL-WBYSXozSvBxpDgWkIbrDtHtbz8 - 80 - 18.196.102.134 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 529 270 106
2021-06-11 20:54:56 srvr 66.151.242.26 GET /.well-known/acme-challenge/L-RRy2gvgIqGpxTWWnQi3XexmMDr14rH6BhokdIYEBI - 80 - 18.222.145.89 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 525 266 22
2021-06-11 20:54:56 srvr 66.151.242.26 GET /.well-known/acme-challenge/L-RRy2gvgIqGpxTWWnQi3XexmMDr14rH6BhokdIYEBI - 80 - 34.219.64.153 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 525 266 51
2021-06-11 20:54:56 srvr 66.151.242.26 GET /.well-known/acme-challenge/L-RRy2gvgIqGpxTWWnQi3XexmMDr14rH6BhokdIYEBI - 80 - 3.67.34.92 HTTP/1.1 Mozilla/5.0+(compatible;+Let's+Encrypt+validation+server;++https://www.letsencrypt.org) - 200 0 0 525 266 106

So I am seeing three hits on both files. Reading the API Announcement, I'm seeing three hits - I should be seeing 4. 1 prmary, and 3 secondary. The initial announcement back in 2017 that that link points to DOES MENTION BGP. So, that may be part of the issue then. :frowning:

Hrm. Is there a way to get some additional debug info back from the le64.exe (or the LE servers)?

EDIT:
Well, f*ck me.
I put the -live key back into my command-line. Figuring to check that everything was working before generating actual certificates. Running the command again, I got:

021/06/11 17:16:22 Domain verification results for 'www.tsqmadness.com': success.
2021/06/11 17:16:22 Challenge file '/.well-known/acme-challenge//Brzx4B8daZKWeEZLFzDo-21g4d9FktjMIVmr_QrKjwE' has been deleted.
2021/06/11 17:16:24 Domain verification results for 'tsqmadness.com': success.
2021/06/11 17:16:24 Challenge file '/.well-known/acme-challenge//AGH3c5xiAcVZ3N4tkv4cA6RLWv7WKD32E39dVOzMU0c' has been deleted.

Despite this, the letsdebug still shows error, so something is either wrong with the sandbox setup, OR, the le64 executable using incorrect sandbox servers.

Edit 2:
Interesting, even using the -live servers, ngs.tsqmadness.com is still failing. I'm at a loss. :frowning:

An hour later, I changed nothing and reran the same exact command that I ran on the ngs.tsqmadness.com and it came back fine.

Recommendation would be to provide more debugging info when something like this fails - specifically, when one server can't reach it, but others can, I think a more useful error than 'timeout' would be mas helpful, since at the start, I saw no hits and folks around the world could pull the file up without issue. And then eventually was able to pull logging and saw successful 200 servings of the file.

Now I'm curious what will happen in three months, when I go to renew.

I don't think anybody really has more debugging information they could give you. All Let's Encrypt's servers know is that they sent packets addressed to your server and didn't get a response. If the primary datacenter can connect but the secondary locations can't, the error message specifically calls out that "secondary validation" failed.

It gets tricky to publish which IPs can and can't connect to your systems, as that information would also be valuable to somebody trying to trick what routes Let's Encrypt's networks should use and such. They need to try to ensure that the requester of the certificate owns the certificate as seen from everywhere on the Internet.

1 Like

Nah, I understand about the IPs - security is important.

More info like, "primary server 1 - failed (timeout)" or "secondary server 3 - HTTP Fail (404)". In cases like this, it would have helped me determine that SOME servers were hitting it, some weren't. Granted, I wound't have been able to do anything about it still, but it would have helped me pinpoint the issue. (As well as posting here then with a message saying, 'hey, 1 of the 4 servers can't connect; what does that mean'? instead of saying that the 'firewall issue' is just flat-out wrong. :smiley:

Interestingly, I just performed an expirement:
I went to create a new certificate. This time, I added in mail.tsqmadness.com. (Port 80 points to the same exact server/instance/AppPool that tsqmadness.com does.) The first time, it failed with the same error. I waited about thirty minutes and tried again. Same error. Waited another hour, and it then passed.

Wondering if it's a delayed DNS lookup or something in one of LE's primary server/services.Give it enough time, and it works fine.

I do wonder if there's some kind of weird routing issue between Let's Encrypt's main datacenter(s) and yours, where just sometimes it works and sometimes it doesn't (maybe based on which server or router or something got your request).

I think it's really rare for the main request to fail while the secondary succeed; for many people it's the other way around (because they block traffic from AWS or the like, where the secondary datacenters are hosted). I could see giving a message that the secondary succeeded while the primary failed might be helpful. The software that Let's Encrypt uses to issue certificates is open-source and available at https://github.com/letsencrypt/boulder; you could raise an issue for it there if you wanted, or maybe post a thread here in the Feature Requests section about that. I'm not sure if there's some security or operational reason they'd want to not say anything more if the primary validation is what's failing, though.

3 Likes

Maybe an attempt to MitM :scream:

1 Like

Thanks for that link - wasn't aware it was open-source.

I wouldn't mind working with them to figure out what's going on - as a developer myself, I made sure to try to check everything I could before posting here originally. :slight_smile:

I'll edit to say this: In the past, I've used the DNS method of verification, and no issues obviously. I, unfortunately, am stuck with my current DNS/Registrar for the next 3 years, and they have no API for creating TXT records. I really do want to have to manually update my DNS records every 3 months for the next 3 years, until I can move to a registrar that has an API (and is effin' cheaper). :smiley:

Why not keep your registrar and just move your DNS to another provider? It's a common mistake to assume they have to be the same.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.