Challenge files available to curl; server reports Timeout during connect

Getting a timeout on validation.

During the pause for --debug-challenges, the challenge files are accessible via curl from three different hosts outside the webserver. But certbot-auto gets timeout errors from the server. My nginx access logs show success responses(200) to the queries by the Let’s Encrypt validation servers.

I tried this with both nginx where the plugin rewrites the config file successfully to answer the challenge directly and with webroot to write the actual files to disk.
TIA.

My domain is: ussheepdawg.org

I ran this command as root:
/root/certbot-auto --webroot --installer nginx --test-cert --debug-challenges
and
/root/certbot-auto --nginx --test-cert --debug-challenges

The first of those produced this output:
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer nginx

Which names would you like to activate HTTPS for?


1: ussheepdawg.org


Select the appropriate numbers separated by commas and/or spaces, or leave input
blank to select all options shown (Enter ‘c’ to cancel):
Obtaining a new certificate
Performing the following challenges:
http-01 challenge for ussheepdawg.org
Input the webroot for ussheepdawg.org: (Enter ‘c’ to cancel): {actual path elided}
Waiting for verification…


Challenges loaded. Press continue to submit to CA. Pass “-v” for more info about
challenges.


Press Enter to Continue
Cleaning up challenges
Failed authorization procedure. ussheepdawg.org (http-01): urn:ietf:params:acme:error:connection :: The server could not connect to the client to verify the domain :: Fetching http://ussheepdawg.org/.well-known/acme-challenge/8RA7jGYfc3GWa9_RcNN2tcvLX7KllaJCJqeglvgDawc: Timeout during connect (likely firewall problem)

IMPORTANT NOTES:

  • The following errors were reported by the server:

    Domain: ussheepdawg.org
    Type: connection
    Detail: Fetching
    http://ussheepdawg.org/.well-known/acme-challenge/8RA7jGYfc3GWa9_RcNN2tcvLX7KllaJCJqeglvgDawc:
    Timeout during connect (likely firewall problem)

    To fix these errors, please make sure that your domain name was
    entered correctly and the DNS A/AAAA record(s) for that domain
    contain(s) the right IP address. Additionally, please check that
    your computer has a publicly routable IP address and that no
    firewalls are preventing the server from communicating with the
    client. If you’re using the webroot plugin, you should also verify
    that you are serving files from the webroot path you provided.

My web server is (include version):
nginx/1.10.1

The operating system my web server runs on is (include version):
Fedora 24

My hosting provider, if applicable, is:
linode

I can login to a root shell on my machine (yes or no, or I don’t know):
yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel):
no

Python 2.7.13
certbot 0.26.1

Hi @pico

looks like Letsencrypt has trouble to load your file:

Fetching http://ussheepdawg.org/.well-known/acme-challenge/i6mIxGhXaybmeaT335W5Lpx9g_40CwZXHEAoPfxUytw: Timeout during connect (likely firewall problem)

Is there a firewall or do you block ip-ranges?

Calling http://ussheepdawg.org/.well-known/acme-challenge/8RA7jGYfc3GWa9_RcNN2tcvLX7KllaJCJqeglvgDawc I get a http-status 404 (which is ok), but calling http://ussheepdawg.org/ there is a http-status 500.

There is a firewall on the box. ports for 80 and 443 are open to all. The home page is broken at the moment because a previous cert expired. I guess I can fix that and see if all is well. I don’t expect that to help.

But: note that paused in the middle of the process with debug-challenges, I could curl those challenge files just fine from multiple places.

Edit: cleaned up the home page to properly reply and retried. Same results.

If you run with -v, does the IP address that the CA says it used for your site match the one that you’re expecting?

Is it possible that you have an ISP or hosting provider that blocks some IP ranges, even if you don’t intend to do so yourself on the server?

Yes, on IP address.
ISP is linode and I use certbot-auto on other VPNs at linode without trouble.

  "validationRecord": [
    {
      "url": "http://ussheepdawg.org/.well-known/acme-challenge/c52MHP9k0x67uBQBLFuxVWiqBf8jPS1nTaJ0-wO2LHw",
      "hostname": "ussheepdawg.org",
      "port": "80",
      "addressesResolved": [
        "66.228.40.7"
      ],
      "addressUsed": "66.228.40.7"
    }

Here’s my nginx.log with successful queries from letsencrypt

13.58.30.69 - - [04/Aug/2018:20:38:56 +0000] “GET /.well-known/acme-challenge/8RA7jGYfc3GWa9_RcNN2tcvLX7KllaJCJqeglvgDawc HTTP/1.1” 200 87 “-” “Mozilla/5.0 (compatible; Let’s Encrypt validation server; +https://www.letsencrypt.org)” “-”
52.29.173.72 - - [04/Aug/2018:20:38:56 +0000] “GET /.well-known/acme-challenge/8RA7jGYfc3GWa9_RcNN2tcvLX7KllaJCJqeglvgDawc HTTP/1.1” 200 87 “-” “Mozilla/5.0 (compatible; Let’s Encrypt validation server; +https://www.letsencrypt.org)” “-”
34.213.106.112 - - [04/Aug/2018:20:38:56 +0000] “GET /.well-known/acme-challenge/8RA7jGYfc3GWa9_RcNN2tcvLX7KllaJCJqeglvgDawc HTTP/1.1” 200 87 “-” “Mozilla/5.0 (compatible; Let’s Encrypt validation server; +https://www.letsencrypt.org)” “-”

I could put the site in the paused section of the debug-challenges and post the url if that helps.

Let’s Encrypt’s staging environment currently makes multiple HTTP requests from multiple ISPs. (Currently from different data centers at 2 ISPs, but that is not guaranteed, and it’s sure to change in the future.) In your HTTP log, there isn’t a request from the most important one. So it’s seemingly true that it failed for some reason.

IPv4 routing issues between Let’s Encrypt and Linode in Newark would be surprising, but anything’s possible on the Internet…

fail2ban, perhaps?

I’m not running fail2ban, although linode thinks it’s a good idea. Could you tell me that most important ip?–I’ll search my log.

It may not matter. I searched for 8RA7jGYfc3GW and only have the three above. All the 404’s are dated more than an hour later and appear to be generated by this thread.

So if I’m following along correctly here, it looks like this could be a bug on the Let’s Encrypt side (I say that with trepidation because my bugs are always caused by me.)

What’s encouraging is that this is very repeatable.

So is there some place I should be filing a ticket? I’m willing to be a help if I can.

If there is a routing issue, it could be with Let’s Encrypt’s ISP, or Linode, or any other ISP in between. We don’t yet know what’s wrong, though.

Looks really like a routing problem.

This tool

works complete with https://letsencrypt.org/, but with http://ussheepdawg.org Tampa and Toronto have problems. Checked other sites (my own, a german newspaper) - no problem.

Tested with FireFox and Chrome, domain and ip - same effect. Tampa and Toronto - TCP Connection failed.

I can’t reproduce any universal issue with Linode Newark:

https://letsdebug.net/speedtest.newark.linode.com/3343 (dual-stack)
https://letsdebug.net/speedtest.newark.letsencrypt.mattnordhoff.net/3344 (IPv4-only)

(The latter hostname no longer exists.)

That doesn’t rule out a more specific routing issue, but…

I also just did the uptrends test against another linode I have in the Newark data center. No trouble with that one.

Should I just spin up a new linode and move on?

So I opened a ticket with linode to ask about routing. They are unaware of any issues at this time in Newark, but naturally asked for the results of running mtr from the affected hosts:

mtr -rw ussheepdawg.org

Any chance someone could run this from "the most important ip" mentioned upthread?

Final post here to provide some closure in case others come here looking for answers.

Over the weekend I configured a new linode with Fedora 28 (upgraded from 24) and swapped ip addresses so that it was using the problematic one from this thread. certbot-auto had no trouble getting a certificate. (also the uptrends link is also showing all green now).

I didn’t think of trying this before the upgrade. My apologies to the rare individual who is disappointed.

So for completeness, I went back to the Fedora 24 box with the new ip address, gave it a temporary subdomain. certbot-auto had no trouble granting a cert here as well.

So a diagnosis of some sort of routing problem seems to have been correct. My thanks to whoever fixed it.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.