Certbot-auto renew failing to work after upgrading from 1.0.0 to 1.2.0

My domain is:
admin.freeflys.com

I ran this command:
apachectl stop
/usr/local/bin/certbot-auto certonly --standalone --renew-by-default --email [email address] --agree-tos --text --preferred-challenges http -d admin.freeflys.com

It produced this output:
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator standalone, Installer None
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for admin.freeflys.com
Waiting for verification…
Challenge failed for domain admin.freeflys.com
http-01 challenge for admin.freeflys.com
Cleaning up challenges
Some challenges have failed.

IMPORTANT NOTES:

  • The following errors were reported by the server:

    Domain: admin.freeflys.com
    Type: connection
    Detail: During secondary validation: Fetching
    http://admin.freeflys.com/.well-known/acme-challenge/ZCcKzrrA4ZCMTdU8p04tnFo_W9SFDKrMnEytlc5KmYA:
    Timeout during connect (likely firewall problem)

    To fix these errors, please make sure that your domain name was
    entered correctly and the DNS A/AAAA record(s) for that domain
    contain(s) the right IP address. Additionally, please check that
    your computer has a publicly routable IP address and that no
    firewalls are preventing the server from communicating with the
    client. If you’re using the webroot plugin, you should also verify
    that you are serving files from the webroot path you provided.

My web server is (include version):
Server version: Apache/2.2.15 (Unix)

The operating system my web server runs on is (include version):
CentOS 6
Linux mws1 2.6.32-754.24.3.el6

My hosting provider, if applicable, is:

I can login to a root shell on my machine (yes or no, or I don’t know):
Yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel):
No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you’re using Certbot):
certbot 1.2.0

I have run the same command for a couple years now to renew, this is the first time I have had an error. Last time it was ran was with certbot 1.0.0. When I ran it this time, certbot upgraded from certbot 1.0.0 to certbot 1.2.0. Also, it looks like a bunch of python3.6 items were installed automatically during the upgrade.

Please advise. Our SSL cert is expiring shortly.

1 Like

The thing that jumps out is that you are trying to use an HTTP challenge right after you stopped your HTTP server. This is not going to work; I’m not sure how it ever worked.

Secondarily, unless there is some extraordinary reason that you are doing a certonly run, I would recommend letting certbot renew and install the new certificate for apache automatically.

But that's what --standalone does—it creates its own HTTP listener to satisfy the challenges.

@alextra, are you sure that you don't have a firewall somewhere that might block inbound HTTP requests from some parts of the Internet?

I do understand that you changed certbot from version 1.0.0 to 1.2.0 recently.
But I can’t see how that can cause this kind of problem…
Have you also changed anything in the network/firewall/IP arena since the last cert was issued/renewed?

edit: even the default (none SNI) cert found at that IP seems to be having trouble renewing…
[admin.surveypanelgroup.com Valid until Sun, 08 Mar 2020 15:59:04 UTC (expires in 12 days, 13 hours)]

Also, there is a cert chain issue with your site.
[completely unrelated to this problem - but worth fixing]

1 Like

@schoen @rg305 I do have an IPTABLES set up and a dedicated firewall, which I can review, though nothing has changed on there for at least the last 6 months or so. What’s strange is that I also get this exact same issue/error for p992.trancos.com, which is in a completely different hosting environment & location, different firewall etc… However it’s running the same CentOS 6 version.

In the same environment, behind the same dedicated firewall, and (similar) IPTABLES, my servers running CentOS 7 upgraded to 1.2.0 and worked as expected.

For what it’s worth, here’s the new message I haven’t seen before when upgrading on CentOS 6:
Upgrading certbot-auto 1.0.0 to 1.2.0…
Replacing certbot-auto…
Bootstrapping dependencies for Legacy RedHat-based OSes that will use Python3… (you can skip this with --no-bootstrap)
…and then we proceed to install a bunch of python dependencies.

And below is the error log:

2020-02-24 23:37:19,670:DEBUG:certbot._internal.main:certbot version: 1.2.0
2020-02-24 23:37:19,675:DEBUG:certbot._internal.main:Arguments: ['--standalone', '--renew-by-default', '--email', '[redacted]', '--agree-tos', '--text', '--preferred-challenges', 'http', '-d', 'admin.freeflys.com']
2020-02-24 23:37:19,676:DEBUG:certbot._internal.main:Discovered plugins: PluginsRegistry(PluginEntryPoint#apache,PluginEntryPoint#manual,PluginEntryPoint#nginx,PluginEntryPoint#null,PluginEntryPoint#standalone,PluginEntryPoint#webroot)
2020-02-24 23:37:19,709:DEBUG:certbot._internal.log:Root logging level set at 20
2020-02-24 23:37:19,710:INFO:certbot._internal.log:Saving debug log to /var/log/letsencrypt/letsencrypt.log
2020-02-24 23:37:19,711:DEBUG:certbot._internal.plugins.selection:Requested authenticator standalone and installer None
2020-02-24 23:37:19,721:DEBUG:certbot._internal.plugins.selection:Single candidate plugin: * standalone
Description: Spin up a temporary webserver
Interfaces: IAuthenticator, IPlugin
Entry point: standalone = certbot._internal.plugins.standalone:Authenticator
Initialized: <certbot._internal.plugins.standalone.Authenticator object at 0x7f3e495fd630>
Prep: True
2020-02-24 23:37:19,723:DEBUG:certbot._internal.plugins.selection:Selected authenticator <certbot._internal.plugins.standalone.Authenticator object at 0x7f3e495fd630> and installer None
2020-02-24 23:37:19,723:INFO:certbot._internal.plugins.selection:Plugins selected: Authenticator standalone, Installer None
2020-02-24 23:37:19,731:DEBUG:certbot._internal.main:Picked account: <Account(RegistrationResource(body=Registration(key=JWKRSA(key=<ComparableRSAKey(<cryptography.hazmat.backends.openssl.rsa._RSAPublicKey object at 0x7f3e4956be48>)>), contact=('mailto:root@admin.surveypanelgroup.com',), agreement='https://letsencrypt.org/documents/LE-SA-v1.0.1-July-27-2015.pdf', status=None, terms_of_service_agreed=None, only_return_existing=None, external_account_binding=None), uri='https://acme-v01.api.letsencrypt.org/acme/reg/937868', new_authzr_uri='https://acme-v01.api.letsencrypt.org/acme/new-authz', terms_of_service='https://letsencrypt.org/documents/LE-SA-v1.0.1-July-27-2015.pdf'), a2fb27f74862605a9da016f550d96eaf, Meta(creation_dt=datetime.datetime(2016, 3, 21, 22, 23, 42, tzinfo=<UTC>), creation_host='mws1.trancos.com'))>
2020-02-24 23:37:19,734:DEBUG:acme.client:Sending GET request to https://acme-v02.api.letsencrypt.org/directory.
2020-02-24 23:37:19,738:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org:443
2020-02-24 23:37:19,906:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "GET /directory HTTP/1.1" 200 658
2020-02-24 23:37:19,907:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Date: Tue, 25 Feb 2020 07:37:19 GMT
Content-Type: application/json
Content-Length: 658
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
  "0cGFK2-kBzM": "https://community.letsencrypt.org/t/adding-random-entries-to-the-directory/33417",
  "keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change",
  "meta": {
    "caaIdentities": [
      "letsencrypt.org"
    ],
    "termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf",
    "website": "https://letsencrypt.org"
  },
  "newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct",
  "newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce",
  "newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order",
  "revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert"
}
2020-02-24 23:37:19,919:DEBUG:certbot._internal.renewal:Auto-renewal forced with --force-renewal...
2020-02-24 23:37:19,919:INFO:certbot._internal.main:Renewing an existing certificate
2020-02-24 23:37:20,077:DEBUG:certbot.crypto_util:Generating key (2048 bits): /etc/letsencrypt/keys/0067_key-certbot.pem
2020-02-24 23:37:20,081:DEBUG:certbot.crypto_util:Creating CSR: /etc/letsencrypt/csr/0067_csr-certbot.pem
2020-02-24 23:37:20,082:DEBUG:acme.client:Requesting fresh nonce
2020-02-24 23:37:20,082:DEBUG:acme.client:Sending HEAD request to https://acme-v02.api.letsencrypt.org/acme/new-nonce.
2020-02-24 23:37:20,122:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "HEAD /acme/new-nonce HTTP/1.1" 200 0
2020-02-24 23:37:20,123:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Date: Tue, 25 Feb 2020 07:37:20 GMT
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Replay-Nonce: 0001oprDzVa0IYsx7TSBrYqXGwZ6GDLJYRhxkC30NGsnhL

Yes, when I try to run it for that one, it fails too.

The error message that you see is coming from the certificate authority and so it's less likely to be related to your Certbot upgrade.

It's much more likely to be related to this recent change on the CA side:

(the "secondary validation" is an additional attempt to connect to your server, from elsewhere on the Internet, for validation purposes)

It's logically possible that the Certbot update had some connection with this, but it's unlikely because there haven't been major changes in the standalone authenticator in the Certbot code recently. Also, if the firewall were completely uninvolved, I would expect to see something like "connection refused" rather than "timeout during connect" in case of a failure to connect to your server.

Because of the recent multi-viewpoint validation change, it's possible that firewall rules that didn't prevent certificate issuance before are now preventing it (by blocking some of the secondary validation connections).

2 Likes

@schoen That seems possible. At least it makes sense that if we haven’t changed anything on our side, and if really no changes to the standalone authenticator, that the secondary validation is the issue. We do block large ranges of foreign IPs, e.g. China, Russia etc… I didn’t see any references to what IPs we should whitelist (though maybe that’s the point?). Any recommendations of how to move forward short of disabling the firewall completely while upgrading?

Still, if this is when the big certbot-auto Python 3 upgrade happened, there's some risk of issues with the standard library or other bundled dependencies. But a timeout error is still unlikely, as you said. (A weird networking bug is more likely to get "connection reset by peer" or something.)

For what it's worth, last I looked, Let's Encrypt currently validates from two countries, neither of which are China or Russia. I'm saying that in the interest of debugging; that's not a guarantee and it is subject to change.

Correct. :grimacing:

If that's the issue, it has nothing to do with your Certbot version, or when you upgrade it. It's just how validation works, regardless of what ACME client you use or what version it is.

You need to allow all IPs to access files in /.well-known/acme-challenge/ when validating. You can also switch to DNS validation (if possible with your setup, and assuming your DNS service is unrestricted).

2 Likes

After doing some exhaustive searching through our firewall rules, it appears that Lets Encrypt was trying to reach us through some Amazon AWS/IP ranges that we had blocked a long, long time ago do to bad actors. After opening up those IP ranges we were able to successfully renew these certificates.
Therefore, it does look like the recent multi-viewpoint validation change was the issue, and not sure if that was tied into the upgrade from 1.0.0 to 1.2.0 or not, but there we go. Case closed. Thanks for your assistance all.

1 Like

This was the issue. Old IP ranges that we had banned are now being used by Lets Encrypt to perform additional validations. Thanks @schoen

2 Likes

I’m glad you were able to figure it out!

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.