When my certificate auto-renews, the challenges fail several times, then suceed

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is:
spaceflight.training

I ran this command:
(auto renew)
It produced this output:
(pages and pages of log output including failures followed by successe. I do not know what parts are relevant)
My web server is (include version):
Apache/2.4.41 (Ubuntu)
The operating system my web server runs on is (include version):
Ubuntu 20.04
My hosting provider, if applicable, is:
Digital Ocean
I can login to a root shell on my machine (yes or no, or I don't know):
yes
I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
no
The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
certbot 0.40.0

I have used certbot auto renewal for years. As shown by log entries, after the certificate auto renews and the challenges are attempted, the first few challenges fail, but eventually it works. This has been going on for years. Since it eventually works, I haven't been too concerned, but now I would like to find the root cause, in case some time it does not eventually work.

Example failures grep'd out of log

2023-05-02 23:54:37,337:WARNING:certbot.auth_handler:Challenge failed for domain spaceflight.training
2023-05-03 07:37:44,426:WARNING:certbot.auth_handler:Challenge failed for domain spaceflight.training
2023-05-03 12:10:48,088:WARNING:certbot.auth_handler:Challenge failed for domain spaceflight.training
2023-05-04 05:03:57,700:WARNING:certbot.auth_handler:Challenge failed for domain spaceflight.training

But this showing that it eventually worked on 05-04


Found the following certs:
Certificate Name: spaceflight.training
Domains: spaceflight.training
Expiry Date: 2023-08-02 16:19:44+00:00 (VALID: 87 days)
Certificate Path: /etc/letsencrypt/live/spaceflight.training/fullchain.pem
Private Key Path: /etc/letsencrypt/live/spaceflight.training/privkey.pem


There may be something obvious here, I find the log output very difficult to understand. What information should I provide for diagnosis?

Bottom line: I seek the cause of the challenge failures followed by success.

Welcome to the community @UncleBilla

First, let's look at your renewal config file. I do see at least one problem (your use of NameCheap URL forward) but let's start here.

Show us this file

/etc/letsencrypt/renewal/spaceflight.training.conf
4 Likes

Thank you for your interest. Here is the contents of the file you requested:

# renew_before_expiry = 30 days
version = 0.40.0
archive_dir = /etc/letsencrypt/archive/spaceflight.training
cert = /etc/letsencrypt/live/spaceflight.training/cert.pem
privkey = /etc/letsencrypt/live/spaceflight.training/privkey.pem
chain = /etc/letsencrypt/live/spaceflight.training/chain.pem
fullchain = /etc/letsencrypt/live/spaceflight.training/fullchain.pem

# Options used in the renewal process
[renewalparams]
account = fa2d47a1fcc745dc25ced87980fbd4fd
authenticator = apache
installer = apache
server = https://acme-v02.api.letsencrypt.org/directory
2 Likes

Ok, there are two problems (at least) :slight_smile:

One is you have two IP addresses in the DNS for this domain name. That is allowed but I am pretty sure in this case it is wrong. Some requests to your domain timeout and others do not. This is likely because one of the IP addresses is wrong (probably 206.189.217.113).

Two is you are using the NameCheap URL Forward. Frankly, I'm not sure how you ever get a cert with that enabled because it interferes with the ACME Challenge URL.

curl -I http://spaceflight.training/.well-known/acme-challenge/ForumTest123
HTTP/1.1 302 Found
Date: Sun, 07 May 2023 14:53:51 GMT
Connection: keep-alive
Location: https://spaceflight.training
X-Served-By: Namecheap URL Forward
Server: namecheap-nginx

Note the Location: has dropped the /.well-known/... part of the URI.

You should change the DNS setting to just be your server's IP address

Once you get those sorted try the following and let us know results

sudo certbot renew --dry-run

omit sudo if not needed

3 Likes

Thanks very much for the reply.

I removed the redirects in namecheap for spaceflight.training.

The test at lets debug now passes. And so does the command you requested that I try

Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/spaceflight.training.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator apache, Installer apache
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for spaceflight.training
Waiting for verification...
Cleaning up challenges

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
new certificate deployed with reload of apache server; fullchain is
/etc/letsencrypt/live/spaceflight.training/fullchain.pem
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates below have not been saved.)

Congratulations, all renewals succeeded. The following certs have been renewed:
  /etc/letsencrypt/live/spaceflight.training/fullchain.pem (success)
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates above have not been saved.)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

It sure looks like that you have solved my long standing problem!

The website is displaying in chrome now, but not in firefox. I will need to look into that, flush the cache, etc.

Many thanks.

1 Like

In your NameCheap config for this domain name there is a section for DNS. You have two A records with different IP addresses there. And, no, not related to any other domain name.

spaceflight.training.   1799    IN      A       206.189.217.113
spaceflight.training.   1799    IN      A       192.64.119.23

From your own server you can run this to check your public IP and this should be the only one in your DNS

curl -4 ifconfig.io

Also in your NameCheap DNS settings is something about URL Forwarding. You should disable that. It has been a long time since I have used NameCheap so don't remember exactly how that is done. The NameCheap help topics should cover this.

3 Likes

Thank you for that info. I did see how to do it on the namecheap GUI finally.

I edited my last post - apologies if that's bad form here.

Bottom line, I think you have nailed it, and thanks very much for the help. I'm sorry I waited so long to ask here about it.

1 Like

Huh. Something very odd still persists. I see you have just the one IP address now in your DNS (although not the one I was guessing).

I cannot reach your domain at all and neither could a Let's Debug test I just ran. I saw your edited post showing success so I am puzzled. And, yes, best to make new posts rather than edit past ones.

4 Likes

Yes, I repeated both the command line test and the lets debug test and they failed for me.

I have an old, non secure website hosted on the same server. It's called eclectichouston.com

I suspect I am doing something wrong in how all this is configured. Both websites are showing the same IP now in namecheap. Formerly the 192.64.119.23 was the IP for eclectichouston.com

This is undoubtedly something I have configured wrong at namecheap, I am looking at that.

Just double-checking, what did this show?

curl -4 https://ifconfig.io
3 Likes

It returns the 206.189.217.113 IP

Oddly, the DNS-01 check on lets debug passes

Not the same thing. That's testing a possible DNS Challenge but you use HTTP Challenge. Also, it doesn't test all that much as DNS Challenge can be complex.

3 Likes

Do you have a firewall active?

Your eclectichouston.com fails too. Is that just another VirtualHost in your Apache or is that an entirely different virtual server?

4 Likes

Yes, I have an active firewall and I geo-block much of the world. Depending on where letsencrypt's IPs are located, that could definitely be causing problems.

Ding ding ding. Let's Encrypt checks from various points around the world with constantly changing IP addresses. The IP addresses are not published.

Also, just for fun:

4 Likes

Thanks. I'll disable that temporarily, and test the renewal.

That will be an acceptable thing for me to do for renewals, actually. If I have to turn off the firewall, let it renew, and then turn it back on, that will work. Since this is just a hobby website anyway.

Your help has fixed the long standing problem of the page timing out, and I deeply appreciate that.

2 Likes

If you want to get fancy you could look at the Certbot pre-hook and post-hook to automate opening/closing the firewall
https://eff-certbot.readthedocs.io/en/stable/using.html#renewing-certificates

4 Likes

I will do that. Thanks again.

2 Likes