Renew certs nginx works only with --debug-challenges after waiting for 5 secs

My domain is: rm.jeremymouzin.com

I ran this command: sudo certbot --dry-run certonly --nginx -d rm.jeremymouzin.com -v

It produced this output:

Plugins selected: Authenticator nginx, Installer nginx
Certificate is due for renewal, auto-renewing...
Simulating renewal of an existing certificate for rm.jeremymouzin.com
Performing the following challenges:
http-01 challenge for rm.jeremymouzin.com
Waiting for verification...
Challenge failed for domain rm.jeremymouzin.com
http-01 challenge for rm.jeremymouzin.com

Certbot failed to authenticate some domains (authenticator: nginx). The Certificate Authority reported these problems:
  Domain: rm.jeremymouzin.com
  Type:   unauthorized
  Detail: 2001:4b98:dc0:43:f816:3eff:fe3d:54fb: Invalid response from https://rm.jeremymouzin.com/.well-known/acme-challenge/dPTXpAKiUWnt9j2PS-anPtHNDvOHrUYNnFGErlTScB8: 404

Hint: The Certificate Authority failed to verify the temporary nginx configuration changes made by Certbot. Ensure the listed domains point to this nginx server and that it is accessible from the internet.

Cleaning up challenges
Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

My web server is (include version): nginx/1.22.1

The operating system my web server runs on is (include version): Debian 12.5 bookworm

My hosting provider, if applicable, is: gandi.net

I can login to a root shell on my machine (yes or no, or I don't know): yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): certbot 2.11.0 (installed via snapd)

The problem

I have a really weird problem: if I try to renew my certificate with the command mentioned before it doesn't work BUT if I try to renew with --dry-run AND --debug-challenges AND I wait for 5 seconds before typing ENTER the dry-run works! So I can't even renew it manually because there is no wait before validation (or maybe there is an option for that? anyway it should work without having to wait!)

Additional informations

Yesterday I read almost all posts in this forum that were relevant for my issue so here are some information you may ask me later.

In my DNS I have an A and AAAA records that points to my subdomain to the IP of my VPS and I have correctly configured nginx for both IPv4 and IPv6 (without forgetting to use listen [::]:80 for IPv6).

Here is my nginx config:

# Redirection from http:// to https://
server {
 listen 80;
 listen [::]:80;

 server_name rm.jeremymouzin.com;

 return 301 https://$host$request_uri;
}

server {
 listen 443 ssl http2; # managed by Certbot
 listen [::]:443 ssl http2;

 ssl_certificate /etc/letsencrypt/live/rm.jeremymouzin.com/fullchain.pem; # managed by Certbot
 ssl_trusted_certificate /etc/letsencrypt/live/rm.jeremymouzin.com/fullchain.pem; # managed by Certbot
 ssl_certificate_key /etc/letsencrypt/live/rm.jeremymouzin.com/privkey.pem; # managed by Certbot
 include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
 ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot

 root /var/www/rm.jeremymouzin.com;
 server_name rm.jeremymouzin.com;

 # Disable unwanted HTTP methods
 if ($request_method !~ ^(GET|HEAD|POST)$) {
  return 405; 
 }

 # ModSecurity-nginx dynamic module
 modsecurity on;
 modsecurity_rules_file /etc/nginx/modsec_includes.conf;

 location / {
  proxy_pass http://localhost:3334;
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection 'upgrade';
  proxy_set_header Host $host;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-Proto $scheme;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_cache_bypass $http_upgrade;
 }

 location /security.txt {
  return 301 https://$host/.well-known/security.txt;
 }
}

I already tried to remove the modsecurity section and the # Disable unwanted HTTP methods, it still doesn't work.

I have NO geoblocking feature from firewall / whatever as you can see here: https://check-host.net/check-dns?host=rm.jeremymouzin.com&csrf_token=a971351a9e70b672b30369f098efc96a8160e88e

My ports 80 and 443 are opened and reachable from the internet:

$ nmap -p80,443 rm.jeremymouzin.com
Starting Nmap 7.92 ( https://nmap.org ) at 2024-06-08 07:09 CEST
Nmap scan report for rm.jeremymouzin.com (46.226.107.169)
Host is up (0.027s latency).
Other addresses for rm.jeremymouzin.com (not scanned): 2001:4b98:dc0:43:f816:3eff:fe3d:54fb
rDNS record for 46.226.107.169: xvm-107-169.dc0.ghst.net

PORT    STATE SERVICE
80/tcp  open  http
443/tcp open  https

Nmap done: 1 IP address (1 host up) scanned in 0.35 seconds

My certbot certificates:

Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Found the following certs:
  Certificate Name: rm.jeremymouzin.com
    Serial Number: 4a8cd248bd22f1363e39d6f053ff6f299cd
    Key Type: ECDSA
    Domains: rm.jeremymouzin.com
    Expiry Date: 2024-06-27 19:38:50+00:00 (VALID: 19 days)
    Certificate Path: /etc/letsencrypt/live/rm.jeremymouzin.com/fullchain.pem
    Private Key Path: /etc/letsencrypt/live/rm.jeremymouzin.com/privkey.pem
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

(I cut out other certificates for other domains that are not relevant here and that don't renew either with the same issue!)

I also checked the nginx temporary configuration file created by certbot while using the following command: sudo certbot --dry-run certonly --nginx -d rm.jeremymouzin.com -v --debug-challenge, and as soon as it asks me to type enter, I take a look at the subdomain nginx config file to check what's new, here is what it produces:

$ cat /etc/nginx/sites-available/rm.jeremymouzin.com 

# Redirection from http:// to https://
server {rewrite ^(/.well-known/acme-challenge/.*) $1 break; # managed by Certbot


 listen 80;
 listen [::]:80;

 server_name rm.jeremymouzin.com;

 return 301 https://$host$request_uri;
location = /.well-known/acme-challenge/Y5alQPckDhfQdYXS2XrOONBswQ0Q7J1dL1HweJAJbtQ{default_type text/plain;return 200 Y5alQPckDhfQdYXS2XrOONBswQ0Q7J1dL1HweJAJbtQ.IPOnz7gQvri-q1k781JN7bi0RYDIHgcqeoOdhhVkgVM;} # managed by Certbot

}

server {rewrite ^(/.well-known/acme-challenge/.*) $1 break; # managed by Certbot


 listen 443 ssl http2; # managed by Certbot
 listen [::]:443 ssl http2;

 ssl_certificate /etc/letsencrypt/live/rm.jeremymouzin.com/fullchain.pem; # managed by Certbot
 ssl_trusted_certificate /etc/letsencrypt/live/rm.jeremymouzin.com/fullchain.pem; # managed by Certbot
 ssl_certificate_key /etc/letsencrypt/live/rm.jeremymouzin.com/privkey.pem; # managed by Certbot
 include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
 ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot

 root /var/www/rm.jeremymouzin.com;
 server_name rm.jeremymouzin.com;

 # Disable unwanted HTTP methods
 if ($request_method !~ ^(GET|HEAD|POST)$) {
  return 405; 
 }

 # ModSecurity-nginx dynamic module
 modsecurity on;
 modsecurity_rules_file /etc/nginx/modsec_includes.conf;

 location / {
  proxy_pass http://localhost:3334;
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection 'upgrade';
  proxy_set_header Host $host;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-Proto $scheme;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_cache_bypass $http_upgrade;
 }

 location /security.txt {
  return 301 https://$host/.well-known/security.txt;
 }

location = /.well-known/acme-challenge/Y5alQPckDhfQdYXS2XrOONBswQ0Q7J1dL1HweJAJbtQ{default_type text/plain;return 200 Y5alQPckDhfQdYXS2XrOONBswQ0Q7J1dL1HweJAJbtQ.IPOnz7gQvri-q1k781JN7bi0RYDIHgcqeoOdhhVkgVM;} # managed by Certbot

}

It seems to add the needed stuff for the verification (location = /.well-known/acme-challenge/XXXX). In fact I can reach to my browser and go to the http URL http://rm.jeremymouzin.com/.well-known/acme-challenge/Y5alQPckDhfQdYXS2XrOONBswQ0Q7J1dL1HweJAJbtQ, it gives me the expected output! And I can see in the Network tab that it redirects correctly first to https as expected. (I also tried to remove the https redirection from the nginx config and reloaded nginx and retried to renew the cert to see if it could help, it did not).

The weird thing is that when I hit enter, the test is successful! As if it needed some time to get the verification done right! I tried with a 3 seconds delay and it doesn't work. It needs at least 5 seconds between the time it ask me this (to type enter):

[...]
Challenges loaded. Press continue to submit to CA.

The following URLs should be accessible from the internet and return the value
mentioned:

URL:
http://rm.jeremymouzin.com/.well-known/acme-challenge/Nuw-Z7DWAecUqGeIIF_OO82jsKsmRhS28zg9P0_Po6U
Expected value:
Nuw-Z7DWAecUqGeIIF_OO82jsKsmRhS28zg9P0_Po6U.IPOnz7gQvri-q1k781JN7bi0RYDIHgcqeoOdhhVkgVM
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Press Enter to Continue

and when I hit enter.

Maybe I issued too much certificates or something? (I didn't use the --dry-run argument when doing some configuration on my server for my project and I did issue several certificates for the same subdomain, sorry about that!), see crt.sh | rm.jeremymouzin.com. Could it be an issue?

I can't figure out what makes this delay necessary to make the renewal verification work?! I can't figure out what I did wrong.

Oh and I did try already also to make a /.well-known/acme-challenge/ subdirectory with some test.txt file in it, it's perfectly reachable from my browser, I tried to chmod 777 acme-challenge (why not) didn't work either.

Based on the nginx config, the file system should not be related to the verification process as it gets directly the location AND the result from the certbot added directive (location = /.well-known/acme-challenge/Y5alQPckDhfQdYXS2XrOONBswQ0Q7J1dL1HweJAJbtQ{default_type text/plain;return 200 Y5alQPckDhfQdYXS2XrOONBswQ0Q7J1dL1HweJAJbtQ.IPOnz7gQvri-q1k781JN7bi0RYDIHgcqeoOdhhVkgVM;}) but as I'm not an nginx expert I tried anyway, it didn't work.

I'm quite out of options and I hope you will point out something I did miss that is completely stupid but that will make everything work. Thanks for your help.

1 Like

Welcome @jemo to the community. That is quite a thorough first post :slight_smile:

I think using --nginx-sleep-seconds 5 instead of the debug-challenges should work. After certbot makes the temp change to the server blocks it must reload nginx so those new values are used. This says to wait longer for that to finish before requesting the cert.

But that is extremely unusual. Even very large nginx configs reload within 1 second but sometimes need 2 or 3. How long does it take for a reload to complete when you do one from command line?

from docs

--nginx-sleep-seconds NGINX_SLEEP_SECONDS
Number of seconds to wait for nginx configuration changes to apply when reloading. (default: 1)

3 Likes

Yes the anti-spam protection didn't like it :sweat_smile: but I thought it would be a good idea to summarize everything I learned for others and to tell you all the things I tried already to avoid back and forth messages.

I tried with a delay of 10 secs but it didn't work... Then 15 seconds and it DID WORK. Thank you!

Reloading nginx doesn't seem to be slow though (less than 1s on average) :

$ time sudo systemctl reload nginx

real	0m1.223s
user	0m0.011s
sys	0m0.000s
$ time sudo systemctl reload nginx

real	0m0.778s
user	0m0.000s
sys	0m0.017s
$ time sudo systemctl reload nginx

real	0m0.693s
user	0m0.011s
sys	0m0.006s
$ time sudo systemctl reload nginx

real	0m0.763s
user	0m0.017s
sys	0m0.000s

My nginx config is 571 lines, is this big (I have several subdomains) ?

$ sudo nginx -T | wc -l
2024/06/08 10:47:16 [notice] 1094153#1094153: ModSecurity-nginx v1.0.3 (rules loaded inline/local/remote: 0/2364/0)
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
571

So it worked but damn, why the hell does it have to wait so long? Do you have any idea?

Now I must find a way to add this option to the systemd automatic renewal process...

Thanks again for your time and quick support.

I found the issue! I have lots of security related directives in my nginx.conf file and I commented everything just to check and it works without delay now!

I will find which one exactly makes this behavior and I will post it here later this day (I have to go).

1 Like

Your better option would be to use the --webroot method instead of --nginx

Certbot --webroot places an actual file in a folder nginx already knows. It does not require nginx to be reloaded. It also makes no permanent changes to your nginx config so you must manually setup your HTTPS (port 443) server block yourself. Lastly, you need to use a post-deploy hook or other method to reload nginx after you get a fresh cert so nginx picks it up.

webroot is commonly used especially for larger or unusual nginx configs. And, for people who don't like some other program modifying their nginx config at all.

To setup for webroot make your port 80 server block look like below. I based this off what you showed earlier.

server {
    listen 80;
    listen [::]:80; 
    server_name rm.jeremymouzin.com;

    location /.well-known/acme-challenge/ {
        root /var/certbot;       # make/use any folder you prefer
    }
    location / {
       return 301 https://$host$request_uri;
    }
}

Then, your Certbot command is

sudo certbot certonly --webroot -w /var/certbot -d rm.jeremymouzin.com --deploy-hook "sudo systemctl reload nginx"

Make sure the -w path matches the root value you chose

2 Likes

Thanks a lot for your detailed explanation for using webroot!

But it's worse than I expected. It's not commenting my security related directives that made it work. It's just reloading the nginx server.

This is crazy: if I ssh myself on my VPS and run the renew command right away, it doesn't work. But if I reload just once the nginx server and do again the renew command, it perfectly works with NO change on any configuration file and without the need of the delay.

Look:

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sat Jun  8 18:26:22 2024 from 77.133.250.153
debian@saas:~$ sudo certbot --dry-run certonly --nginx -d rm.jeremymouzin.com -v
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator nginx, Installer nginx
Certificate is due for renewal, auto-renewing...
Simulating renewal of an existing certificate for rm.jeremymouzin.com
Performing the following challenges:
http-01 challenge for rm.jeremymouzin.com
Waiting for verification...
Challenge failed for domain rm.jeremymouzin.com
http-01 challenge for rm.jeremymouzin.com

Certbot failed to authenticate some domains (authenticator: nginx). The Certificate Authority reported these problems:
  Domain: rm.jeremymouzin.com
  Type:   unauthorized
  Detail: 2001:4b98:dc0:43:f816:3eff:fe3d:54fb: Invalid response from https://rm.jeremymouzin.com/.well-known/acme-challenge/j8IaMrAeYAhCABtuU6_jLkLVIW7P1X14FSz9Mr57uYU: 404

Hint: The Certificate Authority failed to verify the temporary nginx configuration changes made by Certbot. Ensure the listed domains point to this nginx server and that it is accessible from the internet.

Cleaning up challenges
Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.
debian@saas:~$ sudo systemctl reload nginx
debian@saas:~$ sudo certbot --dry-run certonly --nginx -d rm.jeremymouzin.com -v
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator nginx, Installer nginx
Certificate is due for renewal, auto-renewing...
Simulating renewal of an existing certificate for rm.jeremymouzin.com
Performing the following challenges:
http-01 challenge for rm.jeremymouzin.com
Waiting for verification...
Cleaning up challenges
The dry run was successful.
debian@saas:~$

And if I try to renew with --dry-run all my subdomains they all work normally, what is happening? I don't understand anything! Is there a cache or something with ssh / nginx?! I have no idea why it doesn't work right away.

I'll try to use webroot instead for verification, hoping it will work with systemd automatically...

1 Like

Is nginx running before you try your first certbot command with --nginx ?

Because there is a known bug in certbot with --nginx on systemd systems. If nginx is not running before-hand Certbot starts nginx but not using systemd. You then cannot control that using systemd and it causes other problems.

The work-around, and best practice anyway, is to always have nginx running before using the --nginx option. We don't see this fail on systems once they are up and running. Just sometimes when people are just starting out or if they try to manually correct something and don't pre-start nginx.

If that's not the problem I recommend switching to webroot. You'll have it up and running in less time than trying to figure out this weird delay problem.

2 Likes

Yes nginx was always running when I performed my tests. Also it's the version 1.22 there could be a bug or something as it's not the latest.

Anyway I switched to webroot method and it works perfectly even right after sshing myself on the server while the other method still doesn't work without reloading nginx first.

I'll post how I changed the method to webroot soon so that other users can have a how-to tutorial to do the same if they encounter the same problem.

Thanks again for your time for this weird bug.

2 Likes

For the record, here is what I changed to make my renewal process use the webroot method instead of nginx:

I created a directory for certbot:
mkdir -p /var/www/certbot

In my /etc/nginx/sites-available/rm.jeremymouzin.com configuration file, I changed the http section to this (based on @MikeMcQ post, I just used the /var/www/certbot directory instead of /var/certbot):

# Redirection from http:// to https://
server {
 listen 80;
 listen [::]:80;

 server_name rm.jeremymouzin.com;

 location /.well-known/acme-challenge/ {
  root /var/www/certbot;
 }
 
 location / {
  return 301 https://$host$request_uri;
 }
}
(...)

Then I used the reconfigure command of certbot like this:

$ sudo certbot reconfigure --cert-name rm.jeremymouzin.com --webroot -w /var/www/certbot --deploy-hook "sudo systemctl reload nginx"
Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
You are attempting to set a --deploy-hook. Would you like Certbot to run deploy
hooks when it performs a dry run with the new settings? This will run all
relevant deploy hooks, including directory hooks, unless --no-directory-hooks is
set. This will use the current active certificate, and not the temporary test
certificate acquired during the dry run.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
(R)un deploy hooks/(D)o not run deploy hooks: R
Simulating renewal of an existing certificate for rm.jeremymouzin.com

Successfully updated configuration.
Changes will apply when the certificate renews.

I tried to use a renew command with --dry-run just after sshing myself to be sure it works out of the box:

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sat Jun  8 18:53:46 2024 from 77.133.250.153
debian@saas:~$ sudo certbot renew --dry-run --cert-name rm.jeremymouzin.com
Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/rm.jeremymouzin.com.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Simulating renewal of an existing certificate for rm.jeremymouzin.com

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Congratulations, all simulated renewals succeeded: 
  /etc/letsencrypt/live/rm.jeremymouzin.com/fullchain.pem (success)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

And it worked out of the box!
It should also work with systemd automatically later. Here you can see that the certbot script to renew certificates will be triggered in 7 hours:

$ systemctl list-timers
NEXT                        LEFT          LAST                        PASSED       UNIT            >
(...)
Sun 2024-06-09 02:41:00 UTC 7h left       Sat 2024-06-08 14:38:08 UTC 4h 33min ago snap.certbot.renew.service
(...)

10 timers listed.
Pass --all to see loaded but inactive timers, too.

So we will see tomorrow if it worked as expected and renewed automatically my subdomain.
I will edit my post to let you know.

Again I want to say a massive thank you to Mike! Thanks a lot for your lightning speed support and help! Have a nice week-end.

3 Likes

You are very welcome. I am confident you now have a reliable solution. Cheers to you too.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.