Certbot Nginx Plugin Failing Intermittently - Issues Persist After Updating to 2.11.0

Certbot Nginx Plugin Failing Intermittently - Issues Persist After Updating to 2.11.0

Environment

  • Certbot version: 2.11.0 (upgraded during troubleshooting)
  • Web server: Nginx
  • Application: Django
  • Installation method: certbot --nginx installer

Background
We've been successfully using Let's Encrypt with our platform's custom domain functionality for about 1.5 years without issues. Our setup involves Nginx and Django, with certificates managed via the Certbot Nginx plugin.

Current Issue
Over the past couple weeks, we've been experiencing intermittent failures with the Certbot Nginx installer. In an attempt to resolve these issues, we upgraded to Certbot 2.11.0, but the problems still persist.

Key observations:

  • Some domains fail consistently while others work perfectly
  • Let's Debug tool confirms correct DNS configuration for failing domains
  • No recent changes to our Nginx configurations
  • Manual verification using webroot method succeeds, but we'd prefer to maintain our current approach

Troubleshooting Steps Taken

  1. Upgraded Certbot to latest version (2.11.0)
  2. Verified DNS configuration using Let's Debug tool
  3. Confirmed Nginx configurations are unchanged
  4. Tested certificate issuance with webroot method (successful)
  5. Verified process works for some domains but fails for others under identical conditions

Questions

  1. Is there a way to prevent Certbot from cleaning up its temporary files/configurations after challenges? This would help us debug the Nginx configuration changes it makes during the ACME challenge.
  2. Has anyone encountered similar intermittent failures with the Nginx plugin lately?

Any guidance on additional debugging approaches or potential solutions would be greatly appreciated.

Please show the error message(s).

Please show the error message(s).

3 Likes

Without logs and domains it's hard to tell anything but the most common intermittent causes of faults include geoblocking (blocking a country that you need to allow) and multiple IPs or other load balancing for the same domain (validation is going to the wrong server).

2 Likes

Do you have a very large number of server blocks in the nginx config. And is the error a "404" (Not Found)?

If so it sounds like you might need more time for nginx to reload after Certbot makes its temp changes to the nginx config to setup the challenge.

You would do that with increasing the default of 1s to 3s or 5s

--nginx-sleep-seconds NGINX_SLEEP_SECONDS
Number of seconds to wait for nginx configuration changes to apply when reloading. (default: 1)

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.