Certfication generation is failing because of 404 nginx intermittently

My domain is: Multiple domains

I ran this command: sudo certbot certonly --agree-tos --noninteractive --webroot -d domain -d www.domain.com -w --config-dir /sites/ssl/

My web server is (include version): nginx

The operating system my web server runs on is (include version): Centos 7

My hosting provider, if applicable, is: NA

I can login to a root shell on my machine (yes or no, or I don't know): yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): certbot 2.7.4

Info:
My nginx server is giving 404 for the validation file generated by certbot
(I can confirm from the debug logs that file is indeed created still nginx is giving 404.)
It's not a nginx config issue because it's happening intermittently, I am able to generate the certificate 9 out of 10 times.
I have run several test to confirm that there's no delay in nginx. I am running this setup in NFS environment.

Debug logs:

Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems:
Domain: www.example.me
Type: unauthorized
Detail: XX.XX.XX.XX: Invalid response from http://www.example.me/.well-known/acme-challenge/TrTUHUyF9J6cijFijgZ4pxzxOBFUawrfrTdINYq9U-fE: 404

Domain: example.me
Type: unauthorized
Detail: During secondary validation: XX.XX.XX.XX: Invalid response from http://example.me/.well-known/acme-challenge/k1EZ0eXfOluefXIGYsfdKb_gCuBdOwMMufdKn8CnUpw: 404

Am I missing something? Anything else I can look into to resolve this issue? Any help would be appreciated

Thanks

1 Like

Without a domain we can only guess but apparently your domain does not (always?) point to that nginx server. If it's intermittent then perhaps you have two IP address entries or you are load balancing.

You can do some basic tests using https://letsdebug.net

3 Likes

@webprofusion Thanks for replying, yes I have multiple nginx servers but they all share the same config. Issue is not limited to one nginx server. Also the certificate usually generates in the 2nd or 3rd request after failing.

1 Like

Thanks, so when you are using HTTP domain validation Let's Encrypt will immediately make their http request to your domain as http://<yourdomain>/.well-known/acme-challenge/<challenge response file>

Every single server that can possibly respond request on your domain must immediately give the same answer or you risk failing validation. Currently it seems like that's not always happening as per your 404 error.

The validation checks will come from multiple data centers all over the world ("multi-perspective" validation). Let's Encrypt added more validation perspectives a few months back, which makes it more likely for this sort of problem to occur.

The 404 error is a response coming from your domain, and that's the problem you need to fix. The obvious culprit would be your servers failing to synchronize their response quickly enough. If your cert renewal is all happening on a single server then perhaps you could direct/proxy all /.well-known/acme-challenge requests just to that one server.

3 Likes

Thanks @webprofusion , I think it's related to nfs delay. Let me dig more into it

2 Likes

If it's intermittent, it may be that certbot just isn't waiting long enough between updating the nginx config to respond to the challenge and requesting to the CA to check the challenges. I think this is more often reported with more sophisticated nginx setups. Try setting --nginx-sleep-seconds to a higher value (the default is 1) and see if that helps.

2 Likes

Yes, when using the --nginx plugin that can happen. But, they are using certonly --webroot so Certbot does not make any changes to the nginx config. The --nginx-sleep-seconds would have no affect (and Certbot should reject as invalid option for --webroot).

3 Likes

Whoops. Guess I didn't read carefully enough. Thanks for the correction.

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.