Over 3000 certificates on one server

We run a domain hosting service with over 3400 unique domains on one balancer. I was able to issue all certificates, by moving them away from the /etc/letsencrypt/ folder after about 500 or so and after all certificates were issued copying them all back to the folder. Now I can’t issue any new certificates. Renewing doesn’t work either.
I realize this sounds a little bit like putting a stick in my bike spokes while riding it and complaining about falling over afterwards, but maybe there is a solution I’m not seeing?

I ran this command:
certbot --nginx -d example.com,www.example.com -n --no-redirec

It produced this output:

Domain: example.com
Type: unauthorized
Detail: Invalid response from http://example.com/.well-known/acme-challenge/AdIrNFmTJdqj_BxDTRWBkHasdfasdzy1uH7PmEdXsNlYJY: "
<!doctype html>

<meta http-equiv="X-UA-Compatible" content="I"

Webserver: nginx/1.10.3
Operation System: Ubuntu 16.04
Root Acces: Yes

Hi,

By balancer, do you mean load balancer?

If so, do you process validation on the balancer or the actual server? (Because the before you mentioned could be before LE disable TLS validation)

I’m not sure about web balancer… But could it because the validation file was not copied to all servers and LE server can’t find the file on one of the servers?

Thank you

1 Like

It looks like your certbot command line got truncated. Could you provide the full command?

Was there a reason you copied the certificates away and then back again? Did Certbot get too slow when you have more than 500 certificates issued?

This may wind up being a case where Certbot is not the best client for you. There certainly are hosting providers that use it to issue for a large number of domains, but generally its design doesn’t work well for that. In particular, it likes to scan all previously issued certificates at startup.

We use nginx as a loadbalancer (reverse proxy) for our webservers. The domains’ a-records points to the IP of the balancer.

To your last point: Since I was able to issue certificates before, I don’t think that’s the problem.

The command above is the full command, actually. At least that’s the command we used to issue all of the 3400 certificates.

The reason for copying the certificates away, was indeed the speed of certbot. Up to 500 certificates it was reasonably speedy, but afterwards it got slower and slower. One problem was the size of the logfile. The bigger problem however is the backup of all certificates certbot makes after each issued certificate, for re-rolling purposes.
About the scan of all previous certificates you mention, I wasn’t even aware of that.

Are there other clients I can use, that would fit our environment better?

It looks like at least the last character was accidentally dropped, which is what made me thing there might be more missing. But sounds like not.

For other clients, check out our list: ACME Client Implementations - Let's Encrypt. I don't have strong opinions on which clients are better for hosting providers (client authors feel free to chime in!). I do suspect that you'll have the best time using a client that provides a library in your favorite programming language rather than just a command line interface. That will allow you to integrate it more closely with other systems, like onboarding new customers automatically, and reporting and retrying errors in a way that makes sense for your system. I'd definitely encourage you to read our Integration Guide.

For your immediate problem: I don't really see a reason why copying just the certificates away and then back again would result in that error. What that looks like to me is that Certbot is putting up a validation file, but either (a) Nginx is not serving it from the correct place, or (b) the domain name is pointed at the wrong server. I think (a) is generally unlikely, since you're using the Nginx plugin, which knows pretty well how to configure responses. I assume your load balancer is running Nginx, and that's also where you're running Certbot? Are there any other config changes you might have made that could result in your problem?

Thank you for your help.

To your last point. The Domain definitely points towards the server. I checked that with the host command in Ubuntu and I have access to our DNS-Tool, which tells me the same. Since it’s not just one Domain, but all new Domains I try to issue certificates for and all renewals I don’t see how it is not a bug, either in nginx or certbot.

Also, we did not change anything regarding certbot. I installed it with the following guide, and then just used it to issue the certificates. (https://www.digitalocean.com/community/tutorials/how-to-secure-nginx-with-let-s-encrypt-on-ubuntu-16-04)

If you move the certificates aside again, does issuing start working again?

That’s a little difficult, since the environment is productively used.

Funny thing however, at the moment I’m using the following command to issue certificates “certbot certonly --installer null --webroot -w /var/www/letsencrypt -d example.com -d www.example.com”. Sometimes it works, sometimes it doesn’t.
But not just for us. A colleague of mine has his own web server and he has the same problem at the moment.

Just weird.

Some it works sometimes not suggests to me you might have multiple IP addresses in rotation. Could you share one of the hostnames?

Also I believe there is a --debug-challenges flag to certbot that might be a help.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.