I thought I should record my observations about this somewhere since I brought up the topic...
(1) Each OCSP query has some time overhead, so if you have a huge number of certs, certbot renew
takes a long time even when no renewals are due. This is especially annoying when trying to run it interactively (when working on a server's certificate setup) but could also be a problem if it's set to run frequently and can't finish one invocation before a later invocation happens.
(2) On a large Apache install, apachectl graceful
can actually take a long time to finish (I'm guessing it's at least linear in the number of VirtualHosts but may also be affected somehow by the number of active connections and other factors). If you're using certbot --apache
you have up to three apachectl graceful
invocations per cert request (one after configuring Apache to satisfy the challenge, another after reconfiguring it to remove the challenge, and a third for deployment). When using another authenticator, you'll still usually have one apachectl graceful
per deployment, so if you have a certbot renew
that results in 15 renewals on some occasion, you'll either have 45 Apache reloads or 15 Apache reloads, neither of which is great if a single reload takes a long time (or a lot of CPU?).
I was able to work around the second issue with a deploy-hook that created a file indicating that a deployment reload of the web server was pending, which would then happen only once (if needed) after certbot renew
finished.
Some architectural changes that might help Certbot's scaling (maybe especially with Apache integration):
(1) Parallelize OCSP queries for revocation status, or perform them in a more stochastic or opportunistic way somehow. Maybe turn them off by default when certbot renew
is run interactively?
(2) Parallelize challenge satisfaction and challenge cleanup whenever multiple names are or will be requested, even across multiple certificate renewals. That is, all of the challenges could be obtained from the CA, and then a single action would attempt to satisfy all of them (with a single reload), and then the client would perform a single challenge cleanup, and then deal with the CA's report about which challenges were or were not verified.
(3) Parallelize deployment, whether with a new installer interface that attempts to install an arbitrary number of certificates, or with a new deploy-hook interface that gives a deploy-hook an arbitrary number of certificates to deploy at once.
(4) Experiment with keeping a copy of the notAfter date in the renewal configuration file after a successful renewal and using that instead of parsing it out of the PEM file. (This one might be bad for reliability and not produce that much speedup.)