Slow apache reload with 1,000s of certificates

kingyrockets · July 28, 2022, 3:54am

We host a large number of certificates for separate client sites on a single server and are finding that when a new customer joins and we need to reload apache config to activate the certificate the server is unresponsive for around 30 seconds.

Has anyone else experienced this? The problem has definitely gotten worse with the more certificates we install.

My domain is: n/a

I ran this command: systemctl reload httpd

It produced this output: n/a

My web server is (include version):
Apache/2.4.6 (CentOS)

The operating system my web server runs on is (include version): Centos7

My hosting provider, if applicable, is:

I can login to a root shell on my machine (yes or no, or I don't know): Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): certbot 1.11.0

webprofusion · July 28, 2022, 4:03am

I can't help with Apache but I'm sure others are running thousands of certificates. What hardware or VM size are you running on?

CentOS 7 appears to have been released 8 years ago which perhaps implies the environment your sites are running in may not be optimal as it predates the explosive growth in SSL/TLS which we saw when Let's Encrypt started up and Google Chrome made https more or less mandatory.

You should also consider whether Apache is the current best fit for your requirements. Over the past few years it's started to fall out of favor to things like nginx and the newer Caddy server.

kingyrockets · July 28, 2022, 5:41am

It's a fairly powerful VM 16cpus 32gigs ram.

You're absolutely right we are stuck on an old Centos version and locked to an older version of Apache. Which could be the problem..

We are looking at moving to HAProxy which apparently has zero downtime config reloads. This was under the advice of our hosting provider.. but I want to explore all options incase there is a more simple solution here to save us that infrastructure change.

If I hear of others with 1000s of certificates with super fast reloads that have zero impact of page load, that would definitely give me more confidence to explore the OS and Apache/Nginx updates instead.

webprofusion · July 28, 2022, 5:45am

That does sound good, assuming storage performance isn't a bottleneck (you could run a storage benchmark). Another option is to run some sort of profiler to examine where the process is spending it's time (cpu, disk etc), you may want to try that on a clone of the problematic VM.

webprofusion · July 28, 2022, 5:53am

Another thing to keep in mind is that I assume this is a "graceful" restart, so you are also waiting for all your current user http requests to complete [and it depends how fast their internet connection is, not yours]. If some sites are prone to bots or have large downloads that could be an issue,so the question is how fast is a non-graceful restart? You could perhaps configure apache with a shorter timeout as per Why httpd graceful restart takes such a long time? - Server Fault

kingyrockets · July 28, 2022, 6:11am

Hey, yeah that is definitely a good question. The weird thing is that the actual graceful restart which is what "systemctl reload httpd" does happens fast.

But if you load the sites in a browser, this is what takes a long time. It's as thou apache service is busy reloading configs before it can respond.

I'll look into the timeout however and report back.

kingyrockets · July 28, 2022, 6:33am

I've just been playing around with the timeouts.

I've set:
GracefulShutdownTimeout 1

This seems to actually improve things quite a lot.. it's not 100% snappy, maybe 5 seconds of lag trying to load a site right after.

My main Apache configs now are:
Timeout 60
KeepAlive On
KeepAliveTimeout 5
MaxKeepAliveRequests 0
GracefulShutdownTimeout 1

Might play with things a bit further.. maybe it's the keepalivetimeout...

webprofusion · July 28, 2022, 7:42am

It also depends what type of web sites you are serving e.g. an html site should load quickly vs a content management system (wordpress etc) as a dynamic CMS would need to load the application framework and database modules etc.

There could still be some other sort of first-time load issue but you'd have to measure where the bottleneck is (e.g. if cpu, storage. memory and network are all staying low then the problem is possibly a timeout while waiting for something else).

_az · July 28, 2022, 8:29am

If @kingyrockets is running mod_php, I think this is a likely explanation. Under that mode, the PHP interpreter is embedded in each Apache worker, and once the worker exits under a graceful reload, all the speedup you get from the PHP code being cached is lost. So it's a worst case restart for the PHP applications basically.

Putting haproxy/nginx in front, or moving to PHP-FPM, would likely take care of this.

jvanasco · July 29, 2022, 3:44pm

1000s of domains/certs/hosts on a single Apache sounds like it may be an anti-pattern to me. Maybe Apache has changed in the past 15 years, but that was something to avoid in the past - with all servers. Nginx should be able to handle this as of 2011 or so, but I don't think Apache built that out.

IIRC, a status-quo that emerged around 2008 for large configurations like yours was to partition the domains across multiple Apache/Lighttpd/Nginx configs. One webserver sat on port 80/443 to terminate the connection, and then proxypassed traffic upstream to the relevant webservers (running on higher ports). Sometimes they were partitioned by name, other times by client. Running a stripped down nginx was very popular for this, because it had an extremely low memory footprint.

IMHO, I would do both. Tossing nginx in-front should be a relatively fast trial. In my experience, mod_php can be problematic in a whitelabel/client situation for a variety of reasons and moving it to a separate process is best.

Another option is to dynamically load certificates on demand. You can do that with OpenResty, an nginx fork. I open sourced our implementation :

There may be a way to do this in Apache too, but I am unsure. I think you're still likely to deal with Apache issues from the sheer number of hosts and the mod_php behavior that @_az mentioned.

kingyrockets · August 2, 2022, 9:25pm

Thanks everyone for your suggestions and tips.

After some more testing setting GracefulShutdownTimeout 1 hasn't really made any improvement on the graceful reload.

I'm going to work on some proof of concepts using NGINX and PHP-FPM and see if we can easily create a migration pathway to this setup.

system · September 1, 2022, 9:26pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Howto disable apachectl reload after every single renew? Help	11	1290	December 20, 2018
Does Apache need to be reloaded after every renewal? Help	5	4869	June 10, 2020
Certbot renew causing apache to respond very slowly Help	12	760	July 8, 2021
Certbot kills apache. Need help to restart httpd Help	16	1028	June 2, 2019
What would make Certbot scale better Client dev	10	531	April 18, 2024

Slow apache reload with 1,000s of certificates

Related topics