Hi folks: This is general architecture query, so I've not bothered with "what OS etc"
I am deploying 3 cloned loadbalancers/reverse proxies to serve an estate of a 100 odd webserver, probably with haproxy (TBC). Essentially Blue/Green/Test with keepalived failover between Blue and Green
Should I:
Deploy acme.sh on each frontend independently (obvs sharing the /.well-known/ data store across all 3 so it works regardless of which one the test query comes back to)
Or deploy a single central server to run acme.sh and server up the /.well-known/ path (all 3 proxies will route to this server for /.well-known/
1 gives me the advantage that I can use the acme.sh hot deploy for haproxy and have no downtime or swothcing when certs are updated. Disadvantage - 3 times as many queries to LetsEncrypt.
2 pros and cons are the reverse of 1.
Doing (1) would suit me better, but i don't want to trip API rate limits or be unsocial... Do 3 servers asking for the same cert (or renewal) count as 3 orders or one? What's the preferred way?
With 100-ish certificates and 3 front-ends, you really can do either approach and everything will be fine.
If you're expecting to continue to grow beyond ~100, or have more front-ends (even as a future possibility), I'd very much encourage you to take the opportunity now to do #2 instead. Your pros/cons are good, but I'd also submit:
You have only one place to make ACME related changes, reducing the likelihood of stale certs hanging around, failing, and potentially causing your account to someday get paused.
You'd be ready to increase to 4 or 5 loadbalancers/proxies without having to worry about such an increase running afoul of Let's Encrypt's per-registered-domain rate limits.
You would have more control over your own downtime, because such a system only needs Let's Encrypt to successfully issue fewer certificates, and you only have one server with logs and metrics related to cert issuance that you need to monitor.
Also, acme.sh is good software, but I'd highly recommend that you make sure to update to a version that includes support for ARI, whenever such support is added. That's also in the category of controlling your own downtime in the event of a mass-revocation issue.
How many different certificates/domains are you talking about? That might slightly influence things.
Generally speaking IMHO:
Don't do separate installations. It gets messy and hard to troubleshoot.
If you can use DNS-01 instead of HTTP-01, that's the best option. I like to use my own acme-dns system, then delegate the various challenges from the primary dns onto it.
If you have to use HTTP-01, have ALL the systems redirect the acme-challenge directory to a central system. If you've got blue.example.com and red.example.com, set them both to redirect to acme.example.com or something else that is independent. You might use one IP for multiple purposes, but handle that at the DNS level - configure all your apps to redirect to the auth system.
Consider running everything offline, from inside your office on a dedicated machine. A lazy way to deploy certs is to rsync to the servers, and those servers run cronjobs to check if there is a new cert and reload. A nicer way is to use something like Fabric to write a script that grabs the certs, deploys them to servers, and restarts the servers. You can even check the certs/keys into source control.
Depending on the number of different certificates AND your exact server code, it may be beneficial to share the same private key across all the certs on a given renewal day. If you have numerous certs, depending on how the server stores the pkey data, you can get a bit more performance out of your system that way.
Six years ago, I would have advocated #1 and even using autossl capabilities - these days, the ecosystem of LE and server tech has evolved to a point where #2 makes a lot of sense and can be easily integrated into most build/deploy and devops practices.
As a quick sidenote, some proxies/gateways that implement auto-ssl can handle storing/loading certificates from the cloud, and I think may have provisions in place to handle dogpile blocking. I still think a centralized system is the best course though.