In my experience its pretty easy to hit the rate limit if you’re doing cloud (re-)deployments.
It is best of course to save the certificate generated for a cloud host to a persistent storage and restore that when a host is wiped and recreated. But when that is not implemented or not working correctly, the only way you will find out about it is when you are suddenly locked out.
It would be nice if there was a way such a deployment could detect that an unexpected amount of requests has been done for a given domain already. Tracking it locally is not an option, as the problem happens exactly when local state is not being preserved correctly.
So I’m thinking maybe an API that returns info about the progress toward various rate limits for a given domain. Then we can build a warning into deployments when we see we have done more requests than expected, with a low default that can be bumped up as needed for deployments where it is expected.
This does not have to be perfect. The important things are:
- Stateless from the point of view of the client
- Be able to detect when we are doing more requests than expected for a domain
- Do it in the deployment itself. A cron job that polls crt.sh or an email notification is unlikely to reach the right person fast enough, and it makes it hard to tune warning limits per deployment, which is necessary to ensure the warnings are not ignored.