What rate-limits are you experiencing from this pattern or are worried about?
The only one I can imagine that would be affected by these failures is the pending authorization limit - which can jam up HostingProviders/SAAS/PAAS/etc who use multi-SAN certs as the aggregate of pending auths across orders will often exceed the limit and prevent new orders from being made.
If that is the limit you are worried about, your client can/should be cleaning up pending authorizations by deleting all the still-pending authorizations if a certificate fails. That is the status-quo and recommended practice, unless you have other methods in place -- such as (iteratively) trying to complete the order with the successful and pending validations (e.g. retry the order without the failed domain; and repeat this process until the order eventually passes).
In terms of being worried about Rate Limits in general...
Digging into your Cert history and company, I see the following setup:
- You create a certificate with a CN
ssl{n}.ipaper.io
and assign <=99 customer domains to the SAN. - You host multiple subdomains per customer; one on crt.sh | 6181122217 has 30 subdomains
- The customer's subdomain CNAMEs onto your system
LetsEncrypt has a rate-limit of 50 Certificates per Registered Domain per week. Considering you are aggregating everything under 11 Certificates on your domain right now - I would not worry about your domain triggering a ratelimit under normal circumstances.
I think you are triggering that rate-limit because your usage seems to be constantly re-configuring the domains assigned to ssl{n}.ipaper.io
and re-issuing "that" certificate within a few days of the last issue. IMHO that is an anti-pattern, and I don't think LetsEncrypt is likely to approve a rate limit exception when they look at your usage during review.
Personally, I suggest you stop reconfiguring the certificates, and instead reconsider how you deploy them in your system(s). Several gateways and webservers will let you dynamically load a SSL Certificate. If you are on nginx, the OpenResty fork supports doing this through Lua scripting and should not take more than an afternoon to implement. I can point you to resources and code examples if interested.
I would also start migrating your certificate structure to a dedicated client-based one -- something like:
- cn:
ssl{x}.iopaper.io
- san:
x.client1.com
,y.client1.com
This is the strategy that cloudflare utilized on their systems when they required their name on a certificate (I don't think they do so any longer). If you follow that convention, you can onboard 50 new clients per week after validating DNS for all their domains is correctly set up -- as renewals do not count towards rate limits.
I think you are also running into this issue because you have one client's DNS changes affecting all the other clients on that certificate. By isolating clients to their own certificates, that problem will stop.
In the interim, I would have someone on your team write a script to do a pre-flight check before renewal and ensure the DNS is correctly set up for every domain on each certificate before trying to renew.
So my advice in order:
- pre-flight before renewal
- look into dynamic SSL loading
- partition certificates by client
- request a ratelimit extension.
I am confident that LetsEncrypt will approve a ratelimit extension for a SAAS/PAAS vendor that partitions certificates by client and uses them for 90 days (unless an addition/deletion is needed), as that is very typical for them to do.
I am not confident that LetsEncrypt will approve an extension for a SAAS/PAAS vendor that constantly repartitions various domains across certificates and orders a re-issue within a few days, as that is an anti-pattern that ties up their resources.