I'm using ansible and the community.crypto.acme_certificate module.
As far as I can tell this ACME client does not support ARI.
I've read about and understand what ARI is and does.
At this point I'm more concerned about hitting the rate limiter.
xlionjuan said >> I suggest to hold on.
I ask why?
Maiyannah, last night I put a cron job to request a new shortlived every 48 hours since I believe that should not hit the rate limiter. I'm posting here in hopes that we get some official guidance.
You might also consider having a fallback plan for a different CA when using short-lived certs. An extended outage at LE might cause a short-lived cert to expire leading to outages for you. Just a compatibility problem with your ACME Client and LE could cause the same kind of problem so having another CA on standby is sensible.
Please also review the short-lived profile docs: Profiles - Let's Encrypt Especially note this: "We recommend this profile for those who fully trust their automation to renew their certificates on time. This profile is not for everyone."
I am not familiar with your ACME Client but can't it check the cert lifetime and renew only when it reaches a certain threshold? Many clients have such a setting. That allows them to run often (like twice or three times / day) and they renew only when the cert is close to threshhold.
Checking (or trying) a cert renewal just once a month is not advised. This is especially problematic as cert lives are reduced to 45 days. In that case just one cert request failure will result in an expired cert on your server. For 90 day certs, if you are forcing renewal every 30 days that is twice as often as needed and using LE resources. And, only a small number of sequential failures can result in an expired cert.
LE has never recommended that practice.
A well-written ACME Client should be able to run once or twice a day. It should check ARI to know when to renew. If ARI fails (or is not supported by your ACME Client), a fallback plan to renew as cert life is than half remaining for short-lived certs (6+ day) and just one-third left for other certs. The ACME Client can assess the cert on your local system so there is little cost to checking this frequently. ARI requests do require occasional HTTP requests but LE's ARI servers are designed for especially high-volume requests. ARI uses a retry-after response header to throttle incoming requests.
Update: Another reason for more frequent checks is to respond to CA revocation events. It is rare but Let's Encrypt may need to revoke your cert for various reasons. ARI will notify your ACME Client about this so it can renew immediately. But, if you aren't checking ARI often your cert can become expired. When LE still ran OCSP servers an ACME Client could run frequently and check OCSP for that. But, ARI has been around fairly long now and ACME Clients should be supporting it. ARI is supported by a wide variety of CA and is an RFC.
Because renew in the ARI window will bypass the rate limit, but if not, and your ACME client accidentally exceeded the limit and you don't know, your certs will expired.
In early days of ARI that was discussed but I believe the more important element is that the ACME Client use the "replaces" option for the cert ID that it gets from ARI.
It's possible a cron or timer based ACME Client might renew after the ideal ARI window but still be able to use "replaces" to avoid rate limits. At least that's how I read these docs. A well-behaved and properly configured Client would normally run within the window at least as how Let's Encrypt has them now.
The ARI window is in the past for CA revocation events. And, may be in the past in the future for CA load balancing or planned outages. So, I think "replaces" is all that is needed to avoid rate limit not the window.
The second return value is a boolean indicating whether the order is exempt from rate limits. If the order is a replacement and the request is made within the suggested renewal window, this value is true. Otherwise, this value is false.
Looking through the code, "within the window" is really what it says: Current time must be after the window start, but also before the window end. So if the ARI window has already passed, you're no longer exempt from rate limits. (Also, if you're renewing way too early, you're not exempt either)
This brings up an interesting question: For incidents, LE returns ARI windows intentionally set to the past: Apparently those are not exempt from renewals, which sounds like an oversight?