@jvanasco, your proposal makes some sense to me, but it's way more complicated than the simple logic me and @mholt are proposing, and filled with potential loophopes.
E.g. just because an account has a good reputation at some point, doesn't mean it can't become a nuisance at sometime in the future. Either because it was compromised, some bad actor games the system by generating some benign looking activity before going into DDoS-mode, or because the client they are using happens to contain some freak infinite-loop bug that's only triggered once in a full moon.
On top of those concerns, I don't see any benefit of an account / client reputation system over a simple logic that works the same for everyone.
We're talking about a few different things here. The concern and suggestion that you and @mholt share definitely work for ARI renewals and should be implemented. That would solve the renewal issue for subscribers of all sizes.
My concern is really on ARI in general. Setting aside the rate-limiting on Certificate Renewals, large scale systems are likely to encounter rate-limits on their queries – as there are per-second ratelimits for simply hitting endpoints - which will affect other ACME operations. There is an existing manual override system for these users, my suggestion is to better automate this - as the ARI system is likely to expand the number of subscribers who need exemptions -- let's call them Enterprise Users. This is still in the single digit percentages of subscribers, but a rate-limit against a single Enterprise User could effect tens of thousands of domains.
In one of the many threads in this topic, someone suggested an Endpoint that advertises if there is a "renewal event" or not. Clients could query that to see if they must quickly query each issued certificate or not for a new ARI window. Even if that were used to alleviate the traffic from those users, they need to quickly query every certificate and that "potential renewal needed" work is going to be happening concurrently with all their normal account operations.
The more I model this for my own client, the more I see a need to make the override/exemption system work faster.
Ok, now I understand. I thought you were talking about specific instances of clients, not the client library ("User Agent") as a whole. So yeah, my original intent with the topic was to discuss specific rate limit exemption at a specific time for specific authenticated clients.
I, too, also like to see if there's some simple, comprehensive solution that can help clients be aware of when they should renew. I feel like a separate endpoint per certificate is going to be very, very noisy/busy...
Might have been me, I like the idea of /should-i-be-worried/ (ok, maybe not that..) endpoint that says yes or no, or even returns up to say 100 results of certs you should be worried about. That doesn't really address using ARI as a way to guide renewal windows in general but it does alert to mass renewal events and may be simple for clients to implement. Anyway, maybe for the future - as ARI is optional much of this is academic, I have no problem deviating from the spec behaviour if necessary
I share similar sentiments. In addition to the /panic? endpoint, my ideal situation would be something like this:
ARI response packet is included in order finalization response.
ARI response packets contain a "last event" and/or "policy change" sequential id.
/panic? endpoint contains the above ids. The endpoint could just be a normal json file, served behind a CDN, too as it could be a global/not-account-specific information packet.
Implementing the above would mean:
Clients immediately have the recommended ARI data, without needing to hit the ARI endpoint.
Clients can poll a global "/panic" endpoint once-daily, and compare the current event/policy ID to the last ARI information for each certificate they manage
If the IDs have not changed, there is no need to query the ARI endpoint
If the IDs have changed, then each certificate's ARI endpoint should be queried
Clients with small installations or the inability to persist certificate metadata could certainly query every certificate's ARI endpoints daily, but large installations would only need to perform one query for all certificates. I have yet to see any client targeting "large" subscribers that is unable to persist this sort of info, and many smaller clients are able to. The most popular client, Certbot, could manage the ARI information in it's existing renewal config files.
We actually had this same idea yesterday too. We need to do some more thinking and fleshing out, but we're considering adding a "renewal token" that would be returned by ARI responses, and could then be presented (once!) in a new-order request during or after the ARI window to bypass all rate limits and mark the previous certificate as replaced (like an ARI POST would).