ARI renewal-info Rate Limit Changes?

Have any rate limits been changed recently?

I develop an ACME client that incorporates ARI. Beginning on April 8th, around 4 PM EST, I consistenly run into rate limits when trying to fetch updated ARI information from Let's Encrypt. I use a rate limiter to cap requests to 5 per second (assuming its working as intended). This code hasn't changed recently and was not a problem before this time but has been happening consistently since.

Trying to figure out what's going wrong.

status: 0; type: urn:ietf:params:acme:error:rateLimited; detail: Service busy; retry later.

I haven't seen any news about that. Nor have we seen other reports like that.

In the interest of gathering more info ...

What is the duration of the retry-after header when you get that?

Are you honoring the ARI retry-after when you get a fresh response? Last I checked that retry-after was 6H

Are you actually hitting your cap rate of 5 per sec?

The current published limit for ARI is 1000/sec so capping at 5 shouldn't get anywhere near limits: https://letsencrypt.org/docs/rate-limits/#overall-requests-limit

PS: Your forum profile still lists the URL for your original client name. Really like the final name btw :slight_smile:

I made some testing code this evening and the way I had it implemented didn't seem to really be working as intended. I ended up switching to a ticker which properly rate limited to 3 per second and seems to have fixed the issue: httpclient: simplify rate limit · gregtwallace/certwarden-backend@fbfc732 · GitHub

However, in my testing code I was using a batch of 10 simultaneous GET requests and consistently 8 or 9 would be rejected.

You can use this to test (note: doesn't work since playground doesn't seem to support internet connections):

(Uncomment // default: // uncomment to test without rate limit to test without a rate limit)

(And thanks for the heads up on my profile, I fixed it :slight_smile: and the name was suggested by someone else. I wish I was that clever lol.)

Edit: Tweaked code to default without the rate limit in place and to also log the Retry-After Value: Go Playground - The Go Programming Language

2026/04/13 19:52:00 [21]
2026/04/13 19:52:00 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}
2026/04/13 19:52:00 [28]
2026/04/13 19:52:00 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}
2026/04/13 19:52:00 [7]
2026/04/13 19:52:00 [28]
2026/04/13 19:52:00 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}
2026/04/13 19:52:00 [5]
2026/04/13 19:52:00 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}
2026/04/13 19:52:00 [28]
2026/04/13 19:52:00 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}
2026/04/13 19:52:00 [18]
2026/04/13 19:52:00 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}
2026/04/13 19:52:00 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}
2026/04/13 19:52:00 [14]
2026/04/13 19:52:00 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}
2026/04/13 19:52:00 [24344]
2026/04/13 19:52:00 [25535]
[two proper responses here...]

Was that duplicating the identical full URL? Because that isn't a well-behaved ARI query series. The reply to a successful ARI request includes a retry-after header. You shouldn't retry that URL until that elapses.

I haven't seen postings about any related changes but it's possible a new check could be blocking faulty client requests.

I haven't had time to try to reproduce this myself.

Yes, that test code was using an identical full URL (for simplicity). The actual ACME client application was seeing the error when sending multiple ARI requests for different certificates at the same time (i.e., 1 per certificate) so that should not be the cause of the production errors.

Basically the application wakes up every couple of hours and queries all ARI end points where the Retry-After date is in the past.

I get the same result using your test program with some minor changes b/c of my limited and older test go setup.

I changed the rateLimit constant to be 1 but I still see every request logged in the same second. Still, 10 requests in 1s should be fine. I still wonder about reusing the same URL but you see this in your production so ...

I also don't see any Retry-After header info in your log or mine. For every rate limit error there should be one and should be honored. Not sure why that is. Perhaps is true for Boulder issued rate limits. I am not certain but I thought ARI were served from a CDN so perhaps slightly different.

Getting outside my expertise but thought it worth saying I see same results independently. I'm running in an EC2 instance on US East Coast fwiw.

I suspect they are hitting into the endpoint ratelimits. There are specific ratelimits that are associated with certain actions (for example create an account, generate an certificate, perform an challenge etc)

These rate limits work on the URL or action, meaning its permissible to retry again before retry-again time for another URL.

However, the endpoint limits works by IP on the endpoints. Meaning, you have to back off the retrying for all certificates when one fail.

If you share IP with many other users, you could hit these rate limits easily.

Sure, but the ARI renewal-info endpoint allows 1000 / sec. My own test was on a server with a unique IP (IPv6 actually). The OP's test script failed 8-9 times out of just 10 requests for me just like in their tests.

We will probably need LE staff to be involved.

I used "parallel" in bash to run curl requests to your acme-renewal URL and only rarely see a 503. But, I only get about 20 req/s on that test server.

So, I asked Claude to make a Go program to make HTTP requests for a single URL with varying counts and concurrency. Using this I easily reproduce your frequent 503. It also reproduces your finding that a low concurrency works better than high concurrencies. At concurrency of 3 I usually get 47/48 of 50 working (status 200). But, as concurrency increases the failure rate rapidly increases. Doing all 50 test requests concurrently only 2 of 50 get a 200 !

The Go program: parallel_requests.go (2.5 KB)
I don't know Go hardly at all so I can't vouch for the very simple code although that it reproduced what @gtwallace saw seems indicative.

Instructions are:

go run parallel_requests.go -url URL -n 50 -c 20

Where -n is the number of requests and -c the concurrency

Example response from a test (actual URL omitted but from @gtwallace sample program):

go run parallel_requests.go -url https://acme-v02.api.letsencrypt.org/acme/renewal-info/(id.serial) -n 50 -c 20
URL:         https://acme-v02.api.letsencrypt.org/acme/renewal-info/(id.serial)
Requests:    50
Concurrency: 20
---

Completed 50 requests in 454ms

Status code distribution:
  [200] 7 responses
  [503] 43 responses

Latency:
  Fastest: 39ms
  Slowest: 360ms
  Average: 164ms

Throughput: 110.18 req/s

Ooh wait, could it be that the rate limit is implemented as a "must wait between requests" delay, meaning 2 requests arriving 1/1000 sec too close to each other, would get rejected, based on a assumption that "continuing at this rate for a full second would spill over the rate limit".

Regardless on how few the request are.

Note that the "bucket logic" is not implemented for the "per endpoint" rate limits that is implemented in the load balancer, what I understand the "bucket logic" is only implemented for the ratelimits per action.

I agree it seems like a "too fast" problem even without exceeding the stated limits. But, if so the docs could/should be improved so ACME Client authors can design their code properly.

But, this problem started abruptly last week per first post. So if what you describe is by design then it changed without notice.

In any case, LE staff requested that this problem be handled at the github for Boulder so you can follow that thread if you wish: ARI Endpoint Rate Limit Is (Unintentionally?) Low · Issue #8717 · letsencrypt/boulder · GitHub

Thanks for reporting this, and sorry for the inconvenience!

We did make a change on April 8th that unexpectedly lowered the rate limit on the revoke and ARI endpoints. We've just deployed a fix that should bring it back in line with the Overall Requests Limit. Let me know if you still have problems.