Staging environment still has issues

kedar031 · March 1, 2023, 6:09am

My domain is: e2e-fdr-qnx-master0.e2e-fdr.eypgdy.g0.int.cldr.work
I ran this command:
It produced this output: AcmeRateLimitedException: Service busy; retry later.

My web server is (include version): using acme4j clien

The operating system my web server runs on is (include version): CentOS

My hosting provider, if applicable, is:

Team, this is prevalent in the Staging environment even after several hours since the outage is reported as resolved. Can someone please check?

	Suppressed: org.shredzone.acme4j.exception.AcmeRateLimitedException: Service busy; retry later.
		at org.shredzone.acme4j.connector.DefaultConnection.throwAcmeException(DefaultConnection.java:545)
		at org.shredzone.acme4j.connector.DefaultConnection.performRequest(DefaultConnection.java:479)
		at org.shredzone.acme4j.connector.DefaultConnection.sendSignedRequest(DefaultConnection.java:407)
		at org.shredzone.acme4j.connector.DefaultConnection.sendSignedPostAsGetRequest(DefaultConnection.java:155)
		at org.shredzone.acme4j.AcmeJsonResource.update(AcmeJsonResource.java:117)

mcpherrinm · March 1, 2023, 6:30am

It seems there’s a new issue, probably unrelated to the previous one. Seems like one of the databases got OOM killed and didn’t come back healthy. We are investigating.

mcpherrinm · March 1, 2023, 6:53am

Should be better now.

kedar031 · March 1, 2023, 7:28am

@mcpherrinm Thanks for the quick help. we are still hitting this issue quite often(one instance - 4378ebe136cb4116.knox-71r.l2ov-m7vs.int.cldr.work).
Would it take some more time for this to be fixed completely?

avizov · March 1, 2023, 7:42am

we're also experiencing the rate limit issue quite often
cert-manager/challenges "msg"="re-queuing item due to optimistic locking on resource" "error"="[503 urn:ietf:params:acme:error:rateLimited: Service busy; retry later.

JamesLE · March 1, 2023, 7:56am

Sorry about the trouble. I've confirmed that performance is still affected, and have updated our status page. This may take a while for us to fix.

kedar031 · March 2, 2023, 5:15am

Thanks @JamesLE . Any ETA/further update on this will be of great help.

JamesLE · March 2, 2023, 6:42am

We’ve fixed the immediate issue, and the staging environment has returned to its baseline. Unfortunately, that baseline does have a relatively high error rate. We’ll continue working to improve that, but have no ETR.

If you’re regularly unable to issue even one staging certificate, though, do let us know since the error rate should not be that high.

PaulChallis · March 2, 2023, 8:19am

We've been having the same issue with receiving either a 503 or "Service busy; retry later" error. Seems like something to do with rate limiting?

Error: urn:ietf:params:acme:error:rateLimited :: There were too many requests of a given type :: Service busy; retry later.

rohit93c · March 2, 2023, 9:05am

I am trying since yesterday, not requesting more than 3 certificates. But continue to get following error:

Service busy; retry later

I even tried requesting a single certificate, but stills its failing with following error:
Error:

acme: error: 0 :: POST :: https://acme-staging-v02.api.letsencrypt.org/acme/new-acct :: urn:ietf:params:acme:error:rateLimited :: Service busy; retry later.

{"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Service busy; retry later."}

TWR · March 2, 2023, 1:05pm

I've been getting the same issues via Certbot agaisnt the staging environment. Unable to even issue one certificate over here. I've tried giving it everything from 5 minutes to 4 hours to resolve. Thank you opening the support ticket - thought I was doing something egregious.

saurabh.nikam07 · March 2, 2023, 3:12pm

I am facing a similar issue as well on staging. Did not try on production yet though, but I have to plan it next week.
Hope it will not be impacted.

mcpherrinm · March 2, 2023, 3:20pm

Our staging databases struggling under load. Production is unaffected.

mcpherrinm · March 2, 2023, 6:26pm

We're back to normal as of 18:10 UTC.

Sorry for the repeated problems here. It seems the ratelimiter itself had an issue, so we were still ratelimiting even once the initial issues had resolved. 503s are back to zero now.

system · April 1, 2023, 6:26pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Getting "Service busy; retry later." often while Validating Challenges in Staging Environment Help	3	803	April 1, 2023
Error creating new order on Acme Staging Help	13	1020	November 10, 2021
Rate-limited on staging environment Help	4	61	May 31, 2025
Staging API responds with 500 internal server error Help	4	1026	August 8, 2019
Staging environment - getting rate limit service busy retry later Help	13	177	December 5, 2024

Staging environment still has issues

Related topics