Rate limit for '/acme' reached

Hey there,
we are using a modified Version of LEScript (GitHub - analogic/lescript: Simplified PHP ACME client) for automatic Cert issuance on our SAAS CMS. Working without Problems for many years and thousands and thousands of Domains ...
In the last few Days we noticed a massive increase in Rate-Limit Errors. Even there was no single change in our Cert issuance process.
A maximum of 1 Cert per Minute is requested from CMS Server IP.
But getting lot's of "Rate limit for '/acme' reached" Errors:

As the Rate Limit for "/acme" is 40/Second I'm pretty sure there is no way we are reaching this actual Limit.
Already did some debug work and logged every request against LE Servers. And never got above 20-40 requests against ALL LE Endpoints needed for Cert issuance.

Anyone else experience Problems like this? Could this be an indication of some sort of Problem on LE side? Any ideas on how tu further debug this issue?

Thank you, bye from Austria
Andy

1 Like

Is it possible that you have multiple nodes running this software that are all using the same gateway? In the past, there have been issues like this because cloud systems/admins did not fully deactivate a node, and it's still running - and making requests (using up rate limits) without anyone realizing it. If the gateway is shared, they all contribute to the same IP limits, otherwise they overstep the account/domain limits.

There have also been issues with some employees of an organization installing new software and not telling others.

The 40/s rate limit is a combined limit against all the endpoints (acme + directory); it's enforced at ISRG's gateway.

It's possible that LetsEncrypt did change something. see Missing TLD [xn--4dbrk0ce / .ישראל] - #37 by mcpherrinm where LE noted they were doing some maintenance this week.

8 Likes

Thank you for your answer. Unfortunately it's really just running on this one Server. From a dedicated IP Address. So I can rule out the possibility of multiple nodes / same IP.
And since noone other than me got root access to the productive SAAS CMS Server I also can rule out the possibility that another LE Client is doing additional requests from this IP.

As for the 40/s limit - there is 1 request to /directory and x (depending on how many Domains are in the Cert) requests to /acme Endpoint per minute ... while logging additional debug infos I never encountered more (>=) than 25req/s - but still getting the error randomly back ...

2 Likes

Considering that technical info, I feel okay flagging @lestaff.

6 Likes

Hi @futureweb! The screenshot you shared doesn't have detailed error messages. Could you share a collection of error messages, exactly as they are received from the ACME API?

Also, if you could share a chunk of request logs spanning five minutes during which you received errors, that would be helpful.

7 Likes

Let's Encrypt SRE is looking into this. Our initial hypothesis is that it might be related to some ongoing maintenance:

We are currently operating less frontend load balancers than usual as we increase our network redundancy. This was supposed to be invisible, but it is possible it is affecting some rate limit calculations: With less servers, the requests handled by each load balancer server is higher, so it may cause rate limits to kick in earlier as the first level of limits are handled per-server.

If that is the root cause, it should resolve itself by the end of the day as we complete reconnecting the new networking hardware. I'm sorry for any inconvenience this has caused you! We will continue looking into this and post an update soon.

10 Likes

Some Examples:

Domain: www.flo-popup.com 
Time: Do 25.08.2022 12:01
Error: 429 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Domain: www.mountainguide.at
Time: Do 25.08.2022 10:05
Error:429 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Domain: www.archiv.originalanton.com
Time: Do 25.08.2022 09:55
Error:429 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Domain: www.der-brixentaler.at
Time: Do 25.08.2022 09:48
Error:429 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Domain: www.metallveredelunghuber.at
Time: Do 25.08.2022 09:47
Error:429 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Domain: www.bildhauerwerkstaette.at
Time: Do 25.08.2022 09:44
Error:429 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Domain: www.voglwirt-anthering.at
Time: Do 25.08.2022 09:43
Error:429 {"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Today we got 53 Errors, yesterday some hundreds ... the days/weeks/months/years before normally only 1-5 per day ... Some RateLimite Errors were always noticeable, but never that much. Especially large peaks in the nights starting at Midnight (European TimeZone - VIENNA)
Never found a reason why we are seeing those rateLimit Errors as we never shot above limits as far as I could see ...

Debug Output of one Cert Renewal on our Dev. Server - we only do max. 1 Cert / Minute renew (guess no sensitive Data is included here?!)

www.flo-popup.com

429
{"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}


Logger Cache: 2022-08-25 12:01:02 [info] Getting list of URLs for API
2022-08-25 12:01:03 [info] Requesting new nonce for client communication
2022-08-25 12:01:04 [info] Account already registered. Continuing.
2022-08-25 12:01:04 [info] Sending registration to letsencrypt server
2022-08-25 12:01:04 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/new-acct
2022-08-25 12:01:06 [info] Account: https://acme-v02.api.letsencrypt.org/acme/acct/2188868
2022-08-25 12:01:06 [info] Starting certificate generation process for domains
2022-08-25 12:01:06 [info] Requesting challenge for flo-popup.com.in.futurecms.at, www.flo-popup.com.in.futurecms.at, flo-popup.com.ex.futurecms.at, www.flo-popup.com.ex.futurecms.at, flo-popup.com.ex.ortsinfo.at, www.flo-popup.com.ex.ortsinfo.at, flo-popup.com.dev.futurecms.at, www.flo-popup.com.dev.futurecms.at
2022-08-25 12:01:06 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/new-order
2022-08-25 12:01:07 [info] Order Authorizations: Array (
    [0] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309097
    [1] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309107
    [2] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309117
    [3] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309127
    [4] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309137
    [5] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309147
    [6] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309157
    [7] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309167
)

2022-08-25 12:01:07 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/order/2188868/119401301757
2022-08-25 12:01:08 [info] Order Status - Response: Array (
    [status] => pending
    [expires] => 2022-09-01T10:01:07Z
    [identifiers] => Array
        (
            [0] => Array
                (
                    [type] => dns
                    [value] => flo-popup.com.dev.futurecms.at
                )

            [1] => Array
                (
                    [type] => dns
                    [value] => flo-popup.com.ex.futurecms.at
                )

            [2] => Array
                (
                    [type] => dns
                    [value] => flo-popup.com.ex.ortsinfo.at
                )

            [3] => Array
                (
                    [type] => dns
                    [value] => flo-popup.com.in.futurecms.at
                )

            [4] => Array
                (
                    [type] => dns
                    [value] => www.flo-popup.com.dev.futurecms.at
                )

            [5] => Array
                (
                    [type] => dns
                    [value] => www.flo-popup.com.ex.futurecms.at
                )

            [6] => Array
                (
                    [type] => dns
                    [value] => www.flo-popup.com.ex.ortsinfo.at
                )

            [7] => Array
                (
                    [type] => dns
                    [value] => www.flo-popup.com.in.futurecms.at
                )

        )

    [authorizations] => Array
        (
            [0] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309097
            [1] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309107
            [2] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309117
            [3] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309127
            [4] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309137
            [5] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309147
            [6] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309157
            [7] => https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309167
        )

    [finalize] => https://acme-v02.api.letsencrypt.org/acme/finalize/2188868/119401301757
)

2022-08-25 12:01:08 [info] Auth Nr 1:
2022-08-25 12:01:08 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309097
2022-08-25 12:01:12 [info] Got challenge token for flo-popup.com.dev.futurecms.at
2022-08-25 12:01:12 [info] Token for flo-popup.com.dev.futurecms.at saved at /daten/www/ortsinfo//.well-known/acme-challenge/k3AczDgrsSJmJZDxubhU_7E3EYH56IcUMZv98vKAleY and should be available at http://flo-popup.com.dev.futurecms.at/.well-known/acme-challenge/k3AczDgrsSJmJZDxubhU_7E3EYH56IcUMZv98vKAleY
2022-08-25 12:01:12 [info] Sending request to challenge
2022-08-25 12:01:12 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/chall-v3/145963309097/NtW8SQ
2022-08-25 12:01:13 [info] Auth Nr 2:
2022-08-25 12:01:13 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309107
2022-08-25 12:01:15 [info] Got challenge token for flo-popup.com.ex.futurecms.at
2022-08-25 12:01:15 [info] Token for flo-popup.com.ex.futurecms.at saved at /daten/www/ortsinfo//.well-known/acme-challenge/8jHWiDvIzLY43SPdXPVvKdnlE23aQVqdLzbZ4sKfZNc and should be available at http://flo-popup.com.ex.futurecms.at/.well-known/acme-challenge/8jHWiDvIzLY43SPdXPVvKdnlE23aQVqdLzbZ4sKfZNc
2022-08-25 12:01:15 [info] Sending request to challenge
2022-08-25 12:01:15 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/chall-v3/145963309107/7YvNvg
2022-08-25 12:01:17 [info] Auth Nr 3:
2022-08-25 12:01:17 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309117
2022-08-25 12:01:19 [info] Got challenge token for flo-popup.com.ex.ortsinfo.at
2022-08-25 12:01:19 [info] Token for flo-popup.com.ex.ortsinfo.at saved at /daten/www/ortsinfo//.well-known/acme-challenge/7f-jHjG1zPm-mX42MXvd-2RWr6DwftJ5ncmXBVMKk4k and should be available at http://flo-popup.com.ex.ortsinfo.at/.well-known/acme-challenge/7f-jHjG1zPm-mX42MXvd-2RWr6DwftJ5ncmXBVMKk4k
2022-08-25 12:01:19 [info] Sending request to challenge
2022-08-25 12:01:19 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/chall-v3/145963309117/UPVYYA
2022-08-25 12:01:20 [info] Auth Nr 4:
2022-08-25 12:01:20 [info] Sending signed request to https://acme-v02.api.letsencrypt.org/acme/authz-v3/145963309127

@mcpherrinm - thx for the Info - maybe it's related ... will see if it peaks again tonight! :slight_smile:

4 Likes

Thanks, those error messages confirm it's coming from the frontend load balancers as I suspected.
I am still investigating. Our full capacity will be online in a few hours, so if the maintenance is the cause, it should fix itself by then.

10 Likes

Today I already got ~40-50 of those Errors:

429
{
  "type": "urn:ietf:params:acme:error:rateLimited",
  "detail": "Error creating new order :: too many currently pending authorizations: see https://letsencrypt.org/docs/rate-limits/",
  "status": 429
}

which we normally never get ... and 5 of those:

429
{"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Is it possible that unfinished requests are piling up because of all the "rateLimit /acme" Errors and now we finally run against the maxPendingAuth Limit?!?

But it seems that '/acme' rateLimit Errors are back on what we normally see ... even if I can't figure out why we always see a handfull of those "/acme" rateLimits when not peaking up to 40req/s ... :-/

Yes. A common pattern with large integrations is this:

  • client requests an order for 1-100 domains on a renewal certificate
  • while completing authorization challenges, there is some ratelimit error
  • the client aborts that order, and moves to the next certificate in their renewal queue

When this happens, the pending authorizations accumulate and eventually trigger another rate limit that will wedge the account.

The correct way to handle this is:

  • leave the authorizations in place, but do not move on to a different item in your queue. Instead, fix the issues for the currently desired certificate and retry it.
    Or
  • on order failure, cleanup the pending authorizations by disabling them.

Unfortunately, most clients do not handle the pending authorizations correctly. This happens often to large integrations, because they host numerous domains and they may have a client who misconfigured DNS at some point, or removed their domain from the service.

9 Likes

It looks like the Code-Base (LEScript) we chose for our SSL management implemented it also that way...:-/
I'll see if we can improve something here

Thank you!! :slight_smile:

3 Likes

alright - currently I don't see any "Rate limit for '/acme' reached" Errors anymore ... so it seems it was, as you already suspected, related to the maintenance.

For the "too many currently pending authorizations" Error ... I've just implemented a Pending Verification invalidation Logic to our SSL Script which should prevent such issues in the future.

Thanks to everyone here for your time & help! :slight_smile:

bye from sunny Tirol, Austria
Andy

4 Likes

@mcpherrinm - may I ask if it's possible that those Frontend Load Balancer Limits are still calculated wrong? As before the Maintenance I still get a handful of the "Rate limit for '/acme' reached" Errors ... 26 today so far ...
But I'm sure i'm not anywhere near the 40/s Rate Limit ... I'm more in the 1-5 Req/Sec Range ...
For Example - this Order: https://acme-v02.api.letsencrypt.org/acme/order/2188868/121440226827 - failed with Rate Limit ... but only 5 or 6 Request were beeing sent to LE Servers.

429
{"type": "urn:ietf:params:acme:error:rateLimited", "detail": "Rate limit for '/acme' reached"}

Any way you could verify?

thx
Andy

1 Like

I can take a second look. Can you tell me what IP address(es) were used to make the requests, and timestamps? (I can figure that out from the order, but it would skip a step for me if you could tell me)

7 Likes

@mcpherrinm would be great, thx! :slight_smile:

Here are 3 Examples:

Source IP always: 83.65.246.198
TimeZone: CET (UTC+2)

07:18:04 - 07:18:18 - https://acme-v02.api.letsencrypt.org/acme/order/2188868/121441295817
07:13:03 - 07:13:11 - https://acme-v02.api.letsencrypt.org/acme/order/2188868/121440226827
06:37:01 - 06:37:20 - https://acme-v02.api.letsencrypt.org/acme/order/2188868/121432643097

2 Likes

@mcpherrinm - as I just received the Rate Limit Error again I wanted to ask if you maybe got the time to check if the Rate Limits Checks are working as expected, thank you! :slight_smile:

1 Like

We have no aggregate reason to believe rate limits are misbehaving, but I haven't had time to review logs for your requests in particular yet.

5 Likes

Short Feedback on the "Rate Limit Errors" - since LE implemented 429 - Service busy; retry later. Error I didn't get back a Single Rate limit for '/acme' reached anymore ... only some Service busy and that's it ... so it seems LE Servers falsely gave me back the Rate limit for /acme when in fact the Server was just busy ...
either way - all working now as expected - thx! :slight_smile:

1 Like

Thanks for follow up. Note the 429 error code is changing to a 503 soon. This should avoid some confusion

4 Likes

It's actually a little more subtle; in our configuration as-is, I couldn't keep the /acme rate limit while also applying the new overall load limits without a huge refactor that would have taken too much testing time. To keep things lean, I sacrificed the /acme message at the altar of technical debt.

But still, glad that things are looking better. The 503 change I'm planning for next Monday, which will help out a bit on observability too.

5 Likes