Renewals hitting rate-limits

Hello,

We manage thousands of VM’s with certbot to install/renew Let’s encrypt certificates. On all servers we have the “certbot renew” command started by crond at one and the same time. Each VM has its own IPv4 IP address.

Renewal fails for different VM’s randomly with the following: domain.com.conf produced an unexpected error: urn:ietf:params:acme:error:rateLimited :: There were too many requests of a given type :: Error creating new order :: too many new orders recently: see https://letsencrypt.org/docs/rate-limits/. Skipping.

If I run “certbot renew --cert-name domain.com --dry-run” it completes correctly.

Which limit exactly are we hitting here as those are renewal requests? Can you suggest a workaround/fix of this issue?

Hi @iko

your message has the answer.

That's exact one limit.

We manage thousands of VM’s

Less then 10.000, so ~~5000 per 30 day, 500 per 3 days (sorry, can't count :wink: ), less then 200 per day, ~~20 per hour. 20 per hour isn't a problem -> change the times these jobs are running.

The New Orders limit is per ACME account.

Are you sharing one ACME account between many VMs?

How many certificates are managed by the VMs where this error is occurring?

The certbot tool is installed via the epel repo for CentOS7/8. We didn’t register an ACME account (we are using the defaults).

There are 77 certificates on one of the affected VM’s and 12 on another one I saw today as affected. There must me many more being affected randomly hitting the rate-limits.

I read each ACME account can have up to 300 orders per 3 hours and maybe that’s the limit we are hitting. Another thing is “certbot renew” tries to renew all certificates no matter if they point to the server IP or they were moved somewhere else which results in renewal attempts which will never succeed and they are added to that order rate-limit.

Instead of using “certbot renew” I’ll change the renewal process to happen in another way where a script will check if the domain resolves to the correct IP address and if it has a Let’s encrypt installed expiring in less than 30 days. Then it will attempt to renew it. In that way servers will not send renewal requests for certificates which do not require renewal. Additionally, I’ll change the cron job start times to be different for each VM so that we can have up to 2400 orders per day (24 hours / 3 x 300).

To be able to do more than 2400 orders per day, we’ll need multiple ACME accounts. I’ll read more about that.

Ah. So your cron schedule is not the default one? (1 run per 12 hours-ish)?

That would account for your problems. Renewal attempts at that interval will certainly cause issues without extra safeguards as you’ve suggested.

Yup. A new renewal process is out which will make sure the domain points to the correct IP, check the cert’s expiration date is in less than 30 days and then it will attempt to renew. Also, that will run twice per day every 12 hours. Thanks guys.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.