Public beta rate limits

Replied as a linked topic: Hitting rate limits on CI but not dev box

1 Like

I was surprised when I got a Too many certificates already issued for: eviltrout.com error today while trying to generate one for a subdomain.

According to what is posted in this topic, I should be allowed to create 5 in a week. I feel like I’ve created a maximum of two and hit this limit. I was testing refreshing certificates to make sure my auto update worked – does that count towards the limit?

If so I’d love the ability to fill out a form and increase my limit, if only temporarily. I want to encrypt more stuff!

According to crt.sh, you created five certificates yesterday. Refreshing certificates is still issuing new certificates.

1 Like

Looks like you have your 5 all together …

https://crt.sh/?Identity=%eviltrout.com%&iCAID=7395

2016-01-21 2016-04-20 CN=eviltrout.com 19:36:25 GMT
2016-01-21 2016-04-20 CN=eviltrout.com 20:03:42 GMT
2016-01-21 2016-04-20 CN=eviltrout.com 20:04:25 GMT
2016-01-21 2016-04-20 CN=eviltrout.com 20:05:07 GMT
2016-01-21 2016-04-20 CN=eviltrout.com 20:17:46 GMT

I do sometimes think a block at 3 per 2 hours would at least give people a warning …

I’d love to get a PR together to add multiple tiers to the certificatesPerDomain limit, or to all limits in general. I completely agree, it’d be great to be able to code it as ā€œ3 per 2 hours, X per week, Y per 30 daysā€, for example. It’d go a long way to fixing this kind of a problem, as you say.

I still feel like I need to go make a proper model to come up with the right values for X and Y, and find other ā€œbetterā€ (less punative) rate limit queries, though.

4 Likes

@jcjones wouldn't we (not you) need to know the upper limit of what you can or want to handle without rate limiting first to determine a better formula for a PR ?

if you kept logs and records of all Letsencrypt authorisation requests that failed due to 'too many certificates issues' rate limit restrictions, you could tabulate the frequency and times when these failed requests occurred to get a sense of the actual patterns of load. Then come up with a formula that would better satisfy those patterns within the confines of what you can technically handle load wise.

also you might need to account for data from staging issuance too to get a feel of the rates of request/authorisations there too.

maybe narrow the focus of your data tabulation to domains which have =>5 issued requests per 7 day period and see how manu of those domains have failed requests due to rate limiting over than 7 day period ?

Those are two separate steps IMO; the first one is building the infrastructure to support multi-tiered rate limits, the second one would be to find appropriate configuration values (i.e. how many tiers? what are the individual limits?). Those shouldn't be part of the code.

yeah wasn’t saying as part of the code but just steps to take to get to such a formula :slight_smile:

Agreed; they’re separate steps.

My goal is to design something that produces a smooth-ish growth rate to the maximum load without letting abusers consume undue resources. @jsha has an idea of building a kind of reputation effect into it, which I find really interesting.

Using data from the authorization requests is part of what I’d want as input into the model, though there’s a handful of IP blocks in tight loops out there today producing over 95.2% of the authorizations, and other blocks spamming for new certs in tight loops, just pegging the limits constantly. Some of them @jmhodges figured out how to contact, and some of those have been combative. shrug.

It’s kind of remarkable how many of our users reissue at every opportunity they have. Here’s a chart, where the X axis is how many times an FQDN has been re-issued, and the Y axis is the quantity of certs issued with repeat FQDNs. (Query against CT logs is here, from which I pulled this)

Here it is with a log scale for certs issued:

The raw data; note that it exceeds the total count because of re-counting SAN certs:

3 Likes
  1. would the current system be able to handle those abusive IP blocks request limits if it wasn't rate limited ?
  2. FQDN reissues might be legit though for my Centmin Mod LEMP stack the nginx vhost generator is setup to create Nginx vhost and optional ssl vhost pairs per FQDN i.e. setup site forum.domain.com on one vhost, setup another site on blog.domain.com, another on shop.domain.com etc and they'd all be unique vhost sites with their own web roots. As such SAN ssl cert wouldn't be an option as only person who would know how many FQDN sites are setup and more importantly when they are setup is the end user who installed my LEMP stack - which I can't automate and script for in a Nginx vhost generator which integrates Letsencrypt client.

I am already hitting up against the public rate limits, in a few days I will have a few Letsencrypt SSL certs expire before i can renew as I'd have to wait 7 days to get past the public rate limits. Of course these sites are solely for testing. But if they were live sites !

We need more precise mechanisms than banning an IP, certainly. Right now the rate limit’s all that is protecting the database from the bad validations flooding tables. This sort of thing is always a problem for a high-profile service.

FQDN reissues are often legit; I don’t mean to indicate these aren’t. There’s just more of them than I figured there would be. And some of them are not legit, based on log reviews. There’s a couple hosts out there that are sending the same CSRs to /acme/new-cert every minute, and that’s not friendly behavior.

I’m sympathetic about hitting the rate limits as-is; they suck. We totally agree they’re affecting too many legitimate use cases.

There was a Docker/Nginx/LE image that didn’t cache its cert anywhere at all, so if you restarted the container it’d fetch a new cert. If it entered a restart loop, it’d fetch new certs each time. I supplied a patch and it’s better now, but that’s just one that I caught. :confused: Rate limits are critical to cover those sorts of cases, though that would have been preferably caught by a nonexistent, more precise certificatesPerFQDN limit rather than the coarse, gross certificatesPerDomain limit.

3 Likes

yeah i was guilty too early on when i accidentally set my cronjob renewal to every second heh

no possible solutions to put in place before it enters the actual database ? i.e. redis cache, memcached, elasticsearch, sphinxsearch as a layer or clustered layer in front of the actual database ? maybe also account for backend database actual load at time of requests for a dynamic rate limit/flood gate ?

maybe a layered approach like how nginx does rate limits with average and a burst limit. So maybe you can have a burst limit which denies those massive abusive flood type requests ?

Certainly. Keep in mind, we built this thing in a year with like, 3.5 developers. And I had to switch to building out the infrastructure halfway through. :wink:

I think we did what many called impossible, and now that we've proven it's possible and aren't on a shoestring budget, much refinement is to come.

2 Likes

indeed… guess there’s more good things to come…

you could also have a connection/request backlog queue of some kind too in front of the caching, so essentially serialise the massive flood requests to a more manageable request limit entering the database from caching layer

so client would be like

detected 300 requests per second.. rate limiting issuances to 10 requests per second... requests 1-10 processed... requests 11-21 processed... requests 22-32 processed...

email/MTA servers do queuing by batches nicely :slight_smile:

Oh and if on Amazon EC2 there’s ElastiCache https://aws.amazon.com/elasticache/

1 Like

I think a part of the problem was lack of autorenewal mechanism. When one is introduced, there will be less need for rate limits.

You mean some sort of a ā€œrenew-if-neededā€ command?

Kinda like ā€œā€“keep-until-expiringā€ which will only process a renewal request if the certificate is within 30 days of expiration (configurable)?

Found the --keep-until-expiring parameter option after I wrote my comment and managed to put together a simple (but far from perfect) script for myself. Didn’t find much documentation mentioning this parameter so I didn’t know it had been implemented until recently.

Of course, some of the accidental renewals might be avoided if the client brought up a ā€œare you sureā€ prompt if the user is going to apply for a new certificate which already exists (and made with the same configuration options as the live one) and was created within the rate limit window. Like the problem when people use certonly and then think the install option will simply resume where certonly left off (and they don’t think the --keep parameter will have any effect).

1 Like

yeah that confused me too !

I'm a little confused here, after reading the posts I think the Rate Limit is 5 every 3 hours right?
I'm using letsencrypt-win-simple to generate certificates for my IIS server.

Initially I was playing with it and then I accidentally deleted the certificates. I went back to run the script again and it said:

Error creating new cert :: Too many certificates already issued for:

So I waited for 3 hours. I'm trying again now after 4 hours and it's still giving me the same error.

What am I missing here?

EDIT: No Matter I' figured it out, 5 certificates in a 7 day rolling period.