Continuing the discussion from Rate Limits for Let's Encrypt:
What distincts a renewal vs a new certificate? Same CSR?
Continuing the discussion from Rate Limits for Let's Encrypt:
What distincts a renewal vs a new certificate? Same CSR?
Based on my understanding of the code:
There’s a limit of 20 certificates per registered domain.
If you hit that limit, but are requesting a certificate that matches the list of domains of a previous certificate, that limit doesn’t apply. That’s what’s meant with renewal in this context. It allows you to renew your existing certificates even if you can’t get any other new certificates for a while.
There’s a separate limit of 5 certificates per “domain list”, meaning you can’t get more than 5 certificates with the exact same list of domains in a week. That’s probably to prevent buggy automation from causing too much signing load.
Exactly, every cert has the same set of fqdn than another cert emited at least 60 days ago is considered a renewal of the former cert.
@pfg is correct: The exception to the certificates/domain limit is defined in terms of “FQDN sets.” So if you issue a certificate with the same exact set of FQDNs (DNS names) you qualify for the exception.
@nit: A small correction: We actually removed the “at least 60 days ago” part. Now it doesn’t matter when the previous certificate was issued. We decided this was more straightforward to implement and explain, and by pairing the exception with a new rate limit for exact FQDN sets, we could get the same limiting properties we wanted.
thanks @jsha this definitely makes letsencrypt more suitable to use. I left most of my letsencrypt ssl certs to expire as i kept hitting the old rate limits for subdomains. This new limit hopefully will be better
Does the order of the FQDNs matter when it comes to this exception? Can I also assume that this exception does not apply if it's the same set as before but with one FQDN removed?
Order doesn't matter.
That's correct.
Will any of the rate limits will be raised up or completely dropped after leaving Beta status?
See the linked thread: We raised the limits shortly before exiting Beta. We don’t have an additional increase planned soon, but as always we’ll keep an eye on things and adapt as necessary.
Okay, then I have to wait to get certificates for all sub domains and to do it week for week.
I know I can put up to 100 FQDN in one certificate, but I use an own FQDN for each service (instead of sub directory) and I think it’s better to have services separated - even in certificates. So, no one knows of another service, if just using one.
Another question:
Will renwals hit the rate limits, too? If yes, this means if I have to renew 30 certs I have to spread this over two weeks and I’m not able to retrieve new certs in week one. Right?
See the linked thread: We now have an exception to the “certificatesPerName” rate limit for issuance requests that look like renewals: i.e., those that use the exact same set of names as a previously issued certificate.
My problem with the limits is that it encourages key reuse. This has been brought up a couple of times, but I still don't understand the reasoning behind it.
I have hosta.derp.com, hostb.derp.com and hostc.derp.com (Let's assume the limits are set to 2 hosts per domain, to make this simple).
The recommended way of doing this, if I'm not mistaken, is to have one certificate that contains all three hosts, right? Then it doesn't hit the 20 host limit (or 2, in this example).
So I now have a single certificate, with a single key, on all three machines. And because I'm good, I've also enabled https key pinning.
But, uhoh. hostc.derp.com has been compromised. Someone's stolen my certificate and my key! If I wasn't forced to recycle the same key, I would be fine (well, I'd have errors connecting to hostc, because they key has changed).
But, now I have to reissue the cert on all three hosts, and all three are now broken until my pinning expires. (Even if I'd re-generated the certificate three times, with a different key, I'd still have to reissue it because the attacker could impersonate my other two hosts)
This is why I don't understand the logic behind the per-domain rate limiting. @jsha could you possibly expand on the reasoning behind this?
The reason for having rate limits at all is not to limit the number of certificates per domain, but rather to limit the signing load on the HSMs that hold the issuer/OCSP key. That’s resource is limited and, in comparison to other components, significantly harder (read: more expensive) to scale.
Since registered domains are, in a way, a limited resource as well, it’s an entity that’s perfect for implementing this kind of rate-limit. (This is not 100% true, because some TLDs offer free domains, and private suffixes tend to be free as well, but it’s good enough for this purpose.)
As to your example, I would argue that the majority of use-cases where you have more than, say, 20 distinctive services hosted on the same parent domain are something like internal microservices, APIs and what not. In such an environment, I would strongly reconsider whether the Web PKI is the best tool for the job. There are a lot of good reasons to run your own PKI in such an environment, and there’s some great tooling available for this purpose (e.g. CloudFlare’s cfssl, HashiCorps Vault, etc.) Some past examples, for example in the payment industry, have shown that the Web PKI is actually quite often the wrong choice here.
Not to say that there aren’t any valid use-cases where you’d need a large number of publicly trusted certificates on the same domain, but it’s not a huge concern.
But that's exactly what it does. In fact, at the moment, the most certificates you can have per domain, ever, is 240. (20 domains per week, 12 weeks in 90 days, 240).
That's perfectly valid, and if @jsha wants to confirm that THAT is the limiting factor, then of course that's the end of the discussion. However, to me, that sounds counterintuitive. Signing a certificate isn't computationally expensive.
I have a gut feeling that the validation of the hosts is where the resource contention issue lies. But that should be (relatively) trivial to scale, compared to a secure unit that's actually SIGNING the cert.
That was a deliberately contrived example, but that's not the main problem. The Public DNS list has stopped taking new registrations, because Letsencrypt has moved them suddenly and unexpectedly from a relatively unimportant and slow to update list of domain names, to a critical part of the internet infrastructure.
If you want to use your domain to provide services that is more than just 'www.mydomain.com' and 'mail.mydomain.com', and you're not on the (un-updateable) Public DNS list, then you just can't use Letsencrypt.
There was a router manufacturer who posted here around the middle of last year. He wanted to preload all his devices with a valid SSL certificate. That's impossible. Something like https://uuid-goes-here.s3compatibleservice.com is impossible too.
I could simply and trivially register 1-s3compatibleservice.com, 2-s3compatibleservice.com, 3-s3compatibleservice.com, etc, and it would have exactly the same load on LE as if the per-domain limits were removed.
With the per-domain limits, I have to 'cheat' and put MORE load on the backend machines, because I need to hammer the machines with certificate requests as soon as I'm able to, rather than renewing them when needed.
That's why I don't understand the point behind it.
Edit: I should point out, I'm mainly doing IPv6 work, and IoT stuff, which is why this limit keeps bothering me.
Edit 2: Here's a less contrived example. Google is shutting down some Nest thermostats. Lets say I wanted to open source them so people could keep using them. Each Nest device has a unique name and appears through NAT from where it's located. Let's assume that 5000 of them are revived with the open source code. They'll slowly appear as they're re-flashed with the new firmware, and then they'd want a valid SSL certificate, because you don't want someone MitM'ing your house temperature. But because mynestname.opensourcenestcode.org (or whatever) is limited to 240 hosts, it can't be done.
Again, I agree that there are valid use-cases where you just can't solve a problem with the current rate-limits in place. I'm sure Let's Encrypt would love to provide a solution for that as well. It's just not as simple as "throw away the rate limits". The new solution would have to scale while not degrading the service level for all other users. Give it some time and I'm sure something will be worked out.
Dammit, I had missed that one. Thanks. All I can do is sit here and hope these nice new sponsors allow LE to buy some more HSMs!
In your scenario, it sounds like key pinning is the main problem. There's an post about key pinning here. In short: Ideally you should be pinning at the root CA level, with multiple CAs. You should also pin an emergency backup key that you keep offline, somewhere safe. If you're following those practices, a compromise and subsequent revocation and issuance with a new key won't block access to your site.
I'd also like to point out that you can effectively increase the number of certificates you can issue for subdomains on a weekly basis. If you have 40 subdomains and you want to issue independent certificates for each of them, you can issue the first 20 in week 1, and then next 20 in week 2. From then on, you can renew all 40 at once if you want, because of the renewal exception.
To answer your question about why rate limits: Yes, HSMs are our main bottleneck. Database space and write volume is a secondary bottleneck, and we're starting to find that write volume to CT logs is going to be a bit of a bottleneck too.
Providing a free service to the entire Internet is hard, since it's very easy for anyone to take advantage of the service. In order to stay free forever and encrypt the whole web, we need to find ways to provide everyone the services they need at the lowest possible cost to ISRG, a non-profit. Rate limits are a part of that.
I have a horrible feeling that I may have a fundamental misunderstanding of the rate limits. Are you saying that once a host is registered, that is then excluded from the rate limits? I was under the impression that the maximum number of hosts you could have per domain was 240 (mentioned above - 20 hosts per week, with 12 weeks in a month = 240 hosts), but now I'm thinking that I've misunderstood - are you saying the rate limits are only for new hosts?
If that is correct, then that's perfect. And it would even fit with my non-contrived Nest example, unless more than 20 people brought devices online in a week (at which point they'd have to wait a few days for the older entries to expire from the end of the rate limit queue)
Close. I'm saying that the Certificates/Domain rate limit (the one most people hit) is for new sets of hosts. So for instance if you issue [a.example.com, b.example.com], then [a.example.com, b.example.com] again, that second issuance gets a free pass on the Certificates/Domain limit*. However, if you later issue [a.example.com] that doesn't count as an exact match with the previous certificates, and doesn't get the free pass.
*But can still run afoul of the Certificates/FQDNSet limit, if you issue several times in a week.
That sounds like I am totally wrong about the rate limits then. Which does make sense, because my interpretation just seemed flawed.
So, if I may just clarify, with another contrived example (Sorry!)
Lets say I've decided that Route53 is charging too much at 50c/domain, and I'm going to give away DNS hosting for nothing, but make it up in volume. To do this, I'm going to need a bunch of nameservers, and a bunch of machines.
Because I like scalability, I deploy one new machine per day, with the FQDN of {{random uuid}}.superawesomedns.com, but because I can't write APIs, every host needs to expose a https service to the end user.
I use the standard letsencrypt client, and http auth, to get a unique key and cert per host. That then has a cronjob that renews the cert whenever required.
Because I'm not hitting the 20 hosts per week, that should be fine? Or, am I going to hit another rate limit somewhere.
Re-reading your original post, in the other thread
That's obviously where my disconnect happened.