This was probably already discussed and I missed it. But, this is exciting!
Do we know if the standard ACME way of clamping cert lifetimes (NotBefore / NotAfter) will be utilized? How will revocation (CRLs, namely, since LE is discontinuing OCSP) policies relate to these short lived certs?
Please note that this all looks a bit preliminary, which is likely why Let's Encrypt hasn't announced much yet. But I'm guessing that this is more or less how it's going to work.
PS: To expand on this a bit, the idea is that you have multiple profiles configured at the CA, and the subscriber/ACME client chooses one of those. The CA then issues a different certificate depending on the profile. Let's Encrypt has talked about having two profiles in the past: A legacy/old/current profile, which is probably going to be the current certificate profile. This is going to live on for a while, and will most likely be the default profile for clients that do not specify anything.
Then, there's going to be a "modern" profile that has the 6-day lifetime, has no OCSP or CRL endpoints, maybe also no subject DN and possibly other changes for a modern PKI.
The CA/Browser Forum Baseline Requirements do not mandate revocation for "Short-lived Subscriber Certificates". Running a CRL for just these short-lived certs would add a burden on the systems while not being necessary. If I were Let's Encrypt, I wouldn't do revocation for such certs.
I'm probably asking a naĂŻve question (that's probably answered somewhere), but I wonder what the most driving motivations are for certificate revocations in general? Once a certificate has been abandoned by its legitimate users for whatever reason (e.g. "revocation", "expiration"), the only concern I see is forgery of identity via a compromised private key.
If compromises of private keys are/were addressed out-of-band from certificate revocations, I would think that the biggest issue around certificate revocations would be handled (e.g. look up a public key to see if its private key has been compromised before using it for any purpose).
The biggest reason (to me) for the push towards short lived certificates is to address the shortcomings of CRLs. Under the Baseline Requirements, a revoked certificate can appear valid to browsers for up to 10 days past revocation. Short lived certs expire before this window.
but I wonder what the most driving motivations are for certificate revocations in general?
In terms of security, I believe the driving factor is misissuance and not compromise.
I recall a discussion where Let's Encrypt engineers stated they were unhappy with the notBefore/notAfter spec from RFC8555, because that gives the CA absolutely no leeway to align timestamps:
The server MUST return an error if it cannot fulfill the request as
specified, and it MUST NOT issue a certificate with contents other
than those requested.
Meaning that if an ACME client submits dates in the notBefore/notAfter fields, the CA is not allowed to
backdate the notBefore
round the lifetime to a sensible number of seconds, if the client specified something weird
clamp the lifetime
A spec-conforming CA has to instead reject the entire order, which isn't great. Let's Encrypt has had some compliance nightmare around the lifetime of certificates and as such would like to avoid too much variability around the lifetime of a cert. This isn't possible with the current specification, unless you reject all orders that do not exactly match the lifetime the CA expects, or you intentionally violate the specification.
Let's Encrypt would instead like to have the duration CA-controlled, not user-controlled, to avoid potential issues around lifetimes as much as possible. This just doesn't work with user-specified dates. Therefore, it's not that likely that LE will ever implement notBefore/notAfter as per RFC8555.
Also, the ACME profiles extension allows for much more than just adjusting certificate lifetimes. It allows the client to specify a preference for various other things: The CN field has been deprecated for a long time now, and Let's Encrypt would eventually like to get rid of it, without breaking workflows that depend on it. Profiles are a way for users/clients to indicate readiness for such changes, such that the CA knows if it can safely use newer standards.
Let's Encrypt is already a single-point-of-failure for much of the internet, but at least there is a 30- to 90-day buffer to fix problems. (Or, to fix millions of problems.) But certificates that are good for less than a week? What about an outage that hits at this time of the year? Say, at this very moment as I type this, on a Friday afternoon US Eastern Standard Time, on the weekend between two biggest US holidays of the year?? Or even on a normal Monday morning? Scary as hell.
Does Let's Encrypt think their software and servers have some magic dust that makes then bug free and bulletproof? This change only would partially address the narrow problem of idiots leaking their private keys, at the risk of shutting off many millions of certificates that have been not been leaked.
I think people need to start getting backup certificates from other CAs. (Who are the good ones? What are the practicalities of running parallel certificates, from both Let's encrypt and something like zerossl.com?)
Thanks,
-kb, the Kent who has been on a small team responsible for managing many thousands of Let's Encrypt certificates, and who knows things can go wrong.
Hardly; I think they work pretty hard on keeping things up, though.
I think the main push toward really-short lifetime certificates (and thereby not needing to worry about revocation) isn't as much about client private keys, as being able to deal quickly with a CA that broke a rule (usually accidentally) that means that one shouldn't rely on the certificates. I may be wrong on that, though.
I'd say that anyone that actually cares about production availability should already be using certificates from multiple CAs. One doesn't just need to worry about getting new certificates, but uptime for CRL/OCSP and such as well (at least for the classic longer-than-a-week certs). One of the main things that Let's Encrypt has done for the world is pushing for ACME to be a standard, so now anyone can easily switch to another CA just by pointing their client to another CA endpoint.
Current popular free CAs include BuyPass GO, Google, and ZeroSSL. The developer of Certify the Web has a list comparing some, and the publisher of Posh-ACME has their own list.
The most public case I know of is Wikipedia (and the rest of Wikimedia), which has some public documentation though there might be a better link somewhere. They make sure that all their data centers have all certificates loaded, with different datacenters having a different primary one running so that they know all their certs "work" (since they're all being used for live traffic), and are ready to easily switch to another in case one CA has an OCSP outage or other problem.
Cloudflare has been doing this for several years; they now default to Google Trust (more browser compat than the current LE roots) with a random backup that could be LetsEncrypt or another of their partner CAs.
Many ACME Client Authors are working on that functionality. The problem is in aligning the options and features of free CAs with the needs of the clients. Everyone speaks ACME, but has their own API limits and extensions/behaviors.
The "short lived certs" are not immediately replacing 90 day certs, they are being introduced as an alternative to opt-into.
However, the entire industry is moving towards this - it's not just LetsEncrypt.
Increasingly ACME clients are likely to support backups CAs and automatic CA Fallback. We have this feature in Certify The Web, you just add multiple ACME accounts (and optionally a preferred CA) and let the app decide when to fallback to something else.
It will also do some basic checks to try to use a CA which is compatible with the specific type of certs you need (e.g. a wildcard or multiple SAN cert) based on it's own knowledge of the CAs features.
I'm seeing this misunderstanding quite a lot in the wild as well. Unclear whether the people posting inflammatory headlines are doing it on purpose for clicks or not (probably a bit of both).
"Not immediately", but they say that in a few years they expect to be issuing 100,000,000 certificates per day. So they are expectingâone way or anotherâthat people will be using 6-day certificates.
So my worries stand: If in this plan their infrastructure ever has a few days of outage it would bring down much of the internet. And it doesn't have to be a problem with Let's Encrypt, client sites have problems of their own making, and they would have a very short window in which to fix them.
The fact this isn't right away doesn't change that it seems a bad idea.
Ultimately, the fact is that the web PKI ecosystem as a whole is shifting towards short-lived certificates, not just Let's Encrypt. There's not much we as users can do about it. But I believe the web is better for it and has forced the associated software and hosting ecosystem to improve (often times kicking and screaming) both in agility and robustness. I look forward to that trend continuing.
You are assuming these changes are happening in a vacuum without other changes across the ecosystem. It's not. The push for shortened certs is happening alongside a push for redundancy and backup CAs within a domain.
This push is also happening with a goal of triggering ecosystem-wide failures faster as well - a driving factor of shorter certificate lifetimes is that the current CRL mechanisms can allow a revoked certificate to appear as valid for up to 10 days. LetsEncrypt and other CAs want revoking a leaf, intermediate or root cert to be detected ASAP. So there is an ecosystem-wide push for fast failure, fast recovery, and backups all at the same time.
That's currently the case, with 90-day certs, in the sense that the CA needs to sign updated OCSP responses for each issued certificate every half-week. So the CA is already using their intermediates to sign "this certificate is still okay to use" regularly. (I mean, browsers tend to ignore OCSP failures in general, so a failure may not "break the Internet", but a CA OCSP outage is what convinced Wikipedia to always make sure to have certs from multiple CAs running all the time.) Changing that signature to be for a whole new certificate shouldn't really be all that different in terms of infrastructure needed on the CA side (other than how Certificate Transparency will deal with handling that many more certs, which is a Known Problem still being worked on).
That's definitely something to be mindful of, though hopefully the automation needing to be running "all the time" rather than just ~6 times a year will help shake things out such that problems will reveal themselves just within a couple days after, say, a firewall change is made, rather than weeks or months later when it's no longer on top of mind. So administrators should be able to go into the holiday season with less staffing having more confidence that their site will continue to work as it had been, rather than certificate renewals being an "event" that they need to plan for.
Certainly many organizations aren't really ready for this right now, but as others are saying the tooling will be catching up, getting certificates from multiple CAs will be the norm, and systems will adapt.
First, even though we may disagree with @kentborg, good on you for playing your role here - disagreement is how we find flaws in arguments and I hope the good faith I'm seeing in this thread from everyone continues.
I want to throw something out there that may have already been covered but may have been missed - yes, an institution with such critical importance as LE having a technical/security outage/incident would be a significant event. A fair point is raised that when a multi-day outage occurs a part of our CIA triangle is lost (let's assume catastrophic - no DR CAs/RAs are available).
My counter to this concern is that certs aren't renewed and rebound seconds before they expire. LE already recommends renewing certs at the 60 mark, so if we extend this 2/3 logic to a proposed 7 day period (168 hours), a cert is issued at t=0 hours and then at t=112 hours (4.666... days) it is renewed and (presumably) rebound by the consuming service(s). If there is a 48 hour outage at the 112 hour mark that still leaves an 8 hour period for certs to be renewed.
Now, maybe that 8-hour period is cutting it close. I assume an outage of this kind is something the CAB/F members have been discussing and maybe CAs will recommend renewing certs at the half life instead of the 2/3 life.
Edit: I used 7 days above, oops - adjust the numbers to 6 days in your head.
I was guessing that renewals in a 6-day would would happen sooner than 2/3, but that's just me, I'm worried things could go wrong and it can take time to even know there is a problem, that the problem is a cert problem, that it is an expired cert problem, why it expired, how many others are going to expire in the same bad way, how soon, and how to fix it.
Most certificate problems are not because of any problem with Let's Encrypt, but a "user" problem, because some users need funny certificate setups because they terminate for multiple domains and how those domains are defined drives how the certificates are ordered and there can be cracks in all that. And problems in that can take time to sort that out.
I would expect people to be renewing after just two days, or maybe daily. But what do I know? I would also expect people to think that is scary stuff, and no one here sees it that way.
I bring up the possibility of a catastrophic Let's Encrypt failure because no one could deny that is possible, and surely that should catch someone's attention.
Alas, this proposal addresses some revocation problems, so why worry that it might create other problems?
-kb, the Kent who has been part of maintaining a site with a lot more than just one Let's Encrypt certificate.