I just read the news about Let's Encrypt wanting to stop supporting OCSP and it leaves me puzzled because OCSP is the newer technology that was invented after CRLs had proven to be unfit for the task. Even with the browsers supplying the CRLs in "shards" it is still megabytes of plain list data that needs to be constantly updated. Even more, the load and privacy issues have already been solved by stapling. There even is an extension, OCSPMustStaple, that could be utilized to force everyone to use it. It seems to me like the obvious way forward (aside from DANE, the true PKI fix), but now Let's Encrypt comes around with CRLs again. So what's wrong with stapling?
I think Idea is now CRL as is but browsers collect it and create their own compressed filter format
like Introducing CRLite: All of the Web PKI’s revocations, compressed - Mozilla Security Blog
Kind of being discussed here as well: What will happen to Must-Staple
Anyway, you are right on pretty much every point. But LE still feels that OCSP is unnecessary operational complexity. (I don't buy their privacy argument though, because any privacy problem is the fault of the end user clients / relying parties, not the CAs operating OCSP responders.)
Indeed, shutting down OCSP makes the privacy benefits of stapling -- and the possibility of automatic server-side detection of revoked certificates for the purposes of replacing them -- impossible. (And no, ARI is not a complete replacement either, because ARI does not tell you the specific information that a cert is being revoked, does not tell you why it was revoked, and only ... 2, I think? CAs currently implement ARI. My fear is that other CAs will continue to follow LE's lead and discontinue their OCSP services too, if LE can get away with it.)
It's all a distraction though. I think we should just leave revocation alone -- it's broken -- until cert lifetimes are shorter (< 7 days).
It also puts even more power in the hands of browser vendors for no good reason, and I think that a public interest CA like Let's Encrypt should ask their users before implementing such drastic changes.
Well, browser vendors have always been able to (and already do) ship their own CRLs. It's basically how they control which certificates to trust (or distrust, rather). Some will even ignore OCSP staples with a status of "Valid" because they think they know better.
(Did I say revocation was broken? It's broken.)
Some?
Not the same test but I created a cert with must-staple. I did NOT setup my nginx server to staple. SSL Labs confirmed the staple option on the cert and that it was not actually stapled.
None of these cared. They all showed the website with a normal lock icon anyway
Chrome on Android
Edge on Windows
Safari on iPad
I don't have any exotic security settings on any of those systems
My nginx server did not issue any warning in its log (with 'info' level error logging). No warning from nginx -t
This post is my personal opinion on the matter, and some of my coworkers will disagree with some points here:
OCSP without stapling doesn’t work because implementations fail open because of reliability problems, and therefore don’t meet the security goals either. The privacy implications of OCSP checks are, as discussed, quite bad.
OCSP with must-staple would indeed fix all the problems, but it is not deployable. There are far too many webservers with no support for it. And because the TLS stack needs to include the staple, that requires updating the webserver and its TLS library. Adoption is incredibly low, and there’s no path to fix that.
In many ways, OCSP with must-staple is better served by short-lived certificates, which Let’s Encrypt is working on launching soon. Because automating certificate issuance via ACME is already required to use Let’s Encrypt, there’s less software changes required.
While CRLs didn’t work in the ‘90s, the landscape has significantly changed today. The browsers are all working on efficient and effective revocation systems using CRLs as data sources.
More work is needed here, especially for non-browser applications. The techniques browsers are using should be deployed by OS vendors in more places. Traditional CRL checks are still supported, and Lets Encrypt will work to keep our CRLs small.
But it is clear to me OCSP is an ineffective technical dead-end, and we are all better served by moving on to figure out what else we can do.
We may keep OCSP running for some time for certificates that have the must-staple extension, to help smooth the transition, but at this time we don’t have a plan for how to actually deprecate OCSP: just an intent, publicized to ensure we can all begin to plan for a future without it.
This would be amazing. Do this and most, if not all, the problems around revocation go away. It would be the biggest step forward in Web security since Let’s Encrypt launched. And it would vastly simplify much code and infrastructure.
In case you didn't know, Let's Encrypt has been working on this for a while. The idea is to let clients select between different "profiles" that have different properties. The details haven't been fully announced, but last I've heard the idea is to have a "legacy" profile that's about today's default and a "modern" profile.
Eventually a modern profile could offer < 10 day certificates (with the lifetime shrinking even more over time). These certificates then can rely entirely on passive revocation.
Subscribers are not forced onto the new profile - they can select what they want and can probably still get standard 90 day certs. I guess Let's Encrypt expects the majority of users to stay on the old profile and thus wants to shutdown OCSP there as well.
All it needs is a CA/B policy. Once browsers start rejecting connections to non-OCSP-stapled sites, I bet it will take days at most for webservers to support it or people would switch to a different software. The major servers, Apache, Nginx and IIS, support it out of the box.
Maybe, but the shorter certificate lifetimes are, the more dependent the whole internet becomes on working CAs while misuse of compromised certificates is barely a problem in the real world. The situation is already bad with PKI as it is now. Let's Encrypt is doing a good job, but I don't understand how noone seems to see the general danger of centralizing the web around the CAs and browser vendors. Sure as hell something will break, either because of technical difficulties or by political stance.
If they can do that, they can also work on following the standards regarding must-staple. Most likely, the CRL approach will end up with the browser vendors downloading the CRLs from the CAs and run some sort of OCSP equivalent on their own servers.
The news post says " As soon as the Microsoft Root Program also makes OCSP optional, which we are optimistic will happen within the next six to twelve months, Let’s Encrypt intends to announce a specific and rapid timeline for shutting down our OCSP services. We hope to serve our last OCSP response between three and six months after that announcement."
I could understand the reasons for moving to CRLs to revoke intermediates, but for leaves I still think stapled OCSP is the best option.
Hey, all of you!
There's a lot more than webservers relying on TLS! Let's not make overbroad assumptions.
Not the point here. Whatever, it most likely depends on widely used libraries to serve TLS requests, like OpenSSL or GnuTLS, and public trust stores are mostly used by web browsers, followed by mail servers, but those often support DANE as well.
Not really. Their OCSP stapling support is poor in practice. For instance, nginx always lazy-loads the OCSP response, which means that for the first HTTPS request(s) after a server reload, there is no stapled response available. This breaks must-staple. Additionally, nginx is prone to throw away "good" OCSP responses if garbage is received by a misbehaving OCSP endpoint. This also breaks OCSP must-staple. This can be prevented to some degree with ssl_stapling_verify
, but nginx will not consistently retry getting a new OCSP response right away in that case (leaving you without a stapled response for an unknown amount of time). In general, nginx never persists stapled OCSP responses across reloads, which significantly increases the risk of downtime, if the OCSP server goes down while nginx is reloaded.
Apache's mod_ssl stapling is known to have similar issues (it is also susceptible to broken OCSP responses, breaking must-staple). I don't know the IIS OCSP or Apache mod_md implementations well enough to say anything about them.
In general, the OCSP stapling support in web servers is "good enough" for non must-staple usage, but for must-staple you currently cannot reliably use any of the major web servers. nginx allows you to script your own solution, but "just fix it yourself" is not an answer. Anyone who deploys OCSP must staple in actual production knows that this is really painful to do today. It's certainly fixable, sure, but the software isn't there today.
What? Caddy does OCSP stapling properly, automatically, without needing any configuration. Saying that it's "really painful to today" in production is completely disregarding this and frankly, it's a bit disingenuous. OCSP stapling is a solved problem. Large organizations/companies such as Stripe, Framer, and many smaller ones that still deploy OCSP stapling at the scale of hundreds of thousands of certs take advantage of this to great effect to benefit millions and millions of users.
Please read my whole post before taking my words out of context.
Then please show me the production certs of these large organizations that have must-staple set.
Great! So, Caddy has all of these solved?
- OCSP is fetched proactively, not lazy-loaded or on demand
- OCSP responses are persisted across server restarts
- Invalid OCSP responses are disregarded, but do not invalidate older still-valid cached ones. New OCSP responses are fetched ASAP when the OCSP endpoint is back up
If so, I applaud Caddy for having the best OCSP must-staple ready implementation I've seen to date.
Sorry, I didn’t mean to take anything out of context. I just didn’t disagree with anything else in your post, it was spot on
Yes in fact.
We implement Ryan Sleevi’s suggestions here: ocsp-stapling.md · GitHub
I thought Wikipedia used it, though in my poking around just now I don't see it but I may be looking in the wrong place. They use multiple CAs, each cert deployed to all their datacenters and different ones as the active one in different regions, so that they can be confident that they all work, and if one CA has an OCSP outage they can switch all their datacenters to a cert from a working CA. So I know they use stapling since they're trying to make sure their users' privacy is maintained, though maybe they don't set must-staple.
Really it should be normal for any true "production"-scale site to have active certificates from multiple CAs, deployed and ready to switch which is being served at a moment's notice, in order to deal with an OCSP outage, or a CA needing to revoke a cert for compliance reasons, or whatever. But I haven't seen many sites that actually care about their availability enough to do it that way.
I searched censys data and censys doesn't know of a single certificate that uses must-staple and contains the string "wikipedia".
Yes, that's what most people do. Use stapling but don't enforce it, so nothing breaks if you have an occasional odd issue. With must staple set, loosing your OCSP response is just as bad as losing the entire cert (if the client actually validates must staple, many don't).
While this sounds correct, unfortunately we (at my work) see lots of sites (which supposedly are prioritizing high availability) doing it the other way around. They buy unreasonably expensive certs from CAs, but then to keep costs down they buy as few as possible. The entire topic of "are subscribers ready to have their certs revoked" has been extensively discussed recently on security-policy. (Since many CAs use the argument "our customers can't tolerate a timely forced revocation" for non-compliance with BRs).
But yeah in principle you're correct, in an ideal world everyone would have that. But then again, in an ideal world we would just reduce the lifetime of the certs to the OCSP response lifetime and get the same result without needing OCSP at all.
Oh, very much agreed. Like those examples on the Mozilla security list say, many companies just don't want to mess with production ever, which makes sense, but I feel like if the norm were to just have two (or three) certs loaded into production at all times, and production just picked whatever one wasn't revoked (or the admins were allowed to switch which cert was active without falling afoul of their org's rules for "touching" the server unnecessarily), there would be more uptime everywhere. Even if you want an expensive OV/EV (or even DV) cert as your primary, getting one or two backup DV certs from free CAs seems like it should be relatively easy to set up (at least compared to the risks it mitigates). And the paid CAs have management software that should be able to just automate everything for the organization (that is, one of the benefits to a paid CA is that the CA should be able to take care of everything so you don't need to think about it). Honestly, even if the CA wants to be the sole provider of certs, having it be normal to give organizations two different active certs from different intermediates run out of different datacenters feels like it should be the bare minimum.
But people don't ask me.
Sure, that too.
For what it's worth, I completely agree with @mcpherrinm's comments above. I don't like the idea of moving away from OCSP, and I don't like the idea of clients that don't have access to bundled-and-compressed revocation lists downloading overly large CRLs just to get the status of a single certificate. But I also don't like the current state of the OCSP ecosystem, and the existence of a couple webservers that implement stapling well does not mean that the rest of the ecosystem is willing or able to move that direction. I have much more hope for a future of compressed CRLs and short-lived certs than I do for a future of stapled OCSP.