I remember a discussion about this idea in the early days of the ACME protocol, but it never made it in. One of our colleagues argued that it would be useful because in some cases there could legitimately be a delay between the CA’s knowledge that a certificate must be revoked and the actual revocation event. I think other colleagues found it hard to imagine that the CA could appropriately delay the actual revocation. But in the current incident, we’ve seen a case where the first colleague’s intuition was exactly right.
fetching OCSP for stapling may help: https://twitter.com/caddyserver/status/1234874273724084226 that way, when an OCSP response fetched says “revoked” there is still time to renew the cerficate until the previous OCSP response is expired, so no downtime!
Would it be onerous for Boulder to provide an API endpoint that checks for a certificates’ “revocation pending” status? Perhaps by the serial number or something else that is small in size and does not require the overhead of authentication. Clients like Certbot would check that endpoint every 3 days (Considering the BR states 5 days)
We’ve had numerous internal discussions about some sort of generalized ‘renewal hinting’ API that clients could use to learn not just about immediate renewal needs (i.e. because of revocation), but also about generally when the CA thinks the client should renew.
We’re planning on writing up some of our internal notes on this topic and presenting it to client developers and the community to get feedback in the near future.
It would be awesome if this could also get standardized (like the ACME protocol). Who knows, maybe even as an ACME extension?
Unfortunately I’m not allowed to edit the post, but it should be clarified that “this event” was referring to revoking-certain-certificates-on-march-4.
What would be the advantage compared to using the OCSP and relying on the duration of the validity of the answer?
If the website uses OCSP stapling, with Certbot 1.3.0 who check the OCSP answer, I think any downtime can be avoided:
- When the certificate is expired, until the new OCSP - revoke - answer is signed, the server can still serve the valid previous valid OCSP answer
- The OCSP revoked answer is published
- Certbot detects it and renew the certificate
- During that time the server continue to provide the old valid OCSP response
If the server fetch every day a fresh OCSP response and run Certbot every day, when the certificate is revoked the server have then around six days to renew it (because fresh OCSP response are signed daily and are valid 7 days: OCSP server update frequency and/or schedule?)
I fell it’s better for the ecosystem to improve OCSP stapling than extending the ACME protocol in that case (but both solution can be complementary!): https://letsencrypt.org/fr/docs/integration-guide/#implement-ocsp-stapling
@tdelmas, that suggestion is very interesting; but do you know if there are any TLS clients that will ignore OCSP stapling and still make their own independent OCSP query?
I’m not aware of any, more testing could be done using a recently expired certificate with browsers that do check OCSP.
Couldn’t certbot just make the OCSP request itself to the CA, without concern for ocsp stapling by the website?
In @tdelmas’s idea the web server deliberately serves an older OCSP response in order to get clients to continue accepting an already-revoked certificate until the expiration of the OCSP signature.
Yes, with the new version of Certbot, if you set-up a cron every day (or every hour), you can reduce the downtime, the certificate will be renewed.
But between the revocation and the renewal, some web browsers could fail to connect if you don’t have OCSP stapling too, because they will fetch the OCSP themselves and get the last one with the “revoked” status.
With OCSP stapling, thanks to overlap between signed response, you can completely remove any downtime.
ok, i reread everything with that context and now understand.
let me clarify my suggestion: during the renewal process each day, certbot makes a request to the OCSP url for each non-renewing certificate. maybe that is checked every 3 days or so. if its on the revocation list, it renews.
extending ocsp with a “ProbablyAboutToBeRevoked” like the OP suggests would be better and more proactive… but just checking ocsp as part of the certbot renewal would help solve some things.
The “ProbablyAboutToBeRevoked” (from an API or an OCSP extension) will solve some problem - if you can ask for a certificate to be revoked in the future or if the CA decide to mark some certificates “to be revoked” - but not when the revocation must be immediate.
I don’t say it’s a bad idea, I think it worth thinking about it more! (And an API is probably easier than an OCSP extension), but OCSP stapling have some other interesting properties that worth pushing (privacy, speed, costs).
After reading https://bugzilla.mozilla.org/show_bug.cgi?id=1619179#c7 I’m convinced that your “about to be revoked” status is the good solution:
- The OCSP stapling does help in 100% case and avoid all downtime BUT it require the cooperation of all sysadmins to ensure they do enable it.
- A “to be revoked” status (either from an API or an OCSP extension) only solve the mass CA-initiated revocation but that MUST be solved, and it only require the cooperation of ACME-client developers (clients which doesn’t support it in the future could be unlisted from https://letsencrypt.org/fr/docs/client-options/ and users of incompatible/old clients could be notified by email)
So in order to minimize the disruption on the web the next time a CA have to revoke some certificate, both solutions must but pushed, so the CA can revoke them before the deadline without affecting too much websites thus avoiding web users deactivating revocation checks.
There was a blog post recently about how browsers handle OCSP:
(I can’t vouch for its accuracy, though.)
(And I disagree with the performance-over-security tone.)
All drawback of OCSP cited in that article (Speed, fallibility, privacy) are solved by OCSP stapling: no new connection is required (so it can’t slow down, fail not transmit data) to the OCSP responder.
I thought I remembered the article saying that at least one browser did something weird – like checking the OCSP server anyway for some certificates even when the site staples – but skimming it again I guess I was wrong.
You may have read that in one of the articles that is referenced in Matt Hobbs’s blog post, but according to that it only pertains to EV certificates.
how much load to DB if we start a extension that checks for “will be revoked”? this isn’t something that can be cached, isn’t it? as it only make sence to check when there is mass revocation event (as user requested revocation will active , a cacheable page say that there is no cert that marked for revocation may make sense to reduce server load?