Ability for Automated Notification of Revocations

One of the tenets of Let’s Encrypt is automation, this event exposes a very big spot where automation would make complete sense and could spare a lot of problems: there should probably be a mechanism to (safely) retrieve a “ProbablyAboutToBeRevoked” warning for a certificate, along with such probable revocation’s time, maybe as an extension to either both CRL and OCSP, OCSP only or ACME.

Some software (maybe the ACME client, maybe the server’s OCSP stapling functionality, or some distinct software) would then check daily or more often for this property and handle it automatically (by default I would warn the administrator and give him some time to fix it manually before renewing the certificate automatically, if there isn’t too little time left, and make him aware of every action taken and their results).

Making it a CRL and OCSP extension would have the minor benefit of allowing the general public to see it; the browsers and other clients might want to give a minor warning if they detected the situation (the mainstream browsers would probably prefer to not show anything).

CRLs have already the InvalidityDate extension, but it has a different meaning and is documented only for past times, it would probably be a bad idea to repurpose it for this.

4 Likes

I’ve moved your post to a separate thread and provided a title to facilitate conversation on your idea outside of the larger post about March 04 revocations.

I remember a discussion about this idea in the early days of the ACME protocol, but it never made it in. One of our colleagues argued that it would be useful because in some cases there could legitimately be a delay between the CA’s knowledge that a certificate must be revoked and the actual revocation event. I think other colleagues found it hard to imagine that the CA could appropriately delay the actual revocation. But in the current incident, we’ve seen a case where the first colleague’s intuition was exactly right.

5 Likes

Related: https://github.com/certbot/certbot/issues/1028#issuecomment-593985897 and:

fetching OCSP for stapling may help: https://twitter.com/caddyserver/status/1234874273724084226 that way, when an OCSP response fetched says “revoked” there is still time to renew the cerficate until the previous OCSP response is expired, so no downtime!

2 Likes

Would it be onerous for Boulder to provide an API endpoint that checks for a certificates’ “revocation pending” status? Perhaps by the serial number or something else that is small in size and does not require the overhead of authentication. Clients like Certbot would check that endpoint every 3 days (Considering the BR states 5 days)

We’ve had numerous internal discussions about some sort of generalized ‘renewal hinting’ API that clients could use to learn not just about immediate renewal needs (i.e. because of revocation), but also about generally when the CA thinks the client should renew.

We’re planning on writing up some of our internal notes on this topic and presenting it to client developers and the community to get feedback in the near future.

6 Likes

It would be awesome if this could also get standardized (like the ACME protocol). Who knows, maybe even as an ACME extension?

Unfortunately I’m not allowed to edit the post, but it should be clarified that “this event” was referring to revoking-certain-certificates-on-march-4.

What would be the advantage compared to using the OCSP and relying on the duration of the validity of the answer?

If the website uses OCSP stapling, with Certbot 1.3.0 who check the OCSP answer, I think any downtime can be avoided:

  • When the certificate is expired, until the new OCSP - revoke - answer is signed, the server can still serve the valid previous valid OCSP answer
  • The OCSP revoked answer is published
  • Certbot detects it and renew the certificate
  • During that time the server continue to provide the old valid OCSP response

If the server fetch every day a fresh OCSP response and run Certbot every day, when the certificate is revoked the server have then around six days to renew it (because fresh OCSP response are signed daily and are valid 7 days: OCSP server update frequency and/or schedule?)

I fell it’s better for the ecosystem to improve OCSP stapling than extending the ACME protocol in that case (but both solution can be complementary!): https://letsencrypt.org/fr/docs/integration-guide/#implement-ocsp-stapling

2 Likes

@tdelmas, that suggestion is very interesting; but do you know if there are any TLS clients that will ignore OCSP stapling and still make their own independent OCSP query?

1 Like

I’m not aware of any, more testing could be done using a recently expired certificate with browsers that do check OCSP.

Couldn’t certbot just make the OCSP request itself to the CA, without concern for ocsp stapling by the website?

In @tdelmas’s idea the web server deliberately serves an older OCSP response in order to get clients to continue accepting an already-revoked certificate until the expiration of the OCSP signature.

1 Like

Yes, with the new version of Certbot, if you set-up a cron every day (or every hour), you can reduce the downtime, the certificate will be renewed.

But between the revocation and the renewal, some web browsers could fail to connect if you don’t have OCSP stapling too, because they will fetch the OCSP themselves and get the last one with the “revoked” status.

With OCSP stapling, thanks to overlap between signed response, you can completely remove any downtime.

1 Like

ok, i reread everything with that context and now understand.

let me clarify my suggestion: during the renewal process each day, certbot makes a request to the OCSP url for each non-renewing certificate. maybe that is checked every 3 days or so. if its on the revocation list, it renews.

extending ocsp with a “ProbablyAboutToBeRevoked” like the OP suggests would be better and more proactive… but just checking ocsp as part of the certbot renewal would help solve some things.

1 Like

The “ProbablyAboutToBeRevoked” (from an API or an OCSP extension) will solve some problem - if you can ask for a certificate to be revoked in the future or if the CA decide to mark some certificates “to be revoked” - but not when the revocation must be immediate.

I don’t say it’s a bad idea, I think it worth thinking about it more! (And an API is probably easier than an OCSP extension), but OCSP stapling have some other interesting properties that worth pushing (privacy, speed, costs).

1 Like

After reading https://bugzilla.mozilla.org/show_bug.cgi?id=1619179#c7 I’m convinced that your “about to be revoked” status is the good solution:

  • The OCSP stapling does help in 100% case and avoid all downtime BUT it require the cooperation of all sysadmins to ensure they do enable it.
  • A “to be revoked” status (either from an API or an OCSP extension) only solve the mass CA-initiated revocation but that MUST be solved, and it only require the cooperation of ACME-client developers (clients which doesn’t support it in the future could be unlisted from https://letsencrypt.org/fr/docs/client-options/ and users of incompatible/old clients could be notified by email)

So in order to minimize the disruption on the web the next time a CA have to revoke some certificate, both solutions must but pushed, so the CA can revoke them before the deadline without affecting too much websites thus avoiding web users deactivating revocation checks.

1 Like

There was a blog post recently about how browsers handle OCSP:

(I can’t vouch for its accuracy, though.)

(And I disagree with the performance-over-security tone.)

All drawback of OCSP cited in that article (Speed, fallibility, privacy) are solved by OCSP stapling: no new connection is required (so it can’t slow down, fail not transmit data) to the OCSP responder.

I thought I remembered the article saying that at least one browser did something weird – like checking the OCSP server anyway for some certificates even when the site staples – but skimming it again I guess I was wrong.