Guide to best practices for ACME clients

mholt · March 10, 2020, 5:08pm

Over the last few months, I’ve worked in collaboration* with several experts in our niche field of TLS development+deployment to produce the first codified set of guidelines for automated TLS certificates:

https://docs.https.dev/acme-ops

With time, the content and scope of the site will continue to fill with useful content.

I hope it will be of use to any ACME client developers out there!

* Disclaimer: I do not represent any other individuals, who collaborated in unofficial capacities.

tdelmas · March 10, 2020, 5:35pm

Thank you for sharing that work, it’s an impressive piece!

A quick feedback:

https://docs.https.dev/acme-ops#use-one-name-per-certificate

It may worth indicating that having multiple names in the same certificate can increase performance because there is no need to renegotiate a TLS connection if it’s the same IP/server behind all names.

https://docs.https.dev/acme-ops#staple-ocsp-responses-to-certificates

Maybe insisting on the fact that the OCSP response must be renewed before the last minute (halfway through the validity period is recommended by https://gist.github.com/sleevi/5efe9ef98961ecfb4da8 ) so in case of revocation, the old stapled response is still valid ling enough to have time to renew without downtime. Related: https://github.com/certbot/certbot/issues/1028 (I have in mind the case of a mass-revocation, where a lot of client renew at the same time, and some renewals may fail because of the overload, so it needs to have a safety net of at least a day)

https://docs.https.dev/acme-ops#dont-use-muststaple-by-default

Maybe a line explaining why/when must-staple can be useful regarding the threat-model, as it allow to ensure an effective revocation?

mholt · March 10, 2020, 5:53pm

Good point, although honestly, in years of practice I have never known of substantial benefits from this -- or heard even one request/complaint about this. For example, a CDN with 100 names on a cert could in theory benefit if the client happens to visit many of those sites but usually the names on the cert are not related, and if they are, the client might only visit a couple of them. So we're talking a time savings of a few ms every few minutes. It definitely can be an optimization, but it's a niche one to be sure.

I like this idea. It's interesting how prophetic but that issue and this doc were in the case of the recent LE bug (by pure chance, of course -- which also extends to single-SAN certs by the way!). In fact, we've since proven this advice by experience, as Caddy/CertMagic were (AFAIK?) the only clients guaranteed to be unaffected by the CAA rechecking bug (first, because they are not multi-SAN certs, and second, even if they were affected, Caddy/CertMagic would renew the impacted certificate before the Valid staple expired).

There is at least one caveat, and that is some clients don't honor OCSP staples over their own revocation lists. For example, Safari (I think?) -- actually, Apple as a vendor -- dispatches their own revocation lists to their clients, and I believe they receive priority over a signed and stapled OCSP response, even! But yes, definitely if the server is watching the OCSP status and finds a revocation, doing an immediate replacement should at least minimize that window if there is one at all. I'd be surprised if a period check every few hours was that much faster than Apple's propagation speed.

Also a good point -- pending the caveat noted above -- but would you mind opening an issue/PR to suggest it? We can at least discuss it.

Thanks for your feedback!

tdelmas · March 10, 2020, 6:21pm

I'm thinking more about a website, www.example.com how redirects to example.com and then loads assets from static.example.com and requests api.example.com

If the visitor is on another continent, each TLS negotiation needs at least 100ms so in that example, the website can't display anything in less that 300ms lost only in TLS negotiation.

And, to be fair, your doc did talk about the opposite: "Multi-SAN certificates also have a larger size, so they slow down TLS handshakes.", which is less likely to happen as they will only slow down the request if they increase the number of packets sent.

Good to know, it was @schoen's question in Ability for Automated Notification of Revocations - #10 by schoen !

And from 1619179 - Let's Encrypt: Incomplete revocation for CAA rechecking bug :

we need to develop a protocol to notify Subscribers' systems of imminent certificate revocation, so those Subscribers can automate the process of replacing affected certificates before the deadline. We plan to design this protocol publicly, in collaboration with the PKI community, so that any CA and any Subscriber can implement it. We will also collaborate directly with popular ACME clients to integrate and test such automated replacement.

Sure, done: Explaining why/when must-staple can be useful regarding the threat-model · Issue #2 · https-dev/docs · GitHub

mholt · March 10, 2020, 6:56pm

These redirects happen only once, and are cached by clients.

It doesn't take too many added names before more packets are needed; any CDN that maxes out a cert's SAN capacity will easily add transfer time to the certificate.

Anyway, thanks for opening the issue!

system · April 9, 2020, 7:00pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Should clients do more to verify the certificates they receive? Client dev	11	908	December 5, 2020
Follow up - does acme do too much? Help	4	446	November 30, 2022
SSL Challenges in Multi-Cloud Computing: Best Practices with Let's Encrypt Help	3	82	January 5, 2025
Requesting feedback on a draft email for subscribers currently using OCSP must-staple Site Feedback	26	575	March 20, 2025
Certbot and DNS round-robin Help	12	1548	February 25, 2023

Guide to best practices for ACME clients

Related topics