I like the idea in general, because as @petercooperjr suggested there are many optional features a CA might expose, and I agree the chain selection is one of them. We'd probably have to consider each thing separately and it may derail the effort if we try to encompass everything in in one go (or maybe it wouldn't?).
I'm directly interested in knowing the following about a CA :
Up-front chain options for a given key type (and possibly strength). This just helps the user make a preference choice but would still be ignored by the app if the chain wasn't present in the end result, it's a "preference" after all. The UI could also flag if a CA no longer declares support for your currently set preference, my client actually send diagnostic notifications to users and this would be a candidate.
Generally supported key types and strengths: i.e. which can I request and my order won't fail.
Whether notAfter is supported and if so, what's the min, max and default lifetime. Some CAs currently blow up the order if you try to specify notAfter, some don't. Incidentally I'd prefer all CAs ignored it rather than error as this currently hinders CA fallback.
identifier types supported (dns, IP, TnAuthList etc). Challenge types to some extent, but these are often implied from the identifier type you're trying to get (e.g. Authority Token for TnAuthList).
ARI support (I deduce this from the presence of the endpoint though)
Agree that the JSON should be optimised for deserialization to objects (as per JS, dotnet etc), the array in an array thing is mostly seen in python and I think PHP to some extent but it can require special work to parse elsewhere. Likewise I'd use simple names and avoid the _ prefix as it doesn't seem essential.
I do not know which are currently offering alternate chains,
Alternate Chains are explicitly defined in the ACME spec, so I consider this as addressing a deficiency in the spec that wasn't realized until production use.
We can expect every CA to either support Alternate Chains or switch chains as their Trusted Roots expire. The DST retirement will eventually be repeated by every other CA as long as ACME is in use.
I don't know if this is happening within ACME yet, but Commercial CAs often (a) own multiple roots and brands, allowing a subscriber to select a preferred option at checkout or download; and (b) offer cross-signed roots or intermediates.
Beyond that - I see the situation as this: As client authors we can generate a working draft that addresses our needs and gains ISRG's endorsement - then push it into the standards track at which point other CAs would would share their needs, concerns and edits. The end product would work for all providers, otherwise it would never be accepted. Under current conditions, CAs will decide what they offer to clients and how – likely without the input of client authors.
Realistically, our options are either:
Come together as client authors and try to influence the specs to best work with our needs.
Do nothing and have a CA develop their own spec or divergence - then deal with whatever information/flow they have graciously given us.
There is definitely a lot of room for improvement with the ACME spec. I am focused on advancing a proposal regarding Chains right now, because this is changing a lot with the LetsEncrypt CA in the near future, and the current system and client behaviors can cause operational difficulties.
As LetsEncrypt subscribers, we know that chains will change several times in the near future:
swapping of long and short X1 chains
removal of long X1 chain
introduction of X2 chain
We've also had several changes to the chains in the past:
swapping of long and short X1 chains
introduction of X2 chain to specific accounts
re-signing of the long X1 chain (the expiry date changed)
Across every CA, we know there will changes to available chains when their roots near expiry.
I work a lot in publishing with high traffic websites. Chain selection is incredibly important, because the chain is directly related to reach and usability. "Trust" is not the only concern, key type and chain length can affect compatibility as well (especially when you're dealing with embedded browsers on mobile devices). While some subscribers are fine if their services restart with a different chain, others will be negatively impacted - losing traffic, revenue and consumer confidence. While I'd love to see many various improvements to the ACME spec, I am less concerned with addressing Chain Selection as a deficiency in the spec and more concerned with avoiding the tangible problems switching a chain can create. If I have N days of validity left on a certificate and my preferred root is not available, I do not want to renew that certificate with a different trust anchor and deploy it – nor do I want to consume a Certificate credit to lean about this change. I want to catch this before renewal, and research my options.
IMHO these are both particularly great things to bring up, but they do not frequently change and submitting an invalid option to the CA will result in a failure with informative docstrings. With the existing ACME RFC and client behaviors, marking a preferred chain will either:
Consume an allocated Certificate "credit" (i.e. duplicate certificate limit with LE, total certificate limit with commercial CAs) (best-case scenario)
Restart web services with an unwanted Certificate chain (worst-case scenario)
+1. Making results specific to an account/key might best be proposed as a potential extension or behavior. IMHO the important thing right now is being able to tell if a human needs to intervene and review/update the configurations because the CA has changed their offerings, and a behavior like this would cover most of those situations.
We're on the same page here for all of that.
I wonder if this could be consolidated into the same endpoint or a different one.
It seems like this would need to be account-specific, as LE shards X1 and X2 access by account. I also wonder if this might be account-specific with commercial CAs, who often vary features across pricing tiers.
I would love to see this, but I don't think it's going to happen as we're in "divergences" territory against 7.1.3 and 7.4 AND trying to avoid conflicts with the spec.
Boulder does not accept the optional notBefore and notAfter fields of a newOrder request paylod.
Ignoring the fields should not happen as that's a bigger divergence than not supporting an optional field. I don't think that will likely happen unless the RFC itself is changed. 7.4 states:
The server MUST return an error if it cannot fulfill the request as specified, and it MUST NOT issue a certificate with contents other than those requested.
IIRC, the CAs that do not accept these fields made the decision because doing so would require them to issue matching certificates and they wished to control the validity period.
I think the server could advertise if it accepts the optional params, but I don't think there would be industry support to standardize a way to advertise "we're going to ignore the optional params and issue a certificate in violation of the RFC".
Okay, I'm going to come in swinging (sorry!) by saying that I don't think something like what is proposed here should be standardized. I think the overall idea is a good one, and very well motivated and obviously well intentioned, but there are a few things that make this specific idea not the right direction to go. I'll talk a little bit about why that is, and then suggest the direction we should be going in instead.
Let's Encrypt intends to do this in the very near future.
I think that this line gets at the core of why this proposal is not the right direction. This is confusing two very different concepts: what chains are available, and what profiles are available. The profile affects the actual contents of your end-entity cert: being signed by an ECDSA intermediate means you have a different signature algorithm. The chain just affects the certs above your end-entity, such as whether you present X2-signed-by-X1 or X2-self-signed.
And much of the follow-on conversation here has the same issue, unfortunately. The difference between X1 and X2 is generally just a matter of chains, yes, but the difference between E1 and R3 is not, and should not be confused with merely chain selection.
How would one define a "Type"? Is it always going to be just RSA and ECDSA? What about when Ed25519 finally gets allowed? What if people want other things (like KEM certificates)? What if people want EV certificates (not from LE, but from some other ACME server) -- does that count as a Type? Even if that CA only issues EV from a subset of their intermediates?
If this field is super strictly limited, then it becomes difficult to extend. If it is wide-open, it becomes non-machine-readable.
This list is really the core of the issue here: there are many different criteria that a client might want to know about ahead of time and select. Too many. A smart CA doesn't actually want to let clients select any combination of these attributes, because some combinations are useless, or worse, disallowed. Today, the mechanism for selecting things like this is the CSR (that's how we determine that the client wants OCSP must-staple), but CSRs are bad for this because CAs consistently make mistakes regarding copying values directly from the CSR to the final cert. And we don't really want to balloon the newOrder request with a dozen different fields, which will obviously be a mis-prediction of which fields are actually useful (observe: the notBefore and notAfter fields in the newOrder request today, which we ignore).
So, what should we do instead?
I already have a private draft of what follows, which I intend to share with the IETF ACME WG in the near future. I've been talking through this plan with other folks here and at other ACME CAs for a while now and we think this is the best -- simple, obvious, flexible, minimal -- path forward.
Profile Selection. The Directory's Meta object gets a new sub-field listing the available profiles. Each profile has a name and either a short text description or a URL pointing at a documentation page (we're not sure which is better yet).
HTTP/1.1 200 OK
"default": "The profile that Let's Encrypt has been using for the last 6+ years.",
"short": "The same as 'default', but with a validity period of 10 days.",
"minimal": "A stripped-down profile with a short lifetime, no OCSP, no keyEncipherment KU, no clientAuth EKU, etc."
The client exposes these options and their descriptions to its operator, who selects one (e.g. by putting the profile name in a config file). The client then includes that profile name in a new field of the newOrder request.
That's basically it. There are obviously details (which profile is used if none is indicated?) but this mechanism gets both CAs and clients 90+% of what they need to more formally advertise, negotiate, and configure different issuance options.
I like your idea, but I think there are two different goals someone might have, and these different solutions are targeting one or the other:
I want configuring a system to use a particular CA, configuring all the options I can for it.
I want to not configure a system with anything beyond a list of available ACME directory URLs, and the client should figure out the optimal options based on which CAs which are currently available. Such a client might regularly change from one CA to another.
I don't think there's a good solution for this. I've spent a lot of time thinking about it, and fundamentally the issue is that I cannot predict the full set of a criteria that a client might want to use to select between CAs: what signature algorithms they use, what trust stores their roots are in, what validity periods they use, how customizable those validity periods are, what extensions they will or will not include, how many names they'll include, what validation methods they use, whether they require External Account Binding, what their pricing structure is, what their rate limits are.... really I could go on forever. Therefore I think this answer is to not try, and to continue to let humans make that selection.
More that you get your organization validated, they give you a directory endpoint & EAB and such to use, and your ACME client just does the domain name validation. The next year (or whenever it expires) you get your organization validation redone, and the ACME client just keeps ticking away.
That is fair. The point of bringing this up for public discourse is trying to find a solution that works for all in the ecosystem.
I don't think your proposal as-is reflects certain needs of Subscribers and Clients in the ecosystem, or the issues/concerns Subscribers and Clients have faced in the past - and will in the future.
As I noted in an above comment, the ACME specs have been largely driven by CAs so far and would benefit from the input of Clients and Subscribers.
100% in agreement with this paragraph.
The example snippet above would not adequately address many operational concerns of Subscribers and Clients.
As a Subscriber or Client, I should be able to determine if a renewal attempt will return a chain that is "substantially different" without ordering/finalizing a Certificate – and potentially restarting services with an unwanted Root/Chain.
Let's talk about "substantially different". The entirety of a chain is largely irrelevant to this discussion. While size is a performance concern that can be handled post-issuance, the primary concern to many are the Trusted Roots. ISRG's current offering – DST, X1, X2 – each have completely different potential audiences amongst worldwide consumers. A secondary concern is the key technology (R3 vs E1).
If the listing of profiles included options that were pinned to specific roots, I think it would adequately address this concern:
"default": "The profile that Let's Encrypt has been using for the last 6+ years.",
"default-X1": "A version of default that will serve X1 as the Root"
"default-X2": "A version of default that will serve X2 as the Root"
In the above edit, we can consider the "default" profile to be the Server's recommendation for full automation, while there are several variations of that profile pinned to different chains. The variations might even be a child object of the profile key.
"description": "The profile that Let's Encrypt has been using for the last 6+ years.",
"X1": "X1 is Root. Alt Chain ordering is best-practices.",
"X2": "X2 is Root. Alt Chain ordering is best-practices.",
Under that framework, a Client would easily be able to determine if a specific profile (i.e. expected root) is available or not. If the profile is no longer available, the client might be configured to alert the Subscriber, then either stop or continue with an unpinned version.
There could (should) even be variants that advertise the topmost chain ordering. E.g.
"X1-DST": "X1 is Root. Chains are pinned: DST-Cross, X1 (alt)",
"X1-X1": "X1 is Root. Chains are pinned: X1, DST-Cross (alt)",
This would allow clients to immediately determine the most substantial parts of chains have switched, so appropriate action can be taken.
[It would be optional for a profile to declare a feature, absence of a declaration does not strictly mean absence of a feature, but presence of a declaration means the feature is definitely available]
For me the profile details don't have to hang off the directory and make it huge, it just could be a profile endpoint like /profile/default/.
For context, I'm regularly working with multiple CA accounts (e.g., 7 different production CAs, different feature sets, different available cert lifetimes, different keys type support etc) and having the system try to juggle between those as failures occur. This is because I'm currently working on the problem of large scale renewal management from a single instance/cluster. Different certs have different feature requirements and just attempting against CAs in a round-robin style until one works is a little bit unsophisticated.
In my comments here I'm not really talking about Let's Encrypt, I'm talking about ACME CAs generally, especially including internal CAs like step-ca and vault.
Regarding chain selection, I just want to be able to present a list of acceptable options for "preferred chain".
Sorry, I'm still slightly confused. It's not clear to me if the goal is to provide capabilities that ACME simply doesn't have today, or to improve the affordances of capabilities it does. I thought the goal was the former, hence my suggestion for the profile-selection system, which has no equivalent in current ACME. But the emphasis on chain selection I'm seeing feels more like the latter.
ACME already has chain selection mechanisms: you finalize an order, download the cert+chain, follow the link-rel headers, download the alternate cert+chains, and pick one you like based on what the user has configured. I definitely acknowledge that this system isn't very good: it's not clear what the options are ahead of time, there are no human-readable names associated with the options (although there could be! no need to have them be numbered like we currently do, they could be named), and the client has to parse the chains in order to make decisions based on their contents.
But fundamentally chain selection is disconnected from profiles. Chain selection is a choice that can be made after the fact, and even changed without re-issuing the certificate. I think it would be incredibly bad to have a newOrder request lock someone into the EE <-- E1 <-- X2 chain, when they should be able to seamlessly switch to the EE <-- E1 <-- X2 <-- X1 <-- DST chain if they want to.
Also, there are many criteria that might cause someone to prefer one chain over another: algorithms used (some are stronger than others), total bytes in the chain (to reduce network usage in handshakes), number of signatures in the chain (to reduce validation time), EKUs in the chain (some people believe strongly in single-use hierarchies), etc. And again, we can't predict the full set of criteria that people might choose to care about. The easiest way to let them make that determination is to provide the whole chain, and let them make the determination using whatever heuristics they want. That's what the current chain selection system does: let the client see the whole thing.
Yes, this is the point of my other reply above. We can't predict the full set of features that people will want to condition on, so I don't think I'm interested in specifying a set of feature names to include in descriptions like this. If we try to severely limit the set of things that can be advertised, we'll get it wrong. If we try to include everything, we'll completely reinvent x509. Neither of these is a good outcome.
For the use case of someone setting up a system on one CA, I can see it making sense. (Though there's no good standard for clients to alert their administrators when something changes, since it seems that there are many "zombie" clients out there trying every day to revalidate a domain that they no longer control, without anyone noticing or caring.)
But for @webprofusion's (and others') concern about configuring multiple CAs, it would be nice if all that was needed was a list of CA URLs rather than needing to also configure each CA with which features and profiles it supports. That is, I think you're worrying about one problem, while client authors designed for large integrations worry about a different problem, and they all probably need to get addressed eventually.
Yeah, I totally agree that it would be great for a client to just be configured with a list of potential directory URLs, and for it to handle everything from there. That would be glorious.
But I think the devil is in the details.
First we turn to the question of what features or capabilities the CA advertises. As discussed above, there are a lot of possible aspects of a profile that might need to be advertised. If we're letting clients pick values for those features, then there are combinations of values that are unacceptable, so we need a way to communicate that, too. Even if we're not letting clients pick, and just advertising the features of each profile so they can make an informed decision, then we need to have well-known names for every aspect of a profile. If we try to predict which aspects future CAs and clients will care about, we'll be wrong. If we try to build an extensible system that can flexibly describe every aspect of a certificate... that's just x509, ASN.1, and OIDs.
But let's assume that we figure out how a CA is going to advertise all of the details of its profiles. Then how does a client pick a profile based on that information?
It doesn't seem reasonable for the client to do all of the picking; the whole point is that different users might have different preferences. So the client needs to be configurable: every possible item that a CA might advertise, the client needs to have a configuration field that the user can fill out to indicate whether it wants that item. And of course many profile aspects are not just booleans (like "does it have OCSP Must-Staple") but may be enums (like "what set of RSA key sizes do they support") or numeric ranges (like "what validity periods do they offer"). And for each of those, it not only needs to be able to validate the input (did you put P-256 for an RSA key size you like?), but it may need to accept multiple acceptable inputs (you're fine with just TLSServerAuth, or ServerAuth+ClientAuth), and it may need to let you rank those options (you prefer 90-day certs over 365-day certs).
Honestly, I find it hard to believe that many (or any) client authors would implement such a complex configuration system. It's pretty unclear to me how to make such a system intuitive and usable.
But! Let's assume that you solve that, and your client has a robust system for letting the user configure their exact set of preferences and priorities. A user says that they prefer ECDSA-only over RSA, that they prefer TLSServerAuth-only over ServerAuth+ClientAuth, and that they prefer 10-day certs over 90-day certs. They configure three potential CAs. Each of those three CAs satisfies two of the three expressed preferences. How do you pick which one to use?
All of this, put together, is why I currently believe that complex capability advertisement, configuration, negotiation, ranking, and selection is not the best path forward. I think that CAs should advertise a few profiles with human-readable descriptions and sane defaults, and human site operators should generally trust the CA to make widely-compatible choices (as they do today!) and only select alternate profiles from the CA's carefully-curated set of options when they really want or need something specific.
Noted. I think our perspectives are different on this but that's OK. The main difficulty I see comes in trying to capture the capabilities in a standard spec and iterating on that as different capabilities are introduced, which RFCs really aren't designed for (they're very much a point in time).
Yes, CA fallback selection based on graceful degradation of features is complex, it also happens to be something my client already attempts to do, which is where my interest comes in. I accept it's not a priority for most.
I think it's fair to say that the concerns of one ACME client are not that same as another (Cerbot doesn't implement ARI, for example) but at the same time designing for the extreme case may not be broadly beneficial.
So, the goal of my request is to enable Clients to help Subscribers make the right decisions as easily as possible. It is entirely inspired by the last phrase above that I highlighted in bold - the client must parse the chains first.
This doesn't just mean the client must parse the chain(s), but the client must go through the entire certificate procurement process to get the chain(s).
I think this is a disconnect we are having. From my perspective, your idea of a Profile is a perfect mechanism in which the Server can advertise what the eventual chain selection options will be before going through the procurement process. The client would still handle Chain Selection according to it's own logic, and be able to construct the alternate chains. Advertising this before the Certificate is procured is preferred as it is easier to catch changes and incompatibilities.
100% in agreement with you on this.
Building off your real-world example, consider this scenario which will be a real-world concern:
Today: EE <-- E1 <-- X2 <-- X1 <-- DST [default] EE <-- E1 <-- X2
2024-02-08: EE <-- E1 <-- X2 [default] EE <-- E1 <-- X2 <-- X1 <-- DST
2024-06-06*: EE <-- E1 <-- X2 [default]
2024-09-30*: DST Expires
So in this situation which has the benefit of prior announcements and planning for all parties involved, we'll first see the default chains first switch, then one stops signing new Certificates before retirement. As a Subscriber - this advance notices gives me time to decide which options work or not -- and gives some guidance on how to ensure my preferred chain/root can be selected by the Client. (Sidenote: my original idea was for the server to advertise the full chains, so Clients would be influenced towards a standardized way to configure chain selection. i.e. serving names will influence against thumbprints, etc.] This situation is easy to work with – but what happens when there is an unplanned change?
Depending on your website, Roots (and Key Types) can influence audience reach and site performance quite a bit. I know some properties that have decided to go RSA/DST-X1 until the last possible minute, then they'll jump to ECDSA/X2 - which best maximizes both reach and performance over RSA/X1. I know many other properties that simply don't care, as the the differences are negligible – they'll be happy to server whatever LetsEncrypt currently offers.
Catching these changes in offered roots/chains requires two tasks that are not suitable for automation:
Diligently reading all Announcements and Documentation from CAs/Servers
Obtaining (a metered) Certificate and Chains, then analyzing them
I do not think the second option can be considered a candidate for automation because:
The certificates are metered. ISRG imposes limits on Certificates per Domain and Duplicate Certificates. Commercial CAs impose limits on overall certificates. I firmly believe that needing to consume a resource to determine if it is even usable is an anti-pattern.
The order must be completed and fully downloaded for the chain analysis to happen. This takes an increasingly larger amount of time depending on the amount of challenges that must be completed.
Many current clients are written in ways where procuring an unwanted certificate chain is likely deploy that chain and restart services under it. This can have negative effects.
To better inform clients of what to expect, the relevant information from chains can be stuffed into profile variants:
"default": "The profile that Let's Encrypt has been using for the last 6+ years."
A "default" profile is great, and works for the majority of users. Using this to onboard the "short" Certificates and stripped down features is also a brilliant idea. This works very well for Clients.
From the perspective of a Subscriber - we're seeing LetsEncrypt advertise certain capabilities - while still being silent on one of the bigger concerns that will primarily affect reach and usability. OCSP, lifetime, etc are not going to affect reach and experience; switching to an incompatible root can drop traffic by double digits or more.
If we were able to use "variants" however, we could advertise the most important bits of the Chain(s) which will be offered without needing to go through the procurement process or risk restarting services.
Consider how variants might look during the planned root retirement. I'm just naming the variants by the Root:
The Server would still offer all chains during finalization. The Clients would still be obligated to parse (and hopefully store) all chains during finalization. What changes though, is that clients would immediately see on the Directory there has been a material change to the expected chain offerings.
In practice, I imagine things would work like this:
Most Subscribers just go with the "default" option and allow ISRG to choose the best options.
The Subscribers who require a certain chain will specify a chain as they currently do, but it can be pinned to a variant marker.
Upon renewal, the Client will first notice if the variant marker is no longer offered. If the variant is no longer offered... SIRENS the Client immediately alerts the Subscriber. If the variant is offered, everything proceeds as normal.
"Variants" could even be called "Roots" here.
This doesn't enable full automation or the negotiation of Certificate specifics - it just allows Clients to immediately detect a materially significant change before committing to a metered process that can write files and relaunch services with unintended or incompatible roots.
Me too. This shouldn't require getting a Certificate to discover, and it would be wonderful if the specs (or servers) guided Clients into standardized ways to identify the chain so the selections would better persist across clients when subscribers have to switch.
I think eventually ACME will support, or be replaced by, a spect that allows for all this customization. I don't think now is the time for that.
What I do want to stress though, is the current behavior of ACME and Boulder is to make all of these sane choices and defaults - but fail to identify the material changes to these offerings as they happen. Most changes do not have substantial effects, but root offerings does. A Client should be able to query the server and determine if the preferred chain / root is no longer available (or not) before procuring a Certificate. Providing that ability would address the concerns that many of us developers have, and is not jumping into the waters of complex configuration or negotiation.
Apologies, I don't have time to write up a full reply at the moment, but I can do this little bit:
Heh, I think this is the disconnect we're having. It's not that I see profiles as a pre-issuance thing, and chain selection as a post-issuance thing. It's that I see profiles as a "impacts the actual contents of the certificate" thing, while chain-selection is a "post-hoc, can be changed even days or weeks after the certificate is issued" thing.