I'm not sure this is the right category, but I couldn't find a better one.
We sell a product that sets up several web services, each serving on a bunch of endpoints, some of them even supporting arbitrary sub-FQDNs. With this I mean that if the endpoint is foo.example.com , then it also serves *.foo.example.com .
Anyways, one of the features we want to provide is the ability of accepting any bunch of certificates and private keys, sort them by endpoint, validate them and push them to the different services. This means that we can accept anything from a single file with a cert covering all the endpoints and SANs and its private key and the whole chain up to the CA root; to a bunch of files containing anything and figure out whether they're certs, what kind, where they do apply, etc. We decided to do this because certs are hard enough already, so we want to remove the onus of doing all this from the client.
The way we're doing this is as follows: For each endpoint, we find a cert that covers it. Then we find the key that matches and the chain back to the top.
The issue comes when traversing the Long Chain (which currently seems not to include the expired DST Root CA X3) and not finding a root (that is, a self signed cert). From other conversations in this forum, I read that one way to do it is to try local CA trust store roots first, then try to use the ones in the chain. This way, a modern OS would provide a SS ISRG Root X1, while an old one would have the DST Root CA X3 to finish the chain. Then use the original chain as provided so it doesn't break old clients.
My question is if this is an expected way for clients (not me) to validate certs. It seems to be the case, since my firefox running on my Ubuntu 20.10 validates the chain even when it's incomplete. I just to make sure this is OK.
A TLS server presents to its client a set of certificates: A leaf certificate, with the name of the website as a SAN, as well as a collection of intermediate certificates that the client can use to help build a chain to the roots it has in its trust store.
The client doesn't need or expect root certificates to come from the server: It already has those in its trust store, and if it didn't, it wouldn't trust them anyways.
So if a client is out-of-date and only trusts the expired DST Root CA X3 (and doesn't disable it because it's expired), serving an intermediate version of the ISRG X1, signed by DST X3, can help the client build a path between the leaf certificate to a trusted root.
If the client trusts the ISRG X1 root, it will simply ignore that intermediate. It may need the "R3" Let's Encrypt intermediate though, so that it can build a path between the leaf certificate and the X1 root.
If you omit all the intermediates, Firefox and Chrome and others will actually still work. Firefox ships all known intermediates with it, which includes R3. Chrome does "AIA fetching" to download them on-demand, which may incur some network/latency overhead. Tools like curl and others may not.
In the end: Use the long chain with the "X1 signed by DST X3" intermediate if you need more compatibility (especially for old android), but you don't need the DST X3 root itself to be served in that chain.
Don't do the "we support multiple inputs" method. It is a terrible idea that will cause you many headaches (been there, done that). Decide on 1-2 formats, and document it. The two formats I recommend are: 1) specific file upload fields (cert/chain/fullchain, private key); and 2) a zipfile that contains the cert/chain/fullchain and private key. If you are too lax, your clients will create a mess on their own systems and bugfixing/troubleshooting/support will be a nightmare. Teach them the proper way to handle things, and consider their systems your third-level emergency backups.
Something that @mcpherrinm may not have made clear enough, is there is no unified implementation for calculating trust paths. Every browser/client/library handles this differently - some need to see the entire path, others will "short circuit" once they encounter a trusted cert. Some utilize a cache that optimizes this even more, others won't. Some will use the OS root store, some will use a library/package root store (like Python's certifi), others ship with their own root store, and others have that fetching thing he mentioned.
Beyond that, I open sourced our internal tool a few years ago that does a chunk of what you're talking about. It's written in Python and MIT Licensed, so you're free to lift code from it - GitHub - aptise/peter_sslers: or how i stopped worrying and learned to love the ssl certificate . PeterSSLers is a combination ACME Client and Certificate Management system. Within the Certificate Manager, we parse imported certs/keys and store them in SQL. An Nginx plugin lets us query the most recent cert for a given domain as-needed from a 3 tiered cache (Nginx worker, Nginx main, Redis), which eventually falls back to an API hit against an internal Python server; full-misses can trigger an on-demand "autocert".
The way we implemented our Cert Management is that every element in the chain is tracked in the RDMS, so we can swap out compatible roots/intermediates as needed. I spent a while building all that out in preparation for the DST Expiration and was nearly finished - then a bunch of projects and ISRG consolidated around the android solution - so it was largely unnecessary.
Thing is, most clients don't know anything about certificates. They receive files from the CA with unknown contents, and we don't want them to unzip, convert and align stars just to import some certs.
Well, the tool can be put in debug mode and it's quite verbose about what tests against what. It's true that the tool hasn't seen much client interaction yet (only one release has it, and they don't need to use it until they have to renew the certs). But I'll keep your advice in mind.
I don't get what you mean here. Could you explain in another way?
That's quite interesting. Since out own product didn't even validated imported certs before (it just accepted whatever it was given and left the client to figure out why it was not working), I think I can define our semantics to be 'we check against OS CA trust store, then the chain itself' and 'be happy' (until next bug pops up). I'll be documenting all this
I love the name
peter_sslers is a framework designed to help experienced Admins and DevOps persons manage SSL Certificates and deploy them on larger systems
Maybe I could spend a couple of days reading the code, but quite frankly, our tool already works and has lots of tests with real certificates (the ones that raised this issue just became part of the test suite and the tests are not passing on my branch But thanks for the offer.
Sorry, I don't mean to suggest that. Let me tie in with your next question...
When something goes wrong - which always happens - your team is likely to be on a phone/email/chat with a client. When you have code that is incredibly lax (i.e. good) at reading and auto-detection, you're more likely to have a client be disorganized in how they store/organize the files locally. When it comes time for customer support, this becomes an issue for your team. I haven't dealt with this on SSL Certs, but it's happened on every enterprise and commercial tool i've worked on.
The best defense I've found against this is to be more restrictive on the formats you accept, so the clients become better about how they treat the data and files. In some situations, this has been supporting a very limited set of file formats.
IMHO, in your case, tailoring something to the different CAs would make sense. Once you know the formats that major providers offer, you can do things like ensure the clients upload the zipfile from the CA as it was received; or if they received files - put them in a folder and zip it up. The idea is to train them into organizing things correctly on their side.
I think the three most important bits on this are:
Validating the integrity of the chain
Identifying alternate chains, if any
Determining the trusted roots
Depending on who your audience is, validating against the OS root store can be irrelevant and will often lead to wild differences between two machines. That just says "It's working on this particular machine", which is often out of date or may have an augmented Trust Store. For example, Apple changes theirs on every major release and sometimes on update patches; corporate servers (especially windows based) often have custom trust stores deployed. Because of all this, IMHO, it is more reliable to validate against a specific list of "current trusted roots". You can grab mozilla's distribution directly from them, it's repacked for python less often, and curl extracts it regularly. I don't think Apple or Microsoft package their trust stores in a machine readable format, but they do list the contents of each trust store version.
The reason why I suggest this approach, is that when only a subset of users have a certificate issue despite there being (i) a valid cert AND (ii) a verified chain, the underlying cause is almost always because the selected root is not in those user's trust store. Knowing the above makes it easy for your client to check compatibility of the active root against the os/app trust stores of their users.
Ah! Don't worry, there's a reason we call the procedure import. We take the files, extract every potentially useful part, put them in a defined place and format, and then start actually looking at them. This way we can go in, use this as input, and take it from there.
Hmm, this could give us problems when and if we go OS agnostic. Right now we only run on RedHat, CentOS and soon Rocky Linux.
You're going to have a different trust store on every version of each OS, and potentially have different trust stores across those three distros too. Even though they other two are RedHat derived and may eventually sync, you're going to run into timing issues as updates are merged downstream and customers update their own systems. I would definitely use Mozilla's trust store for your validation.
You do realize that if a FQDN is pointing to your server you can just get a certificate via acme and you will be happier, your client will have one less thing to do, and nobody will have to fight with .zip files from random CAs, do you?
I can't impose processes on clients. I develop the tools they use in their own systems. The certs can come from LE, any other CA around, and even the company's own internal CA, which means I will also have to verify against previously imported private CA roots.
It's more work than I want to. Many clients already use a CA, either because it's an internal one, or because their systems are facing the internet and already have sites with certificates. I'm not going to offer automate one CA just because it's easy, because clients with other CAs will start asking for automating theirs