Let's Encrypt, ACME challenges, and ExtKeyUsage=ClientAuth

As discussed previously, Let's Encrypt issues certificates with ExtKeyUsage=Server,Client: extendedKeyUsage "TLS Client Authentication" in TLS server certificates

What's not clear from said thread or the relevant RFCs (RFC 8555 - Automatic Certificate Management Environment (ACME) and RFC 8737 - Automated Certificate Management Environment (ACME) TLS Application-Layer Protocol Negotiation (ALPN) Challenge Extension) is why the existing ACME challenge types are an insufficient proxy for ExtKeyUsage=clientAuth.

I'm interested in hearing the community's approach to rationalizing the above, beyond historical tradition, and if anyone is actually using it for client auth. :slight_smile:

Consider that there are two types of usable LE/ACME identifiers for client auth: DNS or IP (here we're assuming that wildcard DNS likely doesn't matter from a client auth perspective).

IF a public CA were to be used as a Client Auth (and we'll use, as some suggested, SMTP server<->server message protection as an example), it's not clear what the server's validation model on the client certificate would be: would one blindly accept any certificate? Would one attempt to validate identity via the network layer, e.g., matching IP SAN->client origin IP or doing a (reverse) DNS lookup (from client IP -> hostname or DNS SAN->client IP)? &c.

Now consider how this certificate was obtained and what that implies for this example (TLS for a non-HTTPS protocol):

  1. If dns-01 challenges were used, the requesting ACME client could have no direct relationship to the client that would use these credentials, but such is DNS. This does not imply the existence of any other tenant services on this host, but doesn't preclude them.

  2. If tls-alpn-01 or http-01 were used, a direct connection to the server must've been made, but only to a HTTP server! This implies the existence of some other tenant, whether temporary (e.g., a standalone ACME client) or persistent (e.g., a long-term nginx instance, perhaps serving unrelated content under the same domain). Admittedly, all three of ports 80, 443, and 465 are privileged to bind to, so if the operator is running distinct web and mail services (which should not have cross-permissions to impersonate each other) they have arguably made a mistake. If they're e.g., hosting a webmail instance alongside their SMTP server, this might make some sense, but still might arguably be nice to separate the webmail service's certificate's capabilities from the SMTP service's cert (server-only in the former case and client+server in the latter) to prevent misuse.

In any case, presently a certificate for e.g., a parallel web service could be used to impersonate a certificate given to the e.g., SMTP service which has legitimate use cases for client authentication.

Regardless, at least from my view, neither challenge type can truly validate that client usage is allowed: the latter two (tls-alpn-01 and http-01) strictly validate that a server is present and serving the challenges -- which IMO, does not justify clientAuth usage -- with http-01 even following redirects to potentially non-authoritative systems. And in the former case (dns-01), the service itself isn't contacted, so presumably a stronger authentication onus would be on the server to validate the peer's presented certificate.

It seems in general like future support for the End User challenge types (in draft presently: ACME End User Client and Code Signing Certificates) would be the correct approach here, with perhaps a characterization that existing challenge types might not be sufficient to fully justify ExtKeyUsage=clientAuth.

In short, the protocol which is used to (correctly) justify server usage may not be the protocol or service using client auth and it may not be desirable for it to have client auth permissions.


My interest here, in the matter of disclosure, is that we've recently added support for ACME to Hashicorp Vault, eliding ExtKeyUsage=clientAuth, under the opinion that the existing challenges are not a sufficient characterization of these capabilities. I'm mostly curious to explore if any thought has been given to ACME's influence on key usage capabilities as I've been unable to find much literature on the subject.

As an aside, I think there is a use case here for (potentially cross-tenant) microservice authentication when reliance on public CAs are sufficient; when a service is uniquely identified via a 1:1 correspondence to DNS records, and proper restrictions around client auth are used, using ACME challenges for this is OK. But I don't think this is generally an acceptable (i.e., public internet for all consumers) thing to do more broadly.

from what I saw in this forum most time LE used in client contest was about having whitelist of SAN server expect from other side. like a private ca but not want to deal with deploying one.

2 Likes

IMHO, your criticisms here apply equally to Server Auth and Client Auth.

I played with it a bit, trying to see if it could take the place of a self-managed CA when used to handle things like remote database access and it's utility with SMTP servers. I prefer the self-managed CA.

I think you're starting to flirt with overthinking and overengineering a bit. Stepping back for a moment, The ACME Spec and the CA/B forum have agreed on the various existing challenges being appropriate for Domain Validation. The ports are historically privileged/well-known across most operating systems, and assumed to be trustable as under the control of the domain owner.

I need to stress the "Domain Validation" part. That's all these certs do. They're not validating a user, service, or port within the domain - just that the subscriber controls the domain in the most basic sense possible. They certificates don't contain information about the challenge used either.

Protocol aside, ACME uses the context of a server to justify complete control of the domain - which implies Client and Server could be used. There isn't a need to justify Client context.

My 2¢ on this topic:

  1. From what I've seen, I think LetsEncrypt/ACME should default to Server-only and require an explicit opt-in for Client. Every time I've seen someone try to use a LE Cert for Client Auth here, it was either a misconfiguration (not needed) or an anti-pattern in security/credentialing [that almost definitely violates the Subscriber Agreement and CA/B Forum Baseline Requirements]. There are some services that will use the cert in both a client/server context - though people never seem to have those issues here.

  2. I think the better venue for your concerns may be on the IETF working group lists. More people there have been involved in advancing the specs.

7 Likes

I'm not quite following what you're asking, but I'll try to give some replies anyway. :slight_smile:

The main in-the-wild use case of client TLS certificates that I'm aware of is, as you say, for SMTP-to-SMTP communications. For instance, to use Microsoft's Exchange Online Protection which is a "cloud" spam filter you can put in front of your "on-premise" mail server, in order to send outbound mail from your server through the service, you have your server send mail to the online protection endpoint, and you configure through the Microsoft control panel what hostname(s) are trusted to be allowed to relay mail through Microsoft's systems. That is, your mail server uses a TLS client certificate to authenticate itself to Microsoft's mail server, and the mail server confirms that the client certificate matches a particular hostname in order to allow relaying the mail further. But certainly other use cases are possible, where a publicly-trusted client certificate can be used to confirm that the origin of a request is in fact a system that owns a particular public domain name.

I'm really not following what you're trying to say about the different challenge methods; the core of them is to authenticate that a private key belongs to the owner of a domain name, and that's really it. It does seem a little weird that being able to reply once on unencrypted port 80 is enough to be able to establish complete ownership over a domain name, but that really is the state of things at the moment. But that's not really related to whether one owns the name for server purposes or for client purposes, it's really just certifying that that key belongs to that name. (And things like CAA extensions, CAs moving toward multi-perspective validations, the industry moving to shorter certificate lifetimes, and so forth, are all working toward making sure that key-to-name relationship remains intact and is as reliable as we hope it to be.)

I'm not really that familiar with Hashicorp Vault (yet), but I'm not really seeing what the relationship is between a key vault and whether a certificate was issued by ACME? I mean, I could see trusting a particular CA or not trusting that CA, based on what policies the CA had (building a good root trust store is hard), but I don't see how it's related to whether ACME was involved or not, or what that has to do with whether or not a system administrator might want to trust a particular CA for client (or server) authentication? But I think I may just not be fully understanding your use case or concerns.

ACME as a protocol is really completely separate from what certificates the CA might issue using it. The only point of ACME is to automate, as much as possible, a system requesting a certificate, rather than some system administrator needing to copy-paste a CSR into some web portal for it. But it really only covers that part of the certificate issuance process. It's entirely possible to use ACME to issue, say, OV/EV certificates where the verification is done completely out-of-band and the purpose of ACME is just for the server to automatically send a CSR and get the certificate back (DigiCert does), or to issue certificates with whatever attributes set that the CA wants to. So I don't think you'll find literature on issuing via ACME specifically, as I think you're just looking for information on how CAs pick their key usage capabilities in general. I don't see why a CA would use different capabilities whether they were using ACME HTTP-01 or whether they were using a more "classic" 3.2.2.4.2 email to the registered domain contact.

5 Likes

is why the existing ACME challenge types are a[(ed): n in]sufficient proxy for ExtKeyUsage=clientAuth.

IMHO, your criticisms here apply equally to Server Auth and Client Auth.

Not really. Under a Server Auth context, port 80/443 (for http-01 and tls-alpn-01) are privileged ports and thus an external inbound connection proves that some privileged user on said machine has authorized said usage in conjunction with the challenge solving.

However, under all three challenges, no outbound connection has been validated here. This is the crux of the issue: how is an inbound connection to a server on the domain sufficient proof that this is authorized for ClientAuth? Shouldn't an outbound connection (with said challenge) be the required proof? What privileged operator has given the requester explicit or implicit permission for ClientAuth from this server? How do we know that the cert requester has access outbound from the domain, to approve the ClientAuth usage?

Said differently, suppose you, as a client, asked for extKeyUsage=emailProtection or extKeyUsage=timeStamping or similar. Sure, its a domain validated certificate, but what proof have you given that you have a legitimate ability to do this, from this domain? Have you provided, e.g., a recent timestamp with this domain? Sent an out-bound email as webmaster@domain or root@domain? No, you've merely proven that you can receive inbound (HTTP/TLS) requests at this domain.

(Here, strictly, CA/BF guidance says that DV certs cannot contain these KeyUsages, so perhaps they are a bad example, but the point still stands: what has LE done to verify the "MAY" on clientUsage?).

I think you're starting to flirt with overthinking and overengineering a bit.

Perhaps, that will be the outcome. But definitely argue that it is an interesting threat model discussion. Should an inbound (ServerUsage) connection be sufficient to validate ClientUsage? Is DNS-01 an adequate proxy for both use cases?

On the contrary, this is the entirety of the matter. If a client, to a public or private CA, were to ask for say, random subject attributes (O/OU/...) over ACME, they'd clearly get rejected because ACME the protocol has no way of validating authoritative ownership over these attributes.

Similarly (back to this topic), ACME the protocol (presently) has no way of validating that the server is able to perform outbound connections, and thus can any CA (within the validations performed by the ACME protocol) fully verify that extKeyUsage=clientAuth is appropriate?

Regardless, perhaps @jvanasco 's commentary that this is a better fit for the IETF discussion board is better.

1 Like

I guess I'm just missing the threat model you're thinking of here. Why would some entity be authorized to speak for a domain name for inbound access, but not outbound access? Is there some common hosting setup where the owner of a name for inbound connections doesn't also own the name for outbound connections?

But that's the same for a CA that doesn't use ACME at all, which validates the domain name through an email to the domain contact, or via "Agreed‑Upon Change to Website v2" in the .well-known/pki-validation folder. I think your concern is with how CAs validate names in general, not anything specific to ACME.

But as I said, a CA can validate an organization in some non-ACME way, and then use ACME to have the server receive OV (O=whatever) certificates. DigiCert in fact does so, to my understanding.

3 Likes

This is really the crux of it.

On the one hand, everything in the original post here is correct. The validation methods that Let's Encrypt uses do nothing to prove that the Applicant is capable of operating a TLS Client. But as already mentioned, the DNS-01 method also does nothing to prove that the Applicant is capable of operating a TLS Server. The point is not to test the applicant's capabilities with regards to TLS, the point is to determine whether the Applicant "controls" the dnsName in question.

And, as mentioned above, the same is true for all domain control validation methods allowed by the Baseline Requirements. The issue you're bringing up here is not unique to ACME, its inherent to the entirety of the CA ecosystem at this time.

All the certificate does is bind that name and that key together. Once that's done, it's up to everyone else to decide what they want to do with that information.

Now, you're also not unique in raising concerns similar to this. There's a reason that inclusion of clientAuth is only a "MAY" in the BRs: because the root programs would like CAs to stop including it, but they haven't been able to make it forbidden yet because it would break too many people's use-cases. Let's Encrypt would love to change our profile to stop including clientAuth, but we don't have any good way to know who is using it and who we would break if we did so. We hope to make progress towards this in the near-to-mid-term future (stay tuned!), but no promises.

11 Likes

This actually confuses me a bit. Why would Let's Encrypt want to stop me from using a certificate from them to do mutual TLS between my mail servers? Are there different auditing requirements when allowing clientAuth or something like that?

5 Likes

The core premise is that the ACME client is proving (with an accepted set of constraints) they have some level of authority to get a certificate for a specific domain validated name, the debate is what usage should then be included on that cert.

Client auth seems like a valid use of that name as much as server auth is but from my point of view it should be up to the ACME client to ask for something, and up to the CA whether they provide it. If the client wants clientAuth and the CA won't provide it then ideally the order should fail before any acme auth is even attempted.

I think the root of this discussion of around the fuzzy topic of machine identity vs domain names. To nail down certificate usage limits for domain validated certs you'd have to solve machine identity (and while you're at it defined how groups of distributed cooperative machines may be acting as one machine), which has probably been out of scope in CA discussions.

3 Likes

Very simply because the root programs are headed in the direction of decoupling serverAuth hierarchies from clientAuth hierarchies (just like they've already separated out S/MIME hierarchies), and we prefer to be ahead of the curve rather than playing catch-up to new requirements.

7 Likes

Out of curiosity, is this change in CA/BF guidance from a defensive position (let's improve safety by mostly preventing dual use certificates and/or having an explicit chain of trust for client-only certs) or does it look like there will be additional/different validation safeguards around issuing client auth certs?

3 Likes

I'm not aware of any discussions around changing the validation methods for clientAuth.

The interaction models for serverAuth and clientAuth are fundamentally different from each other:

When a client receives a serverAuth certificate, the question it's trying to answer is "is this the correct key for the domain I'm trying to reach?". So the current domain control validation methods make sense, because they prove that the entity which controls the domain wants that key to be associated with it.

When a server receives a clientAuth certificate, the question it's trying to answer is "who does this key that's trying to reach me represent?". There's certainly room for confusion here, since the same key may be associated with many different domain names (either all in the same cert, or spread across many different certs). But at the end of the day, the current domain control validation methods still make sense: they allow the domain controller to state "this key represents this domain". So any client presenting that key in an mTLS handshake can be assumed to be acting on behalf of the domain. The server has no need to do reverse-DNS lookups of the client IP or anything like that -- it just compares the names in the client certificate against an allowlist to see if it is willing to accept connections from that domain.

So despite the inherent differences between the clientAuth and serverAuth identity models, they both come down to the question of "does this key represent this domain?". And that's what the current DCV methods do. So I'm not aware of any intent to change them specifically for the purpose of clientAuth.

9 Likes

Why are they decoupling the hierarchies then? (Sorry if I'm being dense and missed something.)

1 Like

I think primary reason is so the root programs can put in requirements without regard to client auth usage. For example, disallowing CNs on certificates in a ServerAuth hierarchy may be simpler because the "we use CN as the primary identity on our client certs!" argument doesn't hold as much water.

This is just my observation, and definitely not any sort of official opinion. I have spent much of my career working fintech PKI, where there's a lot of gnarly, complicated uses of client certs with the internet PKI, and many CAs who are happy to sell those industries products which aren't necessarily what the browser root programs would like CAs to do.

7 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.