Note that the change also affected some DANE SMTP users, who are publishing TLSA records matching the ISRG X1 public key (rather than R/E, as recommended), but with the cross cert gone, the root CA is no longer in the presented chain, and does not pass DANE-TA(2) checks. more details in dane-users post.
The operators in question should know better than to deploy cert chains that are not first verified against their TLSA records, or to fail to monitor correctness shortly after the fact, but some are not as sophisticated as they ideally would be.
It would be good if any future re-announcements of the upcoming change mention the possible impact on DANE. My posts to dane-users, postfix-users and exim-users likely have not reached all the potentially affected users...
I don't recall whether there are any knobs in ACME clients to include the root in the chain, perhaps more users could be educated about such an option if it exists, and if they prefer to designate the root as the DANE trust anchor, rather than the intermediate issuers.
Thanks for the heads-up! I'm a bit confused, so let me check my understanding:
A "2 1 1" TLSA record means "What follows is the SHA-256 (1) hash of the public key (1) of a trust anchor (2) somewhere up the chain from my end-entity certificate", right?
The sha-256 hash of the public key of ISRG Root X1 is the same, regardless of whether we're talking about the self-signed cert or the cross-sign from DST Root CA X3. So if 0b9fa5a59eed715c26c1020c711b4f6ec42d58b0015e14337a39dad301c5afc3 is the hash of the ISRG Root X1 pubkey, then it should have continued to work regardless of whether the client built a chain to the self-signed root or the cross-signed one.
As always, the chain offered by a webserver is a suggestion: TLS clients performing chain-building to verify the end-entity cert may use any intermediates contained in the offered chain to make chain-building to a trust anchor easier, but they may also ignore them. In this particular case, clients which trust ISRG Root X1 should build a chain to ISRG Root X1 regardless of whether the server offered the "EE <- R3 <- X1 <- DST" chain or just the "EE <- R3 <- X1" chain.
What am I missing? How did leaving the cross-sign out of the webserver's chain cause DANE to fail, despite the pubkey of the trust anchor remaining the same?
Your observations are all correct up to the point where the client is building a chain to the X1 root, but with DANE-TA(2) (usage 2), the chain construction uses ONLY the presented certificates, and not any certificates from some hypothetical local trust store. So the X1 root would match if it were included as part of the presented chain, but it does not when simply missing.
[Edit: The cross cert was included in the presented chain, and indeed its identical key matches the TLSA record, the problem with the shorter chain is precisely that it is shorter, and omits the root CA from the certificates sent on the wire. If there were an option for ACME clients ("certbot", "dehydrated", ...) to append the root CA to the chain file, most TLS server applications would use the chain as-is, and DANE-TA(2) would work for the "pinned" root key, once it is actually presented. ]
Ah, so DANE-TA(2) doesn't care about a constructed trust chain (despite the fact that RFC 6698 Section 2.1.1 (2) says "any certificate matching the TLSA record [is] considered to be a trust anchor for this certification path validation"), it only cares about what's in the presented chain. I did not realize that.
This seems like a misfeature? In general, webservers want to leave out the root certificate, because they want to send the fewest possible bytes on the wire and the TLS clients likely already have the root certificate in their trust store. But that means that DANE "2 1 1" records will generally contain hashes of intermediate keys, and we already know that intermediate pinning is considered harmful. (And in fact, Let's Encrypt will be making changes to prevent intermediate pinning with our next key ceremony.)
How are DANE users intended to walk the tightrope between not sending extra (large!) certs on the wire, and not pinning intermediates?
No, a trust chain is still constructed, it is just that its potential inputs are the presented certificates, a subset of which become the verified chain.
RFC6698 does not prohibit using locally known certificates as inputs, but doing so would be a disservice to the ecosystem, because not all clients would have the same locally known "missing" certificates, and the same chain would work for some clients and not others, with server operators incorrectly pointing the finger at a client that did nothing wrong. Therefore, the original DANE implementation in Postfix, later polished and added to OpenSSL deliberately avoids looking for chain certificates in the local trust store when validating DANE-TA(2) TLSA records. All the inputs MUST come from the presented certificates.
This is explained in RFC7671 Section 5.2.2.
DANE is used primarily in SMTP, or in non-interactive API services, with no hint of adoption in browsers. The extra cert is not a problem, just a few extra milliseconds, and only when resumption did not happen.
Please explain in GORY detail, what you mean by:
And in fact, Let's Encrypt will be making changes to prevent intermediate pinning with our next key ceremony.
Note that in DANE this "pinning" is by the server operator, not random clients. And the server operator is in control of both the chain deployment and publishing of the TLSA record and is capable of ensuring they match.
Potential sabotage of DANE-TA(2) by Let's Encrypt would be very much unappreciated. The recommended DANE-TA(2) records for Let's Encrypt are in fact key hashes of the intermediate CAs.
If you intend to DoS inbound email for a ~56 thousand domains, there'd have to be a compelling reason, and lots of warning.
We plan to begin issuing from intermediates unpredictably, so (for example) you wouldn't know if your cert is going to come from R3 or R4 until you have it in hand. Keep an eye out for more about this in the near future; this thread isn't the place to go into details.
Unless one can have multiple DANE-TA(2) records indicating multiple equally-trusted pinned CA pubkeys? It seems like that's what the table at the bottom of your post is indicating, as otherwise there would be no reason for anyone to have pinned R4 or E2 so far.
If Let's Encrypt really wants more agile issuer keys, you could publish in a suitable DNSSEC-signed domain (or under letsencrypt.opg after DNSSEC-signing letsencrypt.org) DANE-TA(2) records of as yet unexpired, active or soon to be deployed intermediate issuer CA keys.
_dane.le-signed.example. IN TLSA 2 1 1 <sha256(spki-CA-1)>
_dane.le-signed.example. IN TLSA 2 1 1 <sha256(spki-CA-2)>
Where unexpired means a CA for which at least some issued EE certificate has not yet expired. Then before introducing a new issuer, you could add it to the TLSA RRset in question at least a few TTLs before it mints any EE certs.
Users of Let's Encrypt could just add CNAMEs to their zones:
_25._tcp.mx1.domain.example. IN CNAME _le.domain.example.
_25._tcp.mx1.domain.example. IN CNAME _le.domain.example.
_le.domain.example. IN CNAME _dane.le-signed.example.
More operationally sophisticated users would instead periodically clone the records in question into their own zones, avoiding real-time dependency on LE's DNS.
Yes, my understanding is that it would be typical to have multiple keys listed.. Also typical for end-leaf-key-pinning, where you would want to have a key distributed and in DNS caches for some time before actually having the mail server start using it, so you would have both current and upcoming, for instance, both listed.
Gotcha, thank you. In that case yes, everything should work fine, as long as folks include all of the intermediates that we put into rotation.
I don't believe we currently intend to DNSSEC-sign letsencrypt.org, so it seems unlikely to me that we will be publishing our own DANE-TA(2) records for others to refer to, but that decision is outside my domain of expertise.
Yes, definitely, the recommended TLSA RRSET for DANE-TA(2) with Let's Encrypt lists all four of R3/R4/E1/E2. Any one of which is then accepted as a valid issuer. So if all you're expecting to do is randomly pick among otherwise well known issuers, that's fine.
Though some attempt to reach the users who publish just one of them would be useful. For example, a post by LE to dane-users, exim-users and postfix-users, and any other fora that may be appropriate, in addition of course to this fine forum, which reaches some of the users.
As you can see the majority of the users publishing R4/E1/E2 also publish R3, otherwise the numbers would not add up. If we count just the underlying MX hosts, and not served domains, and look at the number of issuer CAs pinned (subset of R3/R3/E1/E2) the counts are:
So, while the majority of domains are using the MX hosts with all 4 CAs pinned (the 340 well managed MX hosts), a majority of MX hosts (hobbyist vanity domains) are pinning just one of the issuers.
Note that the domain where the official TLSA records are published need not be letsencrypt.org, it could be some related domain letsencrypt-issuer-cas.org, or whatever... It "just" needs to be well operated, to ensure many "9"s of uptime.
Alternatively, it could be available only as a source from which to regularly clone the recommended TLSA RRs. They could for example be in the form of TXT records instead.
letsencrypt-issuer-cas.org. IN TXT "2 1 1 <hash1>"
letsencrypt-issuer-cas.org. IN TXT "2 1 1 <hash2>"
Which would make them unsuitable as live CNAME targets, but easy to consume as a configuration source.
The trust anchor you should bear in mind is the matching DANE TLSA record, not the certificate. Just like in RFC5280, the trust anchor is a root public key, not any self-signed certificate with that key. The language in RFC6698 is overloaded, the matching certificate is also called a trust anchor, but that's just a convenience.
Similarly, in DNSSEC the root zone KSK trust anchor is external to DNS. It is part of the resolver configuration. The actual KSK record in the root zone DNSKEY RRSET is data to be verified, not the trust anchor.
But, since folks are used to thinking of CA certificates (rather than bare keys) as trust anchors, RFC6698 plays along identifying the trust anchor with the corresponding certificate object.
I sent a note to the above lists about the planned-for randomisation of the issuer CA choice. But I still think that a reminder from LE closer the actual change date would be helpful. Some operators will still only find out the news too late, after they encounter an outage, but we can at least try to contact the rest.
I do have a list of the affected domains, I don't know whether LE would be interested and willing to send email notices to the contacts of the registered ACME accounts. You have the advantage of having an actual relationship with these domains, while the DANE survey is a third-party observer, and WHOIS is increasingly unhelpful of late...
The hash (with matching types 1 or 2, rather than 0), is a compact representation of a trust anchor, you match it against the presented information and the matched object becomes the root node of the verified chain, with the digest as the "anchor".
This would be analogous to using a DS RRset as your trust anchor in a DNSSEC validator, rather than the list of the corresponding KSKs.
In the case of DANE 3 1 1 with RFC7250 public keys for a "chain", the trust anchor is simply a digest of the expected key, which is the only element of the "certificate chain". That key (Ed25519), could be shorter than its digest (SHA512), but that's neither here nor there.
Is there a point to this digression? I rather suspect not, and will indulge it no more, barring some evidence that this is heading in a productive direction.
The only point is helping me understand how this works. I assumed I knew, I was wrong, but then I checked my setup and I probably did set it up while understanding it a bit more, as I have all four intermediates in 2 1 1.