TLSA record hygiene for Let's Encrypt issuer CAs

Please see:

TLSA record hygiene for Let's Encrypt issuer CAs - dane-users - list.sys4.de

TL;DR:

  • DO publish ALL applicable intermediate CAs when any are published
  • DON'T publish TLSA records matching long-retired LE CAs.
2 Likes

Can you please elaborate in this thread instead of just linkdropping with a few do's/don'ts?

LE has their intermediates published already, including reserved intermediates not yet in use. How is this "DO" new?

How is this practical information for this Community? Who needs to read this and who needs to ignore it?

2 Likes

The details are in the linked post to dane-users, which is a bit long. The impetus is that in fact a non-trivial minority of MTAs with DANE TLSA records violate the requirements.

  • Some publish TLSA records matching (sometimes only) long retired CAs.
  • Some publish TLSA records with just one of R10–R14 or E5–E9, and then have a problem when cert renewal ends up with a different issuer.
  • Some publish TLSA records matching a root CA, that is NOT included in the server's chain file

All of these are poor practices, often leading to at least intermittent issues with inbound mail delivery.

3 Likes

It could be helpful if Let's Encrypt could publish (and maintain) a generic TLSA record with all their current, spare, and upcoming intermediates, which users can then just point a CNAME to:

_25._tcp.my.mailserver IN CNAME _dane.letsencrypt.org.

When LE rotates intermediates, and adjusts their TLSA record accordingly, such CNAME'd TLSA records would automatically remain up-to-date.

2 Likes

The idea of a centrally managed TLSA RRset has some appeal, but it also concentrates operational risk. Any DNS outage at LE becomes an outage for the domains using the suggested CNAME.

Presently LE resources are not runtime critical, if you can't renew your cert today, you can do it later when LE is back online, meanwhile, your current cert should be good enough.

So I am somewhat sceptical that the idea pans out. What would be more helpful is support in certbot for renewal with a pre-generated but different from current key. So that TLSA records can be published first, and key rollover can happen later with a known in advance key.

Perhaps also a URL kept up to date, where users can download the list of intermediate issuer CA key digets, would work well. This would be used by the domain owner periodically to refresh the TLSA records, and is not runtime critical, of the download fails, the refresh can be tried later.

1 Like

FWIW, acme.sh supports this (Le_Next_Domain_Key location).

1 Like

I'm probably misunderstanding but isn't that what using a custom CSR would provide? The main difference being that you would generate the new key before the request, rather than certbot generating it.

Sadly, with certbot, custom CSRs are not a usable interface for rectificate renewal with a pregenerated key. The CSR option leads to a full reconfiguration of the list of domains and associated challenge methods. How do you, non-interactively, combine certbot renewal with an explicit key (with or without a CSR as an intermediate step)?

Ah yes, I was using certbot has an example, I actually don't use certbot very often, but it seems like a simple enough feature for acme clients to implement - pickup their key for the next renewal from the same location each time, allowing your process to have regenerated that independently in advance.

Are you looking to achieve this on a large scale or just with a few renewals?

1 Like

Yes, it is not a complex feature in principle, but in practice certbot is a vast maze of a codebase, and I for one could find a simple way to add it when I looked at it some years back. Someone more familiar with its internal architecture might have better luck.

That said, the intent of this "topic" is to raise awareness of the need to handle TLSA records for LE issues with some care. Too many naïve MTA operators are winging it, and suffering intermittent or longer term outages.

2 Likes

I apologize in advance if this is crudely wrong ... I know nothing of TLSA requirements. But, just based on the overall description of "this" and "next" keys wouldn't something like this work with Certbot today?

At every cert renewal issuance:

  • certbot certonly option renewal
  • Use custom script to:
  • Remove oldest TLSA record (is for now-expired cert)
  • Add TLSA record for this cert
  • (now have TLSA records for current and this "next" one)
  • wait X hours, deploy this cert to TLS server

Various ways to "hold back" the deploy of a newly issued cert. Like copying ../live/.. cert to a separate "deploy" directory for the TLS Server.

PS: Great first post to instruct on best practices and highlighting extent of problems.

1 Like

There are multiple variations on staging certificate deployment in a way that allows associated TLSA records to be prepublished. The reason for this thread is that a non-trivial minority of MTA operators are NOT taking the necessary steps to make certificate renewal non-disruptive.

I use https://github.com/tlsaware/danebot, another similar toolkit (perhaps more polished) is: https://github.com/raforg/danectl. Or one might point the MTA at a key+cert file that are not directly managed by certbot and the like, but instead some other process automates "promotion" of new certs into the live MTA config. Regardless, something should be done other than sloppy "pinning" of a single LE issuer CA and then have a (hopefully brief) outage when renewal cause a new cert with a different issuer to go live, and perhaps update the TLSA records in DNS after, sometimes only after someone else notices and nags. :frowning:

1 Like

Yes, I got that. Definitely a problem

My response was in reply to above comment of yours. Would that new feature in Certbot be distinctly better than staging certs for deployment?

I saw acme.sh had quirks when trying to implement that feature. Related to cert request failing and the "next" key getting the rotation out of sync. A staged deployment seems more robust to me regardless of the specific tooling.

This is a sideshow for sure. But one which you introduced so ... :slight_smile:

2 Likes

I do think that the ability to combine planned key renewal, while disabling automatic key rollover with:

[renewalparams]
reuse_key = True

would be helpful, then users would be able to stay with cert/key files pointing at the standard LE locations, and certbot taking care of key rollover internally, with the planned next key generated by some other process (even "never" is OK for many users) that can prepublish the TLSA records before adding the key to the certbot key staging area a few DNS TTLs later.

This leaves the downstream process unchanged, which significantly eases adoption. Of course for any of this to matter, users have to know about the issue, which is why I am doing my best to reach out and let them know...

3 Likes

not sure it saving future key on server for TLSA update is good idea: wouldn't any event actually rotate the key (server ransomwared or something) likely cause that backup key to lost too or need to be revoked too? I think more appropriate way to automate this would be delaying deploying new certificate after update TLSA record than wait propagation, then reconfig webserver to use new certificate.

1 Like

It makes little sense to optimise for the exceptional case, under normal conditions it is fine to generate the next key a few days early. And the pronlem at hand is not lack of correct alternatives, but lack of easily available alternatives and lack of awareness. Those in a position to implement more sophisticated approaches are not the intended audience. And with rumours of future (IIRC or comparable) ~7-day certificate lifetimes, delaying deployment is not a viable strategy.

It is not a rumor it is a fact LE will offer an optional 160H lifetime cert. This is currently in test rollout with more general availability later this year. See shortlived profile: Profiles - Let's Encrypt

Google Trust Services already allows choosing the cert's duration down to very short periods (even just a day or two).

For certs that are not "short lived", the max lifetimes will reduce to just 47 days by 2029.

I personally believe if you plan to use short lived certs consideration should be given to using an alternate CA as a (hot) backup. More generally, it seems a poor choice for people who don't currently have a valid TLSA config.

1 Like