Re: Enabling ACME CAA Account and Method Binding

Continuing the discussion from Enabling ACME CAA Account and Method Binding:

THANK YOU!

:heart:

12 Likes

Well, I guess my excitement might be a little premature. But I'm still excited that this is actually coming shortly instead of just "sometime".

13 Likes

It will truly bring another elevated dimension to the security of the certificate process.

Hopefully any related errors will be very explicit [and make our troubleshooting a breeze].

9 Likes
  1. I can think a possible bug may trigger only in production: will failing on CAA recheak make auth invalid? otherwise there would be noughty deadlock CA keep useing cached auth with wrong validation method and failing, not allow user to ask new challange. (I think staging doesn't use Auth reuse long enough to trigger CAA recheck)
    @lestaff Boulder Code doesn't look like failing validationmethod CAA doesn't make auth invalid, just order with it fails. with cached validauthz with wrong validation method will block certificate for that domain, as CA will keep reuseing that authz and failing order, not give chance to get different challenge to fix the auth.
    this doesn't look handled in boulder master: anyone with free domain name can test this?

  2. And would one setting thi parameter want to set it as critical, or will they likely set one CAA that wil just such CA that this extension will have meaning?

9 Likes

I agree this is excellent and it unblocks our Certify DNS service for a lot of people now that they can limit issuance to a specific account.

Our system is a supported cloud implementation of the acme-dns protocol for dns challenge delegation, compatible with most acme-dns clients, designed to offer the same benefits of acme-dns without running your own servers for it. The issue has of course been that people had to trust us not to issue certs on their behalf(!) whereas now they can explicitly block that from theoretically being possible.

6 Likes

@webprofusion What an excellent opportunity to spam your commercial service again indeed :slight_smile:


 

@aarongable Is it perhaps a good idea to append these new features to the CAA documentation at Certificate Authority Authorization (CAA) - Let's Encrypt?

6 Likes

well website is open for pull request so GitHub - letsencrypt/website: Let's Encrypt Website and Documentation

6 Likes

I know, I have made 7 PRs in total on that repo as we speak, of which 2 have been merged, 3 are closed and 2 are still open, for about a year now. So sorry if I'm not inclined to start a new PR for this if that's how they're treating Community input :slight_smile:

10 Likes

I'm not quite following you: Are you saying that Boulder treats differently the case where I am adding or removing letsencrypt.org entirely in my CAA record, and the case where I am doing so with a validationmethod parameter?

You would have a CAA issue entry for each authorized CA, and the additional restrictions just mean that that issue entry won't apply as an authorization unless the restriction is met. So you don't need to set anything critical, just set the parameters you need. For instance, I'm using a record along these lines (though I'm adding some redaction since I don't know how public account ids are supposed to be, even though in theory anyone can get them from DNS):

0 issue "letsencrypt.org; validationmethods=dns-01; accounturi=https://acme-v02.api.letsencrypt.org/acme/acct/00000000"
0 issue "letsencrypt.org; validationmethods=dns-01; accounturi=https://acme-staging-v02.api.letsencrypt.org/acme/acct/0000000"
0 issue "amazonaws.com"
0 issuewild ";"
0 iodef "mailto:redacted@domain.invalid"

So for certificate issuance to happen, one of those records needs to match is all. Though until they implement the change, those additional parameters are just ignored. And even once their code starts using those parameters, I'm not sure that a bug leading to a certificate being issued in violation of them would actually be a "misissuance" until their CP/CPS includes a promise that those restrictions will be followed, but I may be wrong on how that works.

8 Likes

not a misissureance, because it would be blocking side, but consider this scenario:
day 1: user A(with account A) get certificate for acme.com / and some other domains with http-01 challange: (this create auth with valid state with http-01 challange:
day 7: CAA record 0 issue "letsencrypted.org; walidationmethods=dns-01;" added
day 10: user A now trys to get another certificate for acme.com :

  1. CA reuses valid auth for acme.com that created in day 1 for this account, so CA links the order with that auth:,
  2. Notices it's past 8 hours so CAA recheck is triggered: if failes(because it didn't pass validationmethod) order fails, but auth keeps its valid status:
    this looks will happend until first auth expires, which can be up to a month later
7 Likes

I guess it's open to pull request, not pull :sweat_smile:

7 Likes

Thanks! :kissing_heart:

8 Likes

And so this is handled differently by Boulder than the case where on Day 1 the CAA is missing (or includes letsencrypt), but on Day 7 the CAA record is changed to have other CAs but not include letsencrypt at all?

7 Likes

it's handled same, ( valid auth is still valid, but order fails) but in validationmethod mismatch client should have been possible to recover from it (by making new challange with right type)

9 Likes

I do have a bunch of throwaway domains I use for this kind of testing, perhaps I can do some experiments over the weekend, if I find some time.

Does one actually have to wait the full 8 hours to trigger this? IIRC, the BRs state 8 hours, but perhaps LE checks before every issuance, so you can trigger this almost immediatly?

And in this case the client is unable to fix it, as it is forced to use a authz that is going to fail?

This is production-only, because it has different logic regarding authz reuse, right?

7 Likes

I had thought that Let's Encrypt used the same logic in staging vs. production, but that it was certbot which acted differently when pointed to the staging environment to inactivate old authorizations. I may be wrong, though.

8 Likes
// Per Baseline Requirements, CAA must be checked within 8 hours of
// issuance. CAA is checked when an authorization is validated, so as
// long as that was less than 8 hours ago, we're fine. We recheck if
// that was more than 7 hours ago, to be on the safe side. We can
// check to see if the authorized challenge `AttemptedAt`
// (`Validated`) value from the database is before our caaRecheckTime.
// Set the recheck time to 7 hours ago.
caaRecheckAfter := now.Add(-7 * time.Hour)

look like it's 7hour for safe margin in LE, it's hardcoded so it should be same for both prod and staging.
ra.reuseValidAuthz is configfiile value, and so I have no idea if staging's config use it or not, althogh
it was disabled in staging in 2019

10 Likes

Okay, let me make sure I understand the scenario that @orangepizza is suggesting:

  1. Client creates a new Order for example.com
  2. Client successfully completes the HTTP-01 challenge for example.com
  3. Client waits more than 8 hours but less than 30 days
  4. Client sets a CAA record with "validationmethods=DNS-01"
  5. Client attempts to Finalize their order
  6. The HTTP-01 validation gets used, CAA gets rechecked disallowing HTTP-01, issuance fails

In this scenario, it's fairly clear to me that the issue is on the client: don't change your CAA records between completing a challenge and finalizing the order. But there is another related scenario (which may be the one orangepizza was originally thinking about) that seems worse:

  1. Client successfully completes a full issuance cycle for example.com using HTTP-01
  2. Client waits more than 8 hours and less than 30 days
  3. Client sets a CAA record with "validationmethods=DNS-01"
  4. Client creates a new Order for example.com
  5. Due to authz / validation document re-use, the Order gets created with the already-validated-via-HTTP-01 authorization already attached
  6. Client attempts to finalize their order, CAA gets rechecked, issuance fails

It's fairly difficult for Boulder to avoid this issue. If Boulder wanted to try to be smart, it could do a CAA check at Order creation time, to see if any of the cached valid authorizations it could re-use match the current CAA records. But Order creation is a synchronous operation, and doing a CAA check is very slow, so that's not a good solution. You might imagine that we could synthesize an Authorization object which has the old challenge already validated, but also allows other challenge types to still be validated as well -- but in fact this would be a violation of RFC 8555, which asserts that a given authorization can only have a single validated challenge associated with it. The only foolproof solution here is to either disable authorization reuse entirely, or at least shrink the reuse time to be the same as the CAA cache time (7 hours). I'd love to do this, and I think we probably will do it in the near-to-medium-term future, but it's a fairly large operational change that we can't commit to doing immediately.

Thinking about it some more, this scenario would arise specifically when a client
a) wants to re-issue on a shorter-than-usual timeframe;
b) wants to change which validation method they use; and
c) wants to use CAA to enforce their new validation method
all at the same time. This is very good to think about, but seems fairly niche to me -- specifically, I can imagine lots of people seeing this announcement and doing (a) and (c), but also changing the validation method that they use at the same time seems less likely. Also, this scenario can be resolved by deactivating the cached authorization.

So unless there are other factors I'm missing, I think I'm not particularly concerned about this failure mode. If people actively start running in to this, please report it here and we'll definitely take another look.

8 Likes

I'm not quite sure what you mean by this question. The CAA "critical" flag can only be set at the level of Properties (issue, issuewild, iodef, etc), not Parameters (accounturi, validationmethods, etc).

6 Likes

I think the problem scales with certs issued and can become problematic [at least for a few weeks] to some users.
Take the case of a large cert user [university/integrator/etc.]:
One third of their certs would renew on any given 30-day period.
Of those, how many would be renewing in a way that could be affected by this "catch-22"?
Even if notice is given to all their users...
How would they overcome it?
Would they have to delete their unexpired certs to issue new ones? [Would that even work?]
Without some proven guidance, I think we will start to see a whole lot of "creative failures" in overcoming this "problem".

So...
It is better to be prepared and have some good advice in hand to give those that may face it.
[I'll leave that as an exercise to those that are ready/willing/able to take it on]

3 Likes