I wonder, why was so much certificates scheduled for revocation?
Remember, the bug ONLY applied to those that had a previous authorization, that was more than 8 hours old prior to issuance, where there was a CAA rechecking bug.
This means, that ALL certificates, that were issued within 8 hours from all approved authorizations, should be unaffected, since even with the bug that was against the policy, issuance would be allowed without rechecking at all, the original CAA check is still valid.
And what I know, Lets Encrypt has a log of all authorizations that is stored until the certificate’s lifetime atleast, because they are required to show that authorizations happened if a legal case happens where someone complains of a fraudulently issued certificate.
Thus Letencrypt should be able to parse that log, and only pick the certificates that has more than 8 hours apart from authorization and issuance for atleast one of the domains, if im right. And most Lets Encrypt clients, DO issue immidiately after authorizating, which means there should not be any delay there.
Or what are the problem of just picking those certificates?
It seems that Lets Encrypt have picked all certificates that were issued at the time of when the bug was present in the system, regardless of when authorizations happened.
As far as I know, Let's Encrypt did single out only the certificates that the bug applied to. They said that the issue potentially affected 2.6% of active certificates, or about 3 million certificates.
The bug existed for months -- if they had wanted to revoke every certificate issued during that time period, whether it was necessary or not, that would have meant every unexpired certificate up until it was fixed.
I imagine that's true. The bug was probably often triggered when, for example, someone issued a certificate for "a.example.com and b.example.com" one day, and then replaced it with a certificate for "a.example.com, b.example.com and c.example.com" a few days later. The second certificate would probably reuse the authorizations for a.example.com and b.example.com (and involve a new authorization for c.example.com), triggering the bug.
Yeah, but one of my certificates was scheduled for revocation (domain sebbe.eu). And that client is written such as so it always redoes all authorizations and then immidiately creates a certificate. So it seems that picking of certificate was related to some other measure than really affected by bug.
Also a lot of Certbot users seamed to have activated the option –force-renewal or –renew-by-default which could have created certificates affected by the bug.
Does your client specifically deactivate authorizations, or frequently replace its account?
Reusing valid authorizations doesn't require special behavior by the client -- it's how things work by default. (A client that doesn't distinguish between pending and valid authorizations might wastefully try to validate a valid authorization again, but doing so is a no-op which won't return an error.)
You have to go out of your way to avoid reusing authorizations.
It looks like you renew or update your certificate approximately every 30 days. The affected certificate is from 2019-12-10 23:17:56. There is a previous certificate issued 2019-11-10 23:19:34-23:19:35. Assuming the November certificate used fresh authorizations, the December certificate could have reused authorizations that were approximately 29 days, 23 hours, 58 minutes and 22 seconds old.
(Let's Encrypt recommends that certificates be renewed 30 days before they expire, or approximately every 60 days, at random times of the day.)
aah that makes sense actually. What my client does, is to create authorizations, and then complete them (writing to DNS zone file), and then request them to be validated by LE server. (ACME v1)
So it could behaved like a no-op when renewing the certificate 1 minute before the authorizations expired, with me beliving that it used fresh authorizations, when it didn’t. I actually tought validating a valid authorization again, would make it fresh again. (the client don’t make any effort to check the pending status of an authorization before it actually executes do_challenge (asking server to check for validness) and then it checks the pending status since the LE server needs some time to complete the validations before I can attempt an issuance. So I use the pending check as a “in progress indicator” only)
The reason there is time sheninigans is because its a cron script set up to run monthly, and the certificate renewal script takes different amount of time to complete each time, causing the issuances to differ by up to 1 minute.
The reason I do renewal monthly instead of each 60 days, is because IF any error happens or if LE is overloaded or whatever, the script will not do any attempt or recourse of fixing the problem. By doing it monthly, I can have 2 failures and still have minimal distruption (would be at most 1-3 days based on when exactly in a month a montly cron is ran).