With the recent addition of SCTs [1] in Let’s Encrypt certificates I noted something: The final certificates no longer get logged to Certificate Transparency.
Background: When using SCTs embedded in certs Let’s Encrypt first issues a so-called pre-certificate that gets submitted to logs and then the logs issue a signed statement (the SCT) that is finally included in the final certificate. It seems LE decided to only log the precertificate now and not the final cert.
I wanted to start a discussion about this, while I don’t think it’s a huge issue I believe it would be better if both the precert and the final cert would be logged. This would allow a better visibility into what’s going on in the CA ecosystem and may uncover bugs e.g. in the encoding of SCTs [2]. See also the twitter discussion that started when I tweeted about this [3].
From what I understand the argument for not logging is that it overloads the logs with too many certs. But I believe this is not a very strong argument. As long as certificates are actually used on the public Internet it is very likely that they will be logged anyway, as for example the google crawler automatically logs them.
The delayed logging may also be be confusing for people using CT monitoring services like facebook or certspotter. People will get one notification when a cert is issued due to the precertificate and another one at a random point when the Google crawler spots their cert. I believe it would be less confusing to get a single notification for both the precert and the cert when a cert is issued.
My feeling is that it’s better to log the final certificate too, even if the pre-certificate is an engagement to generate the final certificate.
I’m just worried about the increase burden on ct logs: as most CA will probably choose to use pre-certificates for ct proof, it will double the load.
In another hand, if Let’s Encrypt doesn’t log itself the final certificates, a bad actor could use that to submit a lot of certificates in a short amount of time, in order to try disqualify a log.
Submitting the final certificate as soon as possible may protect log operators.
Probably! Though rather than deploying temporary code in Boulder, we most likely would write a script that can run in a non-trusted zone, pulling the logged precerts from CT logs, fetching the final certificates by serial number from the ACME API, and logging those final certificates to CT. If that’s something you’d like to help out with writing, that would be helpful!
I could do that. I was unaware it was possible to fetch an already-issued cert from an ACME server without authentication, though in retrospect it makes complete sense.
(Also, just checking: it’s possible to fetch a v1 certificate using the v2 api like this, correct? I did a quick test and it seems the answer is yes, but I want to avoid gotchas)
Update on this: We enabled logging of final certificates, but the load seemed to be causing availability issues for some of our logs, so we disabled it again. We’ll consider on Monday what our strategy will be for possibly turning it back on again.
I’m not sure there’s a logical strategy save for migration to logs that can handle the load.
The protocol doesn’t presently provide for declining to accept a new certificate entry which chains to an included root. For obvious reasons.
If you’re not logging the final certificates, it must be presumed that the final certificates will be accumulated and logged in bulk by a bad actor in an attempt to break various logs’ maximum-merge-delay promises.
Are there enough sufficiently performant logs running that accept LE certificates that will not buckle under the load? If not, this surfaces significant questions about the feasibility of Certificate Transparency moving forward.
It raises questions, but it's not necessarily the death knell for the whole protocol if some deployments had growing pains when input increased 40% overnight.
That potential attack is alarming, though.
You can see when they turned it on and off again in Merkle Town's graphs:
I think the possibility of a coordinated slam of final certificates not-previously-logged should definitely be part of the threat model for CT logs.
If there’s a significant corpus of unlogged certificates accumulating, one must consider that a bad actor might be collecting these to slam a log in the aggregate. If the log doesn’t have protection against this kind of flood, it is conceivable that the MMD time could become violated. Alternatively, if logs do have rate limiting protections against this sort of attack, it is presumably the case that the attacker could cause at least a recurrent certificate issuance blocking event of some duration by slamming the logs until hitting the rate limit, as the CAs generally are going to block for sufficient SCTs to embed in the final certificate.
Ultimately, if a log at this time is not able to sustainably handle at least a little more than 2x the precertificates being submitted, it must be presumed that the log will ultimately either be rate throttling or eventually fail to meet the required guarantees. Either of these essentially define a given log as unfit for utilization by the higher volume CAs.
In any event, the clear anti-DoS mechanism for bulk overload is to go ahead and get the certificates logged. If that is unsustainable for a given log, that log’s infrastructure must be augmented or the log replaced.
Is your policy on what CT logs you log to (both for precerts and final certs) available anywhere? E.g. “Google Icarus and Cloudflare by default; if Google Icarus is down, switch to Google Skydiver; if Cloudflare is down switch to DigiCert; if those are down too, decline to issue” or whatever?
The configuration isn't public (yet), as far as I know, but the code is.
It tries to get SCTs from every configured log. They're divided into groups. The configuration seems to define two groups, "Google" and "not Google". It keeps the fastest responses and cancels the other requests. If it's unable to get an SCT from at least one log in each group, it fails. If it gets more SCTs than necessary, it seems to discard the others.
It also tries to log precertificates and (when enabled) final certificates to every log flagged "informational", but it doesn't throw an error if they don't work.
That’s fascinating, thank you. But I’m curious to know which logs are in each group. The Google list will be fairly boring, although I assume LE took guidance from them on which to use. But due to volume they would need to have made an arrangement with the owners of the non-Google logs, so I’m curious to know who has told them they can cope with the volume.
That is an interesting question. It would also be interesting to know which log professed that it was having difficulty with the combined load of the precerts and the final certificates.
We don’t currently have an always-up-to-date page for our log submission policy, that that’s a good idea in the medium term. As of right now: For Google, we submit to Icarus and Argon. For non-Google, we submit to Nimbus, Sabre, and Mammoth. Our informational logs are Nessie and our own log instance that we are load testing. We additionally submit final certs to Argon.
Thank you for providing this clarity. However, your message seems to imply you only submit final certs to one log - Argon. Or is there a typo in there, as Argon is mentioned twice?
That’s correct. We submit both precerts and final certs to Argon. To the other logs, we submit only precerts. As of a couple weeks ago we were briefly submitting both types of certs to all logs, but we’ve changed to just the one. We’re aware of concerns about third parties copying final certificates between logs and creating the same load issues we were hoping to alleviate, but this seems to provide a bit of temporary relief. I’ve been meaning to open a thread on ct-policy about the possibility of logs implementing prioritization or separate rate limits between precerts and final certs.
Ah, yes - temporary brain fade. It’s actually the precert which needs to comply with the log policy of one Google, one non-Google, not the final cert. Doh.