Sometimes it's required to break the rules - and to change these

Last days I was a little bit nervous. 3,048,289 Letsencrypt certificates to revoke. Checked the file, some accounts with thousands of certificates that must be renewed. What happens if a webmaster is on vacation?

Revoked certificates can kill a business. Https is great, same https everywhere. There is no step back to http. But a not working website is critical. The bug is not good. But there was a correct cached authorization.

This morning - oh:

First informations Tuesday, now 1,706,505 certificates renewed. Client-initiated. So Letsencrypt can revoke these certificates without a website crash.

This evening. May be the next 500.000 - 1.000.000 certificates are renewed. So it's possible to revoke these certificates.

So: Thanks! Websites and users are happy. The rule is broken, but the effect is helpful.


Is this rule - revoke all certificates in 5 days - really good?

The certificate authority must:

That needs some time. There were users, Tuesday, 16:00 - no mail, not affected. 20:00 - oh, a mail - affected.

Users need some time to check their configuration and to renew the certificates.


So my idea: A splitted rule.

  • The CA has 3 or 4 days to identify the certificates and to send a mass mailing.
  • Users should have 3 days to renew their certificates
  • After 6 or 7 days all affected certificates must be revoked.

Not 5, instead 6 or 7 days.

5 Likes

Remember the Baseline Requirements are not LetsEncrypt's rules, but ones all the Certificate Authorities and Major Browser/OS vendors agreed upon.

As end-users we have one perspective, but this affects a much broader concern and use-cases.

3 Likes

Yes. But read

https://www.digicert.com/position-on-1-year-certificates/

In August 2019, CA/B Forum Ballot SC22 was introduced by Google to reduce TLS certificate validity periods to one year. CAs reviewed this proposal with their customers and produced thousands of comments from users, which mostly showed opposition, due to the additional work required by IT teams to handle shorter validity periods. The ballot failed in the Forum, which meant certificate maximum lifetimes remained at two years.

The IT teams didn't want that. Now Apple reduces the validity period -> one year.

Letsencrypt was able to revoke all certificates. But then the webmasters and users would have the problem, not Letsencrypt.

The communication between a CA and all users needs some time.

  • Find all certificates
  • send a mail
  • user must create a new certificate. But some users may have a lot of such certificates.
1 Like

Perhaps the ACME specs should be improved with something another user suggested somewhere on this forum (too lazy to search for the source, sorry!): make some kind of state called “revocation pending” in some kind of OCSP-like manner: when running the renew cronjob, the ACME client not only checks for the OCSP good/revoked state, but also queries the “pending revocation” state: if a revocation is pending, it should try to renew.

Obviously it would be a lot of work and resource requirements at the CA site, all for a bug which shouldn’t have happened in the first place.

4 Likes

Related: Ability for Automated Notification of Revocations

6 Likes

Not only related, but that was the thread I had in mind! Thanks :slight_smile:

3 Likes

Yep, that's a good idea.

Sometimes webmasters are offline. And in a (german) forum there was such a situation: 16:00 - no mail. Oh - 20:00 - a mail.

3 Likes

i had a similar idea the other day!

changing the rules sure, but especially with the currently unique position of Let’s Encrypt as the currently only Free CA (as wosign and startssl were kicked off and CACert never having been a thing in root Programs either) this gives unneeded fuel to the others to request throwing out Let’s Encrypt. especially with the Idea that LE iirc takes a neutral stance on content policing, and having been used for malware, phishing and unethical stuff already, this won’t get better.

2 Likes

I don't think there is a way back to only payed certificates. That time is over.

And there (via 2020.02.29 CAA Rechecking Bug - #4 by jsha )

@jsha announces such a new protocol:

Therefore, our conclusion is that we need to develop a protocol to notify Subscribers' systems of imminent certificate revocation, so those Subscribers can automate the process of replacing affected certificates before the deadline. We plan to design this protocol publicly, in collaboration with the PKI community, so that any CA and any Subscriber can implement it. We will also collaborate directly with popular ACME clients to integrate and test such automated replacement.

If such a solution is deployed, it's not longer a problem to revoke a lot of certificates in 5 days.

2 Likes

I would love that, but aside from LE there is iirc no CA offering free certs, and if LE gets thrown out, it's kinda game over.

true, but the question is currently less about the future but rather the current incident, as that is probably the currently "best" place to get LE knocked out, and I am sure that paid CAs would take every opportunity to stab LE and force them out of the market.

not that I would want that, but realistically speaking.

awesome.

3 Likes

There's also the Norwegian CA Buypass, which offers free certificates with ACME. According to this page there are some restrictions, but it's definitely better than nothing :slight_smile:

3 Likes

The paid CAs know the game is over. Paid certificates are a niche, nothing else. Large sites, earlier EV-certificates, now Letsencrypt certificates.

@jsha wrote it - 1619179 - Let's Encrypt: Incomplete revocation for CAA rechecking bug

By reviewing previous incident reports and analyzing our current situation, a common root cause of failure to timely revoke is that Subscribers are not able to replace certificates on the BR-mandated timelines (24 hours and 5 days, depending on the issue).

That's the same situation. First, Letsencrypt wanted to revoke all certificates. That changed. It would break too much.

Last year, there was another bug (not Letsencrypt certificates, but most of the other CAs). The random serial number generator had an error. Following the rules, millions of certificates to revoke -> no certificate was revoked.

So this rule is critical (or ignored), if the error affects a lot of certificates.

1 Like

well but what would you think would happen if LE flew out of the store for... Let's say a violation of the BRs?

we would go back to having to pay for even DV certs.

dafaq? how did they get through with that? I mean this is frankly bonkers.


suuure, a CA which blatantly lies about the nature of the certificate is so totally trustworthy.

refresh course to EV Guidelines:

EV Certificates focus only on the identity of the Subject named in the Certificate, and not on the behavior of the Subject. As
such, an EV Certificate is not intended to provide any assurances, or otherwise represent or warrant:
(1) That the Subject named in the EV Certificate is actively engaged in doing business;
(2) That the Subject named in the EV Certificate complies with applicable laws;
(3) That the Subject named in the EV Certificate is trustworthy, honest, or reputable in its business dealings; or
(4) That it is “safe” to do business with the Subject named in the EV Certificate.

3 Likes

Let’s also note that the justification to avoid mass-revocation of Paid CA usually is “it will impact our customers”, when the justification of Let’s Encrypt to not revoke all certificate was “it will impact web users, decreasing security if they disable OCSP check”.

4 Likes

A german source: Millions of TLS-certificates with wrong serial numbers.

https://www.golem.de/news/zertifizierungsstellen-millionen-tls-zertifikate-mit-fehlendem-zufallsbit-1903-139979.html

BR: A serial number must have 64 random bits. But there are serial numbers only with 63 bits.

First, the problem had Dark Matter:

https://groups.google.com/forum/m/#!topic/mozilla.dev.security.policy/nlN_QrDwgaw

They used integers to create serial numbers, an integer may be < 0, but serial numbers must be > 0, so only 63 bits are used.

That's a problem of the EJBCA software used to create certificates. Google, Apple, GoDaddy and others had the same problem.

Google + Apple -> only own certificates are affected.

GoDaddy:

https://groups.google.com/forum/m/#!topic/mozilla.dev.security.policy/S2KNbJSJ-hs

Minimal 1,8 million certificates are affected.

Issue was introduced with a change in 2016.

Apple: There is the same:

To minimize impact to our users, we do not expect to revoke all impacted certificates within the 5-day requirement. We expect to provide a timeline for revoking all impacted certificates in a forthcoming update.

Total number of impacted certificates that are still valid (not expired and not revoked) as of 2019-03-07 3:20 PST: ~558,000

558.000 certificates not immediately revoked. The same argument:

minimize impact to our users

2 Likes

PS: The problem shows that a short lifetime is helpful.

Last year the other CAs blocked a Google idea to reduce the lifetime to 13 months.

Now Apple forces that.

Letsencrypt certificates -> no problem, 13 months -> 5 certificates :laughing:

Automation -> shorter lifetime -> software bugs are not so critical.

Result: Not critical bug -> no revocation required.

4 Likes

that is dumb, so they can get away with some lame excuse like that. the excuse of LE is considerably better. (along with the side fact that most mobile browsers and chrome have OCSP disabled anyway therefore not making a difference.

thanks I know.

one thing LE could try would optionally (but maybe even my default) allow even shorter certs, while keeping the 90 days for people who prefer that, and anyone who wants or doesnt care can get more secure certs by that way.

2 Likes

Checked that Apple bugreport:

The start.

2019-03-05 8:00 AM - We became aware of a potential issue with our understanding and configuration of the CA Software used for issuing publicly trusted certificates, and began investigating.

Revocation status 2019-03-12:

Revocation Status

More than 355,000 certificates have been revoked since the incident was detected.

23% of total, impacted certificates issued were still valid (not expired and not revoked) as of 5 days from when the incident was identified. Further analysis is being performed to determine an appropriate schedule for revoking these certificates.

2019-03-22:

Over 115,000 additional certificates have been revoked since our last update leaving less than 10% of the total population of impacted certificates remaining (file attached with remaining certificates).

2019-03-30:

We've been working our plan to revoke impacted certificates. Thus far over 500,000 certificates have been revoked since the issue was identified and 54,853 remain (file attached with remaining certificates). Our plan will result in all impacted certificates being revoked.

2019-05-03:

Most certificates have been revoked and less than 1% of the total population of impacted certificates remains. As previously noted, it is expected that all impacted certificates will be revoked by July 15.

Finished:

In a previous update, we committed to addressing all impacted certificates by July 15. As of July 10, all valid, impacted certificates had been revoked.

So the revocation was finished after more then 4 months.

If all Letsencrypt certificates are renewed after 60 days (typical configuration), after 2 months every certificate is replaced and it's possible to revoke the certificate without any problems.

2,5 months earlier.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.