Let's Encrypt Uptime - Comparing 2019 with 2016/17

DanCvrcek · November 24, 2019, 7:55pm

I analyzed LE uptime at the end of 2017 - looking at its first 16 or so months. As it’s been a while, I was somewhat curious what changed and did another quick analysis of the https://letsencrypt.status.io data to find out.

The exec summary: using only “full disruptions” (including planned restarts of LE services), the uptime is somewhere around 99.92% - compared to 99.86% in 2017.

If we include partial “incidents”, the uptime goes down to 96.4% - while you probably wouldn’t notice some of the partial service disruptions, it’s hard to further categorize those with the available data.

You can see the full text with charts at: https://keychest.net/stories/lets-encrypt-uptime-2-years-on

(please do chip in if you spot any issue with the results!)

My main question - when I sipped a single malt looking at the charts - was what aspirations are there in terms of reliability and if it can ever achieve the uptime of “commercial” CAs.

Osiris · November 24, 2019, 8:34pm

Could you explain to me why you’re mixing linear and logarithmic Y-axis between the different graphs?

rg305 · November 24, 2019, 8:38pm

You're going to have quantify that in real numbers before anyone even understands there is a difference.
I mean, is there?
Taking into account the frequency in which other CAs renew certs (annually or bi-annually) that means they get 6 to 12 times less use.
And given that they probably service that many less certs... are you really comparing apples to apples?

DanCvrcek · November 24, 2019, 8:51pm

You're right in terms of their sales to end-users. In my experience though, the bulk of their revenues comes from enterprise users. This means API integration, custom root CAs, running OCSPs, etc.

But I take your point.

DanCvrcek · November 24, 2019, 9:54pm

purely practical reasons - I started with linear but subsequent charts were losing interesting information because of large outliers. I mention log axis in captions. I suppose, I probably could have come up with a better solution have I had more time.

_az · November 24, 2019, 10:20pm

What’s the uptime if you exclude this week-long “incident”?

Would be nice to have a dedicated graph for OCSP as well, since as you remark, it’s the most important thing to keep online.

DanCvrcek · November 24, 2019, 10:32pm

97.9% ... there was another partial disruption (timeouts on API) .. excluding that one as well would end up in 99.7%.

OCSP - I could see only 2 incidents - both short 5 and 7 minutes (> 99.99%) but it's not clear here, whether lengths reflect the real downtime - or whether they may not include the time between the first reports and LE starting looking into it.
...

_az · November 24, 2019, 10:37pm

Ah yeah, this one - 11 days.

If I remember right, it started with the move from Akamai to Cloudflare (New CDN for the Production API), where they went from relying on the CDN to terminate SSL, to doing it themselves.

I was originally going to say that your conclusion was a little overblown, but now that I think about it, that migration could have been a bit smoother .

Edit: another interesting thing is that OCSP is still on Akamai. If it had been migrated as well, things could have gone real bad!

DanCvrcek · November 24, 2019, 10:41pm

sometimes it's better to avoid touching things that work. Although, I find this rule somewhat ... inflexible

system · December 24, 2019, 10:54pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
My take of letsencrypt.status.io - Let's Encrypt uptime and downtimes Server	2	4575	November 3, 2017
OCSP Reliability Shout Out Praise	13	1927	March 22, 2021
Monitoring of Let's Encrypt CA Help	8	842	July 31, 2020
About day certificates Issuance Tech	15	259	March 29, 2025
Availability data for the Let's Encrypt API Help	7	1954	February 10, 2018

Let's Encrypt Uptime - Comparing 2019 with 2016/17

Related topics