We are developing some ACME tools and as we are curious about LetsEncrypt performance, we built online monitoring as a validation tool.
If you find it useful - good. If you think it’s lacking something - even better, especially if you let us know.
We issue and measure 8 certs/minute across 4 locations (prod) and 20 certs / minute across 5 locations (staging). Timing measurements are for each API call separately. The library has an OCSP client but no data collection yet for that.
Update 1: June 27, 16:15UTC - upgraded monitoring servers, improved caching, optimized access to speed up data collection, corrected visualization of downtimes (we initially assumed that downtime = no API response … how silly of us).
Update 2: We have detected the first downtime on 25th - 11 minutes before the official detection.
As far as I can see, User agreement doesn’t cover technical details. These are enforced by rate limits with which we comply.
Interestingly, we were looking at testing the utilisation of LE API but it sounded too harsh even if it were just a few bursts a day. And you can only do it against a couple of API EPs. It may also be true that this information is better to keep “less public”.
We assumed that variations in the load would be seen in the overall latency. The first weekly report suggests it may be possible but only the time will show. The latency went up by 1000ms on all monitoring stations on Thursday morning. But it may simply coincide with maintenance, although the increased latency lasted much longer.
Cool tool! Just to be clear, the timestamp on letsencrypt.status.io doesn't reflect the time our alerts go off, it's the time that we posted the update. You can see in some of our public post-mortems that the incident timeline includes the internal alert notification time which is different than the public status page.
We have also started producing weekly reports - here’s an extracted chart of the latency over the last week. Each data point represents 100-120 transactions - this particular chart doesn’t show downtimes, it interpolates missing data.