Here I reported a “unauthorized” ocsp issue that is not cause by temorary overload or so. It is 100 percent reproducable. Along with must-staple enabled this issue is really bad:
Reply to self: Service seems to be re-established.
Thanks for the report! This was related to a recent outage: https://letsencrypt.status.io/pages/incident/55957a99e800baa4470002da/5816da347170c62119001f43. We’ll be writing up a full report soon, but the short version is that our database was overloaded, and we use the same database to serve OCSP queries. We’ll work to improve reliability in the future.
You may be interested to read these gists about the ways in which OCSP stapling is implemented suboptimally in Apache and Nginx: https://gist.github.com/AGWA/1de6c26be5396f7cbce7ee016302d684 and https://gist.github.com/sleevi/5efe9ef98961ecfb4da8. Ideally your web server (looks like Apache) would keep the latest OCSP response around until it could be replaced by a fresher one. If that were the case, any outage shorter than ~3.5 days could be weathered safely, even with a Must-Staple cert. Unfortunately, Apache drops its cached OCSP response after an hour, which means that any OCSP responder outage can cause an outage in your site.
My apologies for the downtime!
Thanks for the links to some very interesting and educational reading.
We all learn from our experiences. I have no doubt you follow the “What can we learn from that outage?” line of thinking. I’ve (we’ve) already seen a lot of signs of that.
You’re providing a very professional service with a fantastic support level and you’re labelling all this as “Free Beer”. You’ve got nothing to apologize for. Hats off for your hard and intense work.
P.S. I don’t remember when I got a truly free beer last time.
jsha: I agree with biker, you do a great job, no reason to apologise!
About the suboptimal ocsp handling of mod_ssl I also filed a apache bug report a while ago: https://bz.apache.org/bugzilla/show_bug.cgi?id=57121
Better values to set for Apache’s mod_ssl to mitigate the suboptimal handling of ocsp replies:
SSLStaplingReturnResponderErrors off SSLStaplingResponderTimeout 4 SSLStaplingStandardCacheTimeout 172800 SSLStaplingErrorCacheTimeout 60
This updates valid ocsp resonses only every 48 hours and retries faster in case of erroneous ocsp replies. This helps for short outages but does not help in case the ocsp server is in a generic bad condition after the 48 hours are over though. In any case I currently would recommend those settings for every Apache setup with ocsp.
This is a very helpful example configuration, thanks for posting it. I’ll make sure to reference it if I add a page about OCSP stapling to our documentation.
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.