OCSP server sending expired responses + stapling breaks Chrome

There has been several complaints today that the latest Chrome can’t connect some sites using Let’s Encrypt certificates. I narrowed it down to the OCSP – all sites in question get an expired response from OCSP serve (for example kvlt.ee):

$ openssl ocsp -header "HOST" ocsp.int-x3.letsencrypt.org -issuer kvlt.chain.ee -cert kvlt.ee.pem -text -url http://ocsp.int-x3.letsencrypt.org/
...
Response Verify Failure
139881862981264:error:27069076:OCSP routines:OCSP_basic_verify:signer certificate not found:ocsp_vfy.c:92:
kvlt.ee.pem: WARNING: Status times invalid.
139881862981264:error:2707307D:OCSP routines:OCSP_check_validity:status expired:ocsp_cl.c:370:
good
This Update: Dec  5 04:00:00 2016 GMT
Next Update: Dec 12 04:00:00 2016 GMT
  1. For any other browsers (Firefox, Safari) it’s not fatal.
  2. Switching off OCSP stapling fixes the problem for Chrome as well.

Clearly there is a problem with Let’s Encrypt OCSP responses, but why is it fatal for Chrome only? Is it the problem in Chrome? Or in Apache? Or in other browsers?

Environment: Chrome 55.0.2883.87, Apache 2.4.23, OpenSSL 1.0.2j.

2 Likes

We are facing the same problem currently. Waiting for any status updates…

I’m seeing the following errors in Apache error.log today (first was around 7:05 UTC):

[Mon Dec 12 09:05:48.924279 2016] [ssl:error] [pid 14655:tid 2921329712] AH01936: stapling_check_response: response times invalid
[Mon Dec 12 09:05:48.924639 2016] [ssl:error] [pid 14655:tid 2921329712] AH01943: stapling_renew_response: error in retrieved response!

Otherwise my website loads fine in both Firefox 50.0.2 and Chrome 55.0.2883.75. It seems my Apache does not send invalid OCSP responses to clients:

$ openssl s_client -connect ... -status -tlsextdebug|grep OCSP
OCSP response: no response sent

I use Apache 2.4.10-10+deb8u7, OpenSSL 1.0.1t-1+deb8u5 (Debian Jessie).

One of our application started to warn users about certificate revocation status. I captured the network traffic with Wireshark and found that the OCSP response was like:

thisUpdate: 2016-12-05 04:00:00 (UTC)
nextUpdate: 2016-12-12 04:00:00 (UTC)

The nextUpdate value matches the time that the problem appeared. I double-checked the OCSP status of our site with https://www.pkicloud.com/tools.html and the result was the same.

Did I miss something to keep the OCSP updated?

This is likely an issue with OCSP signing being delayed or with the CDN that Let’s Encrypt uses serving stale OCSP responses for some reason. Both of these things are unfortunately out of your direct control.

I imagine once the Operations team becomes aware of this issue, there’ll be an update on https://letsencrypt.status.io/, so you could sign up there to follow any progress.

(I merged both threads referring to this issue.)

Unfortunately https://letsencrypt.status.io/ shows all servers green :frowning:
In the mean time our applications go down because of this. Are we sure that the operations team is aware of this ?

You should setup OCSP Stapling, with caching in order to survive periods, when the OCSP servers does not offer a fresh signed response.

Judging from the above though, the responses are simply not updating, OCSP stapling can’t help there.

What’s not clear yet is whether this was a CDN fault or something broke at Let’s Encrypt and no new OCSP answers were being signed. But either way it’s concerning to have nobody actually on top of the incident for seemingly 3+ hours AND that there wasn’t anything in place to detect the looming catastrophe. Presumably these OCSP answers were antique, though not yet expired, on Saturday, and the problem could have been found and fixed then.

Once upon a time Let’s Encrypt published statistics showing OCSP signing. Those went away. I presumed they had simply gone from public visibility but perhaps instead Let’s Encrypt ceased even to monitor its own systems in this regard and thus got blind-sided. This is especially important because it takes time to sign OCSP responses, so if the process to sign them broke, or the signed ones are lost and must be recreated, that’s going to take many hours.

My servers still has in cache good OCSP Signed answers from 7 December, which will expire on 14 December.

We’re facing the same problem, multiple customers are reporting outages.
All are (as far as we’ve been able to ascertain) OSCP errors. A quick SSLLabs check shows an OSCP error

This is quite serious, all certificates we have running with LetsEncrypt are now unusable for a majority of our clients.

It also happens to me today. I checked with openssl command:

openssl s_client -connect mydomain.com:443 -tls1 -tlsextdebug -status

resulting in:

OCSP response: no response sent

and also SSL labs, resulting in:

“Revocation information OCSP
OCSP: http://ocsp.int-x3.letsencrypt.org/
Revocation status Good (not revoked)
OCSP ERROR: OCSP response expired on Mon Dec 12 01:00:00 PST 2016”

In the nginx logs have:

2016/12/12 13:08:48 [error] 12#12: OCSP_check_validity() failed (SSL: error:2707307D:OCSP routines:OCSP_check_validity:status expired) while requesting certificate status, responder: ocsp.int-x3.letsencrypt.org

@pfg Is the operations team aware of the issue? The status page still lists the OCSP servers as green.

Is OCSP stapling supposed to help when we are already suffering from the incident?

I don’t think so :frowning:

Facing the same Problems with over 1.000 Sites ... that's serious ... any Updates on this???

[Mon Dec 12 14:59:08.179312 2016] [ssl:error] [pid 23980] AH01936: stapling_check_response: response times invalid
[Mon Dec 12 14:59:08.179433 2016] [ssl:error] [pid 23980] AH01943: stapling_renew_response: error in retrieved response!

Andreas Schnederle-Wagner

I’ve requested new certificates for all important domains and those seem to work for now… Not sure how feasible that is for everyone.

I wasn’t aware that OCSP was this fragile. I’ve been trying to pitch LetsEncrypt as a reliable alternative to my boss, but this is not helping :slight_smile:

Workaround

We have disabled OCSP stabling. Google Chrome by default ignores OCSP problems, so the majority of the visitors won’t notice the error.
Firefox is a decent browser it cares about OCSP.

Take it easy, all BIG SSL providers have OCSP outages. I do not mention names.

Start monitoring your SSL provider now: https://github.com/szepeviktor/debian-server-tools/blob/master/monitoring/ocsp-check.sh

We have disabled OCSP stabling. Google Chrome by default ignores OCSP problems, so the majority of the visitors won't notice the error.
Firefox is a decent browser it cares about OCSP.

How did you disable OCSP stapling? Live, on living certs?

I've disabled it in our webserver.
https://httpd.apache.org/docs/2.4/ssl/ssl_howto.html#ocspstapling

This forces the HTTP clients to do the OCSP checking.