Hi,
Recently I'm receiving the following error message from Let's Encrypt's OCSP end point. The error does not seem specific to the request content (i.e, the certificate) since the error does not recur upon a second attempt.
Error message :
"An error occurred while processing your request.
Reference #102.3dd91002.1515855601.40f441d3"
Multiple reference numbers were received for other occurrences of this error message. Let me know if you want those numbers too.
The above error occurred for Certificate with Serial No. : 04:d5:ef:e2:6c:0e:d4:4a:93:3d:f6:1d:0a:ca:44:64:a2:b5
Request Time: Sat Jan 13 16:00 CET 2018 ( +1 GMT )
import org.bouncycastle.asn1.DEROctetString;
import org.bouncycastle.asn1.ocsp.OCSPObjectIdentifiers;
import org.bouncycastle.asn1.x509.Extension;
import org.bouncycastle.asn1.x509.Extensions;
import org.bouncycastle.cert.jcajce.JcaX509CertificateHolder;
import org.bouncycastle.cert.ocsp.CertificateID;
import org.bouncycastle.cert.ocsp.OCSPException;
import org.bouncycastle.cert.ocsp.OCSPReq;
import org.bouncycastle.cert.ocsp.OCSPReqBuilder;
import org.bouncycastle.jce.provider.BouncyCastleProvider;
import org.bouncycastle.operator.OperatorException;
import org.bouncycastle.operator.jcajce.JcaDigestCalculatorProviderBuilder;
//issuerCert -> java.security.cert.X509Certificate
//serialNumber -> java.security.cert.X509Certificate leafCertificate.getSerialNumber();
//Add provider BC
Security.addProvider(new BouncyCastleProvider());
// Generate the id for the certificate we are looking for
CertificateID certificateId = new CertificateID(
new JcaDigestCalculatorProviderBuilder().build().get(CertificateID.HASH_SHA1),
new JcaX509CertificateHolder(issuerCert), serialNumber);
// basic request generation with nonce
OCSPReqBuilder gen = new OCSPReqBuilder();
gen.addRequest(certificateId);
BigInteger nonce = BigInteger.valueOf(System.currentTimeMillis());
Extension ext = new Extension(OCSPObjectIdentifiers.id_pkix_ocsp_nonce, false, new
DEROctetString(nonce.toByteArray()));
gen.setRequestExtensions(new Extensions(new Extension[]{ext}));
OCSPRequest ocspRequest = gen.build();
byte[] requestData = ocspRequest.getEncoded();
The above is the snippet that generates the request body which is written to the outputstream while connecting to Let’s Encrypt’s OCSP end point.
I doubt if the client has anything to do with the error I’ve reported, since the error does not occur for all requests that I submit, only a small fraction of requests fail.
I doubt if the client has anything to do with the error I’ve reported, since the error does not occur for all requests that I submit, only a small fraction of requests fail.
Let me jump in here, because I think we are seeing the same thing happen and I suspect it's a network issue.
We have an openresty which is generating this log line a few thousand times a day: "[...] failed to get ocsp response: OCSP responder returns bad HTTP status code (http://ocsp.int-x3.letsencrypt.org): 503, context: [...]"
The (vast) majority of requests is OK (HTTP 200) but since normal returns do not get logged, I have no idea what the actual ratio is
When trying to debug this by hand, it does not take too many tries to get the 503:
TCP Dump told me, this was send by Akamai (example from an earlier tcpdump):
HTTP/1.1 503 Service Unavailable
Server: AkamaiGHost
Mime-Version: 1.0
Content-Type: text/html
Content-Length: 176
Cache-Control: max-age=0
Expires: Mon, 15 Jan 2018 17:00:58 GMT
Date: Mon, 15 Jan 2018 17:00:58 GMT
Connection: close
<HTML><HEAD><TITLE>Error</TITLE></HEAD><BODY>
An error occurred while processing your request.<p>
Reference #102.57d91002.1516035658.c4db99d
</BODY></HTML>
From observation, it took a lot more tries to get a 503 for a machine on a different network. This is why I am currently thinking there might be an issue inside or behind Akamai.
I am not sure what to do next. Debugging a CDN from the outside is no fun
Let’s Encrypt Ops here. We have a ticket open with Akamai and are working to pin down why this is happening. We’ll update with new information and may request a curl or similar if it looks like that would help.
We’re chasing it with Akamai, but progress has been slow so far. At this point, the issue seems to be isolated to some of Akamai’s servers in Europe that are having trouble reaching back to our origin servers.
We’re waiting on the results of some research on their side now, but I’ll update here once we know more.
I have used this workaround previously with full success. However I would guess that it is not totally reliable because the underlying IPs may be rotated by Akamai at any time.
I've included the vantage point from a Google Cloud US POP below:
ocsp.int-x3.letsencrypt.org. 123 IN CNAME ocsp.int-x3.letsencrypt.org.edgesuite.net.
ocsp.int-x3.letsencrypt.org.edgesuite.net. 1513 IN CNAME a771.dscq.akamai.net.
a771.dscq.akamai.net. 19 IN A 63.243.228.17
a771.dscq.akamai.net. 19 IN A 63.243.228.10
ocsp.int-x3.letsencrypt.org. 147 IN CNAME ocsp.int-x3.letsencrypt.org.edgesuite.net.
ocsp.int-x3.letsencrypt.org.edgesuite.net. 3563 IN CNAME a771.dscq.akamai.net.
a771.dscq.akamai.net. 19 IN AAAA 2001:5a0:4402::3ff3:e458
a771.dscq.akamai.net. 19 IN AAAA 2001:5a0:4402::3ff3:e45a
Just to add another seeing this issue. Based in the UK, and seeing occasional messages as follows:
[Fri Jan 19 08:52:42.177055 2018] [ssl:error] [pid 12429:tid 139904545642240] [client 66.249.64.157:45470] AH01980: bad response from OCSP server: 503 Service Unavailable
We’re still in communications with Akamai support. We’ve made a bit of headway and they’re looking into their European regions still. We’ll update again when there’s something major to report.
The issue is that this is knocking people’s website’s “offline” if they have setup Apache2 with it’s default OCSP stapling settings at least on Firefox (Chrome doesn’t do OCSP iirc, so I don’t know how it’ll react). Refreshing multiple times helps of course, since the server will continue to attempt to get a valid response, but the average user will probably resign before that happens.
As more and more virtual sites of an apache server drop out of the stapling cache, these sites are no longer accessible. The only solution I currently see is to temporarily set SSLUseStapling to off.