Ocsp.int-x3.letsencrypt.org could not be resolved

Hi,

We are getting a lot of these similar errors for 24-48 hours on our proxy nginx server

2017/02/16 03:36:55 [error] 500804#500804: ocsp.int-x3.letsencrypt.org could not be resolved (2: Server failure) while requesting certificate status, responder: ocsp.int-x3.letsencrypt.org, certificate: “/etc/nginx/ssl/dummy-domain.fr.crt”

An idea ?

Fix your DNS server is probably the shortest and, if I may say so, correct answer :stuck_out_tongue:

The hostname is resolving fine:

osiris@desktop ~ $ dig +trace ocsp.int-x3.letsencrypt.org

; <<>> DiG 9.10.3-P4 <<>> +trace ocsp.int-x3.letsencrypt.org
;; global options: +cmd

(...)

ocsp.int-x3.letsencrypt.org. 600 IN	CNAME	ocsp.int-x3.letsencrypt.org.edgesuite.net.
;; Received 111 bytes from 184.26.161.64#53(a14-64.akam.net) in 19 ms

osiris@desktop ~ $ dig +trace ocsp.int-x3.letsencrypt.org.edgesuite.net

; <<>> DiG 9.10.3-P4 <<>> +trace ocsp.int-x3.letsencrypt.org.edgesuite.net
;; global options: +cmd

(...)

ocsp.int-x3.letsencrypt.org.edgesuite.net. 21600 IN CNAME a771.dscq.akamai.net.
;; Received 101 bytes from 2600:1401:1::40#53(a6-64.akam.net) in 105 ms

osiris@desktop ~ $ dig +trace a771.dscq.akamai.net.

; <<>> DiG 9.10.3-P4 <<>> +trace a771.dscq.akamai.net.
;; global options: +cmd

(...)

a771.dscq.akamai.net.	20	IN	A	82.94.229.49
a771.dscq.akamai.net.	20	IN	A	82.94.229.48
;; Received 70 bytes from 82.94.229.45#53(n7dscq.akamai.net) in 8 ms

osiris@desktop ~ $

Hmmm, although my ISP’s DNSSEC-enabled DNS-servers don’t object to the hostname with the +dnssec flag for dig, DNSViz does give a rather unsettling graph with a lot of errors and warnings:

http://dnsviz.net/d/ocsp.int-x3.letsencrypt.org/dnssec/

:tired_face:

But I’m not sure if that’s the reason why your DNS server can’t find the hostname.

Issue has reduced but not gone, since we removed 127.0.0.1 from being
our primary resolver…

I notice the occasional resolving error in my daily runs, too. It’s rare but it happens.

20170105-000001.log:  "content" => "Could not connect to 'ocsp.int-x3.letsencrypt.org:80': no address associated with name\n",
20170117-000002.log:  "content" => "Could not connect to 'ocsp.int-x3.letsencrypt.org:80': no address associated with name\n",
20170202-000002.log:  "content" => "Could not connect to 'ocsp.int-x3.letsencrypt.org:80': no address associated with name\n",

Edit: LOL at that dnsviz graph. Good to see that even big CDN providers can mess up DNS.

Is this something LE might want to look into, @jsha? That dnsviz graph is definitely not normal and I assume you have a contractual relationship with your CDN provider. This is not how you should run your DNS and as a customer I would demand proper service.

The DNSViz report falls under:

  • Completely harmless warnings (the inconsistent NS and A records)
  • Probably harmless warnings (The two OPT warnings. Frankly i have no idea what that means.)
  • The error, which is kinda bad, but could only result in NXDOMAIN or empty responses, not SERVFAIL.

They aren’t exactly good but shouldn’t have any negative impact. Or at least this type of negative impact.

Well, the keyword is “shouldn’t”. As demonstrated, they apparently do have an impact.

Ah. You’re right, your “no address associated with name” may very well be the NXDOMAIN error; it’s such a vague error message that it could be other things too.

The original poster’s SERVFAIL issues are probably unrelated to the DNSViz stuff. I think.

I’l file a ticket with Akamai, though I agree with @mnordhoff that the issues shown on DNSViz probably weren’t the root cause of the original poster’s SERVFAIL results.

I’ve checked our external OCSP monitor, and it’s not showing any DNS problems, nor is Netcraft’s monitor. However, both of those have limited network perspectives, and it’s possible there’s a regional issue with Akamai’s network. @yoyo699, if you can tell us what region your servers are in we might be able to set up local monitors and attempt to reproduce the problem. Also, what resolver are you using now?

Actually I’ve seen that yesterday and on 16th too, with Google resolvelrs. Well, not exactly that, because it could not be resolved as a result of the timeout, so could be temporary connectivity issues on the uplink. @yoyo699, what is your time zone for that log? Just wondering if it more or less matches the event I observed.

It looks like resolvers that do strict QNAME minimisation (RFC 7816) may have this problem. I was able to reproduce this behavior on Unbound 1.6 with the “qname-minimisation-strict” flag turned on.

I suspect there may be more resolvers that run into the same problem for other reasons. If you’ve seen this issue, I’d be very curious to hear what resolver(s) you’re using - either particular software, or particular providers’ resolver IPs.

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.