Renew cert failure likely because of IPv6

Please fill out the fields below so we can help you better.

My domain is: modaps.modaps.eosdis.nasa.gov with cname modaps.nascom.nasa.gov

I ran this command: sudo /bin/certbot --webroot --webroot-path=/var/www/html renew --renew-hook “/bin/systemctl reload nginx”

It produced this output:
Attempting to renew cert from /etc/letsencrypt/renewal/modaps.modaps.eosdis.nasa.gov.conf produced an unexpected error: Failed authorization procedure. modaps.modaps.eosdis.nasa.gov (http-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: Could not connect to modaps.modaps.eosdis.nasa.gov, modaps.nascom.nasa.gov (http-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: Could not connect to modaps.nascom.nasa.gov. Skipping.

My web server is (include version): nginx 1.10.2

The operating system my web server runs on is (include version): CentOS 7.3

My hosting provider, if applicable, is:

I can login to a root shell on my machine (yes or no, or I don’t know): Yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel): no. Direct ssh

Hi all, I’ve been trying to debug this for a week now and I’ve run out of ideas of what the problem might be.

I didn’t have trouble originally receiving LE certificates but I think recently there was a switch over for LE to use IPv6 over IPv4. When trying to renew this (and other domains) I get the above error message.

I have very limited IPv6 testing capabilities from the outside world but internally IPv6 is working and sites like ipv6-test.com seem to suggest our site is accessible over ipv6.

The web path seems to be correct as well: curl http://modaps.modaps.eosdis.nasa.gov/.well-known/acme-challenge/test

Any help would be greatly appreciated.

1 Like

Hi @ngolpa, could you try with curl -6 (from a machine you’re sure has IPv6 connectivity)? Sometimes web servers are configured to serve a site only in IPv4 even though they are also listening on the IPv6 interface.

Hi @schoen, thank you for the quick reply. Yup I tried that as well:

curl -v http://modaps.modaps.eosdis.nasa.gov/.well-known/acme-challenge/test
* About to connect() to modaps.modaps.eosdis.nasa.gov port 80 (#0)
*   Trying 2001:4d0:241a:40c0::38...
* Connected to modaps.modaps.eosdis.nasa.gov (2001:4d0:241a:40c0::38) port 80 (#0)
> GET /.well-known/acme-challenge/test HTTP/1.1
> User-Agent: curl/7.29.0
> Host: modaps.modaps.eosdis.nasa.gov
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx
< Date: Tue, 30 May 2017 22:46:11 GMT
< Content-Type: text/plain
< Content-Length: 13
< Last-Modified: Tue, 30 May 2017 22:00:59 GMT
< Connection: keep-alive
< ETag: "592deb9b-d"
< Accept-Ranges: bytes
< 
correct path
* Connection #0 to host modaps.modaps.eosdis.nasa.gov left intact

Edit: Formatting fix

I followed up by looking at your DNS settings with an online DNS tester, and it looks like you have a DNS misconfiguration (a lack of delegation or SOA records). The error from the certificate authority was not very descriptive, but I think this kind of DNS problem could cause this failure even if the site works in a browser, so I suggest checking for yourself with a thorough online DNS tester and then seeing if you can fix the SOA issue.

If this is the problem, it would be cool to have an IPv6-capable site where the IPv6 interface was not the cause of the validation failure. :slight_smile:

Hi @ngolpa @schoen,

I can confirm that both domains are reachable using ipv6.

$ curl -ivkL6 http://modaps.nascom.nasa.gov/.well-known/acme-challenge/test
* Hostname was NOT found in DNS cache
*   Trying 2001:4d0:241a:40c0::38...
* Connected to modaps.nascom.nasa.gov (2001:4d0:241a:40c0::38) port 80 (#0)
> GET /.well-known/acme-challenge/test HTTP/1.1
> User-Agent: curl/7.38.0
> Host: modaps.nascom.nasa.gov
> Accept: */*
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
* Server nginx is not blacklisted
< Server: nginx
Server: nginx
< Date: Tue, 30 May 2017 22:48:37 GMT
Date: Tue, 30 May 2017 22:48:37 GMT
< Content-Type: text/plain
Content-Type: text/plain
< Content-Length: 13
Content-Length: 13
< Last-Modified: Tue, 30 May 2017 22:00:59 GMT
Last-Modified: Tue, 30 May 2017 22:00:59 GMT
< Connection: keep-alive
Connection: keep-alive
< ETag: "592deb9b-d"
ETag: "592deb9b-d"
< Accept-Ranges: bytes
Accept-Ranges: bytes

<
correct path
* Connection #0 to host modaps.nascom.nasa.gov left intact

$ curl -ikvL6 http://modaps.modaps.eosdis.nasa.gov/.well-known/acme-challenge/test
* Hostname was NOT found in DNS cache
*   Trying 2001:4d0:241a:40c0::38...
* Connected to modaps.modaps.eosdis.nasa.gov (2001:4d0:241a:40c0::38) port 80 (#0)
> GET /.well-known/acme-challenge/test HTTP/1.1
> User-Agent: curl/7.38.0
> Host: modaps.modaps.eosdis.nasa.gov
> Accept: */*
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
* Server nginx is not blacklisted
< Server: nginx
Server: nginx
< Date: Tue, 30 May 2017 22:49:20 GMT
Date: Tue, 30 May 2017 22:49:20 GMT
< Content-Type: text/plain
Content-Type: text/plain
< Content-Length: 13
Content-Length: 13
< Last-Modified: Tue, 30 May 2017 22:00:59 GMT
Last-Modified: Tue, 30 May 2017 22:00:59 GMT
< Connection: keep-alive
Connection: keep-alive
< ETag: "592deb9b-d"
ETag: "592deb9b-d"
< Accept-Ranges: bytes
Accept-Ranges: bytes

<
correct path
* Connection #0 to host modaps.modaps.eosdis.nasa.gov left intact

Using https versions of the sites I get a 404 error but that should be no problem at all since op is trying to use http-01 challenge.

From my side there is no problem to reach the site and should be validated correctly, maybe Let’s Encrypt boulder has some kind of problem to reach your network.

Maybe @jsha or @roland could shed some light on this issue.

Cheers,
sahsanu

@schoen: My DNS knowledge if not that great so I could be completely wrong but isn’t the SOA record required at the zone level not the individual host?
dig modaps.eosdis.nasa.gov

modaps.eosdis.nasa.gov. 206 IN SOA ns1.nasa.gov. dns.nasa.gov. 17700 900 900 1209600 300

vs
dig modaps.modaps.eosdis.nasa.gov

modaps.modaps.eosdis.nasa.gov. 3376 IN A 198.118.194.38

@sahsanu, take another look at the DNS. It’s resolvable but not technically valid. (I think this is another case where the Boulder errors need to be made more specific in order to help with debugging; I’m getting close to trying to make a detailed list of such cases.)

@sahsanu, yes we’ve got /.well-known/ only configured for http not https.

@schoen, sorry I still am not sure exactly what the misconfiguration is in DNS. I was comparing our responses with those from google and from what I can tell we’re responding in a similar fashion:
http://dnscheck.pingdom.com/?domain=www.google.com responds similar to
http://dnscheck.pingdom.com/?domain=modaps.modaps.eosdis.nasa.gov
and
http://dnscheck.pingdom.com/?domain=google.com responds similar to
http://dnscheck.pingdom.com/?domain=modaps.eosdis.nasa.gov

Just for the sake of completeness. This is what the part of the letsencrypt.log file look like:

{
  "type": "http-01",
  "status": "invalid",
  "error": {
    "type": "urn:acme:error:connection",
    "detail": "Could not connect to modaps.modaps.eosdis.nasa.gov",
    "status": 400
  },
  "uri": "<snip>",
  "token": "<snip>",
  "keyAuthorization": "<snip>",
  "validationRecord": [
    {
      "url": "http://modaps.modaps.eosdis.nasa.gov/.well-known/acme-challenge/<snip>",
      "hostname": "modaps.modaps.eosdis.nasa.gov",
      "port": "80",
      "addressesResolved": [
        "198.118.194.38",
        "2001:4d0:241a:40c0::38"
      ],
      "addressUsed": "2001:4d0:241a:40c0::38",
      "addressesTried": []
    }
  ]
}

and for the cname:

{
  "type": "http-01",
  "status": "invalid",
  "error": {
    "type": "urn:acme:error:connection",
    "detail": "Could not connect to modaps.nascom.nasa.gov",
    "status": 400
  },
  "uri": "<snip>",
  "token": "<snip>",
  "keyAuthorization": "<snip>",
  "validationRecord": [
    {
      "url": "http://modaps.nascom.nasa.gov/.well-known/acme-challenge/<snip>",
      "hostname": "modaps.nascom.nasa.gov",
      "port": "80",
      "addressesResolved": [
        "198.118.194.38",
        "2001:4d0:241a:40c0::38"
      ],
      "addressUsed": "2001:4d0:241a:40c0::38",
      "addressesTried": []
    }
  ]
}

Thank you both for the quick responses.

I guess that may have been a false alarm. Indeed, when I tried it manually I did see an SOA record.

@jsha @cpu, could one of you possibly give us a more specific reason for this connection failure to help figure out whether it’s related to IPv6, DNS, or something else? First guesses that it was related to either of these didn’t immediately pan out.

This appears to be another instance of #2770 (comment). That is, looking at the logs, this appears to be a timeout on the IPv6 address. We have fallback code from IPv6 to IPv4, but it seems that the code does not behave correctly in the presence of a timeout.

However, this does mean there are timeouts reaching your IPv6 address. Quick fixes would be to either remove the AAAA record, or figure out what’s causing the timeouts. We’ll work on a longer-term fix in our fallback code. Thanks for the details!

Thanks @jsha. I’ll look into both of those suggestions. Just for clarification. Are your logs saying it’s a DNS resolution timeout or is it a timeout connecting to our HTTP server’s IPv6 address after resolving the IP?

Something odd I noticed while debugging this issue is one of our sites with nearly identical setup has no problems renewing. For some reason that site seems to prefer IPv4 or IPv6 even though the setup is the same (it has AAAA and A records and it’s served by nginx with same configuration):

{
    "identifier": {
    "type": "dns",
    "value": "floodmap.modaps.eosdis.nasa.gov"
  },
  "status": "valid",
  "expires": "2017-06-02T01:39:21Z",
  "challenges": [
    {
      "type": "http-01",
      "status": "valid",
      "uri": "https://acme-staging.api.letsencrypt.org/acme/challenge/<snip>",
      "token": "<snip>",
      "keyAuthorization": "<snip>",
      "validationRecord": [
        {
          "url": "http://floodmap.modaps.eosdis.nasa.gov/.well-known/acme-challenge/<snip>",
          "hostname": "floodmap.modaps.eosdis.nasa.gov",
          "port": "80",
          "addressesResolved": [
            "198.118.194.40",
            "2001:4d0:241a:40c0::40"
          ],
          "addressUsed": "198.118.194.40",
          "addressesTried": []
        }
      ]
    },

Any ideas why in that case LE is using IPv4 instead and is it possible to repeat the behavior on our sites that are currently failing to renew?

Thanks.

Timeout connecting to the HTTP server.

Let's Encrypt will recycle old, already validated authorization objects, up to their lifetime. Checking the expiration date on the successful entry, it looks like that is what's happening here. The authorization was validated before Boulder deployed the change to prefer IPv6. I'm afraid in a few days, that authorization will expire, and that domain will have the same problem.

FYI, this and other recent threads inspired me to make a little progress towards improving the validation errors. Still eagerly awaiting that list, though!

1 Like

So I’ve been trying to take your suggestion to try and figure out what’s causing the timeout. I’ve been monitoring packets coming to our server via tcpdump. During the time certbot prints “Waiting for verification…” until it fails I don’t see any IPv6 packets reaching our server.

I contacted our upstream network guys and they say they are not blocking any IPv6 connections to us or shaping traffic in any way.

I also tried timing the connection to our server from the very few IPv6 hosts I have access to outside our shop and I’m seeing results in the few milliseconds range. For example (all values in seconds):

curl -v -6 -w "@curl-format.txt" http://modaps.modaps.eosdis.nasa.gov/.well-known/acme-challenge/test -o /dev/null -s
* About to connect() to modaps.modaps.eosdis.nasa.gov port 80 (#0)
*   Trying 2001:4d0:241a:40c0::38...
* Connected to modaps.modaps.eosdis.nasa.gov (2001:4d0:241a:40c0::38) port 80 (#0)
...
   time_appconnect:  0.000
   time_namelookup:  0.004
      time_connect:  0.005
   time_appconnect:  0.000
  time_pretransfer:  0.005
     time_redirect:  0.000
time_starttransfer:  0.005
                ----------
        time_total:  0.006

But these numbers are all from hosts relatively close to us on the east coast. Any chance you could tell me what connection times you’re seeing from the LE server? The curl-format.txt file’s content (-w option above) is:

   time_appconnect:  %{time_appconnect}\n
   time_namelookup:  %{time_namelookup}\n
      time_connect:  %{time_connect}\n
   time_appconnect:  %{time_appconnect}\n
  time_pretransfer:  %{time_pretransfer}\n
     time_redirect:  %{time_redirect}\n
time_starttransfer:  %{time_starttransfer}\n
                ----------\n
        time_total:  %{time_total}\n

I spun up a DigitalOcean host in their San Francisco zone and tried the same curl command you are using, and got a timeout. If you’d like to try and reproduce, their smallest instances are only $5/mo, or you can PM me an SSH pubkey and I can give you access to my temporary instance for testing.

curl -v -6 -w "@curl-format.txt" http://modaps.modaps.eosdis.nasa.gov/.well-known/acme-challenge/test -o /dev/null -s
*   Trying 2001:4d0:241a:40c0::38...
* connect to 2001:4d0:241a:40c0::38 port 80 failed: Connection timed out
* Failed to connect to modaps.modaps.eosdis.nasa.gov port 80: Connection timed out
* Closing connection 0
   time_appconnect:  0.000
   time_namelookup:  0.028
      time_connect:  0.000
   time_appconnect:  0.000
  time_pretransfer:  0.000
     time_redirect:  0.000
time_starttransfer:  0.000
                ----------
        time_total:  126.368

Thanks. That’s an easy way for me to reproduce the problem

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.