IPv6 connections to letsencrypt seem to fail

I am using the Ruby Acme::Client code to enroll dns-01 certificates.
My infrastructure is entirely IPv6 capable (even preferred).
Much of this worked great with staging in unit test cases, and was working rather well two weeks ago.
(I then had other bugs to fix)

My client code will spit out: “Failed communicate with ACME server, please try again”. I tcpdump at my border router, and I see IPv6 traffic, TCP SYN,SYNACK, some half-dozen data-packets that are TLS setting up, and then a disconnect. This was to either 2a02:26f0:6e00:19f::3a8e or 2a02:26f0:6e00:193::3a8e. I tried blocking one of these, or both, on the hypothesis that the front system at akamai is not connecting to the backend properly. Unfortunately, it appears that the Acme::Client code does not try all IP addresses.
So I put “184.86.44.170 acme-v02.api.letsencrypt.org” into /etc/hosts to encourage it to use IPv4, and it works reliably at that point. The IPv4 I notice in DNS is under some round-robin, although the IPv6 seems rather stable.

I would be happy to collect more information; maybe there are some headers from the broken connection I could get to with some effort. Debugging HTTP inside TLS is always a pain from the outside.

How can I report this better?

In case this helps someone else debug their similar situation:

tcpdump -n -p -i any tcp port 443 and ip host 184.86.44.170 and ip host 104.123.101.156 or ip6 host 2a02:26f0:6e00:19f::3a8e or ip6 host 2a02:26f0:6e00:193::3a8e or ip6 host 2600:140a:0:39c::3a8e or ip6 host 2600:140a:0:3a8::3a8e

Hi @mcr,

Can you run the following curl with Akamai headers please?

curl -vv -6 -H "Pragma: akamai-x-get-cache-key, akamai-x-cache-on, akamai-x-cache-remote-on, akamai-x-get-true-cache-key, akamai-x-get-extracted-values, akamai-x-check-cacheable, akamai-x-get-request-id, akamai-x-serial-no, akamai-x-get-ssl-client-session-id, akamai-x-feo-trace" https://acme-v02.api.letsencrypt.org/directory

Here’s a working example from a random VPS provider.

$ curl -vv -6 -H "Pragma: akamai-x-get-cache-key, akamai-x-cache-on, akamai-x-cache-remote-on, akamai-x-get-true-cache-key, akamai-x-get-extracted-values, akamai-x-check-cacheable, akamai-x-get-request-id, akamai-x-serial-no, akamai-x-get-ssl-client-session-id, akamai-x-feo-trace" https://acme-v02.api.letsencrypt.org/directory
* About to connect() to acme-v02.api.letsencrypt.org port 443 (#0)
*   Trying 2600:141b:13:29a::3a8e...
* Connected to acme-v02.api.letsencrypt.org (2600:141b:13:29a::3a8e) port 443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate:
* 	subject: CN=acme-v02.api.letsencrypt.org
* 	start date: May 10 04:39:46 2019 GMT
* 	expire date: Aug 08 04:39:46 2019 GMT
* 	common name: acme-v02.api.letsencrypt.org
* 	issuer: CN=Let's Encrypt Authority X3,O=Let's Encrypt,C=US
> GET /directory HTTP/1.1
> User-Agent: curl/7.29.0
> Host: acme-v02.api.letsencrypt.org
> Accept: */*
> Pragma: akamai-x-get-cache-key, akamai-x-cache-on, akamai-x-cache-remote-on, akamai-x-get-true-cache-key, akamai-x-get-extracted-values, akamai-x-check-cacheable, akamai-x-get-request-id, akamai-x-serial-no, akamai-x-get-ssl-client-session-id, akamai-x-feo-trace
> 
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: application/json
< Content-Length: 658
< X-Frame-Options: DENY
< Strict-Transport-Security: max-age=604800
< X-Akamai-SSL-Client-Sid: vmLKV9bNXbc/Rz2Qr8Aw0A==
< X-Check-Cacheable: NO
< X-Akamai-Request-ID: 917613.6a3e6f0
< Expires: Mon, 24 Jun 2019 14:44:49 GMT
< Cache-Control: max-age=0, no-cache, no-store
< Pragma: no-cache
< Date: Mon, 24 Jun 2019 14:44:49 GMT
< X-Cache: TCP_MISS from a23-215-131-111.deploy.akamaitechnologies.com (AkamaiGHost/9.7.0.2-26040364) (-)
< X-Cache-Key: S/D/14990/432721/000/origin-4j9188P5RUymhQWhT.api.letsencrypt.org/directory
< X-Cache-Key-Extended-Internal-Use-Only: S/D/14990/432721/000/origin-4j9188P5RUymhQWhT.api.letsencrypt.org/directory vcd=10106
< X-True-Cache-Key: /D/000/origin-4j9188P5RUymhQWhT.api.letsencrypt.org/directory vcd=10106
< X-Akamai-Session-Info: name=ANS_PEARL_VERSION; value=0.11.0
< X-Akamai-Session-Info: name=SEC_XFF_ASNUM_MASK_SIZE; value=64
< X-Akamai-Session-Info: name=AKA_PM_SR_ENABLED; value=false
< X-Akamai-Session-Info: name=HEADER_NAMES; value=User-Agent%3aHost%3aAccept%3aPragma; full_location_id=
< X-Akamai-Session-Info: name=ENABLE_SD_POC; value=yes
< X-Akamai-Session-Info: name=AKA_PM_NETSTORAGE_ROOT; value=
< X-Akamai-Session-Info: name=PMUSER_IP_HASH; value=618
< X-Akamai-Session-Info: name=FASTTCP_RENO_FALLBACK_DISABLE_OPTOUT; value=on
< X-Akamai-Session-Info: name=AKA_PM_PREFETCH_ON; value=true
< X-Akamai-Session-Info: name=TAP_GUID; value=
< X-Akamai-Session-Info: name=OVERRIDE_HTTPS_IE_CACHE_BUST; value=all
< X-Akamai-Session-Info: name=AKA_PM_TD_ENABLED; value=false
< X-Akamai-Session-Info: name=AKA_PM_BASEDIR; value=
< X-Akamai-Session-Info: name=TAP_KEY_ID; value=
< X-Akamai-Session-Info: name=SEC_CLIENT_IP_ASNUM_MASK_SIZE; value=128
< X-Akamai-Session-Info: name=AKA_PM_CACHEABLE_OBJECT; value=false
< X-Akamai-Session-Info: name=TCP_OPT_APPLIED; value=medium
< X-Akamai-Session-Info: name=AKA_PM_FWD_URL; value=/directory
< X-Akamai-Session-Info: name=AKA_PM_TD_MAP_PREFIX; value=ch2
< X-Serial: 14990
< X-Akamai-SSL-Client-Sid: 7+JC2zXwm3LQigzGdy1aQw==
< Connection: keep-alive
< X-Cache-Remote: TCP_MISS from a72-247-10-202.deploy.akamaitechnologies.com (AkamaiGHost/9.7.0.3-26197600) (-)
< 
{
  "keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change",
  "meta": {
    "caaIdentities": [
      "letsencrypt.org"
    ],
    "termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf",
    "website": "https://letsencrypt.org"
  },
  "n84uhuP07Fo": "https://community.letsencrypt.org/t/adding-random-entries-to-the-directory/33417",
  "newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct",
  "newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce",
  "newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order",
  "revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert"
* Connection #0 to host acme-v02.api.letsencrypt.org left intact
2 Likes

Let me try the upload function:

test1.txt (4.5 KB)

Not sure what to make of all the headers. I assume it’s useful to you.
I could not trivially run this on my provisioning system as it’s a container without curl. I can build a debug container and run it from the same subnet. This run is from a different /64 (in the same /56) that is two routers deeper, where I happened to have a window open.

1 Like

Let me paste the entire bit in, as the upload is annoying:

mcr@opi0B:~/shg/shg_reach$ curl -vv -6 -H "Pragma: akamai-x-get-cache-key, akamai-x-cache-on, akamai-x-cache-remote-on, akamai-x-get-true-cache-key, akamai-x-get-extracted-values, akamai-x-check-cacheable, akamai-x-get-request-id, akamai-x-serial-no, akamai-x-get-ssl-client-session-id, akamai-x-feo-trace" https://acme-v02.api.letsencrypt.org/directory
*   Trying 2600:140a:0:39c::3a8e...
* Connected to acme-v02.api.letsencrypt.org (2600:140a:0:39c::3a8e) port 443 (#0)
* found 148 certificates in /etc/ssl/certs/ca-certificates.crt
* found 592 certificates in /etc/ssl/certs
* ALPN, offering http/1.1
* SSL connection using TLS1.2 / ECDHE_RSA_AES_256_GCM_SHA384
*        server certificate verification OK
*        server certificate status verification SKIPPED
*        common name: acme-v02.api.letsencrypt.org (matched)
*        server certificate expiration date OK
*        server certificate activation date OK
*        certificate public key: RSA
*        certificate version: #3
*        subject: CN=acme-v02.api.letsencrypt.org
*        start date: Fri, 10 May 2019 04:39:46 GMT
*        expire date: Thu, 08 Aug 2019 04:39:46 GMT
*        issuer: C=US,O=Let's Encrypt,CN=Let's Encrypt Authority X3
*        compression: NULL
* ALPN, server accepted to use http/1.1
> GET /directory HTTP/1.1
> Host: acme-v02.api.letsencrypt.org
> User-Agent: curl/7.47.0
> Accept: */*
> Pragma: akamai-x-get-cache-key, akamai-x-cache-on, akamai-x-cache-remote-on, akamai-x-get-true-cache-key, akamai-x-get-extracted-values, akamai-x-check-cacheable, akamai-x-get-request-id, akamai-x-serial-no, akamai-x-get-ssl-client-session-id, akamai-x-feo-trace
>
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: application/json
< Content-Length: 658
< X-Frame-Options: DENY
< Strict-Transport-Security: max-age=604800
< X-Akamai-SSL-Client-Sid: uB9awMqk2lZPoPUkynh7WQ==
< X-Check-Cacheable: NO
< X-Akamai-Request-ID: 2a71c4c.3207d7f4
< Expires: Tue, 25 Jun 2019 23:24:15 GMT
< Cache-Control: max-age=0, no-cache, no-store
< Pragma: no-cache
< Date: Tue, 25 Jun 2019 23:24:15 GMT
< X-Cache: TCP_MISS from a184-84-243-181.deploy.akamaitechnologies.com (AkamaiGHost/9.7.0.3-26197600) (-)
< X-Cache-Key: S/D/14990/432721/000/origin-5Pd1BuJgcLKPKwI9E.api.letsencrypt.org/directory
< X-Cache-Key-Extended-Internal-Use-Only: S/D/14990/432721/000/origin-5Pd1BuJgcLKPKwI9E.api.letsencrypt.org/directory vcd=10106
< X-True-Cache-Key: /D/000/origin-5Pd1BuJgcLKPKwI9E.api.letsencrypt.org/directory vcd=10106
< X-Akamai-Session-Info: name=ANS_PEARL_VERSION; value=0.11.0
< X-Akamai-Session-Info: name=SEC_XFF_ASNUM_MASK_SIZE; value=64
< X-Akamai-Session-Info: name=AKA_PM_SR_ENABLED; value=false
< X-Akamai-Session-Info: name=HEADER_NAMES; value=Host%3aUser-Agent%3aAccept%3aPragma; full_location_id=
< X-Akamai-Session-Info: name=ENABLE_SD_POC; value=yes
< X-Akamai-Session-Info: name=AKA_PM_NETSTORAGE_ROOT; value=
< X-Akamai-Session-Info: name=PMUSER_IP_HASH; value=908
< X-Akamai-Session-Info: name=FASTTCP_RENO_FALLBACK_DISABLE_OPTOUT; value=on
< X-Akamai-Session-Info: name=AKA_PM_PREFETCH_ON; value=true
< X-Akamai-Session-Info: name=TAP_GUID; value=
< X-Akamai-Session-Info: name=OVERRIDE_HTTPS_IE_CACHE_BUST; value=all
< X-Akamai-Session-Info: name=AKA_PM_TD_ENABLED; value=false
< X-Akamai-Session-Info: name=AKA_PM_BASEDIR; value=
< X-Akamai-Session-Info: name=TAP_KEY_ID; value=
< X-Akamai-Session-Info: name=SEC_CLIENT_IP_ASNUM_MASK_SIZE; value=64
< X-Akamai-Session-Info: name=AKA_PM_CACHEABLE_OBJECT; value=false
< X-Akamai-Session-Info: name=TCP_OPT_APPLIED; value=medium
< X-Akamai-Session-Info: name=AKA_PM_FWD_URL; value=/directory
< X-Akamai-Session-Info: name=AKA_PM_TD_MAP_PREFIX; value=ch2
< X-Serial: 14990
< X-Akamai-SSL-Client-Sid: 9K6AEQ6WyX4zZMmrg5iCvg==
< Connection: keep-alive
< X-Cache-Remote: TCP_MISS from a184-51-147-21.deploy.akamaitechnologies.com (AkamaiGHost/9.7.0.3-26197600) (-)
<
{
  "ZJExRksgFhA": "https://community.letsencrypt.org/t/adding-random-entries-to-the-directory/33417",
  "keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change",
  "meta": {
    "caaIdentities": [
      "letsencrypt.org"
    ],
    "termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf",
    "website": "https://letsencrypt.org"
  },
  "newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct",
  "newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce",
  "newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order",
  "revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert"
* Connection #0 to host acme-v02.api.letsencrypt.org left intact
1 Like

Some users have been able to work around these weird Akamai issues by dropping their MTU to 1300 or even 1280. Cannot get new certificate, readtimeout error

Especially if this is only happening on large payloads (e.g. submission of CSR doeesn’t work, but fetching the directory - like in your above post - does). I’m not sure whether that matches your description of what you see in pcaps.

Can’t hurt much to try, only takes a couple of minutes to verify.

2 Likes

I don’t get to the CSR stage of things. I can try this.

I think this would be helpful - since Docker has its own networking quirks, it would be great to see the results from inside a container that is as similar to the one experiencing the problem as possible.

When you get this error, is it always on the first request (i.e. /directory)? Or is it a subsequent request? We’ve seen some clients that don’t deal well with server-initiated close of a long-running connection.