Random challenge timeouts

Hello,

We are getting random 30 second timeouts with acme challenge over ipv6.
URL: https://acme-v01.api.letsencrypt.org/acme/new-authz
METHOD: post

Traceroute to acme-v01.api.letsencrypt.org
Host Loss% Last Avg Best Wrst StDev

  1. 2a02:29ea:a:10f::7 0.0% 0.3 0.4 0.3 0.4 0.1
  2. 2a02:29ea:a:ff::2 0.0% 0.3 0.3 0.3 0.4 0.1
  3. 2a02:29ea:a:e3::1 0.0% 0.2 0.2 0.2 0.3 0.0
  4. 2a02:29ea:a:2a7::1 0.0% 0.3 0.3 0.2 0.3 0.1
  5. 2001:978:2:e::1:1 0.0% 1.3 1.2 1.2 1.3 0.1
  6. te0-0-0-3.rcr11.tll01.atlas.cogentco.com 0.0% 1.1 1.1 1.1 1.2 0.0
  7. be2655.ccr21.sto03.atlas.cogentco.com 0.0% 6.1 6.2 6.1 6.2 0.1
  8. be3376.ccr21.sto01.atlas.cogentco.com 0.0% 6.2 6.2 6.2 6.3 0.1
  9. 2001:978:3::12e 0.0% 7.1 8.1 7.1 9.1 1.4
  10. xe-0-2-0.cr0-stk1.ip6.gtt.net 0.0% 7.6 7.2 6.9 7.6 0.5
  11. akamai-gw.ip6.gtt.net 0.0% 6.0 6.0 6.0 6.0 0.0
  12. g2a02-26f0-0041-069f-0000-0000-0000-3a8e.deploy.static.akamait 0.0% 5.9 5.9 5.9 5.9 0.0

Traceroute to IP where challenge comes from:
Host Loss% Last Avg Best Wrst StDev

  1. 2a02:29ea:a:10f::7 0.0% 0.4 0.4 0.4 0.4 0.0
  2. 2a02:29ea:a:ff::2 0.0% 0.3 0.3 0.3 0.3 0.0
  3. 2a02:29ea:a:e3::1 0.0% 0.3 0.4 0.3 0.6 0.2
  4. 2a02:29ea:a:2a7::1 0.0% 0.3 0.3 0.2 0.3 0.0
  5. r1-eth-3-2-0-TLL-Linx.ee.zonedata.net 0.0% 0.4 0.4 0.4 0.4 0.0
  6. 2001:ad0:ff:37::1 0.0% 0.3 0.3 0.3 0.4 0.1
  7. r9-ae2-0-Sto-TC-SE.linxtelecom.net 0.0% 5.2 5.2 5.2 5.2 0.0
  8. xe-0-7-0-3-4.r01.stocse01.se.bb.gin.ntt.net 0.0% 5.9 5.9 5.9 5.9 0.0
  9. 2001:728:0:4000::46 0.0% 6.9 6.9 6.9 6.9 0.0
  10. lo-0-v6.ear2.Denver1.Level3.net 0.0% 159.3 159.3 159.3 159.3 0.0
  11. VIAWEST-INT.edge3.Denver1.Level3.net 0.0% 169.3 169.3 169.3 169.3 0.0
  12. 2600:3000:2:328::1 0.0% 173.4 173.4 173.4 173.4 0.0
  13. 2600:3000:0:2::7d 0.0% 169.7 169.7 169.7 169.7 0.0
  14. 2600:3000:1:230::2 0.0% 171.2 171.2 171.2 171.2 0.0
  15. 2600:3000:0:2::416 0.0% 191.3 191.3 191.3 191.3 0.0
  16. 2600:3000:3:720::2 0.0% 187.6 187.6 187.6 187.6 0.0
  17. 2600:3000:2700:1073::4 0.0% 190.6 190.6 190.6 190.6 0.0
  18. ???

@JamesLE @Phil @jkarner @ezekiel Can one of you folks take a look at this?

@Silver Is it safe to assume you’ve ruled out IPv6 connectivity problems with other hosts and are certain the problem is specific to the ACME v01 endpoint?

This timeout does not seem like network issue, as https://acme-v01.api.letsencrypt.org is responding ok, but with real challenge request it gives randomly timeout. We have not seen timeouts last 3hours tho.

Can you share more information? What domain names are you trying to renew and seeing timeouts for? Do you have detailed logs from these attempts? Are you using HTTP-01 challenge types? Which ACME client?

Timeout happened 20minutes ago again 17:38 (GMT+3). We have custom ACME client and use HTTP-01. Domain was: files.grafilius.ee

Do you have any logs from these attempts? In the ACME server logs I only see a smattering of GET requests (some returning errors because they are to POST-only endpoints) and no POSTs. That's pretty unusual!

What steps have you taken to rule out your custom ACME client as the problem, or the network connectivity of the server you're running it on?

Can you add an identifying User-Agent to your client? The specification considers that a MUST and you're currently sending the generic Golang UA complicating my log analysis.

Hi, Silver,

IPv6 connectivity problems are possible, although the routing for 2a02:29e8::/32 (containing files.grafilius.ee at 2a02:29e8:770:0:3::24) looks like it was stable from RIPEstat’s perspectives.

Let’s Encrypt’s API is reached by your client contacting Akamai, which relays your request onwards. Akamai has much broader connectivity than the Let’s Encrypt environment from which validation connections originate. So, it’s possible that you had good routing to/from Akamai at that time, but unstable routing to/from the Validation Authority.

One odd thing I noticed is that the first IPs in your traceroutes, in 2a02:29ea:a::/48, are not routable. Does your ISP use NAT for outgoing IPv6 connections? That would be unusual, but possible.

Another odd thing I noticed is that your traceroute started in/around Talinn, Estonia, but you were directed to an Akamai PoP near Denver, Colorado, US. Akamai’s DNS will usually direct you to a much closer PoP. What DNS resolver(s) are you using?

And, are you still having trouble?

Thanks for your patience!

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.