TImeout over cogent ipv6


#1

Hello

We have timeouts on challenges for new or certificate renewals since a week.

After more investigation, it seems that we have a lot of packet loss over our cogent ipv6 adresses (for example with a mtr to 2600:3000:2710:200::1d)

We have no packet loss with this cogent network to another ipv6 network.

Once we remove the AAAA entry in the DNS, we don’t have timeouts anymore. But that’s not a solution, as we need ipv6 for other things.

Are you aware about a possible peering problem between letsencrypt’s ipv6 network and cogent’s ipv6 ? We also opened a ticket about this problem at Cogent.

Thanks for your help


#2

It seems that you have a problem with cogent. Here is their answer

the destination prefix is unfortunately heavly flapping via BGP, seems the origin has some issues on there side.

The flapping of the route causing that we get several times a second new routing informations for this destination and this casues the re-routing you see and the reported packetloss.


#3

Hi @philipperauch,

I’ll ask our operations team to take a look at this. Thanks for reporting!


#4

Cogent is one of my network’s upstreams.

I’m a bit baffled by the assertion that the route is flapping. My router’s view of the path to the IP you listed is:

  174 3356 13649
  Origin IGP, metric 0, localpref 90, valid, external
  Community: 174:21000 174:22013 6082:101 6082:9996
  Last update: Fri Dec 15 23:42:37 2017

That route appears constant since December 15th.

ViaWest is LetsEncrypt’s direct upstream (AS13649). The Cogent path is presently Cogent -> Level3 -> ViaWest.

Alternatively, my view to the same target via Hurricane Electric is:

  6939 209 13649
  Origin IGP, metric 0, localpref 110, valid, internal, best
  Community: 6082:202
  Last update: Sat Dec 16 00:03:12 2017

That’s Hurricane Electric -> CenturyLink/Qwest -> ViaWest

That path also is not updated since a couple of days.


#5

@philipperauch,

It may be helpful if you can disclose what IPv6 source address you would be attempting access from. Are you using Cogent IPv6 space or advertising your own space into Cogent?


#6

@philipperauch,

As an aside – and completely unrelated to LetsEncrypt – if you’re offering an IPv6 service up to the public, being single-homed on Cogent IPv6 is not the best option.

Cogent’s long history of peering wars with Hurricane Electric essentially means that there’s an IPv6 permanent net split. Those single-homed on Cogent or Hurricane Electric can’t reach others on the opposite side of that divide.

In short, if ubiquitous IPv6 reach is essential to you, you should BGP multi-home with both Cogent and Hurricane Electric or consider single-home setup with someone who bridges the divide. In the US, NTT USA, for example. (Or lots of others.)


#7

LE Ops here. Cogent reached out to ViaWest last week and we traced the routes back to the IP they provided from the VA and there didn’t seem to be any major issues until we were several hops into the cogent network. I’ve pushed ViaWest again today to find out their resolution of the inquiry.

We did not have any issues with the VA reaching the IP Cogent provided for testing, but we weren’t trying to maintain a tcp session. At least for traceroute/mtr the lossy points were inside Cogent’s network.

I’ll paste those results here for reference:

Summary

Here’s the trace from the firewall:

trace to 2a02:29e8:700:0:1::90 (2a02:29e8:700:0:1::90), 30 hops max, 40/8 byte payload/paddata
1 2600:3000:2700:1073::1 (2600:3000:2700:1073::1) 0.474 ms 0.600 ms 0.329 ms
2 2600:3000:3:720::1 (2600:3000:3:720::1) 1.146 ms 0.864 ms 0.632 ms
3 2600:3000:3:38::1 (2600:3000:3:38::1) 0.778 ms 0.749 ms 0.735 ms
4 2600:3002:2:1b0::2 (2600:3002:2:1b0::2) 1.289 ms 1.161 ms 1.093 ms
5 2001:1900:2100::1c91 (ae6-681.edge2.Chicago2.Level3.net) 11.832 ms 11.815 ms 11.835 ms
6 * * *
7 2001:550:3::39 (be3036.ccr41.lax04.atlas.cogentco.com) 29.340 ms 29.290 ms 29.231 ms
8 * * *
9 * 2001:550:0:1000::9a36:2da1 (be2932.ccr32.phx01.atlas.cogentco.com) 40.096 ms *
10 * * *
11 * 2001:550:0:1000::9a36:1ddd (be2927.ccr41.iah01.atlas.cogentco.com) 41.312 ms *
12 * * *
13 * * *
14 * * *
15 2001:550:0:1000::9a36:1eba (be2317.ccr41.lon13.atlas.cogentco.com) 137.086 ms * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 2001:978:2:e::1:2 (r7-eth-1-1-0-TLL-VS.ee.zonedata.net) 197.609 ms 197.527 ms 197.600 ms
27 2a02:29e8:bce:0:7:1:0:2 (r1-eth-3-2-0-TLL-Linx.ee.zonedata.net) 206.619 ms 197.674 ms 197.758 ms
28 2a02:29e8:700:0:1::90 (data.zone.eu) 197.394 ms 197.527 ms 197.455 ms

ICMP outbound is blocked from the VA itself, but I can reach port 80 on that host from the VA and the following mtr uses tcp over 80 instead of ICMP ECHO:

mtr -bw6TP 80 -c 100 2a02:29e8:700:0:1::90
Start: Wed Dec 13 23:51:51 2017
HOST: va Loss% Snt Last Avg Best Wrst StDev
1.|-- gateway (2600:3000:2710:200::1) 0.0% 100 0.2 0.2 0.2 1.2 0.0
2.|-- 2600:3000:2700:1073::1 0.0% 100 0.7 1.4 0.5 66.1 6.5
3.|-- 2600:3000:3:720::1 0.0% 100 1003. 11.3 1.0 1003. 100.2
4.|-- 2600:3000:3:38::1 0.0% 100 1.0 1.0 0.9 1.9 0.0
5.|-- 2600:3002:2:1b0::2 0.0% 100 1.4 1.9 1.1 5.1 0.7
6.|-- ae6-681.edge2.Chicago2.Level3.net (2001:1900:2100::1c91) 0.0% 100 11.9 12.1 11.9 15.6 0.5
7.|-- lo-0-v6.ear2.LosAngeles1.Level3.net (2001:1900::3:19c) 35.0% 100 8249. 7927. 7118. 14307 1534.4
8.|-- Cogent-level3-100G.LosAngeles1.Level3.net (2001:1900:4:3::4ce) 11.0% 100 7044. 649.2 29.8 7044. 1444.9
9.|-- be3271.ccr41.lax01.atlas.cogentco.com (2001:550:0:1000::9a36:2a65) 32.0% 100 3037. 1474. 30.4 7049. 1961.6
10.|-- be2931.ccr31.phx01.atlas.cogentco.com (2001:550:0:1000::9a36:2c55) 41.0% 100 3047. 1382. 40.2 7056. 2174.8
11.|-- be2929.ccr21.elp01.atlas.cogentco.com (2001:550:0:1000::9a36:2a42) 38.0% 100 3034. 2292. 29.7 7053. 2487.6
12.|-- be2928.ccr42.iah01.atlas.cogentco.com (2001:550:0:1000::9a36:1ea1) 15.0% 100 3048. 1185. 41.2 7059. 1810.6
13.|-- be2690.ccr42.atl01.atlas.cogentco.com (2001:550:0:1000::9a36:1c81) 39.0% 100 7067. 1795. 54.7 7073. 2171.5
14.|-- be2113.ccr42.dca01.atlas.cogentco.com (2001:550:0:1000::9a36:18dd) 39.0% 100 3074. 2482. 67.2 7085. 2338.0
15.|-- be2806.ccr41.jfk02.atlas.cogentco.com (2001:550:0:1000::9a36:2869) 35.0% 100 1071. 1718. 69.3 7085. 2243.2
16.|-- be2490.ccr42.lon13.atlas.cogentco.com (2001:550:0:1000::9a36:2a56) 57.0% 100 3144. 2380. 137.4 7289. 2701.9
17.|-- be12488.ccr42.ams03.atlas.cogentco.com (2001:550:0:1000::8275:332a) 74.0% 100 1147. 2272. 142.4 7211. 2632.5
18.|-- be2814.ccr42.fra03.atlas.cogentco.com (2001:550:0:1000::8275:8e) 99.0% 99 3273. 3273. 3273. 3273. 0.0
19.|-- be2959.ccr21.muc03.atlas.cogentco.com (2001:550:0:1000::9a36:2436) 99.0% 99 3163. 3163. 3163. 3163. 0.0
20.|-- ??? 100.0 98 0.0 0.0 0.0 0.0 0.0
21.|-- be2990.ccr22.bts01.atlas.cogentco.com (2001:550:0:1000::9a36:3b5d) 96.9% 98 1173. 1861. 1173. 3166. 1130.3
22.|-- be3045.ccr21.prg01.atlas.cogentco.com (2001:550:0:1000::9a36:3b69) 95.9% 97 1176. 2676. 1172. 7181. 3003.5
23.|-- be3027.ccr41.ham01.atlas.cogentco.com (2001:550:0:1000::8275:1cd) 84.9% 93 7193. 2341. 177.6 7193. 2733.8
24.|-- be2282.ccr22.sto03.atlas.cogentco.com (2001:550:0:1000::9a36:486a) 75.3% 93 218.7 2840. 194.2 7311. 2869.2
25.|-- te0-0-0-0.rcr12.tll01.atlas.cogentco.com (2001:550:0:1000::9a36:2749) 85.6% 90 1204. 1450. 200.0 7217. 1921.5
26.|-- te0-0-2-2.nr11.b021604-0.tll01.atlas.cogentco.com (2001:550:0:1000::9a19:72) 96.7% 90 3207. 1876. 1207. 3207. 1153.2
27.|-- r7-eth-1-1-0-TLL-VS.ee.zonedata.net (2001:978:2:e::1:2) 0.0% 90 192.3 195.9 192.0 202.7 2.7
28.|-- r1-eth-3-2-0-TLL-Linx.ee.zonedata.net (2a02:29e8:bce:0:7:1:0:2) 0.0% 85 192.3 197.5 192.1 207.6 3.7
29.|-- data.zone.eu (2a02:29e8:700:0:1::90) 0.0% 85 227.3 219.2 192.2 1221. 110.2


#8

I’m not in the USA, but in France, so I’m not affected by the peering war between Cogent and HE. Cogent works very well with all the ISP and service providers here.


#9

I’m experiencing IPv6 verification issues as well. On HE.net IPv6 tunnel, IPv6 availability verified from another site.Apparently broken from the LE servers.

from the acme authz response (only ipv6 address not modified)

{
  "type": "http-01",
  "status": "invalid",
  "error": {
    "type": "urn:acme:error:connection",
    "detail": "Fetching http://example.org/.well-known/acme-challenge/<snip>: Timeout",
    "status": 400
  },
  "uri": "https://acme-v01.api.letsencrypt.org/acme/challenge/<snip>/<snip>",
  "token": "<snip>",
  "keyAuthorization": "<snip>",
  "validationRecord": [
    {
"url": "http://example.org/.well-known/acme-challenge/<snip>",
"hostname": "example.org",
"port": "80",
"addressesResolved": [
  "192.0.2.242",
  "2001:470:1f09:2d2:137:205:124:242"
],
"addressUsed": "2001:470:1f09:2d2:137:205:124:242",
"addressesTried": []

#10

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.