Strange (random?) timeouts on challenge

We see random timeouts on challenges. For example tohuto .com. Without www it is OK, but www subdomain gets timeout. Everything seems to be the same (HTTP redirects to HTTPS). Do you see some more details about this timeout? It seems to be ipv6 related, but there is no logic that without “www.” it works…

Thank you.
Silver Asu

Hi @silver, sorry to hear you're having a random seeming problem with validation. That's always frustrating!

Looking at the most recent failure for www.tohut.com our side started the validation to 2a02:29e8:700:0:1::90 at 2017-12-08 14:26:06.955181+00:00. The result was recorded as a timeout at 2017-12-08 14:26:17.522137+00:00 which is approximately 10s afterwards (the configured single dial timeout). The specific error was "Client.Timeout exceeded while awaiting headers" - that generally indicates the connection was successful but that the HTTP server took too long to respond to the validation request.

What makes you think it's IPv6 related? I thought so initially but the successful validations for tohuto.com without the www. are to the same IPv6 address: 2a02:29e8:700:0:1::90.

It might be useful to try and use tcpdump or tshark to record the validation attempts and your replies. If you can see your webserver replying to the requests in <10s after receiving them I can help debug further.

We still think its ipv6 problem. Can you please show taceroute from your side to 2a02:29e8:770:0:3::26? We see some major packetloss over Cogent. This started 6th december already and we see mod_reqtimeout responses to your requests. We rerouted from our side over another uplink provider, but we can’t see how is your trace coming.

2600:3000:2710:200::1d - - [06/Dec/2017:11:23:51 +0200] “-” 408 3550 “-” “-” (—)
2600:3000:2710:200::1d - - [06/Dec/2017:11:34:37 +0200] “-” 408 3541 “-” “-” (—)
2600:3000:2710:200::1d - - [06/Dec/2017:11:35:03 +0200] “-” 408 3535 “-” “-” (—)
2600:3000:2710:200::1d - - [06/Dec/2017:11:52:31 +0200] “-” 408 3526 “-” “-” (—)
2600:3000:2710:200::1d - - [06/Dec/2017:11:54:11 +0200] “-” 408 3523 “-” “-” (—)
2600:3000:2710:200::1d - - [06/Dec/2017:12:12:33 +0200] “-” 408 3520 “-” “-” (—)
2600:3000:2710:200::1d - - [06/Dec/2017:12:12:55 +0200] “-” 408 3538 “-” “-” (—)
2600:3000:2710:200::1d - - [06/Dec/2017:19:16:33 +0200] “-” 408 3520 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:03:18:16 +0200] “-” 408 2988 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:03:18:35 +0200] “-” 408 3523 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:03:34:48 +0200] “-” 408 3538 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:03:41:58 +0200] “-” 408 3523 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:03:46:13 +0200] “-” 408 3529 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:03:54:31 +0200] “-” 408 3520 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:03:56:50 +0200] “-” 408 3579 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:04:15:55 +0200] “-” 408 3538 “-” “-” (—)
2600:3000:2710:200::1d - - [07/Dec/2017:04:35:54 +0200] “-” 408 3538 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:02:58:02 +0200] “-” 408 3526 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:06:25:12 +0200] “-” 408 3538 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:06:26:09 +0200] “-” 408 3529 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:06:33:11 +0200] “-” 408 3523 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:06:49:34 +0200] “-” 408 3532 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:06:53:36 +0200] “-” 408 3514 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:07:11:18 +0200] “-” 408 3523 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:07:15:44 +0200] “-” 408 3526 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:07:32:10 +0200] “-” 408 3514 “-” “-” (—)
2600:3000:2710:200::1d - - [08/Dec/2017:07:44:41 +0200] “-” 408 3514 “-” “-” (—)
2600:3000:2710:200::1d - - [09/Dec/2017:02:32:48 +0200] “-” 408 3520 “-” “-” (—)
2600:3000:2710:200::1d - - [09/Dec/2017:02:42:11 +0200] “-” 408 3526 “-” “-” (—)
2600:3000:2710:200::1d - - [09/Dec/2017:03:05:26 +0200] “-” 408 3553 “-” “-” (—)
2600:3000:2710:200::1d - - [09/Dec/2017:03:08:40 +0200] “-” 408 3520 “-” “-” (—)
2600:3000:2710:200::1d - - [09/Dec/2017:10:12:45 +0200] “-” 408 3526 “-” “-” (—)

mod_reqtimeout config:
RequestReadTimeout header=15-30,MinRate=500 body=20,MinRate=1000

trace (sorry had to remove some hops, because i can only post 20 “links”):
Host Loss% Last Avg Best Wrst StDev
1. r1-ve-700-0-TLL-Linx.ee.zonedata.net 0.0% 0.3 3.3 0.2 9.8 3.7
2. r7-eth-3-1-0-TLL-VS.ee.zonedata.net 0.0% 0.3 0.7 0.2 7.6 1.6
3. 2001:978:2:e::1:1 0.0% 1.4 1.3 1.1 1.8 0.2
4. te0-0-0-3.rcr12.tll01.atlas.cogentco.com 0.0% 1.4 1.3 1.1 1.8 0.1
5. te0-7-0-16.ccr22.sto03.atlas.cogentco.com 0.0% 6.5 6.4 6.2 7.6 0.3
6. be2282.ccr42.ham01.atlas.cogentco.com 0.0% 25.0 24.8 24.6 25.1 0.1
7. be3029.ccr21.prg01.atlas.cogentco.com 52.6% 32.8 32.7 32.5 33.0 0.1
8. be3027.ccr41.ham01.atlas.cogentco.com 72.2% 37.0 34.2 29.1 38.0 4.6
/—/
21. 2600:3000:3:4f4::2 91.7% 190.1 190.1 190.1 190.1 0.0
22. be3044.ccr21.bts01.atlas.cogentco.com 91.7% 74.8 74.8 74.8 74.8 0.0
23. 2001:550:3::3e 83.3% 75.0 114.7 75.0 154.4 56.2
be3044.ccr21.prg01.atlas.cogentco.com
24. be3044.ccr21.bts01.atlas.cogentco.com 91.7% 79.8 79.8 79.8 79.8 0.0
25. ???
26. be12488.ccr42.lon13.atlas.cogentco.com 91.7% 106.8 106.8 106.8 106.8 0.0
27. be2814.ccr42.ams03.atlas.cogentco.com 90.9% 99.7 99.7 99.7 99.7 0.0
28. ???
29. 2600:3000:0:2::85 80.0% 201.0 201.0 201.0 201.0 0.0

Looks like our reroute helped and we do not see timeouts anymore.
But, while analyzing logs, we see that if domain resolves both to ipv4 and ipv6, then you try only ipv6 (no random nor fallback to ipv4 on timeout).
Maybe you can implement https://en.wikipedia.org/wiki/Happy_Eyeballs

Hi

I noticed the same behavior since 4-5d. Certificate renewal via Cogent iPV6 works randomly (most of the time we get timeouts).

Once ipv6 is deactivated, it works.
We don’t have this problem over other ipv6 networks.

Do you have a peering problem with Cogent on ipv6 ?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.