And for what it's worth, it's both primary and secondary validation, and it affects A record lookups from this domain, too.
I did that. Check the 1st message in this thread, pls. It's been updated almost immediately.
When was the last successful challenge?
@petercooperjr I guess it was couple months ago, when the automatic renewal happened. Unfortunately, I don't have that log anymore.
@jcjones please, let me know if you need anything from me
It is difficult to correlate the timestamps to validate that claim:
Have you considered adding another DSP?
I've have good luck with:
Here's a debug log of the resolution from a production VA resolver:
Sep 13 17:50:19 info: resolving _acme-challenge.abisoft.spb.ru. TXT IN
Sep 13 17:50:19 info: priming . IN NS
Sep 13 17:50:19 info: response for . NS IN
Sep 13 17:50:19 info: reply from <.> 2001:7fe::53#53
Sep 13 17:50:19 info: query response was ANSWER
Sep 13 17:50:19 info: priming successful for . NS IN
Sep 13 17:50:19 info: response for _acme-challenge.abisoft.spb.ru. TXT IN
Sep 13 17:50:19 info: reply from <.> 198.97.190.53#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: resolving b.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: resolving f.dns.ripn.net. A IN
Sep 13 17:50:19 info: resolving a.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: resolving f.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: resolving b.dns.ripn.net. A IN
Sep 13 17:50:19 info: response for a.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <.> 192.36.148.17#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for b.dns.ripn.net. A IN
Sep 13 17:50:19 info: reply from <.> 2001:500:2::c#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for f.dns.ripn.net. A IN
Sep 13 17:50:19 info: reply from <.> 198.97.190.53#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for f.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <.> 2001:500:2::c#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for b.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <.> 2001:500:2::c#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for b.dns.ripn.net. A IN
Sep 13 17:50:19 info: reply from <net.> 2001:500:856e::30#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for f.dns.ripn.net. A IN
Sep 13 17:50:19 info: reply from <net.> 2001:503:a83e::2:30#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for f.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <net.> 2001:502:7094::30#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for b.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <net.> 2001:500:856e::30#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for a.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <net.> 192.43.172.30#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: response for _acme-challenge.abisoft.spb.ru. TXT IN
Sep 13 17:50:19 info: reply from <ru.> 193.232.128.6#53
Sep 13 17:50:19 info: query response was REFERRAL
Sep 13 17:50:19 info: resolving ns4-geo.nic.ru. AAAA IN
Sep 13 17:50:19 info: resolving ns8-geo.nic.ru. A IN
Sep 13 17:50:19 info: resolving ns3-geo.nic.ru. AAAA IN
Sep 13 17:50:19 info: resolving ns4-geo.nic.ru. A IN
Sep 13 17:50:19 info: resolving ns8-geo.nic.ru. AAAA IN
Sep 13 17:50:19 info: resolving ns3-geo.nic.ru. A IN
Sep 13 17:50:19 info: response for f.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <ripn.net.> 2001:678:17:0:193:232:128:6#53
Sep 13 17:50:19 info: query response was ANSWER
Sep 13 17:50:19 info: response for f.dns.ripn.net. A IN
Sep 13 17:50:19 info: reply from <ripn.net.> 2001:678:17:0:193:232:128:6#53
Sep 13 17:50:19 info: query response was ANSWER
Sep 13 17:50:19 info: response for b.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <ripn.net.> 194.85.252.62#53
Sep 13 17:50:19 info: query response was ANSWER
Sep 13 17:50:19 info: response for b.dns.ripn.net. A IN
Sep 13 17:50:19 info: reply from <ripn.net.> 193.232.142.17#53
Sep 13 17:50:19 info: query response was ANSWER
Sep 13 17:50:19 info: response for a.dns.ripn.net. AAAA IN
Sep 13 17:50:19 info: reply from <ripn.net.> 2001:678:15:0:193:232:142:17#53
Sep 13 17:50:19 info: query response was ANSWER
Sep 13 17:50:20 info: response for ns8-geo.nic.ru. AAAA IN
Sep 13 17:50:20 info: reply from <ru.> 2001:678:17:0:193:232:128:6#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns4-geo.nic.ru. A IN
Sep 13 17:50:20 info: reply from <ru.> 194.85.252.62#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns3-geo.nic.ru. AAAA IN
Sep 13 17:50:20 info: reply from <ru.> 193.232.128.6#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns8-geo.nic.ru. A IN
Sep 13 17:50:20 info: reply from <ru.> 2001:678:15:0:193:232:142:17#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:19 info: query response was ANSWER
Sep 13 17:50:20 info: response for ns8-geo.nic.ru. AAAA IN
Sep 13 17:50:20 info: reply from <ru.> 2001:678:17:0:193:232:128:6#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns4-geo.nic.ru. A IN
Sep 13 17:50:20 info: reply from <ru.> 194.85.252.62#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns3-geo.nic.ru. AAAA IN
Sep 13 17:50:20 info: reply from <ru.> 193.232.128.6#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns8-geo.nic.ru. A IN
Sep 13 17:50:20 info: reply from <ru.> 2001:678:15:0:193:232:142:17#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns3-geo.nic.ru. A IN
Sep 13 17:50:20 info: reply from <ru.> 2001:678:18:0:194:190:124:17#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns4-geo.nic.ru. AAAA IN
Sep 13 17:50:20 info: reply from <ru.> 194.190.124.17#53
Sep 13 17:50:20 info: query response was REFERRAL
Sep 13 17:50:20 info: response for ns8-geo.nic.ru. AAAA IN
Sep 13 17:50:20 info: reply from <NIC.RU.> 31.177.74.100#53
Sep 13 17:50:20 info: query response was nodata ANSWER
Sep 13 17:50:20 info: response for ns4-geo.nic.ru. A IN
Sep 13 17:50:20 info: reply from <nic.ru.> 31.177.67.100#53
Sep 13 17:50:20 info: query response was ANSWER
Sep 13 17:50:20 info: response for ns3-geo.nic.ru. AAAA IN
Sep 13 17:50:20 info: reply from <nic.ru.> 2a02:2090:e400:7000:31:177:85:186#53
Sep 13 17:50:20 info: query response was nodata ANSWER
Sep 13 17:50:20 info: response for ns8-geo.nic.ru. A IN
Sep 13 17:50:20 info: reply from <nic.ru.> 2a02:2090:ec00:9040:31:177:74:100#53
Sep 13 17:50:20 info: query response was ANSWER
Sep 13 17:50:20 info: response for ns3-geo.nic.ru. A IN
Sep 13 17:50:20 info: reply from <nic.ru.> 2a02:2090:e800:9000:31:177:67:100#53
Sep 13 17:50:20 info: query response was ANSWER
Sep 13 17:50:20 info: response for ns4-geo.nic.ru. AAAA IN
Sep 13 17:50:20 info: reply from <nic.ru.> 31.177.67.100#53
Sep 13 17:50:20 info: query response was nodata ANSWER
Sep 13 17:50:29 info: resolving _acme-challenge.abisoft.spb.ru. TXT IN
Sep 13 17:50:29 info: resolving ns4-geo.nic.ru. AAAA IN
Sep 13 17:50:29 info: resolving ns8-geo.nic.ru. AAAA IN
Sep 13 17:50:29 info: resolving ns3-geo.nic.ru. AAAA IN
Sep 13 17:50:51 info: Capsforid: timeouts, starting fallback
Sep 13 17:51:07 info: control cmd: stats_noreset
Sep 13 17:51:13 info: resolving ns4-geo.nic.ru. AAAA IN
Sep 13 17:51:13 info: resolving ns4-geo.nic.ru. A IN
Sep 13 17:51:14 info: response for ns4-geo.nic.ru. A IN
Sep 13 17:51:14 info: reply from <ru.> 2001:678:17:0:193:232:128:6#53
Sep 13 17:51:14 info: query response was REFERRAL
Sep 13 17:51:14 info: response for ns4-geo.nic.ru. AAAA IN
Sep 13 17:51:14 info: reply from <ru.> 193.232.142.17#53
Sep 13 17:51:14 info: query response was REFERRAL
Sep 13 17:51:14 info: response for ns4-geo.nic.ru. A IN
Sep 13 17:51:14 info: reply from <NIC.RU.> 2a02:2090:e800:9000:31:177:67:100#53
Sep 13 17:51:14 info: query response was ANSWER
Sep 13 17:51:14 info: response for ns4-geo.nic.ru. AAAA IN
Sep 13 17:51:14 info: reply from <NIC.RU.> 31.177.85.186#53
Sep 13 17:51:14 info: query response was nodata ANSWER
Sep 13 17:51:14 info: resolving ns8-geo.nic.ru. AAAA IN
Sep 13 17:51:14 info: resolving ns8-geo.nic.ru. A IN
Sep 13 17:51:14 info: response for ns8-geo.nic.ru. A IN
Sep 13 17:51:14 info: reply from <ru.> 193.232.156.17#53
Sep 13 17:51:14 info: query response was REFERRAL
Sep 13 17:51:14 info: response for ns8-geo.nic.ru. A IN
Sep 13 17:51:14 info: reply from <NIC.RU.> 31.177.85.186#53
Sep 13 17:51:14 info: query response was ANSWER
Sep 13 17:51:15 info: response for ns8-geo.nic.ru. AAAA IN
Sep 13 17:51:15 info: reply from <ru.> 2001:678:18:0:194:190:124:17#53
Sep 13 17:51:15 info: query response was REFERRAL
Sep 13 17:51:15 info: response for ns8-geo.nic.ru. AAAA IN
Sep 13 17:51:15 info: reply from <nic.ru.> 31.177.74.100#53
Sep 13 17:51:15 info: query response was nodata ANSWER
Sep 13 17:51:15 info: resolving ns3-geo.nic.ru. AAAA IN
Sep 13 17:51:15 info: resolving ns3-geo.nic.ru. A IN
Sep 13 17:51:15 info: response for ns3-geo.nic.ru. AAAA IN
Sep 13 17:51:15 info: reply from <ru.> 194.85.252.62#53
Sep 13 17:51:15 info: query response was REFERRAL
Sep 13 17:51:15 info: response for ns3-geo.nic.ru. AAAA IN
Sep 13 17:51:15 info: reply from <nic.ru.> 31.177.67.100#53
Sep 13 17:51:15 info: query response was nodata ANSWER
Sep 13 17:51:15 info: response for ns3-geo.nic.ru. A IN
Sep 13 17:51:15 info: reply from <ru.> 194.190.124.17#53
Sep 13 17:51:15 info: query response was REFERRAL
Sep 13 17:51:15 info: response for ns3-geo.nic.ru. A IN
Sep 13 17:51:15 info: reply from <nic.ru.> 31.177.85.186#53
Sep 13 17:51:15 info: query response was ANSWER
Sep 13 17:51:20 info: resolving _acme-challenge.abisoft.spb.ru. TXT IN
Sep 13 17:51:36 info: resolving _acme-challenge.abisoft.spb.ru. TXT IN
Sep 13 17:51:36 info: priming . IN NS
Sep 13 17:51:36 info: response for . NS IN
Sep 13 17:51:36 info: reply from <.> 2001:500:2f::f#53
Sep 13 17:51:36 info: query response was ANSWER
Sep 13 17:51:36 info: priming successful for . NS IN
Sep 13 17:51:36 info: response for _acme-challenge.abisoft.spb.ru. TXT IN
Sep 13 17:51:36 info: reply from <.> 192.33.4.12#53
Sep 13 17:51:36 info: Capsforid: reply is equal. go to next fallback
Sep 13 17:51:36 info: response for _acme-challenge.abisoft.spb.ru. TXT IN
Sep 13 17:51:36 info: reply from <.> 2001:7fe::53#53
Sep 13 17:51:36 info: Capsforid fallback: getting different replies, failed
Sep 13 17:51:37 info: control cmd: stats_noreset
Sep 13 17:52:07 info: control cmd: stats_noreset
So this appears to be a capsforid
problem coming out of the annals of Let's Encrypt validation history.
It's a bit hard for me to understand unbound's logging there; but can you tell if the problem with the domain's DNS servers, or the ns*-geo.nic.ru.
ones earlier in the process? And I would guess that it's not so much actually the case-randomization, but dropped packets in at least one direction meaning that it's trying without the case-randomization as a fallback even though it isn't actually helping in this case.
I am inexpert with debugging Unbound. I'm hoping the log will shed some light for folk with more experience than me.
I can do more poking later, but I've got to take care of a few things first.
Name: i.root-servers.net
Address: 2001:7fe::53
Is this a problem within the root servers?
I think not the root root servers, but the .ru
server.
If I query the root servers for what NS to use first, it seems alright:
$ dig +norecurse _Acme-challenge.Abisoft.Spb.Ru. @a.root-servers.net.
; <<>> DiG 9.16.42-RH <<>> +norecurse _Acme-challenge.Abisoft.Spb.Ru. @a.root-servers.net.
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3280
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 5, ADDITIONAL: 11
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;_Acme-challenge.Abisoft.Spb.Ru. IN A
;; AUTHORITY SECTION:
Ru. 172800 IN NS a.dns.ripn.net.
Ru. 172800 IN NS e.dns.ripn.net.
Ru. 172800 IN NS f.dns.ripn.net.
Ru. 172800 IN NS d.dns.ripn.net.
Ru. 172800 IN NS b.dns.ripn.net.
;; ADDITIONAL SECTION:
a.dns.ripn.net. 172800 IN A 193.232.128.6
a.dns.ripn.net. 172800 IN AAAA 2001:678:17:0:193:232:128:6
e.dns.ripn.net. 172800 IN A 193.232.142.17
e.dns.ripn.net. 172800 IN AAAA 2001:678:15:0:193:232:142:17
f.dns.ripn.net. 172800 IN A 193.232.156.17
f.dns.ripn.net. 172800 IN AAAA 2001:678:14:0:193:232:156:17
d.dns.ripn.net. 172800 IN A 194.190.124.17
d.dns.ripn.net. 172800 IN AAAA 2001:678:18:0:194:190:124:17
b.dns.ripn.net. 172800 IN A 194.85.252.62
b.dns.ripn.net. 172800 IN AAAA 2001:678:16:0:194:85:252:62
;; Query time: 0 msec
;; SERVER: 2001:503:ba3e::2:30#53(2001:503:ba3e::2:30)
;; WHEN: Wed Sep 13 18:22:47 UTC 2023
;; MSG SIZE rcvd: 371
And then if I ask one of those ru.
servers next,
$ dig +norecurse _Acme-challenge.Abisoft.Spb.Ru. @2001:678:17:0:193:232:128:6
; <<>> DiG 9.16.42-RH <<>> +norecurse _Acme-challenge.Abisoft.Spb.Ru. @2001:678:17:0:193:232:128:6
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20873
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;_Acme-challenge.Abisoft.Spb.Ru. IN A
;; AUTHORITY SECTION:
SPB.RU. 345600 IN NS ns8-geo.nic.RU.
SPB.RU. 345600 IN NS ns4-geo.nic.RU.
SPB.RU. 345600 IN NS ns3-geo.nic.RU.
;; Query time: 119 msec
;; SERVER: 2001:678:17:0:193:232:128:6#53(2001:678:17:0:193:232:128:6)
;; WHEN: Wed Sep 13 18:23:22 UTC 2023
;; MSG SIZE rcvd: 135
It replies with an all-uppercase authority section. Which seems weird to me, but we're rapidly getting to the end of my DNS understanding. I would have expected it to be all lowercase, or to (ideally) echo the case of the Spb.Ru
in my query.
I see what you mean; And it does seem out of turn:
# dig NS Spb.Ru. @8.8.8.8
; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> NS Spb.Ru. @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17840
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;Spb.Ru. IN NS
;; ANSWER SECTION:
Spb.Ru. 21600 IN NS ns4-geo.nic.Ru.
Spb.Ru. 21600 IN NS ns8-geo.nic.Ru.
Spb.Ru. 21600 IN NS ns3-geo.nic.Ru.
;; Query time: 163 msec
;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP)
;; WHEN: Wed Sep 13 18:28:45 UTC 2023
;; MSG SIZE rcvd: 105
dig NS Spb.Ru. @2001:678:17:0:193:232:128:6
; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> NS Spb.Ru. @2001:678:17:0:193:232:128:6
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5300
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;Spb.Ru. IN NS
;; AUTHORITY SECTION:
SPB.RU. 345600 IN NS ns4-geo.nic.RU.
SPB.RU. 345600 IN NS ns8-geo.nic.RU.
SPB.RU. 345600 IN NS ns3-geo.nic.RU.
;; Query time: 147 msec
;; SERVER: 2001:678:17:0:193:232:128:6#53(2001:678:17:0:193:232:128:6) (UDP)
;; WHEN: Wed Sep 13 18:28:56 UTC 2023
;; MSG SIZE rcvd: 111
Google DNS does keep the upper-lower case.
Even weirder with DNSSEC on:
$ dig +norecurse +dnssec _Acme-challenge.Abisoft.Spb.Ru. @2001:678:17:0:193:232:128:6
; <<>> DiG 9.16.42-RH <<>> +norecurse +dnssec _Acme-challenge.Abisoft.Spb.Ru. @2001:678:17:0:193:232:128:6
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16262
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 5, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;_Acme-challenge.Abisoft.Spb.Ru. IN A
;; AUTHORITY SECTION:
SPB.RU. 345600 IN NS ns8-geo.nic.RU.
SPB.RU. 345600 IN NS ns3-geo.nic.RU.
SPB.RU. 345600 IN NS ns4-geo.nic.RU.
SPB.RU. 345600 IN DS 45820 8 2 826F21EA47462EC6F09A3CEA7B1088E801096FCB492CAEAA7BF2C08C 2204127C
spb.ru. 345600 IN RRSIG DS 8 2 345600 20230930091201 20230821081813 35208 ru. vTzlLovm7/2SgEk3yBchKIfFBOgSmdrl/twcaTsL0j3izJyEAJAzHVkL fuS6RJFy8YT/kqv0CZ6rYxiH7PWYoAlVFEeSnLFIQFjtDG44P50Zzurh +al10qtk2OqQimAstgeKOHF6ePo7B9BKwEJr1x/2URhG9zcMX51xAZiR ONQ=
;; Query time: 129 msec
;; SERVER: 2001:678:17:0:193:232:128:6#53(2001:678:17:0:193:232:128:6)
;; WHEN: Wed Sep 13 18:31:17 UTC 2023
;; MSG SIZE rcvd: 351
The authority NS records are in uppercase, but the RRSIG is all-lowercase. Not sure if that's actually the problem, as again this is stretching my knowledge of DNS, but I feel like something somewhere, possibly unbound, might reject it (though whether doing so would be "correct" or not, I really have no idea).
Even weirder with IPv4 and IPv6 inconsistencies:
Name: a.dns.ripn.net
Addresses: 2001:678:17:0:193:232:128:6
193.232.128.6
dig NS Spb.rU. @2001:678:17:0:193:232:128:6
; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> NS Spb.rU. @2001:678:17:0:193:232:128:6
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46823
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;Spb.rU. IN NS
;; AUTHORITY SECTION:
SPB.RU. 345600 IN NS ns8-geo.nic.RU.
SPB.RU. 345600 IN NS ns4-geo.nic.RU.
SPB.RU. 345600 IN NS ns3-geo.nic.RU.
;; Query time: 147 msec
;; SERVER: 2001:678:17:0:193:232:128:6#53(2001:678:17:0:193:232:128:6) (UDP)
;; WHEN: Wed Sep 13 18:35:19 UTC 2023
;; MSG SIZE rcvd: 111
dig NS Spb.rU. @193.232.128.6
; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> NS Spb.rU. @193.232.128.6
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38708
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;Spb.rU. IN NS
;; AUTHORITY SECTION:
SPB.RU. 345600 IN NS ns8-geo.nic.ru.
SPB.RU. 345600 IN NS ns3-geo.nic.ru.
SPB.RU. 345600 IN NS ns4-geo.nic.ru.
;; Query time: 147 msec
;; SERVER: 193.232.128.6#53(193.232.128.6) (UDP)
;; WHEN: Wed Sep 13 18:35:29 UTC 2023
;; MSG SIZE rcvd: 113
It's not actually v4 vs v6 I don't think; I'm sometimes getting the capital .RU.
at the end of the NS record, and sometimes not, in consecutive queries to the same server:
]$ dig +norecurse +dnssec +bufsize=512 TXT _Acme-challenge.Abisoft.Spb.Ru. @193.232.128.6
; <<>> DiG 9.16.42-RH <<>> +norecurse +dnssec +bufsize TXT _Acme-challenge.Abisoft.Spb.Ru. @193.232.128.6
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53720
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 5, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;_Acme-challenge.Abisoft.Spb.Ru. IN TXT
;; AUTHORITY SECTION:
SPB.RU. 345600 IN NS ns8-geo.nic.ru.
SPB.RU. 345600 IN NS ns3-geo.nic.ru.
SPB.RU. 345600 IN NS ns4-geo.nic.ru.
SPB.RU. 345600 IN DS 45820 8 2 826F21EA47462EC6F09A3CEA7B1088E801096FCB492CAEAA7BF2C08C 2204127C
spb.ru. 345600 IN RRSIG DS 8 2 345600 20230930091201 20230821081813 35208 ru. vTzlLovm7/2SgEk3yBchKIfFBOgSmdrl/twcaTsL0j3izJyEAJAzHVkL fuS6RJFy8YT/kqv0CZ6rYxiH7PWYoAlVFEeSnLFIQFjtDG44P50Zzurh +al10qtk2OqQimAstgeKOHF6ePo7B9BKwEJr1x/2URhG9zcMX51xAZiR ONQ=
;; Query time: 119 msec
;; SERVER: 193.232.128.6#53(193.232.128.6)
;; WHEN: Wed Sep 13 18:43:55 UTC 2023
;; MSG SIZE rcvd: 351
$ dig +norecurse +dnssec +bufsize=512 TXT _Acme-challenge.Abisoft.Spb.Ru. @193.232.128.6
; <<>> DiG 9.16.42-RH <<>> +norecurse +dnssec +bufsize TXT _Acme-challenge.Abisoft.Spb.Ru. @193.232.128.6
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36711
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 5, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;_Acme-challenge.Abisoft.Spb.Ru. IN TXT
;; AUTHORITY SECTION:
SPB.RU. 345600 IN NS ns4-geo.nic.RU.
SPB.RU. 345600 IN NS ns3-geo.nic.RU.
SPB.RU. 345600 IN NS ns8-geo.nic.RU.
SPB.RU. 345600 IN DS 45820 8 2 826F21EA47462EC6F09A3CEA7B1088E801096FCB492CAEAA7BF2C08C 2204127C
spb.ru. 345600 IN RRSIG DS 8 2 345600 20230930091201 20230821081813 35208 ru. vTzlLovm7/2SgEk3yBchKIfFBOgSmdrl/twcaTsL0j3izJyEAJAzHVkL fuS6RJFy8YT/kqv0CZ6rYxiH7PWYoAlVFEeSnLFIQFjtDG44P50Zzurh +al10qtk2OqQimAstgeKOHF6ePo7B9BKwEJr1x/2URhG9zcMX51xAZiR ONQ=
;; Query time: 129 msec
;; SERVER: 193.232.128.6#53(193.232.128.6)
;; WHEN: Wed Sep 13 18:44:04 UTC 2023
;; MSG SIZE rcvd: 351
Don't know as any of that's actually relevant to the problem, but it's possible that there's some sort of load balancing going on giving different non-case-echoing responses, which Unbound is being unhappy about.
just found Unbound randomly fails to resolve names which seems to explain what's going on here
@jcjones is there a chance to turn use-caps-for-id off in the unbound config?
They have that option on for good reason. The main job that a CA does is resolve DNS names to validate that they point to a place that the where the person requesting a certificate can validate belongs to them. That option helps mitigate certain attacks, so having it on makes it harder for someone to get a certificate that they shouldn't. (And Let's Encrypt would much rather prevent one misissuance than allow someone using a possibly-misbehaving DNS server to get a certificate anyway.)
But, that does bring up a good point: Can you try another (free) CA? I know that there are several free ACME-based CAs out there now, though I don't know if some of them will avoid .ru
names.
Not sure if it's helpful, but it looks to me that there are quite a few validation attempts going on for spb.ru
subdomains, and as far as I can see, every one of them fails for DNS issues.
As @petercooperjr said, I'm afraid that's an important mitigation against DNS cache poisoning attacks. Google's public nameservers this year started doing the same thing for the same reasons.
As I understand it, and only in theory -- every CA should be doing something similar, given the state of the DNS RFCs. Of course different CAs use different DNS resolvers which have different implementations of the same kind of mitigation.
Probably the next step here is to send an inquiry to the DNS help at dns.he.net and point them to this thread.
I don't think it's the dns.he.net
nameservers that are having any trouble, but the nameservers for spb.ru
. Probably somebody needs to find a good contact for them.
@jcjones @petercooperjr am I getting it right that despite the reported timeout error, the real issue is unsupported Capsforid by some DNS servers?
It's actually kinda strange that all dns testers (even letsdebug) show no dns issues..
The current hypothesis I have is that yes, the nameservers for the .ru
zone (<letter>.dns.ripn.net.
) aren't supporting case-echoing, and the unbound instance that Let's Encrypt uses to resolve the name isn't falling back to not requiring it for that name. Case echoing is something that doesn't seem to be strictly required by the standard, but most DNS servers do. Unbound should be able to detect that the DNS server doesn't do that, and fall back to not requiring it, in most cases.
I do remember one instance (in this thread) where packets were getting sporadically lost between Let's Encrypt's validation servers and the DNS servers, which caused the Unbound fallback logic to get invoked because it thought the server wasn't handling case-echoing even though it did. I don't think that kind of thing is happening here, but it can be hard to tell if something somewhere might be dropping packets and throwing a big wrench into things.
Most telling to me is that Unboundtest, which is in theory configured the same way as Let's Encrypt's validation servers, seems to be able to resolve things fine. Plus, that this seemed to be working for you up until at least a couple months ago. I think that leaves two main possibilities:
- There was some upgrade (or other configuration change) to Unbound that's been made to the Let's Encrypt validation servers within the last couple months, but has not been made to Unboundtest. @jsha, is there any chance you can take a look at this possibility?
- There's some sort of network routing issue, where Let's Encrypt's validation servers (both primary and secondary!) are hitting some kind of different path than Unboundtest is, going to a different DNS load-balancer or dropping more packets or something along those lines.
And to emphasize, this is all just a hypothesis based on conjecture and circumstantial evidence, it wouldn't shock me if the real problem was actually something else that we haven't discovered yet.
@petercooperjr I really appreciate your detailed answers, thank you for that (and for your efforts to help me)!
I've got in touch with spb.ru zone owner and they are investigating on their side. I've also tested against unboundtest service - it went well couple times but then started to advertise the same issue (capsforid timeouts).
What I'd really like to know is the ip address of the nameserver which times out. That would be much easier to narrow down the problem with this knowledge.