A particular set of name servers fail DNS 0x20 only when queried over TCP, and typically also cause unbound’s capsforid fallback strategy to fail (not sure why, but it might be related to the different case returned in DNSSEC records by the particular servers).
These servers are in the 194.0.1.0/24, 194.0.2.0/24, 2001:678:4::/48, 2001:678:5::/48 (anycast) address space, apparently operated by CommunityDNS.
Authoritative name servers for .be, .pl, .gr and perhaps other ccTLDs can be found in this space, so this potentially affects any domain therein. The probability for SERVFAIL increases when the delegation path points to name servers also under an affected TLD.
This is an example query that fails DNS 0x20 (using pydig for convenience):
;; TCP response from ('2001:678:4::a', 53), 644 bytes, in 0.089 sec
;; 0x20-hack qname: <Name: youtU.Be.>
;; rcode=0(NOERROR), id=64443
;; qr=1 opcode=0 aa=0 tc=0 rd=1 ra=0 z=0 ad=0 cd=0
;; question=1, answer=0, authority=8, additional=1
;; Size query=37, response=644, amp1=17.41 amp2=7.13
;; QUESTION SECTION:
youtu.be. IN NS
*** WARNING: Answer didn't match question!
;; AUTHORITY SECTION:
youtu.be. 86400 IN NS ns1.google.com.
youtu.be. 86400 IN NS ns2.google.com.
youtu.be. 86400 IN NS ns3.google.com.
youtu.be. 86400 IN NS ns4.google.com.
ba141snrnoe1rc9mddgrest23g657rir.be. 600 IN NSEC3 1 1 5 1a4e9b6c BA175A6M75ITNTD2DO5RIQLCVM45GSMR NS SOA RRSIG DNSKEY NSEC3PARAM
ba141snrnoe1rc9mddgrest23g657rir.be. 600 IN RRSIG NSEC3 8 2 600 20190126161101 20190117165115 2478 be. EwIccnpBcEtGMPPkaIz1bW2I7FIhEtEZ+D8RL7JRkICXk2nZobgdKcVyTDD2fIth+5ZmLzzCkK5pyX/TpUNzVjvHlI3G5W3+Ui+BhMv3jAY+2qkuwr4/IRqy9spmSfhgi2ZbEJcMc0UojeisP8ERnTsVAGuLRD9qtDXPKIWDkeY=
jq78bsrkbnnvjo7nor8f2i20vl9k8cgo.be. 600 IN NSEC3 1 1 5 1a4e9b6c JQ7RT3IFRO588SF81JDET9H3LLMBCU9K NS DS RRSIG
jq78bsrkbnnvjo7nor8f2i20vl9k8cgo.be. 600 IN RRSIG NSEC3 8 2 600 20190202170011 20190123164453 2478 be. SworU9I5MQUy0hty//rVo//yG916wuZFJyZb1O1/ii/Ueo4EZUZ5lzQ3XQkI6qmZMBMmFINebbAS7gJgVKNmbaVj4vJiZ2eeurnvmGTKXwHu4MYI/OPjoUOnNwo7KokhDCCbbCqRzVe1+BHWRJyZmdppp3awVzLD4ZZ4h5lWQ48=
;; ADDITIONAL SECTION:
;; OPT: edns_version=0, udp_payload=4096, flags=do, ercode=0(NOERROR)
The same query over UDP does not manifest the same issue:
;; UDP response from ('2001:678:4::a', 53, 0, 0), 644 bytes, in 0.044 sec
;; 0x20-hack qname: <Name: yOUtu.bE.>
;; rcode=0(NOERROR), id=38167
;; qr=1 opcode=0 aa=0 tc=0 rd=1 ra=0 z=0 ad=0 cd=0
;; question=1, answer=0, authority=8, additional=1
;; Size query=37, response=644, amp1=17.41 amp2=7.13
;; QUESTION SECTION:
yOUtu.bE. IN NS
;; AUTHORITY SECTION:
yOUtu.bE. 86400 IN NS ns1.google.com.
yOUtu.bE. 86400 IN NS ns2.google.com.
yOUtu.bE. 86400 IN NS ns3.google.com.
yOUtu.bE. 86400 IN NS ns4.google.com.
ba141snrnoe1rc9mddgrest23g657rir.be. 600 IN NSEC3 1 1 5 1a4e9b6c BA175A6M75ITNTD2DO5RIQLCVM45GSMR NS SOA RRSIG DNSKEY NSEC3PARAM
ba141snrnoe1rc9mddgrest23g657rir.be. 600 IN RRSIG NSEC3 8 2 600 20190126161101 20190117165115 2478 be. EwIccnpBcEtGMPPkaIz1bW2I7FIhEtEZ+D8RL7JRkICXk2nZobgdKcVyTDD2fIth+5ZmLzzCkK5pyX/TpUNzVjvHlI3G5W3+Ui+BhMv3jAY+2qkuwr4/IRqy9spmSfhgi2ZbEJcMc0UojeisP8ERnTsVAGuLRD9qtDXPKIWDkeY=
jq78bsrkbnnvjo7nor8f2i20vl9k8cgo.be. 600 IN NSEC3 1 1 5 1a4e9b6c JQ7RT3IFRO588SF81JDET9H3LLMBCU9K NS DS RRSIG
jq78bsrkbnnvjo7nor8f2i20vl9k8cgo.be. 600 IN RRSIG NSEC3 8 2 600 20190202170011 20190123164453 2478 be. SworU9I5MQUy0hty//rVo//yG916wuZFJyZb1O1/ii/Ueo4EZUZ5lzQ3XQkI6qmZMBMmFINebbAS7gJgVKNmbaVj4vJiZ2eeurnvmGTKXwHu4MYI/OPjoUOnNwo7KokhDCCbbCqRzVe1+BHWRJyZmdppp3awVzLD4ZZ4h5lWQ48=
;; ADDITIONAL SECTION:
;; OPT: edns_version=0, udp_payload=4096, flags=do, ercode=0(NOERROR)
Considering the widespread impact of this problem I think Let’s Encrypt perhaps should consider getting in touch with CDNS and/or investigating whether an IP blacklist can be implemented in unbound, similar to caps-whitelist
but for servers.