LE CAA fails due to NOTIMP, thus validation fails, too

I have searched for, and read most of the related threads from back in 2017 about verification failures due to "DNS problem: SERVFAIL looking up CAA for XXX - the domain's nameservers may be malfunctioning", resulted by the server answering NOTIMP (Not Imlemented, aka. RCODE=4) to CAA query.

The reason was told to be Unbound converting NOTIMP into SERVFAIL and LE fails on SERVFAIL. This was hinted "to be known".

Some people even went to the extent to call it invalid:

Returning other opcodes, including NOTIMP, for unrecognized qtypes is a violation of RFC 1035, and needs to be fixed.

Unfortunately this statement were missing specific reference. I have tried to find this in RFC1034 and RFC1035, as well as in their updates, but found nothing to say it'd be illegal to reply NOTIMP for a function not actually implemented.

Some other people have made snarky remarks about outdated DNS servers and "you had X years to update".

However the server in my case isn't particularly outdated: it is rbldnsd, which is a very compact, special purpose server, specifically implementing minimal amount of RRs, since it's main purpose to reply massive amounts of RBL requests. And does that pretty well. The minimal MUST RRs are implemented, and others are not. On purpose. According to RFCs.

So, here I am, with a server with valid replies, not implementing a function which is not compulsory for either DNS or LetsEncrypt, yet it's impossible to have the domain verified since NOTIMP isn't accepted.

I am still open for well referenced advices on why it would be invalid, and if justified, I'll go out and have the server code patched; if there is no such reference, I would really appreciate if LE would follow up on a valid reply and would handle it as such: same as NXDOMAIN I suppose.

(All alternatives, like

  • do not run webservers on the domain
  • replace your server
  • hack your dns [like dnsdist]
  • you are < NAMECALLING > go away

are not really helping, since if I wanted to do that I wouldn't have written this entry.)

Thanks!

Hi @grinapo

we have the year 2020. If your dns software isn't able to handle CAA queries correct, throw that software away.

Nobody would use Windows 95 or Windows 2000 to go online. But if you want to do that, that's your risk and your choice.

That's wrong. CA must check if there is a valid CAA or if there is no valid CAA. That's a NoError / NoData answer, not a NotImplemented / Servfail.

Please read

Some DNS providers that are unfamiliar with CAA initially reply to problem reports with “We do not support CAA records.” Your DNS provider does not need to specifically support CAA records; it only needs to reply with a NOERROR response for unknown query types (including CAA). Returning other opcodes, including NOTIMP, for unrecognized qtypes is a violation of RFC 1035, and needs to be fixed.

1 Like

I think Let's Encrypt have made a fairly explicit statement already on https://letsencrypt.org/docs/caa:

Your DNS provider does not need to specifically support CAA records; it only needs to reply with a NOERROR response for unknown query types (including CAA). Returning other opcodes, including NOTIMP, for unrecognized qtypes is a violation of RFC 1035, and needs to be fixed.

ISTM that it's well established that NOTIMP is to be used when the server does not support the requested kind of query - meaning the header opcode (RFC 1035 - Domain names - implementation and specification, QUERY/IQUERY/STATUS).

The "Not Implemented" rcode states:

Not Implemented - The name server does not support the requested kind of query.

That text perhaps seems a bit ambiguous (does it mean opcode or qtype?), but the query opcode is defined in the following terms:

A four bit field that specifies kind of query in this message

which to me, seems like it fully removes any ambiguity.

Reading some of the comments on the dns-operations discussions about ietf-dnsop-no-response-issue and also ietf-dnsop-any-notimp seems to support that interpretation.

Specifically that resolvers are expected to respond with NOERROR or NXDOMAIN for unknown qtypes (as opposed to unknown opcodes).

Hi @_az

check IANA, then it’s simple.

https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#dns-parameters-5

OpCode Name Reference
0 Query [RFC1035]
1 IQuery (Inverse Query, OBSOLETE) [RFC3425]
2 Status [RFC1035]
3 Unassigned
4 Notify [RFC1996]
5 Update [RFC2136]
6 DNS Stateful Operations (DSO) [RFC8490]
7-15 Unassigned

The standard OpCode is always 0 = Query (CAA query).

https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#dns-parameters-6

RCODE Name Description Reference
0 NoError No Error [RFC1035]
1 FormErr Format Error [RFC1035]
2 ServFail Server Failure [RFC1035]
3 NXDomain Non-Existent Domain [RFC1035]
4 NotImp Not Implemented [RFC1035]
5 Refused Query Refused [RFC1035]

NotImplemented -> if one of the OpCodes 0 - 15 isn’t implemented.

But that’ not one of the Resource Record (RR) TYPEs - > queries, unknown query (1 - 65535).

1 Like

It sounds like @grinapo disagrees with Let’s Encrypt’s reading of RFC 1035. However, as @_az says, Let’s Encrypt has been very explicit and deliberate on this point and this policy is unlikely to change.

There might be ways to change it involving getting IETF and/or the CA/B Forum to clarify the interpretation and donating the necessary technical work for ISRG, but these are all probably less work than the alternatives…

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.