Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.
we are trying to renew this domain certificate for about a week, we get one of these errors on each attempt:
error code 400 "urn:ietf:params:acme:error:dns":
DNS problem: SERVFAIL looking up CAA for luminateops.com - the domain's nameservers may be malfunctioning"
During secondary validation: DNS problem: SERVFAIL looking up CAA for us-west-2.luminateops.com - the domain's nameservers may be malfunctioning"
During secondary validation: DNS problem: SERVFAIL looking up CAA for luminateops.com - the domain's nameservers may be malfunctioning"
etc, basically every combination of the domain parts and primary/secondary validation errors.
This domain was renewed successfully for several years, and we didn't have any CAA records.
After getting those errors we tried adding CAA records for all domain parts, and still getting those various of errors.
As I understand it, the fact that we sometime fails on the primary validation and sometimes on secondary validation means we are able to pass the primary and that our configuration is somewhat valid, which makes the fact we fail more vague.
we'll appreciate if any lets encrypt staff can check the logs and explain what is wrong on our dns configuration.
Using https://unboundtest.com/ I can currently see all the CAA RRs you added (except for the CNAME to application01.account-api.prod.us-west1.luminatesec.com, but that hostname responds with a NOERROR, so that should be good too).
I don't know which challenge you were using, but LetsDebug complains about a non-working port 80 at Let's Debug (which would only be a problem for the http-01 challenge).
Although now I see a TXT being visible at _acme-challenge.account-api.production.us-west-2.luminateops.com using UnboundTest, but strangely enough when I use LetsDebug with the dns-01 challenge, Let's Encrypt complains about a SERVFAIL for that same TXT RR lookup? DNSViz (_acme-challenge.account-api.production.us-west-2.luminateops.com | DNSViz) does show some DNSSEC errors though, so probably worthwhile to fix those anyway, maybe it fixes the issue alltogether. (But strange that UnboundTest doesn't error out, usually it's quite good at mimicing the Let's Encrypt resolver, which is its job..)
NB: DNSViz also warns about a discrepancy between the authorative NS RR set and the delegation NS RR set, which is also a good idea to fix.
I have a hunch that something relatively recently in Let's Encrypt's configuration got a lot stricter about failing when DNS delegation was misconfigured. Or it may just be with the additional validation perspectives that it's more likely one of the Unbounds will fail when trying to decipher it. But yes, double-checking your DNS delegations are all correct is definitely the first step to trying to untangle this.
We've been seeing mis-delegated name servers for Route53 regularly here in past couple weeks. One poster said Route53 had reassigned theirs without any notice. Not sure how this helps fix anything. Just thought some extra color might be helpful
This definitely the cause of issues for me - fixed the certbot failure as soon as I fixed up the NS records to be the correct settings (that had been silently changed by Amazon/Route 53)
Thanks for the heads up, had been really struggling to figure this out