CAA behaviour changed?

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. https://crt.sh/?q=example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is: *.beta-4.radix.equinor.com

I ran this command:

certbot certonly --dry-run --manual --preferred-challenges dns-01 -d *.beta-4.radix.equinor.com

It produced this output:

Failed authorization procedure. beta-4.radix.equinor.com (dns-01): urn:ietf:params:acme:error:dns :: DNS problem: SERVFAIL looking up CAA for beta-4.radix.equinor.com

IMPORTANT NOTES:
- The following errors were reported by the server:

Domain: beta-4.radix.equinor.com
Type: None
Detail: DNS problem: SERVFAIL looking up CAA for beta-4.radix.equinor.com

My hosting provider, if applicable, is: Azure DNS manages DNS

I can login to a root shell on my machine (yes or no, or I don’t know): yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel): no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you’re using Certbot): certbot 0.28.0

Additional:

However, when running “dig caa” locally I get NXDOMAIN and not SERVFAIL. As far as I can tell from https://letsencrypt.org/docs/caa/ " You can set CAA records on your main domain, or at any depth of subdomain. For instance, if you had www.community.example.com , you could set CAA records for the full name, or for community.example.com , or for example.com . CAs will check each version, from left to right, and stop as soon as they see any CAA record." getting a NXDOMAIN should be fine and allow the certificate to be issued.

When adding the CAA record to the subdomain it suddenly works though:

az network dns record-set caa add-record -g common --zone-name radix.equinor.com --record-set-name beta-4 --flags 0 --tag "issue" --value "letsencrypt.org"

sudo certbot certonly --manual --preferred-challenges dns-01 -d *.beta-4.radix.equinor.com

 - Congratulations! Your certificate and chain have been saved at:
   /etc/letsencrypt/live/beta-4.radix.equinor.com/fullchain.pem

This has been working for 7-8 months until we started getting Authorization Failure Rate Limit errors a few days ago and started troubleshooting. We have also been in contact with Microsoft support which could not provide any other assistance than suggest trying to add CAA directly to beta-4.radix.equinor.com.

Hi @StianOvrevage

if the nameserver sends a SERVFAIL, Letsencrypt doesn't know if there is a CAA or not.

So the certificate creation is blocked.

Checking your domain via https://check-your-website.server-daten.de/?q=beta-4.radix.equinor.com your nameserver

 ns1-07.azure-dns.com

is instable. One time all EDNS - checks fail:

Nameserver doesn't pass all EDNS-Checks:ns1-07.azure-dns.com: EDNSOP100: no result. EDNSFLAGS: no result. EDNSV1: no result. EDNSV1OP100: no result. EDNSV1FLAGS: no result. EDNSDNSSEC: no result. EDNSV1DNSSEC: no result.

So if a nameserver is instable, this is a problem.

But you can block the walk (subdomain -> domain), if you add a CAA.

1 Like

Maybe there was just a temporary networking issue?

Maybe Let’s Encrypt triggered some sort of rate limiting with Azure?

Since nothing’s going wrong now, it’s hard to guess what exactly went wrong before.

1 Like

We’re experiencing exactly the same issue today. We created some certificates with no issue on the 24th Jan, but now any attempt to request a certificate is met with the error “SERVFAIL looking up CAA”. We’ve ensured the CAA is set up on the domain.

Again, with Azure DNS as the provider.

I just want to point out that Azure had outages yesterday and today.

Neither of them should cause anything like this – probably – but who knows.

:confused:

1 Like

Probably completely unrelated…
But maybe worth the mention:

Try with single, or double, quotes around the asterisk string:

certbot certonly --dry-run --manual --preferred-challenges dns-01 -d "*.beta-4.radix.equinor.com"
certbot certonly --dry-run --manual --preferred-challenges dns-01 -d '*.beta-4.radix.equinor.com'

[sometimes some systems don’t always process the asterisk as intended]

Hi @StianOvrevage,

I can see the same behaviors on my side. I’m using Azure DNS in West-Europe and have a really simple DNS Zone Setup.
My maingoal is to redirect all requests for climatixic.cloud to a Traefik Ingress controller running on Kubernetes. For that my only Zone Entry is an wildcard A-Record for climatixic.cloud pointing to my Public IP of the Ingress controller.

Traefik and certbot, both of them are reporting SERVFAIL looking up CAA for blaa.climatixic.cloud. If I add a CAA Record for blaa.climatixic.cloud the verification will work, but just for that particular Domain.

I checked my URL with https://check-your-website.server-daten.de/?q=blaa.climatixic.cloud once with and once without CAA entry. Here are the results:

CAA Entry inplace:

A	name "blaa.climatixic.cloud" is subdomain, public suffix is "cloud"
A	good: No asked Authoritative Name Server had a timeout
A	DNS: "Name Error" means: No www-dns-entry defined. This isn't a problem
A	Good: Nameserver supports TCP connections: 1 good Nameserver
A	Good: Nameserver supports Echo Capitalization: 1 good Nameserver
A	Good: Nameserver supports EDNS with max. 512 Byte Udp payload, message is smaller: 1 good Nameserver
A	Good: Nameserver has passed 7 EDNS-Checks (EDNSOP100, EDNSFLAGS, EDNSV1, EDNSV1OP100, EDNSV1FLAGS, EDNSDNSSEC, EDNSV1DNSSEC): 1 good Nameserver
Nameserver doesn't pass all EDNS-Checks:ns1-09.azure-dns.com: EDNSOP100: no result. EDNSFLAGS: no result. EDNSV1: no result. EDNSV1OP100: no result. EDNSV1FLAGS: no result. EDNSDNSSEC: no result. EDNSV1DNSSEC: no result. 
A	good: CAA entries found, creating certificate is limited: letsencrypt.org is allowed to create certificates
A	Duration: 12263 milliseconds, 12.263 seconds

CAA Entry NOT inplace:

A	name "blaa.climatixic.cloud" is subdomain, public suffix is "cloud"
A	good: No asked Authoritative Name Server had a timeout
A	DNS: "Name Error" means: No www-dns-entry defined. This isn't a problem
A	Good: Nameserver supports TCP connections: 4 good Nameserver
A	Good: Nameserver supports Echo Capitalization: 4 good Nameserver
A	Good: Nameserver supports EDNS with max. 512 Byte Udp payload, message is smaller: 4 good Nameserver
Nameserver doesn't pass all EDNS-Checks:ns1-09.azure-dns.com: EDNSOP100: no result. EDNSFLAGS: no result. EDNSV1: no result. EDNSV1OP100: no result. EDNSV1FLAGS: no result. EDNSDNSSEC: no result. EDNSV1DNSSEC: no result. 
Nameserver doesn't pass all EDNS-Checks:ns1-09.azure-dns.com: EDNSOP100: no result. EDNSFLAGS: no result. EDNSV1: no result. EDNSV1OP100: no result. EDNSV1FLAGS: no result. EDNSDNSSEC: no result. EDNSV1DNSSEC: no result. 
Warning: No CAA entry with issue/issuewild found, every CAA can create a certificate
A	Duration: 9060 milliseconds, 9.060 seconds

BTW, i can verify an wildcard certificate without any issues.

Hopefully we can get it solved soon. Thanks for your help.

I’ve managed to get certificates to issue today, but only if a valid DNS record is in place for the request. Our DNS records get created later in the orchestration of our environment deployments, but the certificates are required first to allow the deployments to succeed. If we create placeholder records in Azure DNS for the certificate names we are requesting, then LetsEncrypt issues the certificates.

Something has changed at either LetsEncrypt or Microsoft to change this behaviour.

Pure speculation:

I wonder if Azure is rate-limiting negative responses as an attack mitigation measure. Stringently enough for Let’s Encrypt to get caught by it.

But that’s probably less likely than a routing issue or something.

Hi,
No change today, at least for me.
I opened a support request at Azure. Let’s see what Microsoft has to say. I’ll keep you up to date.

Cheers

Chiming in that we’re seeing this as well. We have been issuing certificates (staging and prod) for wildcard subdomains on 4 domains whose records are managed in Azure DNS (many dozens of certs were issued over a period of about a month). About 2 days ago all domains stopped issuing certs of both kinds with “SERVFAIL looking up CAA”.

We actually didn’t have any CAA records at all on any level of our domains.

If I add a CAA record to the root (@), the domain passes CAA checks using a handful of DNS checking websites but we still get “SERVFAIL looking up CAA” when trying to create certs for subdomains with LE. If I add the record directly to the subdomain (our subdomains are dynamic, this would be super annoying to have to do) the certificates create fine.

2 Likes

Hi everyone, just found the same issue on Github: https://github.com/Azure/AKS/issues/806

1 Like

Still a problem for us. I have re-opened my Azure ticket and hopefully they will look at it soon.

That's interesting. I haven't been able to figure out what's happening with the other domains in this thread, but westeurope.aksapp.io returns FORMERR in response to seemingly any query. (Even a lowercase A query without EDNS!)

https://unboundtest.com/m/CAA/westeurope.aksapp.io/TWZVUV2E

I'm not sure it's the same issue. :confused:

Some Updates from Microsoft Azure Support. I just asked till when this will be fixed but got no answer so far. I'll keep you up to date.

@mnordhoff Regarding Mail1, it looks to me like the same behavior. I agree with you the response is different from an inexisting entry to an exsting text entry. Let me give an example for that.

I have an Wildcard A Entry for *.climatixic.cloud and a have an TXT Entry for
_acme-challenge.test. If i do a CAA Unboundtest for text.climatixic.cloud i get a SERVFAIL. But if I do an Unboundtest for noentry.climatixic.cloud I receive a NOERROR. I'm not the expert but that sounds similar to me :slight_smile:

Mail1

As per discussion with the Azure DNS zones developer team, this behavior is by design. Having dots on the record name attribute will always cause the creation
of ENT’s (empty non-terminal), and thus wildcard entries will be less specific.
For example by creating _acme-challenge.test.climatixic.cloud, it implicitly creates the ENT “test”. It is considered then that “test” is a name in the zone,
even if it has totally empty data and therefore wildcard at *.climatixic.cloud. will not apply since “test” ENT is more specific.
The investigation is still on because the result Azure DNS is sending (Format Error) is not the expected one.
Best Regards

Mail2

I have another update regarding this service request
Regarding the error “Format error” it was considered a bug introduced on the latest release and it will be reverted. The expected result in these situations
is “NOERROR” with just the SOA record in the authority section.
Best Regards,

Cheers e-bits

3 Likes

Thanks for the update!

People in the Let's Encrypt community have found bugs in nameservers before, but I think this is the first one with Azure.

If it doesn't sound too glib to say this, nice work!

I was wrong -- westeurope.aksapp.io was affected by the same bug you were.

It might be possible that some people in this thread were affected by different issues, though.

For what it's worth...

The only reason this fails is because of the bug. Like the email says, the authoritative DNS server is supposed to return NOERROR, not FORMERR. Once they fix that, Let's Encrypt will be able to resolve it again.

(Some older nameservers return NXDOMAIN instead, which they shouldn't, but which resolvers accept as long as DNSSEC isn't involved. But I digress.)

But even when things are working correctly, when you create the _acme-challenge.test.climatixic.cloud TXT record, the wildcard A record stops applying to test.climatixic.cloud, so test.climatixic.cloud temporarily stops existing and stops having an A record.

You should create an A record for test.climatixic.cloud so it doesn't temporarily cease to exist when you're renewing the certificate.

Doing so would also work around the bug, but that's beside the point.

1 Like

Community - Update from Azure DNS. We are fixing Azure DNS behavior to return NOERROR and I will update this thread once we are complete. Rollout is expected to complete a take a week or so to our worldwide fleet

7 Likes

I'm experiencing this as well. Been a booger to figure out for the past few weeks since it was working before and we introduced changes on our side thinking it was us.

So we are expected to be unable to gen new certs for new sub domains for a week?

1 Like

We understand your pain and apologize for it!. We are working on shortening the time of this rollout so that it is completed in the next day or so. We will keep this thread updated as we make progress

2 Likes

@wacky_man Are you trying to create certs for a domain fully owned by you ? Can you file an Azure support ticket with the details of the zone and we will try to unblock you. You can also send the details to azurednsfeedback@microsoft.com and we will look at options to configure records to enable CAA checks to pass

2 Likes