Certificate renewal fails: urn:ietf:params:acme:error:caa 403

I am using Plesk on CentOS7. I have numerous ‘subscriptions’, but two of these give renewal problems although all subscriptions are supposedly setup identically. Both have a number of ‘domain aliases’, i.e. other domain names setup to 301 redirect to the principal domain. This means the certificates have options of:

  1. Unqualified principal domain name only
  2. Unqualified principal domain name + ‘www’ name
  3. Unqualified principal domain name + wildcard

The issue is explained in detail at: Issue - Let's Encrypt "urn:ietf:params:acme:error:caa" 403 failure | Plesk Forum but there is definitely some odd behaviour:

  1. Not every renewal cycle seems to be an issue … these subscriptions can auto renew without issue, then not
  2. If I deselect all the ‘domain aliases’ and even wildcard option, renewals have gone through that then included all the domain aliases and issuing a wildcard on the primary domain!
  3. Having deselected all the domain aliases, the LE error below has then reported one of the domain names NOT included in the renewal

So, with this background and noting there are two renewals giving the same error…

My domain is: (1) sprakekingsleyllp.co.uk and also (2) chloefox.org.uk

I ran this command: Renew certificate from within Plesk Plesk Obsidian, Version 18.0.56 Update #4 (Plesk Obsidian v18.0.56_build1800231106.15 os_CentOS 7)

It produced this output:

Could not issue an SSL/TLS certificate for chloefox.org.uk
Details

Could not issue a Let's Encrypt SSL/TLS certificate for chloefox.org.uk.

Details

Invalid response from https://acme-v02.api.letsencrypt.org/acme/finalize/356300830/227970812736.

Details:

Type: urn:ietf:params:acme:error:caa

Status: 403

Detail: Error finalizing order :: Rechecking CAA for www.chloe-fox.org.uk and 10 more identifiers failed. Refer to sub-problems for more information

Note both subscriptions give the same error, so you can substitute in either domain1 or domain2 in the above error report. The only issue being the “number of identifiers” and domain names quoted in the report. Note the domains do not have and never had had CAA records.

My web server is (include version): Apache 2.4.6-99.el7.centos.1

The operating system my web server runs on is (include version): CentOS Linux 7.9.2009 (Core)

My hosting provider, if applicable, is: Heart Internet

I can login to a root shell on my machine (yes or no, or I don't know): Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): Yes (Plesk)

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): n/a … and Certbot doesn’t appear to be on the server

Note https://acme-v02.api.letsencrypt.org/ is returning a 403 for some reason and that there are no CAA restrictions.

I see a dash in that second name.
Is that expected?

3 Likes

Good :eyes: @rg305

1 Like

Yes, as I mentioned the way these ‘subscriptions’ can work is to have a principal domain and then a number of 'domain aliases’ which all 301 redirect to the site under the principal domain name. It is not an error. If you look at the existing LE cert the SAN list includes:

*.chloefox.org.uk (the principal name)
chloe-fox.co.uk
chloe-fox.me.uk
chloe-fox.org.uk
chloefox.co.uk
chloefox.me.uk
chloefox.org.uk
www.chloe-fox.co.uk
www.chloe-fox.me.uk
www.chloe-fox.org.uk
www.chloefox.co.uk
www.chloefox.me.uk
www.chloefox.org.uk

As chloefox.org.uk is the principal domain, this is the CN for the cert and so this is why you see: "Could not issue an SSL/TLS certificate for chloefox.org.uk"

However, one of the other domains it’s looking to add to the SAN is
www.chloe-fox.org.uk (with a hyphen) which is the one it’s reporting, “and 10 more identifiers”.

So what I report is as reported, and is related to the SAN including variations on the name. It’s not an error

It's a SAN domain, not and error

These names conflict/overlap:

They are not allowed to exist in the same certificate.

4 Likes

Sorry, that's just a transcription error by me from the SAN list of the existing cert. As it's such a pain to copy paste each name individually, I copied the unqualified names, then duplicated then and added a www. prefix to each rather than the copy/paste again. I inadvertently forgot to remove the final unqualified name for chloefox.org.uk and indeed added a www prefix.

So, this name is not in the request list. It's not in the current SAN and so I'm sorry, this is no further forward as to why LE is issuing this 403.

I've attached the current SAN list so you can see what's in the current cert. This is simply an attempt to rotate an existingcert.

1 Like

Also thinking about it, 403 Forbidden is an unlikely response from LE for an incorrect entry in a SAN list. Something else must be going on.

Retrying with just the single unqualified name, Plesk still reports:

Invalid response from https://acme-v02.api.letsencrypt.org/acme/finalize/356300830/227970812736.

Details:

Type: urn:ietf:params:acme:error:caa

Status: 403

Detail: Error finalizing order :: Rechecking CAA for "chloefox.org.uk" and 11 more identifiers failed. Refer to sub-problems for more information

The "and 11 more identifiers failed" bothers me. Can you advise, specifically, what it LE referring to when it mentions "an identifer"?

My suspicion is there is a principal name which is the cert CN, and an optional list of 5 'alias' domains to add to the SAN list. So, with all options disabled and requesting only the unqualified CN, how can there be '11 additional identifiers'? If the request adds a 'www.' Prefix to the CN and also 5x additional domains with unqualified and 'www.' Prefixes, I add an additional 11 names. Is Plesk doing something odd here?

Should I look to install certbot and try and force this rotation manually?

So, considering the reported response is 403, with is an HTTP Forbidden, I looked at another subscription, again with multi-names in a SAN.

This certificate is good for:

Valid from |Wed, 29 Nov 2023 11:55:59 UTC
Valid until |Tue, 27 Feb 2024 11:55:58 UTC (expires in 2 months and 16 days)

Running a manual refresh now also gives a 403:

Could not issue an SSL/TLS certificate for sprakeandkingsley.com
Details

Could not issue a Let's Encrypt SSL/TLS certificate for sprakeandkingsley.com.

Details

Invalid response from https://acme-v02.api.letsencrypt.org/acme/finalize/356300830/228271926196.

Details:

Type: urn:ietf:params:acme:error:caa

Status: 403

Detail: Error finalizing order :: Rechecking CAA for "sprake.me.uk" and 93 more identifiers failed. Refer to sub-problems for more information

So why is a cert that auto-refreshed (via Plesk) just a week -and-a-half ago now failing with a 403 response? The server is well configured, this is just automated Plesk not hand-build messed up attempts. Check the server setup at: SSL Server Test: sprakeandkingsley.com (Powered by Qualys SSL Labs)

All these subscriptions (configurations) have been in place for years and been auto-rotating, with occational 403 issues. Now I am seeing these 403 responses? Why?

Don't focus on the "403". That is the overall result of the Cert request and shows for nearly every failed request (all?). Why it chose 403 I am not sure.

Let's Encrypt checks the CAA to ensure it is allowed to issue for the domain. It must either get a valid CAA record or a valid "not found" response. In your case something "wrong" is happening with the not found - either timeout or bad data.

Normally there is more info than that for a CAA lookup failure. Is there more detailed logs in Plesk?

I do not see anything wrong with your DNS servers using our normal tools (unboundtest, dnsviz, or an edns validator). So, the most likely reason is your DNS server provider has a too-sensitive setting for DDoS protection.

The LE Servers make numerous requests to the DNS servers and may be getting blocked by such a setting.

You might be able to work-around this problem by adding a CAA record. Finding a CAA record reduces the number of requests LE issues so might be successful. The proper format for CAA is described here:

3 Likes

An with another experiment. On another subscription, single domain name only, wildcard cert, renewed without issue, so the 403 isn't related to the server making the requests

1 Like

Thank Mike, however, no CAA and never have been.

Thanks for the DNS DDoS suggestion. That's a lead. Note that a single domain just renewed, but then, fewer checks. I wonder if SAN == multiple requests == DDoS block. That's a route to check. Thanks for the idea. Good one :slight_smile:

2 Likes

Another thought on the DNS DDoS mitigation theory...

I presume LE will go to 'whatever' DNS they have configured for their systems and not direct to the authoritative NS for the zone? Hence LE will go to 'whatever' NS, they in turn my refer to others before a request comes to the authoritative NS. Or does LE check the NS RR for a zone and then directly query the authoritative NS?

I'm just trying to work out in my mind how to prove/disprove this theory, say by hosting the zone elsewhere, but then it occurred, it's unlikely LE is the client querying the authoritative NS and so they would need to be refusing referred lookups from other NS ... which you might expect would show up with issues with other services??

Do you follow my logic?

No, the LE servers walk the DNS tree directly using the authoritative servers

See unboundtest.com which mostly mimics Let's Encrypt (except the volumes and locations of LE servers doing the lookups)

Using a different DNS provider is a good test. Cloudflare is commonly used and probably free. It would be a good one to test against anyway.

3 Likes

Many thanks Mike.

Have been doing various tests and finding slow/no name resolution via Google (8.8.8.8), but 'instant' via Cloudflare, AT&T, UUnet etc. - Dig web interface - online dns lookup tool

Also set explicit 'issue' and 'issuewild' RR rather than go with an empty response. So far, still failing, albeit RR are in place and being returned

1 Like

All looks good via https://unboundtest.com/m/CAA/chloefox.org.uk/VJ652WDW as well (for all domains in the SAN)

This thread is interesting: Error finalizing order Rechecking CAA failed. Refer to sub-problems for more information - #3 by bbinct

Exact same issue, and the resolution is:
"Well, I don’t know what changed, but finally after trying all day, my cert has renewed!
Thanks for all your help, if I figure out what changed, I’ll post it here. I did speak with the person responsible for the DNS, but I don’t know if they changed something. I’m guessing they did…

EDIT: Nothing changed, I think LetsEncrypt was just too busy and timing out all day doing all the new SAA DNS queries for each cert request. It was only an issue because LetsEncrypt gave so many users just 24hr notice that our certs were about to be revoked."

This is like myself ... cert auto-rotate without issue, then periodically I seem to have this CAA issue, then like this thread, the problem resolves itself without the cause ever being identified ... which is why I'm here as I've never managed to get it it before, despite certs finally being issued

1 Like

sprake.me.uk | DNSViz

3 Likes

Yes, I had already tested with unboundtest.com and two other common tools we use and all were fine. I just noticed @rg305 found error in one of your domains - I don't know that I checked them all earlier but that's something to look at.

But, these testing tools are just single requests. They do not mimic the larger volume of queries made by Let's Encrypt validation servers. And, they cannot mimic the worldwide locations of LE servers (all querying at same time).

Which is why I suggested this seemed more like a DDoS setting. It could be something else but it's a fair guess.

You could just see what happens by adding CAA record as I noted earlier.

Other testing tools
https://dnsviz.net/
https://ednscomp.isc.org/ednscomp

3 Likes

sprakeandkingsley.com | DNSViz

2 Likes