SAN certificate with dns-cloudflare

Hi!

Strange situation occurs.

I have 29 servers with his own FQDNs (one root domain). I've installed separate server to issue\renew cert for those servers. I'm issued one SAN certificate which contains 58 domain names (2 names for server) with certbot (snap installed) and dns-cloudflare plugin. Everything worked fine. All servers are in CF DNS - two A records for each.

After some time I've renew certificate, and everything was fine also.

But in past month (and now) when I try to renew cert I'm getting error about CAA records. CAA records are in place.

If I renew few domains (1-10) it's ok. But all 58 - error.

One thing that I've done within this time - I activated IPv6 stack on servers and add AAAA records to CF DNS.

One server doesn't support IPv6 at all. But even without it - I've got an error.
It seems to me, that after some tries CF block requests.

Certbot command is:
certbot certonly --force-renewal --key-type rsa --cert-name cert_work --non-interactive --dns-cloudflare --dns-cloudflare-credentials /opt/
certs_dns/credentials $str --agree-tos --dns-cloudflare-propagation-seconds 20 --email {my@email}

$str variable - "-d {domain1} -d {domain2} .... etc"

certbot 2.6.0

Log attached
log.txt (177.0 KB)

Hello @Comandante, welcome to the Let's Encrypt community. :slightly_smiling_face:

You might want to read this. I don't think Let's Encrypt has adjust the number, but possibly.

2 Likes

I think the discrepancy might be due to authorization caching (when you reissue certificates covering recently-issued names using the same Let's Encrypt account, the authorizations are not rechecked by the CA, for what I think is about one week).

So you could have some names that issue automatically with no checking, but other names that do need to be checked, and that fail.

I'm not sure what to do about the CAA failure here. For example, the first name that failed this check just worked with unboundtest, and also apparently worked with dig for me locally.

3 Likes

It could be related to this new Let's Encrypt server-side error:

This error appears to be making CAA records effectively mandatory rather than optional, at least some of the time for some people. (If so, that is not intended behavior for the certificate authority.)

3 Likes

I wonder if the DNS server (or some firewall in front of it) is seeing all the CAA requests coming at once and considering it an "attack" that it's blocking.

That probably isn't related, unless you're talking about adding IPv6 to the DNS servers themselves (as opposed to just the AAAA records it returns).

5 Likes

Only in Staging, and it looks like this is attempting against production.

But remove this immediately, it will only cause you pain and will never "fix" a problem.

5 Likes

In staging everything ok (dry-run). Error only in production.

I know about limit of 100 FQDN in cert, but I have only 58.

--force-renewal I run my script once a month. Dont waiting for certbot renewal. My script doing some other things after getting new certificate. And what the problem with this option?

Well, it allows you to force renewal only one month in, when you don't need to. Why are you running certbot manually in order to renew?

Those things should be put in the deploy hook, and then certbot will do whatever tasks you need it to, and only when the certificate actually needs to be (and is successfully) renewed.

Very interesting. I think there's not going to be a lot that people here can do; you'd need to have your DNS provider look into why it's giving SERVFAIL when there are a lot of requests coming in at once.

4 Likes

Well, it allows you to force renewal only one month in, when you don't need to. Why are you running certbot manually in order to renew?

Situation is the next - today I have 29 servers, and tomorrow I'll get 32. And I have to issue new cert with those new servers. Copy cert to servers, restart dockers and so on.
And may be day after tomorrow I'll get another servers to add.

So, I wrote a script, which ran once a month to check everything and renew cert.

Why not just do the fewer domains (i.e. more batches of smaller loads)?

2 Likes

Why not just do the fewer domains (i.e. more batches of smaller loads)?

I need one cert on all servers. Its more easy than divide servers into different parts

1 Like

Fair enough.

1 Like

Will create ticket to CF. Will see what they answer

It really seems that they are think that this is DoS

1 Like

Even in this case, you still don't need --force-renew; you could use --expand instead.

The difference in logic between then is that --expand automatically answers "yes" to the question about whether you'd like to replace the old certificate with an expanded one (covering a strictly larger set of names), while --force-renew also answers "yes" to the question about whether you'd like to re-issue the certificate even though there's no apparent reason to do so (when the certificate is not near expiry and no new names have been requested to be added to it).

6 Likes

Why this is a problem to re-issue cert? Even there were no changes in names, but I want to renew it not in 3 month, but in one. Where problem?

You can certainly do that if you prefer. We've seen that, for most people, relying on a manual schedule like that makes it more likely, rather than less likely, that they'll eventually miss a certificate renewal (because they're on vacation, or sick, or dealing with some other urgent matter at that the time that they hoped to deal with the certificate renewal).

And people who habitually use --force-renewal without looking closely at the Certbot output are also likely to hit rate limits in case of certain minor errors.

But it doesn't violate any Let's Encrypt policies or anything to do it that way.

If you merely want the renewal to happen after 30 days without also doing your renewals manually, you can set renew_before_expiry = 60 days instead of the default renew_before_expiry = 30 days in the .conf file in your /etc/letsencrypt/renewal directory. (That time window for renewal attempts is user-configurable.)

5 Likes

What will you do when you get the 51st server?
I'd start doing whatever you would for that now.

3 Likes

51 will not be tomorrow. When I'll get at least 45 - I will think.

In any way, question is - It work 4 month ago, and not working now.

I think we recently saw a very similar issue with Akamai here: LE is not issuing cert for unknown reason, where CAA rechecking and large certificates are involved.

It is a little worrying if both Cloudflare and Akamai users are encountering this problem. If finalization triggers a flood of DNS queries that is is too heavy (for some subjective measure of "too heavy") and we're seeing it across multiple notable providers, something's up.

It might be possible to prevent the CAA rechecking flood by deactivating existing authorizations every time you issue a certificate. The overall process will take longer, but it means the CAA checks will happen one by one for each authorization, not in rapid fire upon finalization. However, there is no way to force Certbot to do this, only with the --dry-run flag which discards the certificate. The other way to do it is to use a fresh ACME account every time.

5 Likes

If everybody was doing that, Let's Encrypts systems would have their load increased a LOT. IMO that's not something you should do.

4 Likes