Over the past few days, I have run several test configurations with certbot (using --break-my-certs). Every time, ~2/10 subdomains fails the challenge. Running certbot again then gets succeeds with the remaining subdomains. What's odd is that the subdomains that fail are different every time.
I've checked the domains with letsdebug as well, and there too I get variable results without having made any changes to my DNS records.
Example error:
IMPORTANT NOTES:
- The following errors were reported by the server:
Domain: sub1.xxxxxxx.com
Type: dns
Detail: During secondary validation: DNS problem: SERVFAIL looking
up CAA for sub1.xxxxxxx.com - the domain's nameservers
may be malfunctioning
Domain: sub2.xxxxx.com
Type: dns
Detail: During secondary validation: DNS problem: SERVFAIL looking
up CAA for sub2.xxxxxxx.com - the domain's
nameservers may be malfunctioning
- Your account credentials have been saved in your Certbot
configuration directory at /etc/letsencrypt. You should make a
secure backup of this folder now. This configuration directory will
also contain certificates and private keys obtained by Certbot so
making regular backups of this folder is ideal.
id xxxxxx
opcode QUERY
rcode NOERROR
flags QR RD RA
;QUESTION
sub.xxxxxx.com. IN A
;ANSWER
sub.xxxxxxx.com. 3599 IN CNAME xxxxxxx.com.
xxxxxxx.com. 3599 IN A 167.xxx.xxx.xxx
;AUTHORITY
;ADDITIONAL
Edit: I just tried again and yet another subdomain failed (but the ones that previously failed were fine). I checked it with the dig tool you linked to and got the same result as above (except for the subdomain and ip of course). When I re-ran the certbot command, the subdomain that just failed failed worked fine.
Since the problem seems to be intermittent, is there a way to get more information about what went wrong when the failure occurred?
I wasn't paying close attention the first few times, but SERVFAIL has appeared the last several tries.
1. All the domains that are having this unique problem use the exact same set of DNS servers:
Yes, they all use the exact same name servers. Although the Google servers you listed are my "main" name servers, I also had to create A records on the Digitalocean side to get things to work. That is expected, right? In any case, it's been that way for months and this has never been an issue in the hundreds of tests I did before this past week.
2. The "renewal"/"test" process requires DNS changes to be made:
I am not sure what you mean exactly, but I don't think so. At least I don't touch any of my name server settings when I am running certbot.
3. I use this ACME client, and version.
I am using certbot-nginx from the Ubuntu 20.04 repositories. In /etc/letsencrypt/renewal/mylittlestashbox.com.conf I have server = https://acme-staging-v02.api.letsencrypt.org/directory, is that what you are asking for?
On the most recent run, I was able to create all the certs without error, but this is the file:
Regarding the A records on Digitalocean, all I know is that I could not go to mylittlestashbox.com or any subdomains until I had created the record on the Digitalocean side. I have deleted them now and it still seems to be working.
Perhaps either the change I just made has not propagated yet, or perhaps I did not wait long enough during the initial setup and falsely assumed that my creation of those records is what caused it to start working.
conf file:
~$ cat /etc/letsencrypt/renewal/mylittlestashbox.com.conf
# renew_before_expiry = 30 days
version = 0.40.0
archive_dir = /etc/letsencrypt/archive/mylittlestashbox.com
cert = /etc/letsencrypt/live/mylittlestashbox.com/cert.pem
privkey = /etc/letsencrypt/live/mylittlestashbox.com/privkey.pem
chain = /etc/letsencrypt/live/mylittlestashbox.com/chain.pem
fullchain = /etc/letsencrypt/live/mylittlestashbox.com/fullchain.pem
# Options used in the renewal process
[renewalparams]
account = xxxxxxxxxxxxxxxxxxxxxxxx
rsa_key_size = 4096
server = https://acme-staging-v02.api.letsencrypt.org/directory
authenticator = nginx
nginx_server_root = /etc/nginx
Reason #2: When the D1 server is queried, it knows nothing of your domain: nslookup mylittlestashbox.com ns-cloud-d1.googledomains.com *** UnKnown can't find mylittlestashbox.com: Query refused
meanwhile C1 works just fine:
Ok that makes sense, just wanted to make sure. I have not touched anything on the Google side yet.
BTW I ran certbot again and there was again a failure for just one of my subdomains, whose A record has been deleted from DO but is still untouched on Google.
Challenge failed for domain notes.mylittlestashbox.com
http-01 challenge for notes.mylittlestashbox.com
Cleaning up challenges
Some challenges have failed.
IMPORTANT NOTES:
- The following errors were reported by the server:
Domain: notes.mylittlestashbox.com
Type: dns
Detail: During secondary validation: DNS problem: SERVFAIL looking
up CAA for notes.mylittlestashbox.com - the domain's nameservers
may be malfunctioning
- Your account credentials have been saved in your Certbot
configuration directory at /etc/letsencrypt. You should make a
secure backup of this folder now. This configuration directory will
also contain certificates and private keys obtained by Certbot so
making regular backups of this folder is ideal.