"secondary validation: DNS problem:" During renew

My domain is: doublebacon.de

I ran this command:
/usr/bin/certbot -v certonly --dry-run -d doublebacon.de

It produced this output:
I left out the verbose output here.

root@proxy:/etc/letsencrypt# certbot certonly  --dry-run -d doublebacon.de
Saving debug log to /var/log/letsencrypt/letsencrypt.log

How would you like to authenticate with the ACME CA?
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1: Obtain certificates using a DNS TXT record (if you are using Cloudflare for
DNS). (dns-cloudflare)
2: Nginx Web Server plugin - Alpha (nginx)
3: Spin up a temporary webserver (standalone)
4: Place files in webroot directory (webroot)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Select the appropriate number [1-4] then [enter] (press 'c' to cancel): 2
Plugins selected: Authenticator nginx, Installer None
Obtaining a new certificate
Performing the following challenges:
http-01 challenge for doublebacon.de
Waiting for verification...
Cleaning up challenges
Failed authorization procedure. doublebacon.de (http-01): urn:ietf:params:acme:error:dns :: During secondary validation: DNS problem: query timed out looking up A for doublebacon.de; no valid AAAA records found for doublebacon.de

IMPORTANT NOTES:
 - The following errors were reported by the server:

   Domain: doublebacon.de
   Type:   None
   Detail: During secondary validation: DNS problem: query timed out
   looking up A for doublebacon.de; no valid AAAA records found for
   doublebacon.de

Outout with -v : root@proxy:/etc/letsencrypt# certbot certonly -v --dry-run -d doublebacon.deRo - Pastebin.com

My web server is (include version):
nginx version: nginx/1.14.0 (Ubuntu)

The operating system my web server runs on is (include version):
VERSION="18.04.6 LTS (Bionic Beaver)"

My hosting provider, if applicable, is:

  • N/A - selfhosted

I can login to a root shell on my machine (yes or no, or I don't know):

  • yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):

  • no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):

root@proxy:/etc/nginx# certbot --version
certbot 0.27.0

Notes/ What else did i try:

  • DNS-Servers are with Cloudlfare
  • This issue effects all my (sub)domains
  • I have done lots of googling, all results seemt o point to DNS problems, but AFAIK all is well.
  • From the machine certbot is running on:
root@proxy:/etc/nginx# dig doublebacon.de +short @bryce.ns.cloudflare.com
79.194.153.246
  • I have changed the hosts DNS-resolver (I have pihole internally), set it to 1.1.1.1 and one of the IPs bryce.ns.cloudflare.com points to.
  • It used to work, I had some dependency problems after installing the dns-cloudflare plugin, but got that resovled. Maybe there lies the issue.

Thanks, I appreciate your help.

Yeah, there are a number of similar reports this morning. (great post by the way)

Let's Encrypt staff has been alerted. I would expect some sort of post from them soon.

Same problem here.... LetsDebug says "all is okay".

So, i kinda solved it. The --dry-run produced the output. If I leave it out, and actually request a new cert, it works no problem.
However, this is not my understanding of a dry-run. I think should try to perform all actions as far as possible and then exit with a message that it would succeed.
I initially had this problem with the DNS-challenge and cloudflare-dns plugin. Why does it not set the TXT record, verfies it and then says "yes it works, just not issuing new cert" ?

If other people have a similiar view on dry-run, I will creat a feature request.

@Eldiabolo21 There are a number of DNS secondary auth failures being reported today in the staging system. Let's Encrypt staff has been alerted and we are waiting a response.

Your understanding of dry-run is correct. There is just an unusual failure happening today.

Thanks for reporting; this is an issue I caused while misunderstanding some remote validation code last night. Fix in progress.

Do I understadn your comment correctly, that a dry-run goes agains the staging environment?

Yes. That is correct

Thanks everyone for your helpful replies and good luck getting things back up and running.

You might want to update that (if possible).