Wildcard certificate generation failing DNS challenge - but the TXT record is there

[Sorry for all the edits, hit submit too quickly and had to finish typing]

My domain is: alinlung.top

My web server is (include version): Traefik v2.4.8

The operating system my web server runs on is (include version): Debian Buster

I can login to a root shell on my machine (yes or no, or I don't know): yes

I'm using Traefik as a reverse proxy for a few services run on a local home server (each service on its own subdomain, so traefik.alinlung.top, netdata.alinlung.top, etc). I want to set-up a wildcard certificate (*.alinlung.top), but it seems the challenge is failing for some reason.

This is my docker-compose set-up for traefik:

  traefik:
	container_name: traefik
	image: traefik
	command:
	  - --log.level=INFO
	  - --api.insecure=true
	  - --accesslog=true
	  - --providers.docker=true
	  - --providers.docker.exposedbydefault=false
	  - --providers.file.filename=/config/traefik.yml

	-SNIP-


	  # HTTPS LetsEncrypt Wildcard Config
	  - --certificatesresolvers.letsencrypt.acme.email=-SNIP-
	  - --certificatesresolvers.letsencrypt.acme.storage=letsencrypt/acme-wildcard.json
	  - --certificatesresolvers.letsencrypt.acme.dnschallenge=true      # It needs to be a DNS Challenge (rather than TCP or HTTP) for wildcard certificates to work
	  - --certificatesresolvers.letsencrypt.acme.dnschallenge.provider=namesilo
	  #- --certificatesresolvers.letsencrypt.acme.dnschallenge.resolvers=1.1.1.1:53,8.8.8.8:53
	  - --certificatesresolvers.letsencrypt.acme.dnschallenge.delaybeforecheck=600
	  #- --certificatesresolvers.letsencrypt.acme.caserver=https://acme-v02.api.letsencrypt.org/directory
	  - --certificatesresolvers.letsencrypt.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
	  # Staging server is used for testing (so you don't hit API limits)


	  # HTTP / HTTPS Entrypoints
	  - --entrypoints.web.address=:80
	  - --entrypoints.websecure.address=:443

	ports:
	  - 80:80       # Web EntryPoint
	  - 443:443     # WebSecure EntryPoint
	  - 11007:8080  # WebUI
	volumes:
	  - *shared-volume
	  - *docker-sock
	  - ${USERDIR}/docker/traefik/letsencrypt:/letsencrypt
	  - ${USERDIR}/docker/traefik/certs:/certs
	  - ${USERDIR}/docker/traefik/config:/config
	environment:
	  - *PUID
	  - *PGID
	  - *TZ
	  - NAMESILO_API_KEY=${NAMESILO_API_KEY}
	labels:
	  - traefik.enable=true
	-SNIP-
	  
	  - traefik.http.routers.traefik-wildcard-cert.tls=true
	  - traefik.http.routers.traefik-wildcard-cert.tls.certresolver=letsencrypt
	  - traefik.http.routers.traefik-wildcard-cert.tls.domains[0].main=*.alinlung.top
	  # - traefik.http.routers.traefik-wildcard-cert.tls.domains[0].sans=alinlung.top
	  - traefik.http.routers.traefik-wildcard-cert.service=api@internal

	-SNIP-

This is the error I see in Traefik's logs:

time="2021-05-17T22:34:28+03:00" level=error msg="Unable to obtain ACME certificate for domains \"*.alinlung.top\" : unable to generate a certificate for the domains [*.alinlung.top]: error: one or more domains had a problem:\n[*.alinlung.top] time limit exceeded: last error: NS ns1.dnsowl.com. did not return the expected TXT record [fqdn: _acme-challenge.alinlung.top., value: 09wwvh4fBPgEAB1LaVYfkPJkMHJtH5GIzUnXU2hUAlw]: \n" providerName=letsencrypt.acme

But I have namesilo's DNS management page open on my second screen, I can see the TXT record is created and has the correct value. See the screenshot below

I'm guessing the issue might be the TXT record not getting propagated, but I set a 10-minute delay from the config and it didn't make any difference, so I'm a bit low on ideas.

You might just need to wait longer. I can see the _domainconnect TXT RR, but indeed no _acme-challenge TXT RR..

By the way, you do realise everybody can resolve your hostname to the IP addresses you've blurred, making that quite useless? :stuck_out_tongue:

1 Like

Yeah, I realised a few minutes after posting that those records are public anyway. Looks like I'll find out if my stuff is locked down well enough :sweat:
Anyway, the _acme-challenge record seems to be deleted automatically when the DNS challenge fails (the second the message is printed in the console, it's gone from the namesilo Records list). By waiting longer, do you mean just setting a longer wait than 10 mins?

Does Traefik have some kind of debug mode where it pauzes after it adds the TXT records?

Could be worth a try. And while you're waiting, you could use tools like dig (for example Dig (DNS lookup) if you don't have dig installed) to check if you can see the TXT record yourself.

1 Like

I would test the propagation delay by hand.
That means create a new DNS record (start the timer/counter).
Then check each of the six IPv4 and six IPv6 IPs responsible for your DNS zone individually for that newly created record.
Once all twelve are in sync, stop the timer/counter.

1 Like

Does anybody know how Cloudflare handles DNS traffic?

There are 3 name servers, all hosted via Cloudflare, everyone with 4 ip addresses:

alinlung.top
	•  ns1.dnsowl.com / 67m35
	162.159.26.136
Ashburn/Virginia/United States (US) - Cloudflare, Inc.	•

	• 
	162.159.27.173
Ashburn/Virginia/United States (US) - Cloudflare, Inc.	•

	• 
	2400:cb00:2049:1::a29f:1a88
Columbus/North Carolina/United States (US) - CLOUDFLARE	•

	• 
	2400:cb00:2049:1::a29f:1bad
Columbus/North Carolina/United States (US) - CLOUDFLARE	•

(see https://check-your-website.server-daten.de/?q=alinlung.top ).

Are DNS answers cached via Cloudflare?

1 Like

I don't think there's a way to make it pause, but it does let you try the DNS challenge by hand (it tells you what record and with what value to manually add)

Yeah, I could make a short bash script to do all that and time it. Maybe the DNS propagation is just taking way more than my 10-minute delay.

1 Like

Here's the namesilo Traefik config you need: Namesilo :: Let’s Encrypt client and ACME library written in Go.

Set your DNS propagation time limit to 15 minutes or more, it will then poll for the changes up to that limit. Namesilo say it takes at least 5 minutes before they push changes. https://www.namesilo.com/Support/DNS-Troubleshooting

Before validation can complete you need all of your own nameservers to be up to date with the correct TXT record value. This is made more complicated if you are requesting a certificate with wildcard + a primary domain e.g. *.alinlung.top plus alinlung.top in the same cert because these (confusingly) need two different validation values present at the same TXT record when the challenge response is checked by Let's Encrypt. If one of those valid values is present then only one of the identifiers can be validated and it will take a couple of attempts to pass validation.

If you are able to, consider running your own acme-dns service, this will allow you to create CNAMEs in your DNS pointing to an acme-dns instance, DNS validation will be instant and requires no further changes to your namesilo dns entries. Joohoi's ACME-DNS :: Let’s Encrypt client and ACME library written in Go.

1 Like

I've done some testing last night, it seems TXT records take between ~8 and ~45 minutes to show up for all 12 IP addressed of the DNS nameservers. I just set the delay to 60 minutes (and also set the DNS propagation time to 60 minutes) and I was just able to get a wildcard certificate from the staging server. I think I'll just skip on the sans certificate (alinlung.top without a subdomain) because frankly, I don't use the naked domain anywhere. Thanks everyone for all the help :relaxed:

I'm gonna give it a few hours for the TXT record TTL to expire and the caching to go away, and then I'll request the certificate again, from the normal (non-staging) server.


If anyone finds this on google and has similar issues, here's the bash script I used. Just put your list of nameserver IPs at line 7, your domain at line 8 and the and the TXT value you're expecting at line 14. It's a bit clumsy and heavy-handed (haven't written bash in ages and it was written at 1am) but it gets the job done, and you can easily compare the starting and ending time to see how long it took.

#! /bin/bash
date;
done=false;

while [ $done == false ]; do
	ok=true
	for dnsServer in "YOUR_NAMESERVER_IP_1" "YOUR_NAMESERVER_IP_2"; do
		tmp=`dig -t txt +short _acme-challenge.YOURDOMAIN.YOURTLD $dnsServer | tr -d '"'`;
		if [ -z $tmp ]; then
			ok=false;
			echo "Failed at" $dnsServer "empty"
			break
		else
			if [ $tmp != "YOUR_TXT_KEY_VALUE" ]; then
				ok=false;
				echo "Failed at" $dnsServer "nonempty" $tmp
				break
			fi
		fi
	done
	if [ $ok == true ]; then
		done=true;
		echo "Finally done"
		date;
		tput bel; sleep 0.2s; tput bel; sleep 0.2s; tput bel
		exit 0
	fi
	sleep 10s
done
date;
tput bel; sleep 0.2s; tput bel; sleep 0.2s; tput bel
exit 1
3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.