I’m trying to get a wildcard certificate for *.cryptoclimate.io and cryptoclimate.io, as described here, after making the mistake of getting only a certificate for *.cryptoclimate.io.
Due to TTLs or propagation (even though I set the TTL to the minimum 30s on Digital Ocean), it’s been a highly frustrating experience trying to get that wildcard cert over the past hour or so. Every single challenge fails, and I get different (old, like in this post) TXT values from online domain tools checkers vs. running nslookup locally vs. what certbot fails with.
Would it be possible to force a different hostname than _acme-challenge? That should avoid these caching and propagation problems.
The main thing that materially affects how fast you can complete a DNS challenge is how quickly Digital Ocean pushes updates out to its own nameservers.
Last I checked, they are pretty fast about it. I think your problem is elsewhere if things are taking an hour.
Being able to control the _acme-challenge label will not change anything in your situation. Zone updates are not pushed out any quicker on the basis of what the actual change is - it’s opaque.
Regarding TTLs, Let’s Encrypt currently observes TTLs up to a maximum of 60 seconds. That is the worst case delay that can be caused by TTL/caching. You are free to set a TTL of 0 or 1 seconds.
All you need is one wildcard cert. So you had it right in the first place, then messed it up. Make sure that you have 2 txt records. Don’t delete the first one to create the second one. The ttl should not make any difference at all.
It's a small RESTful DNS server that works with certbot and other clients. once you set it up and forward the _acme-challenge record to it, you never have to touch the main DNS servers again. you can control all your LetsEncrypt validations through it.
A few years ago I had similar problems with namecheap's DNS. Even with a 60s TTL, it could take 10-30 minutes for things to clear; other times it was fine.
After a lot of experimentation, I eventually concluded they had multi-level read-through cache installed. (It could have been something else, but it behaved like a multi-level read through cache.) While the TTL on the record was 60s, their internal cache seemed to have a 5 or 10 minute storage time, which wasn't cleared with an update. When I tried to complete a challenge, LetsEncrypt might talk to a DNS server with the 60s timeout that has a local cache, or trigger an internal distributed cache lookup between 60s and 300s, or trigger a new internal datastore lookup. Every subsequent attempt was the same - either returning one of 2 possible stale values or doing a read-through cache on the new value.
At that point, I just said "fork it" and migrated everything to acme-dns.
DigitalOcean uses Cloudflare DNS Firewall, which has configurable edge caching. So there probably are multiple layers of caching involved. I don’t know if DigitalOcean has published the details.
Then I simply waited an hour from adding the TXT record to pressing Enter to preform each challenge, and that worked. After two hours, I have my wildcart cert . So I think the problem was with DigitalOcean's caching.
Allowing _acme-challenge to be changed is extremely dangerous. Let’s say a malicious actor gets access to set a limited number of txt records, but _acme-challenge isn’t one of them. They could get still get a wildcard cert, and if they disguised it as something technical you might not notice.