TXT record is not found

If a TXT record is not found, Lego will wait additional 10 min and etc.
But TXT record is found by Lego, that's why it asks LE to validate the challenge.
LE somehow doesn't see it.

Possibly DNS propagation delays from different DNS Name Server from around the world.
And Let's Encrypt uses Multi-Perspective Validation Improves Domain Validation Security - Let's Encrypt

Yes, Anycast DNS propagation can be funny. You said that 7 of 15 fail which mean many work. We may be dealing with a timing problem of fairly short duration.

As a test, can you add a 5-10m delay before the first lego DNS TXT query? This would give some extra time before LE is signaled by lego to try.

4 Likes

Yes, I'm thinking about such an addition - add a delay before we say Lego to check a challenge.

Hi, in addition to the comments above, make sure you have configured a low TTL for your TXT records using SOFTLAYER_TTL and also configurate a high timeout using SOFTLAYER_PROPAGATION_TIMEOUT:

https://go-acme.github.io/lego/dns/ibmcloud/

If you can configure alternative name resolvers (lego --dns.resolvers as per Options :: Let’s Encrypt client and ACME library written in Go.) then try setting them to ns1.softlayer.com and ns2.softlayer.com directly or try other public ones like cloudflare.

3 Likes

Will it be correct to set dns.resolvers to '"ns1.softlayer.com:53", "ns2.softlayer.com:53"` for all Softlayer orders? If not, how can I select which dns resolvers to use for every specific order?

I found that Lego code always queries each one of the authoritative nameservers for the expected TXT record.
What else can be done to make a successful LE check more likely?

All that really has to happen for Let's Encrypt to validate your domain is for all of your authoritative nameservers to agree. The resolver you use for lego is just so it can test the DNS record for itself before proceeding, so it's up to to you what you use there.

4 Likes

But Lego checks authoritative nameservers to have the TXT record! And LE still doesn't find it.

I've only skimmed this thread but does it verifiably check them all (and not just one)? If it does and they are returning variable results then they're not very reliable, I'd extend the propagation timeout to compensate.

5 Likes

This problem sounds somewhat similar to mine, which is with the LEGO command line client directly, which presumably uses its own library.
I created an issue with the LEGO client because I was mistakenly thinking that it was checking the wrong name servers, or checking them in a wrong way, but after some more research this turned out to be false: Possible bad propagation check with dns-01 challenge · Issue #1777 · go-acme/lego · GitHub

In my thread here on this community forum, there were some people who thought that my company's DNS setup was too complicated: Was there some dns lookup failure in recent days? I personally have some doubts about it (it's not like all servers should work for a correct result, one at each step is enough), but the "dns server department" has made some changes anyway. In about a week we will need a renewal and we can see if it works better now.

But since we seem to have somewhat similar problems, this may be a hint that the real problem is not with you or with me, but somewhere in the range < us, Let's Encrypt ].

My solution was to use Lego dns01.WrapPreCheck - the function that wraps precheck.
In this function I added additional delay after all challenges were prechecked successfully.
So after all TXT records are found for all nameservers I wait additional 5 minutes (only for SoftLayer).
This solved the customer issue, all orders succeed.

2 Likes