My domain is: composedb.com
We’re utilizing the xenolf/lego project as a library for issuances via DNS challenge with a modified version of its DNS preCheck logic where we are doing an arbitrary sleep of 30s once all nameservers have been verified, as well as logging the nameservers checked. The majority of the time things work as expected on the first attempt but every-now-and-then we get the NXDOMAIN
error back from let’s encrypt. An excerpt from a recent failure is below:
2018/07/16 14:51:07 [INFO][930596040.composedb.com] acme: Obtaining bundled SAN certificate
2018/07/16 14:51:08 [INFO][930596040.composedb.com] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/REDACTED
2018/07/16 14:52:26 [INFO][930596040.composedb.com] acme: Trying to solve DNS-01
2018/07/16 14:52:34 [INFO][930596040.composedb.com] Checking DNS record propagation using [google-public-dns-a.google.com:53 google-public-dns-b.google.com:53]
time="2018-07-16T14:52:34Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:52:39Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:52:44Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:52:49Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:52:54Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:52:59Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:53:04Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:53:10Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:53:10Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d2.googledomains.com.
time="2018-07-16T14:53:15Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:53:15Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d2.googledomains.com.
time="2018-07-16T14:53:15Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d3.googledomains.com.
time="2018-07-16T14:53:20Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:53:20Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d2.googledomains.com.
time="2018-07-16T14:53:20Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d3.googledomains.com.
time="2018-07-16T14:53:25Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d1.googledomains.com.
time="2018-07-16T14:53:25Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d2.googledomains.com.
time="2018-07-16T14:53:25Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d3.googledomains.com.
time="2018-07-16T14:53:25Z" level=info msg="querying for TXT record" fqdn=_acme-challenge.930596040.composedb.com. nameserver=ns-cloud-d4.googledomains.com.
time="2018-07-16T14:54:01Z" level=error msg="acme: Error -> One or more domains had a problem:\n[930596040.composedb.com] acme: Error 400 - urn:ietf:params:acme:error:dns - DNS problem: NXDOMAIN looking up TXT for _acme-challenge.930596040.composedb.com\n"
The 30s sleep has resulted in us seeing less occurrences of the error but hasn’t resolved it completely. I went this route based on the logic in certbot as I was hoping the combination of checking nameservers and sleeping for a bit would be enough. In the case above, when the request was retried within ~1 minute, there was no issue with obtaining the certificate.
Is there anything else that can be done to check/verify the TXT record is ready to be checked by boulder? A previous discussion on this similar topic didn’t really give me anything big to work on since the resolvers used by boulder would like to be kept private and could potentially change. My hope is there can be something additional to be checked and would also improve the lego library.
/cc @cpu