Servfail only on staging

My domain is:
boone.hpisd.org

I ran this command:

certbot certonly --webroot --staging --csr /path/to/my.csr -w /var/www/html -d admin.boone.hpisd.org -d boone.hpisd.org

It produced this output:
Failed authorization procedure. admin.boone.hpisd.org (http-01): urn:ietf:params:acme:error:dns :: During secondary validation: DNS problem: SERVFAIL looking up A for admin.boone.hpisd.org - the domain’s nameservers may be malfunctioning, boone.hpisd.org (http-01): urn:ietf:params:acme:error:dns :: During secondary validation: DNS problem: SERVFAIL looking up A for boone.hpisd.org - the domain’s nameservers may be malfunctioning

Note that the letsdebug.net output for these domains passes validation 100% of the time after repeated checks:
https://letsdebug.net/boone.hpisd.org/128304

Simply removing the --staging toggle and hitting LE production environment solves the issue; It appears to only occur in staging.

I wonder if this is a regression related to this previously closed topic:

Hi @lancedolan

that’s

relevant. Read

So the primary Letsencrypt servers are able to check your domain. The secondary servers are blocked.

That’s new, started this year. So I don’t think it’s related to a topic (last year).

Looks like some name servers you use block these secondary validations.

These are worldnic.com hostnames. Earlier this year, network issues were resolved between LE and worldnic.com nameservers which were preventing us and other large hosting providers from generating certs for large numbers of customer (hundreds of domains in our case).

Could it be that whatever solution you reached resolved the issue only in your production environment, while worldnic.com continues to block your staging environment (or rate-limit it, or w/e the original issue was) ?

2 Likes

@lancedolan, thank you for this report! I believe our diverse-perspective resolver addresses may have rotated. I have opened communication with worldnic and will update this thread as I get information.

3 Likes

I’ve received confirmation that our current secondary resolver address list has been whitelisted. @lancedolan, is Staging behaving better now?

2 Likes

Sorry for slow reply. We stopped using staging environment completely after the previous replies. I’ll try again.

Thanks for the help and attention! Always impressive.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.