DNS SERVFAIL errors from Let's Encrypt

Hi, everyone!

We are getting DNS SERVFAIL errors in approximately 19 out of every 20 attempts from Let's Encrypt when using certbot trying to do DNS-01 and HTTP-01 validations. Examples include _acme-challenge.hrs.isr.umich.edu and _acme-challenge.oami.umich.edu.

We are able to consistently reproduce this using letsdebug.net: approximately 19 of every 20 attempts fail with a DNS SERVFAIL error.

We have not been able to reproduce this using unboundtest.com. Every test we do there succeeds. Tools such as dnschecker.com (test link) also have never reported DNS errors when looking up the record. Based on these things, we're wondering if the problem is specific to Let's Encrypt's network and DNS setup.

Does anyone have any ideas or advice for troubleshooting this problem?

My domain is:
hsr.isr.umich.edu

I ran this command:
sudo certbot -vvv certonly --manual --preferred-challenges dns -d hrs.isr.umich.edu

It produced this output (see the end of this post for the full output):
{
"identifier": {
"type": "dns",
"value": "hrs.isr.umich.edu"
},
"status": "invalid",
"expires": "2024-08-28T13:30:45Z",
"challenges": [
{
"type": "dns-01",
"url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/393164121136/4zSguA",
"status": "invalid",
"validated": "2024-08-21T13:30:57Z",
"error": {
"type": "urn:ietf:params:acme:error:dns",
"detail": "DNS problem: SERVFAIL looking up TXT for _acme-challenge.hrs.isr.umich.edu - the domain's nameservers may be malfunctioning",
"status": 400
},
"token": "xxx-redacted-xxx"
}
]
}

See below for the full output.

My web server is (include version):
Laptop: no web server, using certonly mode
Web hosting provider (pantheon.io): Nginx (version not known)

The operating system my web server runs on is (include version):
Laptop: certbot is running under macOS 14.6.1
Web hosting provider (pantheon.io): CentOS 8

My hosting provider, if applicable, is:
Web hosting provider: pantheon.io
DNS provider: self-hosted (umich.edu)

I can login to a root shell on my machine (yes or no, or I don't know):
Yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
Laptop: no
Web hosting provider: Yes
DNS hosting: no, ISC BIND 9.11.36 on Red Hat Enterprise Linux 8 with back-ported security patches.

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
2.11.0

Full output of the command:

$ sudo certbot -vvv certonly --manual --preferred-challenges dns -d hrs.isr.umich.edu
Root logging level set at 0
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Requested authenticator manual and installer None
Single candidate plugin: * manual
Description: Manual configuration or run your own shell scripts
Interfaces: Authenticator, Plugin
Entry point: EntryPoint(name='manual', value='certbot._internal.plugins.manual:Authenticator', group='certbot.plugins')
Initialized: <certbot._internal.plugins.manual.Authenticator object at 0x107a5f500>
Prep: True
Selected authenticator <certbot._internal.plugins.manual.Authenticator object at 0x107a5f500> and installer None
Plugins selected: Authenticator manual, Installer None
Picked account: <Account(RegistrationResource(body=Registration(key=None, contact=(), agreement=None, status=None, terms_of_service_agreed=None, only_return_existing=None, external_account_binding=None), uri='https://acme-v02.api.letsencrypt.org/acme/acct/1902976136', new_authzr_uri=None, terms_of_service=None), xxx-redacted-xxx, Meta(creation_dt=datetime.datetime(2024, 8, 21, 13, 15, 21, tzinfo=), creation_host='0587670198.vpn.umich.net', register_to_eff='markmont@umich.edu'))>
Sending GET request to https://acme-v02.api.letsencrypt.org/directory.
Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org:443
https://acme-v02.api.letsencrypt.org:443 "GET /directory HTTP/11" 200 746
Received response:
HTTP 200
Server: nginx
Date: Wed, 21 Aug 2024 13:30:45 GMT
Content-Type: application/json
Content-Length: 746
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
"HZK9Rw334ns": "Adding random entries to the directory",
"keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change",
"meta": {
"caaIdentities": [
"letsencrypt.org"
],
"termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.4-April-3-2024.pdf",
"website": "https://letsencrypt.org"
},
"newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct",
"newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce",
"newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order",
"renewalInfo": "https://acme-v02.api.letsencrypt.org/draft-ietf-acme-ari-03/renewalInfo",
"revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert"
}
Notifying user: Requesting a certificate for hrs.isr.umich.edu
Requesting a certificate for hrs.isr.umich.edu
Requesting fresh nonce
Sending HEAD request to https://acme-v02.api.letsencrypt.org/acme/new-nonce.
https://acme-v02.api.letsencrypt.org:443 "HEAD /acme/new-nonce HTTP/11" 200 0
Received response:
HTTP 200
Server: nginx
Date: Wed, 21 Aug 2024 13:30:45 GMT
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
Link: https://acme-v02.api.letsencrypt.org/directory;rel="index"
Replay-Nonce: lpv3ejQgydimIVWkXJrqMiRblF9xvL-iGIkA6Yawg0bxGFSxdLc
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

Storing nonce: lpv3ejQgydimIVWkXJrqMiRblF9xvL-iGIkA6Yawg0bxGFSxdLc
JWS payload:
b'{\n "identifiers": [\n {\n "type": "dns",\n "value": "hrs.isr.umich.edu"\n }\n ]\n}'
Sending POST request to https://acme-v02.api.letsencrypt.org/acme/new-order:
{
"protected": "eyJhbGciOiAiUlMyNTYiLCAia2lkIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL2FjY3QvMTkwMjk3NjEzNiIsICJub25jZSI6ICJscHYzZWpRZ3lkaW1JVldrWEpycU1pUmJsRjl4dkwtaUdJa0E2WWF3ZzBieEdGU3hkTGMiLCAidXJsIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL25ldy1vcmRlciJ9",
"signature": "QGzXIlHb2L2Rkfnn1NHKg7OILmDHF9vj8FCJUUN9Cfwi7lUwcCHGfbKD5NAgr86syP8cVJIMecK-ZAGGn5kd6XTexFv0uXpNLfoqiPICnUxSCtliis5e7lU0jCcIrPBCFeHv0boWXTuYGg2BYwO2PKZJkhpxW65Jl7tONoy926dztbOIkTUef1U5lqXAmpNGb0WCSyvwNDPzfakn61NgbpxYMreW_1yJ3B-MuwoLOTgPi-BfGsu2XpYxCODa-5Si1HfE4bkgnXUC6_V3CK8Jp4J-S6SD64lY4cNkLrx5q155ouyTStsHivXF5QpO7LN9FqMSLraaW-HHSHrULvIeFg",
"payload": "ewogICJpZGVudGlmaWVycyI6IFsKICAgIHsKICAgICAgInR5cGUiOiAiZG5zIiwKICAgICAgInZhbHVlIjogImhycy5pc3IudW1pY2guZWR1IgogICAgfQogIF0KfQ"
}
https://acme-v02.api.letsencrypt.org:443 "POST /acme/new-order HTTP/11" 201 343
Received response:
HTTP 201
Server: nginx
Date: Wed, 21 Aug 2024 13:30:45 GMT
Content-Type: application/json
Content-Length: 343
Connection: keep-alive
Boulder-Requester: 1902976136
Cache-Control: public, max-age=0, no-cache
Link: https://acme-v02.api.letsencrypt.org/directory;rel="index"
Location: https://acme-v02.api.letsencrypt.org/acme/order/1902976136/298190416626
Replay-Nonce: Xt09NNgJvB8QE2AqN27-WxZcynMwbSYPyFzmEnmrlD5JL8aSRko
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
"status": "pending",
"expires": "2024-08-28T13:30:45Z",
"identifiers": [
{
"type": "dns",
"value": "hrs.isr.umich.edu"
}
],
"authorizations": [
"https://acme-v02.api.letsencrypt.org/acme/authz-v3/393164121136"
],
"finalize": "https://acme-v02.api.letsencrypt.org/acme/finalize/1902976136/298190416626"
}
Storing nonce: Xt09NNgJvB8QE2AqN27-WxZcynMwbSYPyFzmEnmrlD5JL8aSRko
JWS payload:
b''
Sending POST request to https://acme-v02.api.letsencrypt.org/acme/authz-v3/393164121136:
{
"protected": "eyJhbGciOiAiUlMyNTYiLCAia2lkIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL2FjY3QvMTkwMjk3NjEzNiIsICJub25jZSI6ICJYdDA5Tk5nSnZCOFFFMkFxTjI3LVd4WmN5bk13YlNZUHlGem1Fbm1ybEQ1Skw4YVNSa28iLCAidXJsIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL2F1dGh6LXYzLzM5MzE2NDEyMTEzNiJ9",
"signature": "Q0RQ8ND75V7U7CmoAyN-CuAEWWWQhUxOdiQ7CY72xxtP6U3JefAk0VIRDeF5Nf9jiNJLBfuxwLBtDZDzRNCa5PhvOTeHco9heXRZnP6DUhu1bJnq_bz10Mg-qo9fguc-MOcYG1VAvUY17eoOnMHYQA_7uOO0DRlilZOHmWE9aMyYlYuDwX4ejSn_tjQkpbu-6TruWzNFkr7W6VhxOKPKLjECr-Cps1pdJjg5NQQMn_j-ZWbtD5fyIoZ71vc_fQj2aiu8RTOKzmP4VNRdXPonEi-eyyvLzwUedrFjtaOqAuA-ItLE0FfunYBsCeUxAQlJYnxxv4PxOzoHhwVZHj1hBA",
"payload": ""
}
https://acme-v02.api.letsencrypt.org:443 "POST /acme/authz-v3/393164121136 HTTP/11" 200 801
Received response:
HTTP 200
Server: nginx
Date: Wed, 21 Aug 2024 13:30:45 GMT
Content-Type: application/json
Content-Length: 801
Connection: keep-alive
Boulder-Requester: 1902976136
Cache-Control: public, max-age=0, no-cache
Link: https://acme-v02.api.letsencrypt.org/directory;rel="index"
Replay-Nonce: Xt09NNgJZjnSPfsqBpFQm0ltoNRBLu3zs-eevWEdJoULAO-APKs
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
"identifier": {
"type": "dns",
"value": "hrs.isr.umich.edu"
},
"status": "pending",
"expires": "2024-08-28T13:30:45Z",
"challenges": [
{
"type": "http-01",
"url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/393164121136/YX8tUw",
"status": "pending",
"token": "xxx-redacted-xxx"
},
{
"type": "dns-01",
"url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/393164121136/4zSguA",
"status": "pending",
"token": "xxx-redacted-xxx"
},
{
"type": "tls-alpn-01",
"url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/393164121136/cxNGEg",
"status": "pending",
"token": "xxx-redacted-xxx"
}
]
}
Storing nonce: Xt09NNgJZjnSPfsqBpFQm0ltoNRBLu3zs-eevWEdJoULAO-APKs
Performing the following challenges:
dns-01 challenge for hrs.isr.umich.edu
Notifying user: Please deploy a DNS TXT record under the name:

_acme-challenge.hrs.isr.umich.edu.

with the following value:

9xiVFuxWlOXEpr4KB4Jo1ZUwCpRp3tYCj3KubTNJYKk

Before continuing, verify the TXT record has been deployed. Depending on the DNS
provider, this may take some time, from a few seconds to multiple minutes. You can
check if it has finished deploying with aid of online tools, such as the Google
Admin Toolbox: Dig (DNS lookup).
Look for one or more bolded line(s) below the line ';ANSWER'. It should show the
value(s) you've just added.


Please deploy a DNS TXT record under the name:

_acme-challenge.hrs.isr.umich.edu.

with the following value:

9xiVFuxWlOXEpr4KB4Jo1ZUwCpRp3tYCj3KubTNJYKk

Before continuing, verify the TXT record has been deployed. Depending on the DNS
provider, this may take some time, from a few seconds to multiple minutes. You can
check if it has finished deploying with aid of online tools, such as the Google
Admin Toolbox: Dig (DNS lookup).
Look for one or more bolded line(s) below the line ';ANSWER'. It should show the
value(s) you've just added.


Press Enter to Continue
JWS payload:
b'{}'
Sending POST request to https://acme-v02.api.letsencrypt.org/acme/chall-v3/393164121136/4zSguA:
{
"protected": "eyJhbGciOiAiUlMyNTYiLCAia2lkIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL2FjY3QvMTkwMjk3NjEzNiIsICJub25jZSI6ICJYdDA5Tk5nSlpqblNQZnNxQnBGUW0wbHRvTlJCTHUzenMtZWV2V0VkSm9VTEFPLUFQS3MiLCAidXJsIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL2NoYWxsLXYzLzM5MzE2NDEyMTEzNi80elNndUEifQ",
"signature": "xViaUKRFl5yYNfGcs5HI-0hbohtKLbHilL9yVIvJNGyEHnSLWwsnOmcQODTg6kI2IcWuv04S9GLYW-ex6IL_vpris6b4bjHGQiacI0WNS0NwpCfz5yJTrrsoCwkWRLRFlk-QEbmNYtvFPsTlUlw2YB5a3qJEcEAKbkdLNx5YO5uFRGObQnnVWkdI1wbrnf6RuQkyApH2UFhnBQaZQSjyAZH6YSDQhJ2IKuVZK1ANrmrKww34pcSqMNW0V6EDcQqgc5C1FzwJSEb5qaOCdSqYT4QwJANzs1ZorI3aOM3CLJJsoOLGPBgoe5_3igDbijg-_vLpFvtFBJg3nSa9s5zfgw",
"payload": "e30"
}
https://acme-v02.api.letsencrypt.org:443 "POST /acme/chall-v3/393164121136/4zSguA HTTP/11" 200 186
Received response:
HTTP 200
Server: nginx
Date: Wed, 21 Aug 2024 13:30:57 GMT
Content-Type: application/json
Content-Length: 186
Connection: keep-alive
Boulder-Requester: 1902976136
Cache-Control: public, max-age=0, no-cache
Link: https://acme-v02.api.letsencrypt.org/directory;rel="index", https://acme-v02.api.letsencrypt.org/acme/authz-v3/393164121136;rel="up"
Location: https://acme-v02.api.letsencrypt.org/acme/chall-v3/393164121136/4zSguA
Replay-Nonce: lpv3ejQgmM0b-xALmanqyNln93uuh0Kqq67aWc7mu5LxIfbOkcg
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
"type": "dns-01",
"url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/393164121136/4zSguA",
"status": "pending",
"token": "_1Zrjx3DIQbNYTOJrT3mLGpz5WJfxbGfL8YaZhwSmmY"
}
Storing nonce: lpv3ejQgmM0b-xALmanqyNln93uuh0Kqq67aWc7mu5LxIfbOkcg
Waiting for verification...
JWS payload:
b''
Sending POST request to https://acme-v02.api.letsencrypt.org/acme/authz-v3/393164121136:
{
"protected": "eyJhbGciOiAiUlMyNTYiLCAia2lkIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL2FjY3QvMTkwMjk3NjEzNiIsICJub25jZSI6ICJscHYzZWpRZ21NMGIteEFMbWFucXlObG45M3V1aDBLcXE2N2FXYzdtdTVMeElmYk9rY2ciLCAidXJsIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL2F1dGh6LXYzLzM5MzE2NDEyMTEzNiJ9",
"signature": "NTBp5ThoV5vKemnI4yN601vnnQqLXufUJFzxhiGB--eY1jXGkb3QUNLTqT87XZsHa3dWTtAmP3ruoc2mv_9zBzDXLlRnMFBxNDxphwj8omdfP78hlMaPqB_XRhh3CEJbPZqJl0suo26XOIxlx1Jx84GGRGT8JSEln5bO3S-bSkPRn8jdXFzqKizx11ODpHIYci_nQpqJJLWmgJNO7ZwphkA8DUPm-544Z98hJTexyDSMTLbkVMHFua8lfqkEKSqhGy008b6hfUWOxLI5yWVlA9yfSD1YBOr1jjONf2v-o1XeFEOLTkq1OvFRbqiOhJxeSXqtjjdYYST1yOpoD6DgXA",
"payload": ""
}
https://acme-v02.api.letsencrypt.org:443 "POST /acme/authz-v3/393164121136 HTTP/11" 200 657
Received response:
HTTP 200
Server: nginx
Date: Wed, 21 Aug 2024 13:30:58 GMT
Content-Type: application/json
Content-Length: 657
Connection: keep-alive
Boulder-Requester: 1902976136
Cache-Control: public, max-age=0, no-cache
Link: https://acme-v02.api.letsencrypt.org/directory;rel="index"
Replay-Nonce: lpv3ejQgj_Vc6gXz3N3CY56hOvIMJjq0lUuYQBE7e6NKD7eEnxE
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800

{
"identifier": {
"type": "dns",
"value": "hrs.isr.umich.edu"
},
"status": "invalid",
"expires": "2024-08-28T13:30:45Z",
"challenges": [
{
"type": "dns-01",
"url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/393164121136/4zSguA",
"status": "invalid",
"validated": "2024-08-21T13:30:57Z",
"error": {
"type": "urn:ietf:params:acme:error:dns",
"detail": "DNS problem: SERVFAIL looking up TXT for _acme-challenge.hrs.isr.umich.edu - the domain's nameservers may be malfunctioning",
"status": 400
},
"token": "xxx-redacted-xxx"
}
]
}
Storing nonce: lpv3ejQgj_Vc6gXz3N3CY56hOvIMJjq0lUuYQBE7e6NKD7eEnxE
Challenge failed for domain hrs.isr.umich.edu
dns-01 challenge for hrs.isr.umich.edu
Notifying user:
Certbot failed to authenticate some domains (authenticator: manual). The Certificate Authority reported these problems:
Domain: hrs.isr.umich.edu
Type: dns
Detail: DNS problem: SERVFAIL looking up TXT for _acme-challenge.hrs.isr.umich.edu - the domain's nameservers may be malfunctioning

Hint: The Certificate Authority failed to verify the manually created DNS TXT records. Ensure that you created these in the correct location, or try waiting longer for DNS propagation on the next attempt.

Certbot failed to authenticate some domains (authenticator: manual). The Certificate Authority reported these problems:
Domain: hrs.isr.umich.edu
Type: dns
Detail: DNS problem: SERVFAIL looking up TXT for _acme-challenge.hrs.isr.umich.edu - the domain's nameservers may be malfunctioning

Hint: The Certificate Authority failed to verify the manually created DNS TXT records. Ensure that you created these in the correct location, or try waiting longer for DNS propagation on the next attempt.

Encountered exception:
Traceback (most recent call last):
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/auth_handler.py", line 108, in handle_authorizations
self._poll_authorizations(authzrs, max_retries, max_time_mins, best_effort)
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/auth_handler.py", line 212, in _poll_authorizations
raise errors.AuthorizationError('Some challenges have failed.')
certbot.errors.AuthorizationError: Some challenges have failed.

Calling registered functions
Cleaning up challenges
Exiting abnormally:
Traceback (most recent call last):
File "/opt/homebrew/bin/certbot", line 8, in
sys.exit(main())
^^^^^^
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/main.py", line 19, in main
return internal_main.main(cli_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/main.py", line 1894, in main
return config.func(config, plugins)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/main.py", line 1600, in certonly
lineage = _get_and_save_cert(le_client, config, domains, certname, lineage)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/main.py", line 143, in _get_and_save_cert
lineage = le_client.obtain_and_enroll_certificate(domains, certname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/client.py", line 517, in obtain_and_enroll_certificate
cert, chain, key, _ = self.obtain_certificate(domains)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/client.py", line 428, in obtain_certificate
orderr = self._get_order_and_authorizations(csr.data, self.config.allow_subset_of_names)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/client.py", line 496, in _get_order_and_authorizations
authzr = self.auth_handler.handle_authorizations(orderr, self.config, best_effort)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/auth_handler.py", line 108, in handle_authorizations
self._poll_authorizations(authzrs, max_retries, max_time_mins, best_effort)
File "/opt/homebrew/Cellar/certbot/2.11.0_1/libexec/lib/python3.12/site-packages/certbot/_internal/auth_handler.py", line 212, in _poll_authorizations
raise errors.AuthorizationError('Some challenges have failed.')
certbot.errors.AuthorizationError: Some challenges have failed.
Some challenges have failed.
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.
[0587670198 certs]$

3 Likes

Have you made changes at approximately 1800 UTC? Let's Debug used to be able to reproduce, but can no longer do so. Let's Encrypt staging also appears to be able to see your records now. I have taken some debug logs from before, but I'm wondering if you have resolved this now? The serial number of your main's zone has changed (my initial debugging showed a serial of 3000131124, now it's 3000131127, so something has likely changed).

Your DNS setup looks very complicated. I've looked at it a bit and also taken a packet capture from Let's Debugs site - there appear to be various sub-zones involved, some DNSSEC-signed, some are not signed. Some zones apparently also share DNS servers. I have seen some inconsistencies in your setup (like the SOA MNAME not showing up in the NS for isr.umich.edu, or nameservers that answer authoritatively for isr.umich.edu despite not being listed in the NS set), but I'm not sure if any of these was an actual problem.

If the error recurs, I've enabled additional logging on Let's Debug that may be able to diagnose this further.

5 Likes

Thank you, @Nummer378 ! Yes, we were eventually able to reproduce the problem via unboundtest.com, which led us to a solution: We had a very broad set of nameservers that took a lot of lookups to fully resolve. Unbound was stopping before fully resolving our nameserver addresses -- there were no SERVFAIL or NXDOMAIN responses returned from any nameserver. We reworked our DNS to require fewer lookups to fully resolve our name servers, and that seems to have fixed the problem. The change took several hours to make and propagate.

Thanks again for your response and your help. And, yes, our DNS is very complicated and has a lot of variation within it.

5 Likes