DNS challenge almost always times out

My domain is: hub.malvager.net + irc.malvager.net

I ran this command:
certbot-auto certonly --manual --manual-public-ip-logging-ok --rsa-key-size 4096 -n --expand --keep --agree-tos --reuse-key --preferred-challenges=dns --manual-auth-hook /root/LE/auth.sh -d "hub.malvager.net,irc.malvager.net"

It produced this output:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator manual, Installer None
Cert is due for renewal, auto-renewing...
Renewing an existing certificate
Reusing existing private key from /etc/letsencrypt/live/hub.malvager.net/privkey.pem.
Performing the following challenges:
dns-01 challenge for irc.malvager.net
dns-01 challenge for hub.malvager.net
Running manual-auth-hook command: /root/LE/auth.sh
Output from manual-auth-hook command auth.sh:
Start: Mon Aug 5 03:00:04 CEST 2019
End: Mon Aug 5 03:10:45 CEST 2019

Running manual-auth-hook command: /root/LE/auth.sh
Output from manual-auth-hook command auth.sh:
Start: Mon Aug 5 03:10:45 CEST 2019
End: Mon Aug 5 03:20:01 CEST 2019

Waiting for verification...
Cleaning up challenges
An unexpected error occurred:
ReadTimeout: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Read timed out. (read timeout=45)
Please see the logfiles in /var/log/letsencrypt for more details.

My web server is (include version): N/A

The operating system of my server is (include version): Debian 9 (Stretch)

My hosting provider is: N/A (home-hosted VM)

I can login to a root shell on my machine: yes

I'm using a control panel to manage my site: no

The version of my client is: certbot 0.36.0

Additional information:
The server on which certbot runs is also my primary DNS server (based on bind9). All zones are propagated to my domain registrar's nameservers so they can be used as secondary/tertiary, in case of outages. As such I need to make sure all authoritative nameservers have the TXT record before LetsEncrypt can verify the challenges, which is why there's about 10 minutes between the start and end times. auth.sh is a simple Bash script to add the record and check every server involved using a for $srv + while dig loop, so it only proceeds with the next server if the record is found.

I actually managed to get all certificates once but that was already a couple months ago. It also succeeded only after 10+ timeout errors, so I guess I just got lucky there.

I read a bunch of topics on here about Akamai dropping packets over a certain size. But if that's the case then it has been that way for almost 2 years already, in which case it's high time for LetsEncrypt/Akamai to reliably fix the issue.

Also, I have other VMs in the same network that use HTTP challenges and those certs get issued just fine.

Hi @Sahbi

checking your first domain there is no A-record. Ok, it's dns validation, so no A-record is required.

But checking your main domain there is a curious answer ( https://check-your-website.server-daten.de/?q=malvager.net ):

The standard urls are ok. But https://www.malvager.net/.well-known/acme-challenge/check-your-website-dot-server-daten-dot-de has something that starts with:

Visible Content: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> Oh god what have you done BANNED ,sARRAs, ,ARRRRRs, cARRRRRRRT, , ,CTARRRRRRRA ,,,, ,saCARRRRRRRAc ,, ,caas,, ,cscaTRRRRRRRRAsc,,,,, ,caCTRRRRRAa ,ccccaTRRRRRRRRRTc, ,,,,,,,, cRRRRRRRRRAC ,cccccaTARRRRRac,,, ,,,,,, ,,, ,,,, ,CRRRRRRRRRRRRATTT ,ascccsaTRRRac, ,, , , ,, ,,,ARRRRRRRRRRRRRTaac,, ,CaccccaTRRa, ,,,,,,, ,,,,cRRRRRRRRRRRRATacc, cRAccccssc,,,,,,,,,,,,,,,,, ,, ,,,,,,,cARRRRRRRACsc,cc, cRRsccccc,,,,,,,,,,,,,,,,,,,, ,,,,,,,,sARRRRATCsc,,cac ,caCCscc,,ccccc,,,,,,,,,,,,,,,,,,,,,,,,,,cTATCsc,,,,cc, ,CTscc,cccccc,,,,,,,,,,,,,,,,,,,,,,,,,cccccc,, cs, ,aCcc,ccsccc,c,,,,,,,,,,,,,,,,,,,,cccccccc,,, ,sTs ,cscccaRRC,sAAac,,,,,,,,,,,,,ccccccc,ccccccccc,,cac, caacccc,,,sRRRRRRAac,,,,,,,,,cARRRRRRAscccc,cc,cccaac, sATaccc,,,cRRRRRRTsc,,,,,,,,,cRRRRRRRRTaccc,,,,cccCC, cCCsccccc, ,TRRCc, , ,,cCRRTTRRRRRc,ccccc,,ccc, ,caCasscccccc,,casc, ,,cTRRRRRRRAc,ccc,cccccc ,cCATTCCasscccscc, cTAc ,,sARCssc,,,,,c,,,ccccc, cCTATTTTCCaaasccc,,,,,,,,cRRs ,,,, ,cccccccc,,,,ccccsa, ,sATCTAATTCCCaascc,,,,,,,cc,,,,,,,,,,caCCascccc,,,,c,cccc, aRAAATTTTCCCCCsscccc,cccccc,c,,,,,,,,ccccc,,ccccc,,,,,cccc, cCCaCTTTTCTTTCaaasccccccccccccccc,,,,,,cccccccccc,,,,,,,ccc cCTATATATATTCassccssccccccccccccccsCTTTCsccccccccccccccc, ,TAAARRATTTCCaasaassccccccccccccccccccccccsscccccccc, cccsARRAATTTCaaaaaaasaassssssssscccccccccccccccccc, ,sCTAAAATTTTCCTTTTTTTTTTaTTTTTCaasssssssc,,,, caARRRAAATTTTTTTTTTTTATATTRRTCaaCCaaCCs, ,CCscccccsaaaTAATaaCaaaTTTTAARRRRRAAAAAa, ,TACscccccccccaaaasaaaaaCaaac, ,CACscccccccccccsssssssssascc ,aCTCsccccc,cccccccccccccccccccsc, ,,,, ,csTTaccc,,,,cccccccccccc,cccccccc,,, ,,, ,,,,,c,,,,cTTCccc,,,,,,,,,,,,,cccccccccccc,cc, caCCscsssccccccCCscc,,,,,,,,,,,,,,,,,cccaCCc,,,, ,cTRRTCTACsssscsCCscc,,,,,,,,,,,,,,,,,ccaTRRs,,, ,TRRRRRRRRATAATTTCcc,,, ,,,, ,,,,,,,,,,ccsscc,,, ,sCaCCTARAsssCTACacc,,,,,,,,,, , ,,,cccccccc,,, ,CTacc,,,,,, , ,,,, ,caaasccc,,, cAACscc,,,,,,,,,,,,,,, cCCCsccccc, ,AACscc,,,,,,,,,,,,,, ,cATassccccc cTAascc,,,,,,,,,,,,,, ,cTACCaascc ,TTascc,,,,,,,,,,,,,,, ,cACscccc, ,CTascccccc,,,,,,,,,,,,,sTac,c, ,aCaccccc,,,,,,,,,,,,,,,,,,,ccc ,caCssccccc,,,,,,,,,,,,,,,,,,cc, cAAascccccc,,,,,, ,,, ,,,,,,,c, cTCsc,,,,cc,,,,,,,,,,,,,,,,,cc, ,cCCaasCTCsc,,,,,cc,ccccccc,,,,,,,,,cc,, ,ARRRAAACasc,,,,,,ccsCTTCascc,,,,,,,cccccssscscsscaCc, 

Oh god what have you done BANNED ,sARRAs, ,ARRRRRs, cARRRRRRRT,

It's a dancing "Banned" animation.

Is this your script? If it's your hoster, perhaps dns queries or outgoing queries are blocked too.

1 Like

That “banned” page is just my default error page, which is shown for any domain that points to the server but doesn’t have an actual matching in Apache. However, I won’t be using these certs with Apache or HTTP in general, and the first domain not having an A record is actually done deliberately. www.malvager.net also has a different IP as hub.malvager.net, and I don’t even want www. on this particular cert.

What is that URL you’re trying to call though? Is it something that is checked regardless of me using DNS validation?

@lestaff, this looks like a connectivity problem to Akamai—could someone help @Sahbi look into it?

1 Like

@Sahbi this isn’t the DNS challenge timing out, it’s your subsequent HTTPS request to Let’s Encrypt that says to validate the challenge. I’ve seen similar behavior in Certbot before, where waiting a long time for DNS to propagate means that Certbot has a kept-alive connection, but that connection is considered dead by some firewall or NAT appliance in between Certbot and Let’s Encrypt. The result is that the next request from Certbot times out. Ideally the requests module should automatically retry these errors but I don’t think it does. That might be because it’s impossible for requests to know whether the server received the request or not.

Do you have any firewalls or NAT devices between your machine and Let’s Encrypt?

3 Likes

Yeah I'm running an OPNSense firewall, but I set it up to be fairly permissive. I even tried "conservative firewall optimisation": Tries to avoid dropping any legitimate idle connections at the expense of increased memory usage and CPU utilization.

I also ran a netcat server/client test to see if the connection indeed gets killed "too soon". Obviously the VM running certbot is acting as the client, and the server is a VPS hosted with OVH. It seems the connection only gets closed after 30+ minutes, but afaik netcat doesn't even do TCP keepalive so it may not be entirely accurate:

echo "Start: $(date)" ; netcat vps 12321 ; echo "End: $(date)"
Start: Tue Aug 6 00:48:07 CEST 2019
foo
End: Tue Aug 6 01:20:31 CEST 2019
The "foo" was sent by the client to test if it actually has a working connection, the message is indeed properly received by the server.

Is there perhaps some way to either:

  1. Pass a flag such as --preferred-dns-server to tell certbot which (authoritative) nameserver to try first?
  2. Prevent the keep-alive business and only set up the HTTPS connection after all DNS stuff is finished?
1 Like

Certbot isn't the one talking to your authoritative DNS servers.

Certbot connects to the Let's Encrypt API server (https://acme-v02.api.letsencrypt.org), and asks it to perform the DNS validation.

Your timeout is on that HTTPS socket, but it's not clear, from the excerpt you posted, which part of the issuance process resulted in the timeout.

Could you post the full stack trace related to this timeout? You can find it in the log file. If it is on the "finalize" operation (wherein the CSR is HTTP POST'ed to the Let's Encrypt servers), then it sounds like it's probably due to:

I don't disagree with any of that but you may as well try confirm it.

I believe that the "packet size" networking issue still exists separate to @jsha's theory about NAT idleness. My reasoning is that it can often (at least, when these timeouts crop up) be reproduced with a large (~50KB) post to the Akamai servers on a completely fresh socket outside of any NAT environment.

If the timeout is on a POST to an authz URL rather than to the finalization/order URL, then the NAT idleness theory sounds like it is very likely, because it would be the first request after your authentication hook has run.

1 Like

It seems it’s neither of those, the exception is thrown after an attempted POST to acme/challenge. I’ll just put the entire debug log here, in case you’d need more info than just the stack trace:

2019-08-06 03:00:02,766:DEBUG:certbot.main:certbot version: 0.36.0
2019-08-06 03:00:02,766:DEBUG:certbot.main:Arguments: ['--manual', '--manual-public-ip-logging-ok', '--rsa-key-size', '4096', '-n', '--expand', '--keep', '--agree-tos', '--reuse-key', '--preferred-challenges=dns', '--manual-auth-hook', '/root/LE/auth.sh', '-d', 'hub.malvager.net,irc.malvager.net']
2019-08-06 03:00:02,766:DEBUG:certbot.main:Discovered plugins: PluginsRegistry(PluginEntryPoint#apache,PluginEntryPoint#manual,PluginEntryPoint#nginx,PluginEntryPoint#null,PluginEntryPoint#standalone,PluginEntryPoint#webroot)
2019-08-06 03:00:02,777:DEBUG:certbot.log:Root logging level set at 20
2019-08-06 03:00:02,777:INFO:certbot.log:Saving debug log to /var/log/letsencrypt/letsencrypt.log
2019-08-06 03:00:02,778:DEBUG:certbot.plugins.selection:Requested authenticator manual and installer None
2019-08-06 03:00:02,782:DEBUG:certbot.plugins.selection:Single candidate plugin: * manual
Description: Manual configuration or run your own shell scripts
Interfaces: IAuthenticator, IPlugin
Entry point: manual = certbot.plugins.manual:Authenticator
Initialized: <certbot.plugins.manual.Authenticator object at 0xb6004dec>
Prep: True
2019-08-06 03:00:02,782:DEBUG:certbot.plugins.selection:Selected authenticator <certbot.plugins.manual.Authenticator object at 0xb6004dec> and installer None
2019-08-06 03:00:02,782:INFO:certbot.plugins.selection:Plugins selected: Authenticator manual, Installer None
2019-08-06 03:00:02,791:DEBUG:certbot.main:Picked account: <Account(RegistrationResource(body=Registration(status=None, terms_of_service_agreed=None, agreement=None, only_return_existing=None, contact=(), key=None, external_account_binding=None), uri=u'https://acme-v02.api.letsencrypt.org/acme/acct/58346500', new_authzr_uri=None, terms_of_service=None), e2e677eec8a030ca8645028d321e9a35, Meta(creation_host=u'tshb.home.lan', creation_dt=datetime.datetime(2019, 6, 1, 22, 50, 33, tzinfo=<UTC>)))>
2019-08-06 03:00:02,791:DEBUG:acme.client:Sending GET request to https://acme-v02.api.letsencrypt.org/directory.
2019-08-06 03:00:02,792:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org:443
2019-08-06 03:00:03,026:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "GET /directory HTTP/1.1" 200 658
2019-08-06 03:00:03,027:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Content-Type: application/json
Content-Length: 658
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800
Expires: Tue, 06 Aug 2019 01:00:03 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Tue, 06 Aug 2019 01:00:03 GMT
Connection: keep-alive

{
  "I4MOEsDW6p0": "https://community.letsencrypt.org/t/adding-random-entries-to-the-directory/33417",
  "keyChange": "https://acme-v02.api.letsencrypt.org/acme/key-change",
  "meta": {
    "caaIdentities": [
      "letsencrypt.org"
    ],
    "termsOfService": "https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf",
    "website": "https://letsencrypt.org"
  },
  "newAccount": "https://acme-v02.api.letsencrypt.org/acme/new-acct",
  "newNonce": "https://acme-v02.api.letsencrypt.org/acme/new-nonce",
  "newOrder": "https://acme-v02.api.letsencrypt.org/acme/new-order",
  "revokeCert": "https://acme-v02.api.letsencrypt.org/acme/revoke-cert"
}
2019-08-06 03:00:03,036:DEBUG:certbot.storage:Should renew, less than 30 days before certificate expiry 2019-08-31 02:11:46 UTC.
2019-08-06 03:00:03,036:INFO:certbot.renewal:Cert is due for renewal, auto-renewing...
2019-08-06 03:00:03,036:INFO:certbot.main:Renewing an existing certificate
2019-08-06 03:00:03,036:INFO:certbot.client:Reusing existing private key from /etc/letsencrypt/live/hub.malvager.net/privkey.pem.
2019-08-06 03:00:03,076:DEBUG:certbot.crypto_util:Creating CSR: /etc/letsencrypt/csr/0017_csr-certbot.pem
2019-08-06 03:00:03,077:DEBUG:acme.client:Requesting fresh nonce
2019-08-06 03:00:03,077:DEBUG:acme.client:Sending HEAD request to https://acme-v02.api.letsencrypt.org/acme/new-nonce.
2019-08-06 03:00:03,260:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "HEAD /acme/new-nonce HTTP/1.1" 200 0
2019-08-06 03:00:03,260:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Replay-Nonce: a-lHzE9nT4QCg16JQzz1ZDi4fxG9_CDqIDoD5MZptn0
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800
Content-Length: 0
Expires: Tue, 06 Aug 2019 01:00:03 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Tue, 06 Aug 2019 01:00:03 GMT
Connection: keep-alive


2019-08-06 03:00:03,260:DEBUG:acme.client:Storing nonce: a-lHzE9nT4QCg16JQzz1ZDi4fxG9_CDqIDoD5MZptn0
2019-08-06 03:00:03,261:DEBUG:acme.client:JWS payload:
{
  "identifiers": [
    {
      "type": "dns", 
      "value": "hub.malvager.net"
    }, 
    {
      "type": "dns", 
      "value": "irc.malvager.net"
    }
  ]
}
2019-08-06 03:00:03,298:DEBUG:acme.client:Sending POST request to https://acme-v02.api.letsencrypt.org/acme/new-order:
{
  "protected": "eyJub25jZSI6ICJhLWxIekU5blQ0UUNnMTZKUXp6MVpEaTRmeEc5X0NEcUlEb0Q1TVpwdG4wIiwgInVybCI6ICJodHRwczovL2FjbWUtdjAyLmFwaS5sZXRzZW5jcnlwdC5vcmcvYWNtZS9uZXctb3JkZXIiLCAia2lkIjogImh0dHBzOi8vYWNtZS12MDIuYXBpLmxldHNlbmNyeXB0Lm9yZy9hY21lL2FjY3QvNTgzNDY1MDAiLCAiYWxnIjogIlJTMjU2In0", 
  "payload": "ewogICJpZGVudGlmaWVycyI6IFsKICAgIHsKICAgICAgInR5cGUiOiAiZG5zIiwgCiAgICAgICJ2YWx1ZSI6ICJodWIubWFsdmFnZXIubmV0IgogICAgfSwgCiAgICB7CiAgICAgICJ0eXBlIjogImRucyIsIAogICAgICAidmFsdWUiOiAiaXJjLm1hbHZhZ2VyLm5ldCIKICAgIH0KICBdCn0", 
  "signature": "e13qM5sTooySx8YVF10M7jIT1hb6AehsaR4k3Hqnsvg96QMtI0zTBOp2h0ZWD7N8BvrbzF65wPqMdyDzb2ZoofB4U_goAvVRTaZ66_XMcIY9xI1zy96386wHHxP53eSYaid2mdImPCaLZjo7Ff4GUJ3g9OrEnRcB6iRUiHyyGy17u0J8CwIUpDQxR1i7vwJN2A0Yi-7cN8qfrWBtcp_vTmU5j-2-87ePrR6FZfB27KTJrB0Dr18UabDIXrSVFDWRpm0a0K5wy-exLevNSLbX-GLt_id8DyiOVLOB5XgzVksP9-BczMET_46gRbSP5YK6zs3HyE4YP6QpzN6wRvveU0E9O0EiHWaDl6vlqclzgwxlcxrVaC_OHvvsTh0GWFvO0P_H9U04TE5ikAhtUULjrUy-uLw8W05j2v3pN_oLaw5xRL09qwre4uCehqxZeRJ2hG5fDhmasegQzVvLrOae9iDkORqurMY2xOb08OzVSyHMOhW7yB3P1bIJtiTJ54h4aeVFSUQ_BqTkH7kA03ch5y9hDw1P9mDZ5uzy297NyhH2iB-pOWDfMKT5J4gIlJeSeRh_fPP_GoScMiOVe8urSuAXTtK-ar37mTO6nAN7cYsnEOLGH2YjoS3x5-euRX0Xg0hckfXIdvHIK6iqlcL-MGJI8Zst0_b2uAFI1d9TPN8"
}
2019-08-06 03:00:03,600:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "POST /acme/new-order HTTP/1.1" 201 532
2019-08-06 03:00:03,602:DEBUG:acme.client:Received response:
HTTP 201
Server: nginx
Content-Type: application/json
Content-Length: 532
Boulder-Requester: 58346500
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Location: https://acme-v02.api.letsencrypt.org/acme/order/58346500/821463968
Replay-Nonce: eop-95AjfBK4Sg6eSF8vJu22ENO04Uwsxwq5y1ZXEFU
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800
Expires: Tue, 06 Aug 2019 01:00:03 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Tue, 06 Aug 2019 01:00:03 GMT
Connection: keep-alive

{
  "status": "pending",
  "expires": "2019-08-09T01:00:07Z",
  "identifiers": [
    {
      "type": "dns",
      "value": "hub.malvager.net"
    },
    {
      "type": "dns",
      "value": "irc.malvager.net"
    }
  ],
  "authorizations": [
    "https://acme-v02.api.letsencrypt.org/acme/authz/hkFhfovc4rmFp2yIEKFQ_y4H8J7a8gj16CFOHNLz-uU",
    "https://acme-v02.api.letsencrypt.org/acme/authz/ZPGotBSSmuwIUSPRDBFdqT7ZA4Z2pZftbMhgix2fQzs"
  ],
  "finalize": "https://acme-v02.api.letsencrypt.org/acme/finalize/58346500/821463968"
}
2019-08-06 03:00:03,602:DEBUG:acme.client:Storing nonce: eop-95AjfBK4Sg6eSF8vJu22ENO04Uwsxwq5y1ZXEFU
2019-08-06 03:00:03,603:DEBUG:acme.client:JWS payload:

2019-08-06 03:00:03,640:DEBUG:acme.client:Sending POST request to https://acme-v02.api.letsencrypt.org/acme/authz/hkFhfovc4rmFp2yIEKFQ_y4H8J7a8gj16CFOHNLz-uU:
{
  "protected": "eyJub25jZSI6ICJlb3AtOTVBamZCSzRTZzZlU0Y4dkp1MjJFTk8wNFV3c3h3cTV5MVpYRUZVIiwgInVybCI6ICJodHRwczovL2FjbWUtdjAyLmFwaS5sZXRzZW5jcnlwdC5vcmcvYWNtZS9hdXRoei9oa0ZoZm92YzRybUZwMnlJRUtGUV95NEg4SjdhOGdqMTZDRk9ITkx6LXVVIiwgImtpZCI6ICJodHRwczovL2FjbWUtdjAyLmFwaS5sZXRzZW5jcnlwdC5vcmcvYWNtZS9hY2N0LzU4MzQ2NTAwIiwgImFsZyI6ICJSUzI1NiJ9", 
  "payload": "", 
  "signature": "L7Il4d9YwGSXGU552_lmJBWXT_mDSl6zUCR_BtJ1EPrsYyWvdi8Z1ULKlHELYgtf0E6LY-Wk7VV_yV6Sggr_7JYKJGGQ4GXlDyCM2T9tMU6lrwZ-BPoaRCghBwaY1m_IJkE-GYM0W9qTZr_9o7h6Rwl76pVsgy7KYjPFjWhKNf9HZJb7bpyCXfHL_r9cM_oBSSo1O1CfPc8c2qwvW-XPzH-GFYMLOoqSayFKDnp48WOLOdh5MrqdZC_QI8teBV2IQuPpW_OOng20g0vWnX6gKyaw8GgzY2rpSmMxu4fNPdAd6GzawrLi75WrBztP2jLr8zROGwto_vc5dKmylvo3XWFd_5gDi6ku50pWkMRK2G5cKG499lXh3heNWrWiAQr5xNQdu-KqgBZLIg0DsYmJOr5dPWb4Fv4rDSZucHPZTDocy18fQS3MnutCe37i1zcKxBa3qNFCPVnIe_PCoJMG-R7IU5F-sKVnuo-G3-BvTC5A_G-ZUT1iELI4Y0Y-zu4-YSpIzGnba2UDEPfCAyqESkbDB1F0HxQ2ByI4qvMXyJkXsRwoM9izGaOkLLsB_a556r8wzCQVY5s1yC0uq0erqRrq5GOY8OEZQnKwXktMxj4-BHFbiV0ui7o97pMSjTOrvo9WxaIJoe28yoFGKiTR9TcOK_uzKGNLDIMRL724OvE"
}
2019-08-06 03:00:03,868:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "POST /acme/authz/hkFhfovc4rmFp2yIEKFQ_y4H8J7a8gj16CFOHNLz-uU HTTP/1.1" 200 911
2019-08-06 03:00:03,870:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Content-Type: application/json
Content-Length: 911
Boulder-Requester: 58346500
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Replay-Nonce: QRu77_MVhIacrbTHQkbBQ6Pcz7Hg70Ljk2CU2JMncFc
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800
Expires: Tue, 06 Aug 2019 01:00:03 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Tue, 06 Aug 2019 01:00:03 GMT
Connection: keep-alive

{
  "identifier": {
    "type": "dns",
    "value": "irc.malvager.net"
  },
  "status": "pending",
  "expires": "2019-08-09T01:00:07Z",
  "challenges": [
    {
      "type": "http-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/challenge/hkFhfovc4rmFp2yIEKFQ_y4H8J7a8gj16CFOHNLz-uU/19032093625",
      "token": "C1m8d_Y1HP2qXWhiFbP99n8BXjhi_XVLRryJ8KQs3oc"
    },
    {
      "type": "tls-alpn-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/challenge/hkFhfovc4rmFp2yIEKFQ_y4H8J7a8gj16CFOHNLz-uU/19032093658",
      "token": "6KP5eMmKu2SGqAUuPVSPzEY-LjX3nLtajklVGgk3pq4"
    },
    {
      "type": "dns-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/challenge/hkFhfovc4rmFp2yIEKFQ_y4H8J7a8gj16CFOHNLz-uU/19032093671",
      "token": "AQNifDYPDcRNMdmB_f6awoXq5nCYgXTmEoQkyBR49Zk"
    }
  ]
}
2019-08-06 03:00:03,871:DEBUG:acme.client:Storing nonce: QRu77_MVhIacrbTHQkbBQ6Pcz7Hg70Ljk2CU2JMncFc
2019-08-06 03:00:03,872:DEBUG:acme.client:JWS payload:

2019-08-06 03:00:03,939:DEBUG:acme.client:Sending POST request to https://acme-v02.api.letsencrypt.org/acme/authz/ZPGotBSSmuwIUSPRDBFdqT7ZA4Z2pZftbMhgix2fQzs:
{
  "protected": "eyJub25jZSI6ICJRUnU3N19NVmhJYWNyYlRIUWtiQlE2UGN6N0hnNzBMamsyQ1UySk1uY0ZjIiwgInVybCI6ICJodHRwczovL2FjbWUtdjAyLmFwaS5sZXRzZW5jcnlwdC5vcmcvYWNtZS9hdXRoei9aUEdvdEJTU211d0lVU1BSREJGZHFUN1pBNFoycFpmdGJNaGdpeDJmUXpzIiwgImtpZCI6ICJodHRwczovL2FjbWUtdjAyLmFwaS5sZXRzZW5jcnlwdC5vcmcvYWNtZS9hY2N0LzU4MzQ2NTAwIiwgImFsZyI6ICJSUzI1NiJ9", 
  "payload": "", 
  "signature": "hlxNbrb8G578P6Kkc1hmryWGpwwHjbS89ofXTTRdP17uDLcTGt92KcSfLJT3Nj4S9Tcgl8FY7qO1v_DYGXOZnyO-mmty7YfTAT9Bx501ik6lAx1Vb6f7B3bIltipmh68lJBLVz7rm2LIPZxyVGp8HvzLuKNPKVh-ffwxi9jao74CXNAr2-BaakjLgplXGqnUb-IroaBwLyblayjl3k4xExV4riWxabqL20x86foOcelF4wpG2RXV2FqEnZ0MiiJ1MHD8mBawVZS4V1vdWoNAHEgik8pr0D-Fp6kPPmXK2Ghw3LhUNRr3wpoEHfUMYYGtoV7vfJLh1yUeo9twyC1QMQLpK1sodkKYjrFnSfFtoTeiCAkWvdm2rA-TU6J5jg7LBU9Z_MPLkNHCBq2dz1F3aKx4pudSpnGEmv8PeocShig3kWJDiNLHPzYyBc33tVldv-1F_EamovRWW-EROBtmIL6W9Jc5ztvQRApQ7qSqB5fG_3Ph5gBbNnNgcZH2ZxSwEsHp5lk9FFbaZu-8gAjTE-TQJUsQK46dAzBgJzUeyPqGSc3gosJH0Ybs30M5DlZKZFCcWDDvzjQs8zGPlcFOyeqBv_Svy8h-dQIITL_7tm4gRhA7DyUEyXE1Vccy7DwweRO7pQbKdYm6kCpcLSJ5Um9WoWLQbNzWVQaUrYDUR0Q"
}
2019-08-06 03:00:04,281:DEBUG:urllib3.connectionpool:https://acme-v02.api.letsencrypt.org:443 "POST /acme/authz/ZPGotBSSmuwIUSPRDBFdqT7ZA4Z2pZftbMhgix2fQzs HTTP/1.1" 200 911
2019-08-06 03:00:04,283:DEBUG:acme.client:Received response:
HTTP 200
Server: nginx
Content-Type: application/json
Content-Length: 911
Boulder-Requester: 58346500
Link: <https://acme-v02.api.letsencrypt.org/directory>;rel="index"
Replay-Nonce: iA3plHsUWwHiNRs5vHv2EOK8DfpZIfteScSaCq2tk7A
X-Frame-Options: DENY
Strict-Transport-Security: max-age=604800
Expires: Tue, 06 Aug 2019 01:00:04 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Tue, 06 Aug 2019 01:00:04 GMT
Connection: keep-alive

{
  "identifier": {
    "type": "dns",
    "value": "hub.malvager.net"
  },
  "status": "pending",
  "expires": "2019-08-09T01:00:07Z",
  "challenges": [
    {
      "type": "tls-alpn-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/challenge/ZPGotBSSmuwIUSPRDBFdqT7ZA4Z2pZftbMhgix2fQzs/19032093423",
      "token": "n1W6MOqjo6yK_-M4xxCj2V3XC2ObgSpvy6szgnRd37s"
    },
    {
      "type": "http-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/challenge/ZPGotBSSmuwIUSPRDBFdqT7ZA4Z2pZftbMhgix2fQzs/19032093429",
      "token": "sMWT14EfTsY9DzuznUvE7x7MpCWBbgTXxcDMxoYuD5U"
    },
    {
      "type": "dns-01",
      "status": "pending",
      "url": "https://acme-v02.api.letsencrypt.org/acme/challenge/ZPGotBSSmuwIUSPRDBFdqT7ZA4Z2pZftbMhgix2fQzs/19032093435",
      "token": "zfuakPe7y8w6GOkdbVSCBrOvVa9Js5T5PH-uRcQ1dM4"
    }
  ]
}
2019-08-06 03:00:04,284:DEBUG:acme.client:Storing nonce: iA3plHsUWwHiNRs5vHv2EOK8DfpZIfteScSaCq2tk7A
2019-08-06 03:00:04,285:INFO:certbot.auth_handler:Performing the following challenges:
2019-08-06 03:00:04,286:INFO:certbot.auth_handler:dns-01 challenge for irc.malvager.net
2019-08-06 03:00:04,286:INFO:certbot.auth_handler:dns-01 challenge for hub.malvager.net
2019-08-06 03:00:04,311:INFO:certbot.hooks:Running manual-auth-hook command: /root/LE/auth.sh
2019-08-06 03:10:16,333:INFO:certbot.hooks:Output from manual-auth-hook command auth.sh:
Start: Tue Aug  6 03:00:04 CEST 2019
End: Tue Aug  6 03:10:16 CEST 2019

2019-08-06 03:10:16,344:INFO:certbot.hooks:Running manual-auth-hook command: /root/LE/auth.sh
2019-08-06 03:14:42,329:INFO:certbot.hooks:Output from manual-auth-hook command auth.sh:
Start: Tue Aug  6 03:10:16 CEST 2019
End: Tue Aug  6 03:14:42 CEST 2019

2019-08-06 03:14:42,335:INFO:certbot.auth_handler:Waiting for verification...
2019-08-06 03:14:42,335:DEBUG:acme.client:JWS payload:
{
  "type": "dns-01", 
  "resource": "challenge"
}
2019-08-06 03:14:42,380:DEBUG:acme.client:Sending POST request to https://acme-v02.api.letsencrypt.org/acme/challenge/hkFhfovc4rmFp2yIEKFQ_y4H8J7a8gj16CFOHNLz-uU/19032093671:
{
  "protected": "eyJub25jZSI6ICJpQTNwbEhzVVd3SGlOUnM1dkh2MkVPSzhEZnBaSWZ0ZVNjU2FDcTJ0azdBIiwgInVybCI6ICJodHRwczovL2FjbWUtdjAyLmFwaS5sZXRzZW5jcnlwdC5vcmcvYWNtZS9jaGFsbGVuZ2UvaGtGaGZvdmM0cm1GcDJ5SUVLRlFfeTRIOEo3YThnajE2Q0ZPSE5Mei11VS8xOTAzMjA5MzY3MSIsICJraWQiOiAiaHR0cHM6Ly9hY21lLXYwMi5hcGkubGV0c2VuY3J5cHQub3JnL2FjbWUvYWNjdC81ODM0NjUwMCIsICJhbGciOiAiUlMyNTYifQ", 
  "payload": "ewogICJ0eXBlIjogImRucy0wMSIsIAogICJyZXNvdXJjZSI6ICJjaGFsbGVuZ2UiCn0", 
  "signature": "AkmMqSXtzyEoXyvFyXtuSTc8905PiVH9oE7Q3oBe1IECOAqfnBV3_n-w-xCgkGTPCSxdaRQrgDJsYGa3Fuc7GhNJla1ybPgLp_BvsvnyzN5wzFqgfSrzPmWCL6Kzb43kmrBKMzR-pDc7CxBVcVg-nIOti5ezjfU1k-fGAXvEYRzm9lENVccVPGxNlVyVm_9NYyN9gaDtOtIrbFpzmxK9Dm-eGpG-MVi9H4JTXmtUe3YTDgnUuu2rFM9gn16XZEEfiguRoJTuTIWmcTW-BL9DDbyfX9wVS6E9RQgkZetfD0XFJ8yk7OHmktQOktX9VdJ2aOLugZECa01tOIf7vcL6LBfyn6HwDKYuKdmTyDBsiEy-wxNnuGxWV5ojhycyEA1ti7K9gh_OeBcOwVeKrq1AweVeI-AlCtIsjQ3SIJ0gJwtTS-YP27HrrWcom9B0BPJow-vy-R-cES7pr1GxMTjrgyZy-KMgxnV3w1t3QN---yvwJLGTnL0QpdI3ppVFtfMYFcWRk2XtKJDjM-8fadONyExmG3M4dIl4Vxfx5WWVl-t1HnQyZHYmtFfrdZsPneRn_8mUiJvEDWsuc3XXUVyXW2eUvp94JMSi_JBWiGAXLbl-rr6WQJhI2PUI5wcYXu_JxCJc35DFEvT0RNfzpQucLB9tOPGaFix0zHFICFC_IFg"
}
2019-08-06 03:15:27,414:DEBUG:certbot.error_handler:Encountered exception:
Traceback (most recent call last):
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/auth_handler.py", line 87, in handle_authorizations
    self.acme.answer_challenge(achall.challb, resp)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 149, in answer_challenge
    response = self._post(challb.uri, response)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 95, in _post
    return self.net.post(*args, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 1179, in post
    return self._post_once(*args, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 1192, in _post_once
    response = self._send_request('POST', url, data=data, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 1095, in _send_request
    response = self.session.request(method, url, *args, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
ReadTimeout: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Read timed out. (read timeout=45)

2019-08-06 03:15:27,414:DEBUG:certbot.error_handler:Calling registered functions
2019-08-06 03:15:27,414:INFO:certbot.auth_handler:Cleaning up challenges
2019-08-06 03:15:27,414:DEBUG:certbot.log:Exiting abnormally:
Traceback (most recent call last):
  File "/opt/eff.org/certbot/venv/bin/letsencrypt", line 11, in <module>
    sys.exit(main())
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py", line 1381, in main
    return config.func(config, plugins)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py", line 1264, in certonly
    lineage = _get_and_save_cert(le_client, config, domains, certname, lineage)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py", line 115, in _get_and_save_cert
    renewal.renew_cert(config, domains, le_client, lineage)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/renewal.py", line 307, in renew_cert
    new_cert, new_chain, new_key, _ = le_client.obtain_certificate(domains, new_key)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/client.py", line 349, in obtain_certificate
    orderr = self._get_order_and_authorizations(csr.data, self.config.allow_subset_of_names)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/client.py", line 385, in _get_order_and_authorizations
    authzr = self.auth_handler.handle_authorizations(orderr, best_effort)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/auth_handler.py", line 87, in handle_authorizations
    self.acme.answer_challenge(achall.challb, resp)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 149, in answer_challenge
    response = self._post(challb.uri, response)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 95, in _post
    return self.net.post(*args, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 1179, in post
    return self._post_once(*args, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 1192, in _post_once
    response = self._send_request('POST', url, data=data, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 1095, in _send_request
    response = self.session.request(method, url, *args, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
ReadTimeout: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Read timed out. (read timeout=45)
2019-08-06 03:15:27,415:ERROR:certbot.log:An unexpected error occurred:

It’a log from last night because it still runs as a daily cron, in hopes I get lucky again (or the issue is resolved) before the certs expire.

1 Like

Yep, this supports the firewall/NAT hypothesis, since the post to the challenge URL is the first request that happens after the manual hook.

Is it possible to run the issuance on a machine that is not behind the firewall? Or tweak the settings further so you have a guaranteed minimum amount of time that connections will be considered live? AFAIK Certbot doesn't do TCP keepalive either.

@bmw do you think it might make sense to add a special retry handler to challenge POSTs in Certbot, since those often happen after a long wait (and since we know duplicate submissions are harmless)?

That seems like a vey nice way to solve the problem to me. Good idea jsha!

I made a post to our GitHub issue about this talking about your suggested change here in more detail.

1 Like

I could try to run certbot on the OVH VPS and update the DNS manually, but since a concrete fix was proposed I might try to modify the code myself. Unless proposed fixes are usually implemented by others in a short timespan (like a week or so)?

1 Like

Yeah, if you are comfortable modifying the code yourself, go for it! I will be curious to know the result.

Instead of propagating the TXT record, just propagate a DNS redirect for _acme_challenge.yourdomain one time and forget. Then, all DNS servers will get once the ACME record, which itself should point to which ever DNS server you like (ie: your bind9). From then, changes to the TXT (ie: on your bind9) will not affect the other DNS servers. This is how easyssl.live handles this scenario and it works well. Hope it helps.

@jsha I took a look at the code but there’s a bit too much abstraction to my tastes. Plus it wouldn’t even run a modified certbot (added a couple logger.info() things), it insisted on using /opt/eff.org instead of the venv generated from the git repo clone. I followed these instructions, FWIW.

@Fabrice2016 I don’t quite follow what you’re trying to say here. Since you speak of “DNS redirects” I’m assuming you mean CNAMEs, which won’t quite work because all my domains have secondary servers which might still be attempted (since they’re still authoritative for the base zone). So let’s say I set a CNAME from _acme_challenge.sub.foo.bar to bind9.sub.foo.bar and serve the latter only from my server, the secondaries will return NXDOMAIN and certbot will fail.

Anyways, I figured out a way that does seem to work reliably. In the base zone (malvager.net) I added an NS record pointing to just my own DNS server for _acme-challenge.irc.malvager.net as well as hub.etc. Then I set up bind9 to serve that as a separate zone and disallow transfers (just in case). Now when I run auth.sh it only has to sleep up to 10 seconds to give bind9 time to load all zones (I have quite a few), the POST then also succeeds. I wasn’t sure if LE follows delegated zones properly and I couldn’t find a definitive answer on here, but it looks like it does. I actually request 6 certs this way and all of them succeeded right away.

@jsha Is there a way to force revalidation of the DNS challenges? I tried --force-renew but it seems to skip the challenge entirely and simply reissues the cert.

Hi Sahbi. Sorry for the confusion, yes I did mean CNAME record not ACME record :slight_smile: If you set a CNAME from _acme_challenge.sub.foo.bar to bind9.sub.foo.bar the secondaries should return the CNAME too not NXDOMAIN, thus everything ends up at bind9.sub.foo.bar. Good new is that it solve the 10 minutes issue (except for the first time propagating the CNAME, but then it wont change anymore). Bad new is that you need to make sure you have another bind9 in case of failure. But would it be necessary? Should be simple at least to monitor that one is alive when you trigger the challenge.

Nah the bind9 server is the same server that runs certbot, if it’s down then it doesn’t even run so that’s no problem.

What I meant in regards to the CNAMEs though, bind9.sub.foo.bar (the “target”) would still be just a simple record under the main zone. The secondaries would indeed return the CNAME but since the target is still under the same main zone (foo.bar), resolvers will just use the NS records from there. Which in my case always include the secondaries, so they might get picked. You need to “delegate” bind9.sub.foo.bar as a subzone to only the primary/master DNS server (using NS records). It would probably work fine if the CNAME target is an entirely different domain.

You are right, if the target was a different domain it would likely work. Sounds like your primary (bind9.sub.foo.bar) would also be the target. What about getting an additional IP and setup a light DNS server to answer challenge only (ie: acme.sub.foo.bar)?
Like: primary bind9.sub.foo.bar on IP01 has CNAME for _acme_challenge > acme.sub.foo.bar
and bind9.sub.foo.bar on IP01 has A acme.sub.foo.bar > IP02 then acme.sub.foo.bar on IP02 answers the TXT .

It’s just two different ways of accomplishing the same thing – N delegations, versus one delegation and N CNAMEs. You could choose one or the other based on personal preference and what’s easier with your stack.

Agree. However, the separate DNS approach would not involves reloading all zones in Bind9 when handling a challenge. Or am I missing something? :slight_smile: