Renewal process crashed my production server

Patches · February 1, 2018, 11:18pm

Hi @schoen & @bmw we have a bug (maybe plural) for you!

First, certbot runs into trouble performing authorization:

2018-02-01 08:13:09,849:DEBUG:acme.client:Received response:
HTTP 400
Server: nginx
Content-Type: application/problem+json
Content-Length: 142
Boulder-Requester: 21609655
Replay-Nonce: tXscKOWilkFpABBem0gPqlwgnzKCmNCojzl9p9Qj70w
Expires: Thu, 01 Feb 2018 08:13:09 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Thu, 01 Feb 2018 08:13:09 GMT
Connection: close

{
  "type": "urn:acme:error:malformed",
  "detail": "Unable to update challenge :: cannot update a finalized authorization",
  "status": 400
}
2018-02-01 08:13:09,849:DEBUG:acme.client:Storing nonce: tXscKOWilkFpABBem0gPqlwgnzKCmNCojzl9p9Qj70w
2018-02-01 08:13:09,850:WARNING:certbot.renewal:Attempting to renew cert (v2.members.dartconnect.com) from /etc/letsencrypt/renewal/v2.members.dartconnect.com.conf produced an unexpected error: urn:acme:error:malformed :: The request message was malformed :: Unable to update challenge :: cannot update a finalized authorization. Skipping.
2018-02-01 08:13:09,854:DEBUG:certbot.renewal:Traceback was:
Traceback (most recent call last):
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/renewal.py", line 425, in handle_renewal_request
    main.renew_cert(lineage_config, plugins, renewal_candidate)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py", line 1065, in renew_cert
    _get_and_save_cert(le_client, config, lineage=lineage)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py", line 113, in _get_and_save_cert
    renewal.renew_cert(config, domains, le_client, lineage)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/renewal.py", line 297, in renew_cert
    new_certr, new_chain, new_key, _ = le_client.obtain_certificate(domains)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/client.py", line 318, in obtain_certificate
    self.config.allow_subset_of_names)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/auth_handler.py", line 81, in get_authorizations
    self._respond(resp, best_effort)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/auth_handler.py", line 134, in _respond
    resp, chall_update)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/auth_handler.py", line 158, in _send_responses
    self.acme.answer_challenge(achall.challb, resp)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 230, in answer_challenge
    response = self.net.post(challb.uri, response)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 709, in post
    return self._post_once(*args, **kwargs)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 722, in _post_once
    return self._check_response(response, content_type=content_type)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py", line 583, in _check_response
    raise messages.Error.from_json(jobj)
Error: urn:acme:error:malformed :: The request message was malformed :: Unable to update challenge :: cannot update a finalized authorization

This seems like a weird error condition for the simple case of a domain that no longer exists?

Possibly it’s unexpectedness leads to the next error, which is that certbot fails to invoke their hook:

2018-02-01 08:13:12,422:ERROR:certbot.hooks:Hook command "systemctl start nginx" returned error code 1
2018-02-01 08:13:12,423:ERROR:certbot.hooks:Error output from systemctl:
Job for nginx.service failed because the control process exited with error code. See "systemctl status nginx.service" and "journalctl -xe" for details.

Presumably, nginx cannot bind to the port because certbot has not stopped listening in standalone mode due to the above error.

Finally, certbot claims to be exiting abnormally but apparently doesn’t, because the rest of their log is filled with random requests to their server until they manually killed certbot:

2018-02-01 08:13:12,423:DEBUG:certbot.log:Exiting abnormally:
Traceback (most recent call last):
  File "/opt/eff.org/certbot/venv/bin/letsencrypt", line 11, in <module>
    sys.exit(main())
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py", line 1240, in main
    return config.func(config, plugins)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py", line 1142, in renew
    renewal.handle_renewal_request(config)
  File "/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/renewal.py", line 443, in handle_renewal_request
    len(renew_failures), len(parse_failures)))
Error: 1 renew failure(s), 0 parse failure(s)
2018-02-01 08:19:06,385:DEBUG:acme.crypto_util:Performing handshake with ('::ffff:66.249.84.8', 37995, 0, 0)
2018-02-01 08:19:06,386:DEBUG:acme.crypto_util:Server name (members.dartconnect.com) not recognized, dropping SSL
[...and so on...]

Maybe because the standalone server is still running on another thread?

So maybe three bugs?

The weird authorization error. (Probably hard to track down at this point.)
Certbot needs to stop standalone mode on a traceback.
Certbot really needs to exit at all costs when things go south.

Topic		Replies	Views
Certbot crashes Nginx while renewing certificates Client dev	5	3844	June 29, 2018
Error Renewing Certificate Today : too many currently pending authorizations Help	23	4072	September 23, 2017
Automated renewal of cert fails Help	11	6834	March 23, 2017
Renew problem Letsencrypt certificate Help	4	2435	February 26, 2017
Please help, certbot renewal not working Help	8	5256	April 6, 2020

Renewal process crashed my production server

Related topics