Certbot renew failure notification?

I am trying to figure out how to get certbot renew to run a hook on a failed renewal.

Having reviewed the options available for prehook, posthook, and deploy…none of these options are specific to failed renewal attempts. In a production application with potentially hundreds or thousands of renewal attempts per day, all I care about are the failures. I don’t need affirmation of a successful renewal which is what is supposed to happen.

I’m having a hard time believing that this isn’t something that has been accounted for, as any ops person or developer would likely have the same concern.

Certbot returns a non-zero exit code on failure, if that’s of any use to you.

That’s something, but I don’t feel like I should have to invent my own solution for this.

I will likely leverage Cloud Watch for this in the end. But it is a bit staggering that this not already built in, especially considering what is built in.

If you think it would be a useful feature, feel free to propose it in Cerbot’s issue tracker. Or chime in on this earlier feature request which seems related, though perhaps not exactly identical.

If renewals are being attempted at the generally accepted best practice intervals (1-2x per day starting 30 days from expiration), any given individual failure shouldn’t really be an issue that you need to act on. If the failure is temporary, subsequent attempts will deal with it as soon as the temporary problem is resolved.

If it’s a permanent issue (config issue, account revoked, etc), I would think you already have something that’s monitoring the actual cert expirations that would notify when a given cert has reached a critical threshold of not being renewed (at which point you’d intervene). You also have a built-in monitoring method via the email notifications already.

I guess I’m curious what you’re planning on doing with a renewal failure hook.

2 Likes

I guess I’m curious what you’re planning on doing with a renewal failure hook.

For starters...be notified.

You mention this however:

You also have a built-in monitoring method via the email notifications already.

To which I'm not sure what you're referring.

It sounds like you’re saying there is already a built in notification system for failures. Which is not something I’m aware of. But also would make sense because this is the aspect I’m harping on here as egregious that it “doesn’t” exist. (but apparently does…?).

Is there a way to test the notification email?

The CA (not Certbot) sends email notifications for certificates that are approaching expiry, have not been renewed, and are attached to ACME accounts with email addresses.

The email looks like this:

Hello,

Your certificate (or certificates) for the names listed below will expire in 20 days (on 21 Oct 18 09:21 +0000). Please make sure to renew your certificate before then, or visitors to your website will encounter errors.

We recommend renewing certificates automatically when they have a third of their
total lifetime left. For Let's Encrypt's current 90-day certificates, that means
renewing 30 days before expiration. See
https://letsencrypt.org/docs/integration-guide/ for details.

example.org

For any questions or support, please visit https://community.letsencrypt.org/. Unfortunately, we can't provide support by email.

If you are receiving this email in error, unsubscribe at http://mandrillapp.com/track/unsub.php?u=...

Regards,
The Let's Encrypt Team

Since it is not sent by Certbot, there's not really any way for you to test it.

But, you should not rely on this email anyway. It cannot determine whether your renewed certificate was actually properly deployed or not.

As such, you should instead rely on website monitoring (like Uptime Robot or Pingdom) to actually monitor your web endpoints. They can email you if the actually deployed certificate approaches expiry.

Edit: FWIW, I agree that tracking Certbot's exit code is the best option, and then just scoop up the latest /var/log/letsencrypt/letsencrypt.log.

Certbot is not a programmable ACME client, it is largely an interactive one. It is possible to build up support tools around it to mechanize it, but it doesn't mean it's a good idea. There are programmable high-level ACME clients that are better suited for the job.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.