Running certbot in an AWS lambda, cross-partition

Hi,

In a company project we have, we are required to generate certificates (let's encrypt ones, in particular, for several reasons) on-the-fly , during onboaring of a new customer, and configure a rotation/renewal mechanism for those certs.
We run in AWS, and in particular - we have this setup where:

  • EC2 instance running on gov partition is using certbot to issue certiticates with LE.
  • The certificates' subject is a domain that is hosted on AWS Route53 Public Hosted-zone - in a Commercial parition
  • For the DNS challenge, using the route53 plugin - we use a specific set of AWS Secret&Access keys to gain cross-partition access from the instance running on Gov parition to the Route53 Public HZ on the Commerical parition.
  • That EC2 instance is saving those certs on AWS Secretsmanager, in same partition.
  • Each of these secrets are set up with a lambda that is triggered to rotate this certs every couple of months.
  • The lambda also runs on the gov parition, and has an IAM role that allows it to interact with AWS secretsmanager for updating new cert details.
  • The lambda attempts to use a similar approach for cross-parition access - using a static set of secret/access keys.
  • However - the current route53 dns plugin for certbot relies heavily on the standard AWS env-vars that provide it with the access to route53, and in AWS lambda - we cannot override those env-vars specifically.

In attempt to resolve this quickly on our end, we forked and made a patch:

And we figured it might interest others to pull these changes (or something similar) to the main certbot/certbot repo.

I'm moving your thread to the Client dev category, as this Feature Requests category is meant for Let's Encrypt feature requests, not Certbot (as Certbot is managed by the EFF, not Let's Encrypt/ISRG).

You could also file a PR on the Certbot github repo, although the Certbot dev team is rather small and if the changes aren't that significant, it may take years for the team to merge the PR, if ever.

3 Likes

A few small comments on your patch - which I do think you should make a PR to Certbot against. These suggestions all come from the experience of having deployed similar hotfix patches in the past to Certbot (or other) projects - and driving myself mad trying to figure out what happened when something breaks.

1- Move the nest import from line 47 to the top; aside from failing some style standards, this will make it harder to detect import issues if boto moves anything around.

2- Drop the fallback to us-east-1; raise an Exception if its not there. Internally, you can use logging/printing/API-Posting to catch this on your server configurations, so you can update them all before switching to an Exception.

3- If you PR: change the print( to log.critical( or log.info(. Print is fine for internal use.

4- Add a few unit tests to check the setup is configured right, and that it fails predictably. Failing predictably is the more important bit, IMHO.

4 Likes

Thanks! I really appreciate the feedback!

Your comments all make a lot of sense.

I actually optimistically hoped that someone would wanna make something from my quick&dirty patch that would qualify as a proper PR (obviously I wouldn't have submitted such change as a PR, I oughta know better :wink: ).

I'm just super busy at work (who isn't, I guess..), but I will try to find a couple of hours to make this change into a proper PR. Just cuz I do want to contribute if I can.

I guess I just also wanted to see if such change seems attractive/desired for anyone else (as it's quite simple TBH and I was surprised it wasn't addressed already).

Cheers!

2 Likes

I totally understand; and I've been in your shoes. I just wanted to make sure you don't assume any technical debt with this hotfix - because (from experience) it can be a nightmare to troubleshoot when it breaks.

2 Likes