Boulder deploy in production

Hi all,

as other threads I saw here in this forum, we are considering to deploy Boulder in production in our infrastructure to issue certificates for internal services.
We successfully installed a test server following the README.md but the same document suggests

As-is the docker based Boulder development environment is not suitable for production usage . It uses private key material that is publicly available, exposes debug ports and is brittle to component failure.

but i can't find any relevant information about how to harden the Boulder installation.

We typically use a firewall that blocks all incoming traffic and then configure exceptions. Would blocking all ports except 80, 443 and 4001 be enough?

Thanks in advance,
Marco

1 Like

Hi @mmaridev,

You can find other threads in the forum that go into this topic in more detail but the short answer is that the Boulder development team strongly advises against using Boulder for an internal PKI. It isn't built for this purpose and we do not have the resources to assist folks using it for this purpose. I recommend you look into using CFSSL directly, or perhaps evaluate a product like Smallstep Certificates.

You can find some light guidance on this topic here: Deployment & Implementation Guide · letsencrypt/boulder Wiki · GitHub

This won't be sufficient.

3 Likes

Dear @cpu,

thank you for the answer.
Sorry if I ask but if the git version isn’t suitable for production how are you deploying acme-v02.api.letsencrypt.org ? Is there any other way to deploy in a secure way Boulder or another ACME server?

Thanks,
Best regards,
mmaridev

1 Like

Hi @mmaridev,

We do deploy the code that you find in git, by way of an RPM generated with the individual Boulder binaries. We do this with the tagged releases pushed to Github each week. (See also GitHub - letsencrypt/boulder-release-process: A repo for documenting and demonstrating the Boulder release process.).

What we don't do is use any of the Docker code/containers or the test/ configurations that those containers use. That is strictly used for development (The test configs are included in the .rpm, but that's only to help our SRE team diff configuration changes over time).

The infrastructure details, production/staging Boulder config files, and associated configuration management used by our SRE team for the prod/staging APIs is private at this time.

For a production grade Boulder deploy there are many things to consider above and beyond whether or not you can use the Docker containers. Here's a brief list off of the top of my head:

  • You'll need to manage an internal PKI for the gRPC mTLS certificates for each component. Using the test/ PKI in-tree would be a security disaster. Using Boulder to issue the certificates for itself would also introduce some "fun" cyclic dependencies so you probably need a whole separate CA/software suite.
  • You'll have to provision and manage multiple distinct servers with fine-grained network policy to enforce Boulder's security contexts. None of this is provided out of box.
  • You'll need to replace the SoftHSM CA configuration with a real HSM, configured through Boulder's/CFFSL's PKCS11 infrastructure. SoftHSM isn't production grade. Our HSM configuration is not public.
  • You'll need to stand up a separate production grade recursive DNS resolver capable of handling the validation load (recall also that CAA checking and associated tree climbing means this isn't 1:1 with ACME authorizations, you also can't disable CAA checking even though it makes little/no sense for an internal environment).
  • You'll need to be following Boulder's development cycle very closely or you run the risk of missing important bug fixes. Unfortunately at the present time we do not publish change logs and we don't make semantically versioned releases. To keep up to date you will have to check each week's release tag and the code diff between the previous weeks.

Honestly the list is quite large. We would also be resistant to merging code to Boulder that would make running an internal PKI easier. It would be code and configuration our very small team wouldn't be actively using (e.g. prone to bitrot) and increased complexity is at odds with our goal of providing the most secure experience we can for the Let's Encrypt service in particular.

It's not an impossible task to run Boulder in a production setting (I know of at least one organization trying it) but it is not a small task and I strongly encourage you to consider alternatives. RFC 8555 and ACME in general are specifically tailored to the web PKI and domain validation of external names. If you are in an internal environment then what Boulder adds on top of CFSSL is probably not as useful to you as you might think.

Hope that helps! In a world with infinite resources and more than three developers it would be great for Boulder to support this use case but it isn't practical today.

4 Likes

@cpu thank you so much for this clear explaination.

As far as our goal right now is mainly to use the certbot client that works out-of-the-box on *nix to automatically produce certificates for some docker containers we’ll probably investigate on other ACME servers compatible with certbot or write an our own implementation.

Thank you so far for the help.

1 Like

Happy to help! I hope I wasn't too pessimistic in my tone.

This use case seems like it would be a great fit for the work Smallstep is doing. They're actively working on an ACME interface and testing compatibility with Certbot. It isn't ready today but maybe it will be a good solution for you down the road.

On the smaller side of things there is also mkcert. Unfortunately it also doesn't support ACME today but it is on the roadmap.

Good luck,

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.