Issuing certificates for custom domains in a SaaS - is this the correct approach?

Hi! I would like to ask if what I’m doing is the correct approach. I have a Rails web app that lets users add custom domains, so for these domains I need to issue certificates. The app is hosted on Kubernetes, so I am using cert-manager to manage the LetsEncrypt certificates. The way it works is as follows:

  1. User adds a custom domain
  2. The app creates an ingress (http only) in Kubernetes for that domain
  3. The app fetches a page with a verification code from that domain, to ensure that the domain is pointing to The app; this is repeated every 20 seconds until the domain is verified by the app itself
  4. When the domain is verified by the app, the ingress is “upgraded” to https using cert-manager annotations
  5. cert-manages notices that there is an ingress with pending certificate, and requests the certificate from LetsEncrypt
  6. The app periodically does the same verification as before but this time with https instead of http, and when the verification code can be read successfully without ssl errors it means that the domain is verified and has a valid certificate, so it’s ready for use.

This process works very well, and if the DNS for a custom domain is already updated to point to the app (with a CNAME record), within 30 seconds/1 minute the domain is ready for use. The reason I have the app do a verification by itself first, is to ensure that the domain is already pointing to the app when LetsEncrypt does its own verification.

I am wondering however if I can have problems with rate limits or something if the app becomes successful when it is in production. Is this approach correct or should I be doing it differently? How do big companies with lots of custom domains and certificates handle this?

Many thanks in advance to whomever can clarify this for me.

It sounds like you have a good handle on it already. Outsourcing the work to cert-manager is a good idea.

There’s a lot of advice on https://letsencrypt.org/docs/integration-guide/ that will be relevant to you, so read that if you haven’t already.

The rate limits you are most likely to hit seem to be:

For users of the ACME v2 API you can create a maximum of 300 New Orders per account per 3 hours.

and

There is a Failed Validation limit of 5 failures per account, per hostname, per hour

If your app becomes wildly successful, you can get an exemption for the former rate limit.

For the latter, it might be worth checking that your app has some kind of back-off in place to prevent failed authorizations being retried constantly. For example, if the domain has been blacklisted by Let’s Encrypt, or if there’s some kind of connectivity issue with anycast nameservers that you can’t reproduce yourself.

Edit: overall, it seems like most of the above behavior is going to be the responsibility of cert-manager, so it might be worth making a few contrived tests to see that it behaves the way that you would expect it to, in a variety of failure scenarios.

1 Like

Thanks for your reply! Very useful. I’m just scared that everything works well in the beginning and then boom… problems :smiley: I will see if I find more info about how cert-manager handles this stuff. Thanks again!

1 Like

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.