Bulk certificate generation for users' custom domains using lua-resty-auto-ssl

Hi everyone!

My application (link shortener) allows users to connect their own custom domains. We have around 4500 custom domains at the moment.
In order to provide SSL for custom domains, I have deployed lua-resty-auto-ssl. The problem is that all links have high traffic. There are a lot of clicks every second. When I point DNS to my server with open resty/lua-resty-auto-ssl, the domain can’t even fully propagate and I begin to receive failed authentication errors and rate-limit errors.

New users add 5-10 new domains per week so it won’t be an issue later but now I need to find a way to generate certificates for all existing domains. Is it possible that Let’s Encrypt would agree to take off the limits for a while? Or do you have any idea how to solve this?

Thank you in advance :slight_smile:

Server details: AWS EC2, Amazon Linux AMI, NGINX, Open Resty, LuaRocks(2.0.13), lua-resty-auto-ssl.

Let’s Encrypt Account ID: 94683247

Common errors displayed when pointing DNS to open resty server:

 "type": "urn:ietf:params:acme:error:rateLimited",
  "detail": "Error creating new order :: too many failed authorizations recently: see https://letsencrypt.org/docs/rate-limits/",
  "status": 429
"type": "urn:ietf:params:acme:error:unauthorized",
    "detail": "Invalid response from http://{custom-domain}/.well-known/acme-challenge/x1mJEMkDMXUlz9AVai7PtOvu1MD4DKyJXTcaybzln6Y [{IP}]: 500",
    "status": 403

My guess is that the DNS propagation may be the reason (pointing to my open resty in a region where someone clicks on the while and pointing to old server in a region used to authenticate). I am not sure if I’m right.

1 Like

Hi @jasnos

the answer

says, there is a http status 500 from your server. That’s something you have to fix.

That limit should never happen. That happens only if your configuration / client is buggy.

I would expect you hit the 300 new order / 3 hours limit. But “failed” -> first find the reason and fix it.

1 Like

Mass-migrating domains to Let’s Encrypt in a short period is really tricky, you’re not the first to run into the problem.

The best advice I can give you is to add a qualifying pre-flight check to your process. This would check that the domain actually points at your service and is capable of fulfilling the ACME challenge.

That way, given your 300 orders/3 hours limit, you can make the most of those, without pointlessly wasting Let’s Encrypt’s resources on domains that have no chance of succeeding.

Looking at the readme, you may be able to check whether the preflight has been completed in the allow_domain hook function. But the actual preflight check may need to happen in an independent process.

I also hope that the lua-resty-auto-ssl implementation has internal locking which prevents concurrent certificate orders for a single domain.

Regarding rate limit exemptions, you can certainly apply for them, but I’m not confident that you’d get one for a short term issue like this. They take a few weeks to be processed.

2 Likes

I forgot to mention, that the IP address that gives 500 is not my server’s IP. It is an IP address of the previous custom domain SSL provider (clearalias_com). That is why I assumed the DNS is not fully propagated. The custom domain’s link clicked by someone leads to my open resty server which tries to generate a certificate, but the authentication server is pointed to an old provider’s server that gives 500.

Then you should use a solution that queries the authoritative name servers.

Then you have the same result Letsencrypt sees.

2 Likes

Thank you. I will read more about it and try.

I think the more likely reason is that you are using an autocert system with a high number of domains. IMHO, you are going to constantly run into different ratelimit issues because of this between obtaining and renewing certificates… and issues will increase as you scale the number of nodes that terminate SSL.

I had similar needs/concerns with whitelabel services for custom domains. I ended up making an ACMEv1 system, and redesigned it to ACMEv2 over quarantine - PeterSSLers.

The basic operation is this:

  • An OpenResty plugin loads certificate data from a waterfall cache: nginx-worker, nginx-master, redis, and then an internal Pyramid (Python) application.
  • The Pyramid application is a Certificate Manager and API Client. It has a UX for humans and API for apps and OpenResty. It handles the ACME ordering process and scheduling renewals.
  • The OpenResty plugin can be configured to failover to autocert, which has a locking timeout.

Some bits are rough, but it does it’s job and is constantly improved.