Large Integrators' experience with Let's Encrypt

andrei · August 20, 2018, 10:20pm

Hey there,

Our company’s planning on rolling out an integration with Let’s Encrypt, and having LE be our main provider in terms of TLS Certs.

Currently, we expect to need to request a couple of rate limit changes, and the scaffolding certainly seems to be in place in order to get that sorted in a timely manner. The thing I don’t get is that while the form here ( https://goo.gl/forms/plqRgFVnZbdGhE9n1 ) does have a field to request an increase in the per-domain cert limit, it doesn’t seem to have one for the acme2 order limit of 300 per 3 hours. Is that something that’s just hardcoded or is there a form to request an exemption from that rate limit that I’m not seeing ?

The other thing that’s somewhat scary is not having a paid support layer to rely on in case things get hairy. What’s other people’s experience around that been like ?

Thanks

stevenzhu · August 20, 2018, 10:23pm

Hi,

@lestaff is there an way to request a higher limit on that part?

Sometimes free support is fine... just wondering, what support will it need for an automated issuerance experience.

Thank you

Osiris · August 20, 2018, 10:27pm

You're welcome to discuss your issues here on the forum

Let's Encrypt currently supplies the world with more than 53 million certificates for free. You can imagine that offering support to "paying" "customers" is rather difficult. As you can see, it's very much possible to supply the world with tenths of millions of certificates without the "trouble" of having to maintain a support staff to answer possible questions of "paid" customers. That's just not the purpose of Let's Encrypt. That's just not the current "model" of Let's Encrypt.

That said, Let's Encrypt is dead simple easy to manage. It's "just" an API: the ACME protocol. You request a certificate, you provide "proof" of ownership for the hostnames and you'll get a certificate. Need paid help? Pay some IT specialist to do it for you. It's just that simple IMHO

andrei · August 20, 2018, 10:28pm

Hey Steven,

I honestly don’t know at this point. With our current provider, we’ve had just random domains not validate properly because of reasons and those required some sort of magic jiggery-pokery on their support staff’s part to go through.

Admittedly these issues have been few and far between, so that’s mostly why I was wondering what the experience has been with LE for other high-volume users.

Osiris · August 20, 2018, 10:31pm

It's of course very much possible to have issues with the validation of hostnames. But mostly that's because of issues at the "customer" server end of the validation process. E.g., AAAA DNS records without actually working IPv6 addresses, DNSSEC issues, et cetera et cetera. We can't promise you there won't be any issue, but if you plan it right, test it right, you won't run into any.

JuergenAuer · August 20, 2018, 10:43pm

Hi @andrei

100 per hour are 2400 per day, so does your company has more than 10.000 or 20.000 customers? It shouldn't be a problem to create a list, then step by step.

The api is well documented, so it's simple to create your own client. Typical problems are special configurations or "big holes" (ipv6 defined, but not working). But a hoster should be able to manage that completely outside the scope of these problems.

So the code should work with one and with 10.000 certificates. 10.000 need a little bit more time. And there is a test system, so a hoster can create a client and test enough.

cpu · August 21, 2018, 1:33pm

Yes, I believe the form is just missing this field and we typically follow up with reporters to ask about this rate limit.

@jple Could we update the form to include this field?

I'm obviously not speaking from an unbiased perspective here but I think our community support is pretty great thanks to the many excellent volunteers that offer their time/expertise. In previous jobs I've had a hard time interacting with some unnamed commercial CAs - specifically trying to get access to engineers beyond the first tier customer support. With Let's Encrypt our engineering staff is quite a bit more accessible! Compared to other CAs our software and protocol are both open and that goes a long way with helping folks diagnose their own problems too.

Osiris · August 21, 2018, 2:07pm

A little bit offtopic perhaps, but mostly I'm seeing individuals and small companies opening threads on the community (as far as I know, the only support offered by Let's Encrypt). The "big dogs" out there, offering thousands of certificates per day to heck-do-I-know number of customers: I'm not seeing them here. So I'm a little bit inclined to conclude Let's Encrypt/ACME is that easy, so a well staffed IT-department doesn't need this forum for support.
This theory of course can be biased, as it is possible large companies do run into trouble, but would refuse to open threads here because they are afraid it might lead to negative attention upon their company.

andrei · August 21, 2018, 4:14pm

Yeah, I given that I couldn’t find any large companies having massive fits about issues with LE, I would assume that the service is stable and there are no off the cuff changes being made to the underlying api / architecture that would send people with existing tooling in a tailspin.

Also, one would assume, given the nature of the service, a 10-20 minute maintenance window / incident wouldn’t be the end of the world since most large integrators would pad their cert issuing timeframes. I do wonder how people relying on LE certs for communication within their infrastructure deal with that, but I assume they’d be doing wildcard certs to simplify things.

I guess what I’m trying to get a better sense of is the size of the unknown unknowns when interacting with Let’s Encrypt. For instance, are there any edge cases where a domain just won’t invalidate? Are there any DNS servers that just don’t play nice with LE’s verification of CAA records?

andrei · August 21, 2018, 4:18pm

Oh awesome!

Yeah, think the main difference is going from just throwing issues over to the vendor's paid support to setting aside the dedicated resources internally to debug issues as they come up. The thing I don't have a sense of yet is how often these issues come up

schoen · August 21, 2018, 4:53pm

You can definitely have problems like this. For example, LE uses mixed-case DNS queries which mitigate some kinds of DNS attacks; there were several significant DNS appliance and software vendors which initially didn't use the behavior that LE required. It took some effort to get them to update their software.

If you do have names that you're trying to issue for where some of the infrastructure is under the control of your customers or partners, you can expect some challenges (in both senses!) where things break due to software incompatibility or misconfiguration. I think this is rare but a large provider dealing with huge numbers of names will definitely come across it.

If you control the infrastructure yourself (for example, if you run the DNS servers that LE is querying and the web servers—if any—to which challenge connections are made), I think you can get to a high level of reliability. As we've mentioned, although there is no formal commercial support, the LE engineers care about your problems and are very accessible. I've been involved in and witnessed some pretty elaborate debugging efforts where issuance was failing in a particular country or for a particular vendor's products and we generally did a good job of getting to the bottom of why and figuring out how to fix it.

I think the cases that we're not doing a good job with aren't ones that will be relevant to you:

End-users who don't have the right background for the tools that they're trying to use, or who aren't choosing the right tools for their use cases.
Vendors who develop their own Let's Encrypt clients but don't participate in the forum.

Well, maybe also

People who, for brand or client confidentiality reasons, don't want to debug in public.

Osiris · August 21, 2018, 5:18pm

Hey don't sell yourself short, that's like 99,9 % of the content of this forum and you're one of the most active people here!

andrei · August 21, 2018, 11:52pm

Thanks for chiming in on this Seth.

That's pretty much the boat we're in, so the expectation is that there will be a bunch of domains where the problems are just "magic". My assumption is that the code that the production boxes is GitHub - letsencrypt/boulder: An ACME-based certificate authority, written in Go., so that would make it a lot easier to debug any would-be edge cases, right ?

Yeah, this one I'm assuming will maybe be an issue at some point, but we can probably anonymize things or use the actual boulder code to figure out where the failures are coming from.

Osiris · August 22, 2018, 9:51am

Correct. You also might have interest in the miniature version of Boulder, called Pebble. Be careful though to check the ACME versions used by the software: the ACME protocol is currently still a draft (although almost a RFC) and the latest draft at time of writing is 14. But Pebble is based on 13. I myself don't know if there are big differences, but if you'd like to use Pebble/Boulder, it would be good to check if the actual public servers of Let's Encrypt use the same version of the draft as you do.

And, as always, use the staging environment of Let's Encrypt for testing purposes.

andrei · August 22, 2018, 4:33pm

That's a good point, just noticed that the status page surfaces maintenance windows with the actual releases being deployed, so that'll definitely help debug whatever issues pop up.

cpu · August 22, 2018, 4:46pm

The differences between 13 and 14 were almost entirely editorial. The only meaningful change was the addition of a new error type for when a certificate is requested to be revoked that has already been revoked. Boulder will be compatible with this in the near future. Pebble has an open PR to implement the error type and we can update the README for draft-14 after that lands.

andrei · August 22, 2018, 8:38pm

Speaking of revocation, has bygonessl caused any changes in that process at all for Let’s Encrypt? I know that currently having the private key of the certificate is needed in order to request revocation, is that going to be the case going forward as well ?

mnordhoff · August 22, 2018, 8:54pm

That's one method of revocation, but not the only one. If you have the certificate and can validate all of the names, you can revoke it that way.

A CDN-style certificate with multiple domains that you only partly control remains a problem.

Edit: Let's Encrypt's shorter lifecycles, compared to most CAs, also reduce the issue.

andrei · August 22, 2018, 8:58pm

Yeah, those are the ones I was wondering about. Do you have any sense of whether there are any changes planned around that?

jared.m · August 22, 2018, 9:00pm

The word from Let’s Encrypt staff has been that no changes are planned, but if any were made in the future, they would almost assuredly be to shorten the certificate lifespan as opposed to lengthen it.

Topic		Replies	Views
Please give an option to pay for extra rate limit Feature Requests	24	10909	March 12, 2022
Let's Encrypt in numbers - limits, restrictions, features Server	16	12128	August 6, 2017
Public beta rate limits Issuance Tech	131	63166	December 22, 2016
New certs failed with "A rate limit prevents DCV" Help	38	5973	November 2, 2021
Certificates/Domain Limit Problem Issuance Policy	23	6407	July 23, 2016

Large Integrators' experience with Let's Encrypt

Related topics