Our system uses anycast for DNS (DOH and DOT) services. We'd like to use IP authentication, since our IP addresses are useds by clients to bootstrap or directly communicate with our systems.
I see a few problems with the current model that has been documented for IP address certificates, and I'm wondering if there can be a discussion on the topic.
Cert time is too short. We own our own /24's and /48's for use, and we are listed as the "owners" of the address space in the RIR. It seems that if RIR "ownership" can be validated, then it should be possible to extend the duration of the validity to something more than 6 days. 30 days would be more reasonable for critical services like DNS. There is an exceptionally high cost to failure to validate, as any problem with validation will cause tens (or hundreds) of millions of users to be unable to perform any internet transactions at all. In order to catch and solve problems, the 6 day interval effectively turns into 2 days (imagine a problem on a holiday window, where either LE or our staff has a 3 day "gap", plus then add a day for iteration.) This leaves no room for anything less than perfect management on both sides of the problem equation which is not a viable path forward.
Validation process being limited to HTTP or TLS-ALPN-01 is a problem. Our anycast prefixes do not operate on port 80 anywhere - that port is closed for various policy reasons. Our current software stack (dnsdist) does not support TLS-ALPN-01 and I don't see that on any horizon. Why not accept https (443) as a destination for HTTP method checks on IP addresses for validation and don't check for cert validity? It seems that this is no worse or better than port 80. I'm not a huge fan of this method, since it requires distribution of the validation keys to our anycast array with LE systems connecting to an arbitrary anycast cluster, but it's at least possible to create a solution whereas port 80 is a non-starter.
Hey, that's actually like the most useful use case for IP address certificates!
Yes, short-lived certificates require a lot of faith in one's automation. I would expect that any real-world production use case of short-lived certificates would be getting certificates from multiple Certificate Authorities, with a system that automatically switches between them as needed. (See, for instance, what Wikimedia does, where they use two CAs, both certs loaded in all their data centers, with different data centers serving a different one by "default" so both certs are known to work well, and the ability to switch all the datacenters to using the other if one of the certs stops working. That's with non-short-lived certs, but I think is the best practice regardless.)
That is close to a case that already works. One needs to start on port 80 with the port 80 HTTP server serving a redirect to HTTPS 443, and then the validation checks on port 443 but without caring about the current certificate being served there. So one would need to have a simple redirect-only HTTP server on port 80, which usually isn't that much of a security issue.
That really is the trick, and my understanding is that it's a much harder problem than one would expect. I'd love to be proved wrong, though.
Thanks for participating here! Many of us are out for the US/Canada holiday long weekend, but I have a couple of quick replies. I don't mean these to be discouraging: although they are the "here's why that won't work" type of replies, we're eager to help figure out how you could make things work, despite the constraints.
Many of the constraints are based on the CA/B Forum Baseline Requirements (BRs), which apply to all publicly trusted certificate authorities.
This is a tricky one. Because of how Let's Encrypt works, validation methods need to be 100% automated with 100% integrity: i.e. it should not be possible for a parsing error to permit issuance by mistake. I'm pessimistic about parsing RDAP data with the level of integrity this would require. Let's Encrypt also doesn't have any support for validation by email, which would be needed for the validation method (BR 3.2.2.5.2) that involves checking RIR data.
So, that's one major reason why IP address identifiers are limited to the short-lived certificate profile; it's safer to assume that subscribers only briefly control their IP addresses (e.g. cloud/VPS instances).
Unfortunately, those are the only standardized ACME validation methods for IP addresses (in RFC 8738), and the HTTP-01 method must start on port 80 (RFC 8555, Sec. 3).
BR 3.2.2.5.3 does allow IP address validation based on reverse DNS resolution, although this isn't currently implemented for ACME. So, I'm not sure any external constraint prevents this, but there would be a lot of implementation work in order to make this happen.
Are you able to share more details about the policy problems with opening port 80? I can think of a few clever ways to add a shim in front of dnsdist, but they're probably not suitable for production. Adding TLS-ALPN-01 support seems like a heavy lift, but might be the simplest thing to do.
Good to hear, since that's all we do. We've been waiting for LE to offer V4/V6-based certificates for quite some time, so I was pleased to see the announcement a while back but the details were not exactly what we had hoped for. I'm with Quad9 doing some preliminary investigation on this topic, so we've been delivering service with IP-based certificates for 8-ish years now.
This dual-cert model is certainly possible, but it adds significant operational complexity. Whenever operational complexity is increased, failure modes increase. I'm sure we can consider this model in the future, but our first intention would be to run with one certificate as we have done in the past. Our current vendor is Digicert, which is to my knowledge the only other possible option for IPv6 IP-based certificates - is my information outdated? The costs for an IP-based cert with additional names from them are are non-trivial and rising rapidly. We are a non-profit, and keeping costs low is important but keeping reliability high is more important. We were hoping to do a 1:1 conversion to LE but that does not seem viable with a 6-day certificate duration. Digicert's long-duration certificate lets us worry less about cert automation and spend time on the other parts of the automation workflow that demand our attention more (namely, the DNS parts.) It is unlikely that we would build a new automation platform to push certificates to hundreds or thousands of edge systems in the near future on such a regular basis. There is a significant difference between pushing certs upon system activation/reboot and pushing them every few days. Is this a solved problem for others? Yes, certainly - it is quite achievable. But is it a problem we can take on right now? Probably not, unless someone wants to give us a grant for that work.
While I understand that it appears obvious that it should be "simple" there are operational realities which make this not quite so trivial. We are not an HTTP content delivery organization - our job is delivery of DNS, and DOT/DOH are just channels. We constrain our network to just delivery of DNS. There are zero port 80 tools operating on our network, except on our main web server which does a redirect. Opening port 80 on two hundred and sixty plus locations (and changing partner firewall/anti-ddos models that has been previously negotiated to specifically not expect port 80) and adding the defenses to manage exhaustion/DDOS/probing/etc. on another port is a very complex process. We do not have a "web server" at all running on these systems, so this is adding another parallel protocol across thousands of instances for just this one tiny event which is to receive an LE validation request from somewhere, at some point in time, once every 6 days to exactly one of our anycast nodes. This does not seem like a good use of our limited edge resources or operations staff time at the moment.
Another concept perhaps that would allow a somewhat elegant work-around for validation: Would insertion of a validation key in the in-addr space be sufficient? You currently have a trust model for forward zones - why not in-addr.arpa or ip6.arpa? This method isn't validating the RIR ownership data, but anyone who can insert records in the inverse zone effectively is proving "control" over (at minimum) the /24 or /48 in question, in addition to the specific /128 or /32.
Understood. I have no issue with short-lived certificates where "control" justification is weak. However, where it is strong (example: RIR data matching requester data, or inverse-address control) then I was hoping to see longer intervals to provide more robust defense (or at least more time to solve) when faced with faults.
This actually seems like one of the most obvious methods to start with for IP-based validation in my view. I believe that DOH and DOT will be or are already large drivers for IP-based validation which given now that the ADD is starting to be actively deployed. Owners of the IP space for those servers are very stable - they are being inserted into lots of equipment, and it will rarely or never change. This I would suspect will be mostly IP space that is RIR-allocated, and is not dynamic.
What seems to be missing in BR 3.2.2.5.3 is an explicit statement supporting TXT model for validation, though perhaps it may be inferred that since it mentions any method in 3.2.2.4 is allowed that it may also be acceptable. Inverse zones are just like forward zones - any record can be added.
See my reply above in this thread. We operate no web services at all on anycast DOH/DOT receivers, and adding a web server on our entire fleet just for LE auth is very complex and wasteful. If there could be a query sent to 443 (ignoring the cert) then we could probably do a redirect to a central system which held credentials. I'm not sure why this method (443 but accepting any cert) doesn't have equal standing, though I may not be seeing a reason that this isn't exactly the same trust level as port 80.
Oh yes, I wasn't trying to imply that it was easy. Since short-lived certs do require one to be really confident in one's automation, there certainly is a lot of operational complexity, definitely. Hopefully the tooling will get to the point where this sort of thing is the default and simple to deploy everywhere, which I think is the eventual goal, but we certainly aren't there yet. Short-lived certs are definitely a bleeding-edge thing, and I totally understand that many production systems would want to take a bit of a wait-and-see approach for a bit.
I thought that it was a thing that most of the big players did, though not free. But I haven't looked around much to see exactly what is offered by who.
No, I don't think you'd be able to do that, and I don't think that LE is really trying to directly "compete" with other IP cert providers. I think it's more that LE is exploring providing short-lived certificates, as a possible solution for how broken revocation of certificates is in practice (and to lead the industry in showing how it may be possible), and kind of as a side effect of that they can end up trying to offer IP certificates as well. But again, yes, this is all bleeding-edge stuff and there's a lot to consider and set up before one might want to really use it in production for something mission-critical.
Then yes, I would agree that Let's Encrypt's short-lived certs are probably not the best fit for your organization quite yet. They're not even available to the public yet, since even Let's Encrypt is wanting this to be slowly rolled out and it's not clear what the implications will be on the needs of Certificate Transparency servers or the rest of the ecosystem.
I know this has been talked about in the various committees, but I think the gist of it is that just because someone is delegating reverse DNS to an organization, that doesn't necessarily mean that they want to delegate the ability to create IP certificates for that range. (And there's a similar problem with how one might want to define CAA records for IP certs.) Hopefully someone can give actual links to discussions somewhere; I'm just basing this on vague memories and may be mistaken.
I'm not sure this requires another thread, but I'll mention it here as well:
The ADOT effort in the DNS community is thankfully gaining momentum. There is a long-term hope to have the proposed DELEG model be able to signal encryption for zones with better accuracy and information but the size of that proposal means it will take some time to move through the approval and eventual implementation process.
There are interim efforts to try to secure recursive-to-authoritative DNS data, of which I know LE is aware given your statements on using authenticated DNS in other areas.
With my Quad9 hat on, I'll say that we are very enthusiastic about DOT as a transport mechanism, even in "opportunistic" mode. There was a talk at DNS-OARC last week which also shows some promise that provides signaling via SVCB records (https://indico.dns-oarc.net/event/55/contributions/1182/attachments/1126/2365/oarc45-dns-transport-signaling.pdf) and if those can be DNSSEC-signed then perhaps that also has some promise instead of "blind attempts".
One of the drawbacks to RFC9539 is that it does not require authentication - this is a bit distressing to us, as it invites interception and injection attacks. It may be the case that in our further testing of opportunistic ADOT that we require certificates to be issued for the IP address of the authoritative server. This has not been decided, and we currently do not require IP-based cert credentials, but it may make sense in the near future if LE enables this offering for v4/v6 addresses. Again, these are servers with high stability and typically on non-dynamic infrastructure. 6 days is workable if fully automated, but it is still an eyebrow-raising interval for any organization that is relying on their nameservers for secure delivery of authoritative data to clients in high-risk environments (risk = observation or injection of bogus data for non-signed zones.)
I suppose the net summary here is: I hope that the decisions that LE takes on this topic don't become an inadvertent wet blanket on the services that need this the most, and for new protocols that in my opinion would benefit significantly from a longer-duration interval.
Hi @jtodd, now that we're back from our long weekend, I'll try to provide as comprehensive an answer as I can.
This idea absolutely makes sense. Unfortunately, we have no ability today to validate your ownership of the address space, and it is very very unlikely that we will add such ability in the future. We are a very small non-profit organization, so it's critical that we keep our operations as simple as possible. Building whole new feature sets, which affect our audit criteria, for the use of just a small handful of Subscribers is not an efficient use of our limited time and resources. In order to service the entire internet without thinking about special cases, we generally have to build features for the lowest common denominator -- everyone can prove control of a single IP address, almost no one can prove ownership of a whole address space.
This should not be the case. Failure to validate should result in an automatic retry, and the old certificate should not be decommissioned until after validation and issuance of the new cert has succeeded.
Yes, this is why our documentation of the shortlived profile explicitly calls out that this profile is only suitable for organizations that fully trust their automation, and/or have an on-call rotation to handle issues outside of business hours. Our own ARI system suggests that shortlived certs be renewed at halfway through their lifetime; your napkin math is similar.
Because making the request directly to port :443 is explicitly forbidden by the ACME protocol, and the Baseline Requirements require us to adhere to that RFC. You're far from the only one who would like us to be able to skip the first request to port :80, but that's a security-driven policy requirement that we're unlikely to be able to change.
Now for some better news. The ACME Working Group is working on standardizing a new DNS-based validation method called dns-persist-01. The CABF has already approved this method for use in validating dnsNames, and some folks are preparing another ballot to also allow its use for validating IP addresses via the reverse DNS zone. We hope to have funding to implement this method in 2026.
That's awesome, as it is essentially a stateless validation method – to be setup manually once, but from then onwards, authorized ACME clients can use it independently, without any interaction with the domain owner (or in this case: IP range delegation).