Let's encrypt needs a backup domain and a better registrar

Due to today’s incident I propose to move the domain letsencrypt.org to a safe registrar for high profile domains like Cloudflare Registrar or others.

You can read the benefits here:


https://www.cloudflare.com/es-es/registrar/

You can check the domain security here:

Also, a backup domain should be created and hosted on a different registrar and pointing to diferent servers as a failover service, so if the client fails to reach the .org domain it should try with the backup one.

We need redundancy.

Thanks for your time!

2 Likes

We don’t even know what happened yet. DNS is complicated. Let’s not jump to conclusions before the investigation completes.

3 Likes

Indeed let’s encrypt need to find a registry that doesn’t put clienthold on customer’s domain…

Pricy doesn’t mean better. It just means it cost money.
Actually Cloudflare is faster than Akamai and fully standards compliant.

1 Like

I agree..

However I think now the issue is... Why there was an clienthold & registrarhold with enom....

True, this is the most important thing to figure out now. Only then informed decisions can be made.

Akamai is a founding member and sponsor of ISRG, I imagine that came with a significant discount :stuck_out_tongue: .

2 Likes

client* flags are generally a registrar issue rather than a registry issue.

If the clientHold flag was added by the registry itself without prior notice, the issue is bit more disturbing...

Anyway, I agree with the OP that let's encrypt should have 2 domains managed by 2 different registrars. And these domains should be on 2 different TLDs managed by 2 diffrerent registries.

And these domains should be locked at registry level (server* flags).

1 Like

I’m not sure a fallback domain is worth the extra effort and expense to protect against such a rare circumstance. If the outage was due to a registrar error, it might have helped, but if the outage was due to e.g. a malicious abuse report to ICANN, both domains could have easily been affected at the same time and it would have been a total waste of effort.

An outage of several hours shouldn’t affect anyone using Let’s Encrypt properly, anyway.

The points about registrar locks and other measures to increase the security of the existing domain are definitely valid (though I find it unlikely that CloudFlare is the only registrar that provides such features) and I’m certain the Let’s Encrypt staff already started considering implementing them before you even raised the thread. :slight_smile:

2 Likes

An outage of several hours shouldn’t affect anyone using Let’s Encrypt properly, anyway.

It does affect OCSP (if you're not stapling, at least, or if your staple cache is only updated too close to expiry). But a backup domain won't help for that either. (Or can a certificate can contain several different OCSP URIs?)

1 Like

In fairness, it also affects new issuance, which may be problematic for large integrators in particular. That being said, I don't believe a backup domain, and all that entails, are worth the effort, to be honest.

It will definitely affect all current users and certificate holders... (It won't affect chrome users though... Since chrome doesn't actively check OCSP...)

Imagining if Comodo or other CA has the same issue.....

OCSP is fail-open precisely for this reason. And there's hardly anything that fails faster than a domain not existing, so this was the best possible kind of outage to ask for from an OCSP POV.

I think you can repeat AIA fields but browsers probably won't bother checking more than one.

Dear Customer,

We're sorry, but due to delays at Let's Encrypt, SSL certificate issuance for our customers and many others around the globe has been delayed. Your site was made available with a self-signed certificate so you can test it, but your browser will display security warnings. Also note that your site may not be accessible at all yet due to DNS propagation delays, which could take up to a day.

We will notify you when a certificate is issued and you can access your website without any security warnings. We're sorry for any inconvenience this delay has caused. Have a nice day!

If you had a deadline yesterday you should have made sure it worked three days ago. If it's not important enough to plan ahead or give GoDaddy or Comodo $10 for you can spend that money on a beer instead and wait a few hours... It's not like S3 is down and all your files are gone, you have plenty of other options for certificates or you can just chill.

They go down too, they just get bought out and change status pages all the time so you can't see the history:

https://forums.cpanel.net/threads/comodo-ocsp-outage.604051/

Not if you are using the Must Staple flag in your cert (which you SHOULD use).

A failsafe is not a good idea. If an attacker can MITM you with a compromised certificate, they can block OCSP queries to prevent the revocation to works.

The ICANN don't have any direct control over registeries. And if a registry receive instructions for shutting down a well-known domain, be sure they will-double check that (with ICANN or the authority who emitted this instruction) before effectively disabling that domain.

Let's Encrypt's OCSP responses are valid for 6-7 days from the moment you request them. OCSP stapling implementations will cache the old response while a new one is unavailable, if they even bothered retrieving one at all during the outage window.

I think I had the ICANN UDRP in mind when I wrote that, even though that really can't be involved here. Nonetheless, there are many ways to get your domain shut down that there is little to nothing the people you pay for your domain can do about: https://www.eff.org/files/2017/08/02/domain_registry_whitepaper.pdf

Until we have more information I cannot assume it's all their registrar's fault, sorry. It's a complicated Internet we're all connected to.

1 Like

We’ve posted a postmortem for this outage: 2018.07.30 Domain Resolution Interruption

5 Likes

Only good ones -- AFAIK most web servers don't do the caching by default (or do stapling very well/correctly) -- I know Caddy does but that's the only server I can speak of. Caddy can survive OCSP responder outages for about 3-4 days because of its cached stapling implementation.

Yes, we all agree. :slight_smile: This is why a robust server-side implementation of stapling is important. See this GitHub Gist and its comments: ocsp-stapling.md · GitHub

2 Likes

Nginx & Apache will also survive with correct OCSP staple setup(which is relatively easy... (But still need some setup)

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.