We run a service (exe.dev) where we request certificates for user's domains. Today we were having trouble getting certs. We discovered (after much debugging) that ns2.exe.dev was unavailable. Users had no trouble getting to their sites because ns1.exe.dev was functioning normally.
However the second name server being out of action meant we could not get certs from LE. This lasted for hours, until we figured out what was broken and fixed ns2.
This is, clearly, our fault. We should have fixed our name server within minutes of it breaking, and we are busy adding monitoring now. But it is also odd behavior for LE. I would have expected your servers, on failing to connect to ns2, would try ns1. It is unfortunate our failure cascaded into your system.
If there is any more information I can get you, happy to try.
Pretty sure it'll allow one of many nameservers to fail, but if you only had 2 nameservers and only one was working it wouldn't have enough to form a multi-perspective opinion about validation.
You currently seems to have a standard AWS Route53 setup with multiple nameservers, not sure if you've just moved to that or not.
Oh that's really interesting! Is there some critical information LE gets out of using multiple root NSs?
We use route53 for exe.dev, but our users host on exe.xyz, using the nameservers ns1.exe.dev / ns2.exe.dev. We could deploy more, but I would love to first understand what LE is getting from that, given we are using TLS-ALPN-1 verification.
The idea being in the event you can spoof one perspective you (hopefully) can't spoof them all without having genuine control over the domain. Nowadays I think it's a requirement for all CAs.
Think about if someone can take one of nameservers but not all so answers are inconsistent. in that case only safe thing to do is not make any new certificate at all.
My understanding of the "multi-perspective" part is that it's about multiple paths from the Internet to guard against BGP-based attacks (and the web page you posted agrees with that interpretation). I don't see how the number of working nameservers you have affects that at all.
If there's an additional requirement that you need to have a a minimum number of nameservers working to issue a LE certificate, that's news to me.
Each perspective selects an authoritative NS at random. So a primary and some of the secondary perspectives may select working NSes and other perspectives may select the broken ones.
Right, we don't target many different authoritative nameservers on purpose, but since we're making requests from 5ish different perspectives, statistically we're going to hit both NSes from at least one perspective.