I think that Let’s Encrypt’s resolvers tend to send a lot of traffic to authoritative nameservers (compared to normal resolvers) because they keep practically zero cache and query multiple record types at once. That could trigger some kind of rate limiting or firewall behavior on the Network Solutions side.
Or it might just be a regular old routing ****up between Viawest and Network Solutions.
Edit: I just tried again for the domain and it worked. Can you retry?
Note that some DNS failures for issuance that was previously successful could be a result of Let's Encrypt's new multiperspective validation.
(This isn't a likely explanation for all of the problems that were mentioned on this thread, but it's something that's good to be aware of, especially if you've seen the behavior change very recently.)
We also have a large number of customers using Network Solutions. I’ve been somewhat successful in getting a small portion of these to renew by just retrying but its definitely not keeping up. Maybe 10% are renewing after some time?
Special Request @jsha Could you share details on what exactly is failing between your system and Network Solutions, so that we can contact them and get corrective action moving? Right now we don’t understand the problem well enough to inform them on what to correct.
Reason For Urgency
We have Network Solutions customers who’ve lost SSL (and ability to take payments and do business) at this very moment and dozens (if not hundreds) of customers that will be in the same boat within a week or 2 if we don’t find a solution.
How long has this problem been manifesting for you? I would expect that we’d have 30 days from beginning of the problem before we started to see certificates expire.
We do renew at 30 days, but we remove failing hostnames from their 100 domain SAN cert at 25 days in order to force the renewal to succeed and maintain at least a 25 day window. This is usually customers leaving us and is never a problem. Some of our NetSol customers were removed from their SAN cert by this process before I noticed. I’ve temporarily changed this “force renewal by stripping bad domains” threshold to 15 days.
As it stands we have less than a dozen NetSol customers who lost their cert, and dozens (maybe hundreds?) more set to lose their SSL in 7 days, when we hit this 15 day threshold.
For a system as large as ours, and prone to rate limits, I get very nervous about going much further than 15 days. If we let ourselves go down to 0 days and then start chugging through renewals, I’m afraid we’ll get rate limited and not only lose ability to cert new customers, but if we fail to renew 7 days worth of certs before getting rate-limited then we could be forced into expiring live certs.