I wanted to let you know that the problem described in my original post is still on-going and generates a constant stream of error logs because the validation is so unreliable.
About 70% of all my certbot validations fail.
We did MMR and traceroute tests from our server to the public internet and didn’t find any packet loss or problems with our network/hardware. I don’t know how to test connections to your systems since the tool makes the connection. Is there a hostname I can use to query an ip and test it? I can assure you we’re on premium hosting/bandwidth with low latency. Our host does not filter any of our traffic.
I have to repeat the commands against your system 2 to 5 times in order to get each certificate. I repeat the command once an hour until it works. It ALWAYS is able to renew all the certificates eventually. The issue is only with reliability of the service.
I may have to redesign the script to stop sending me errors until it fails like 5 times if this can’t be fixed, since I’m seeing a lot of error alerts from this when renewals come up.
I don’t know how this tool works, but it seems like it should not fail this often especially if all it has to do is pull a small static file via HTTP.
I still use http webroot plugin as before. Is it possible for me to implement the function of the certbot validation in my application instead of relying on this tool? Maybe there is a timing or configuration problem with it that is too fast.