I was wondering why and how bots can know new host certificated issuance ?
I have issued 2 certificates in the last 12 hours for host names made from randomly choosen serie of letters and numbers. These hostnames can not be guessed; typically there are 50 caracters longs. I own my dns servers and nobody (except my servers) can xfer my zones.
What I couldn't explain is that immediately (less than 10 seconds) after issuing a new letsencrypt certificate, I have bots connecting to my services.
example:
How do they know my [new] hostnames: letsencrypt must publish this information right ? Is there a flag to certbot or acme.sh to not let this information be made publicy advertised: these are privates servers on my own hardware: I actively ban all these parasitic behaviour ?
every publicly valid certificate have to published to certificate transparency log, and it will be available to anyone to monitor. https://www.certificate-transparency.org/
check out crt.sh if you want to lookup cert for a domain
Certificate transparency seem to be a great goal, but in the facts, is abused by scan bots and all sort of parasitic behaviour (bots actively searching for wordpress flow from last decade; and so on...).
If i want a certificate for my own private cloud server, it's aimed to provide my familly a way to share data without the need to import self-signed CA authority to match self signed certificates. Or to explain my old parents that the security exception they see in their browser is "good" this time...
Certificate transparency must be abble to contain an "hidden" layer like most whois registry do. I trust letsencrypt (maybe I'm wrong) but I definitely can't trust bots trying to masquerade themselves as usual browsers.
entire purpose of CT is to monitor site owner to check if a certificate for their domain made without their knowledge, so it have to publish domain name (as they aren't the own made certificate) to work.
BTW, it's "MUST" by CA/B baseline requirement, LE will be distrusted from browser if that don't publish to ct log.
I thought [probably naively] certificates where trusted based on cert chain. I can't see what CT has to do with browsers. Further more, I'm curious how, with most peoples/organisation switching to dnssec, someone could issue false certificates easily in 2020. The argument of without CT you will break the internet, sounds a bit confuse to me...
As far as I can see CT deserve mostly evil purposes or is farly too much abused to be something benefic to "normal" people...
If you're refering to bots seeking to exploit common exploits in web software: you should always be aware of this, the hostname being public or private. To me, using this as an argument agains CT logs is very weak, as your software needs to be secure no matter what if it's connected to the global world wide web.
If you want a fully private site secured with TLS, you should generate your own private CA and somehow distribute your own root certificate to the clients. Once you've added your own root certificate to the OS (or browsers) trust store, your clients wouldn't get any TLS error.
Another "solution" would be to get a wildcard certificate (e.g. *.example.com). Then, you cannot see the individually sub-domains in the CT-logs.
But like other also has stated, your security model should never be based on the fact that nobody knows your DNS names (or IP). Always assume that kind of information are public and make your services secure with that in mind.
I had the same thing happen to me a while ago ... luckily the person running the scan process identified themselves in the user-agent, was very helpful, and pointed out the CT logs and he explained that he was running a survey of SSL sites.
He explained that his survey was infrequent and low impact, but offered to exclude my sites from his survey.
I saw no risk in his survey and thanked him for educating me about the CT logs.
I haven't paid attention to that kind of scan for quite a while since.
The bots are a fact of life. You already know that, since you have been banning them from before you got your certs. They are getting to your site not because of the certs, but because they are just going to every IP address and probing. What's changed is that since you issued the certs, they now know the names. There is no link between the certificate and the IP address unless you created an IP certificate instead of a Domain certificate.
I'm curious as to why your domain name is a random string of 50 characters. A domain name is an easy-to-type name for humans so they don't need to know the IP address. If you are going to type the random hard-to-remember gibberish, you may as well type the IP address.
I had the same problem with my private subdomains, now I only use one certificate with "domain.com *.domain.com" and no more public info on my private subdomains.
In the specific case, it is important to know is the certificate presented as default certificate if a client connects without SNI? If yes, there is no need to assume that the name leaked through the CT log (which is always true, but may be irrelevant), the long cryptic name is directly presented in the certificate's SAN section. The next connection with SNI may already come with that name as server name.
For more secure coverage:
The root/apex cert should probably just be covered by: domain.com & www.domain.com (single cert)
Below that, you can have all the private/segregated individualized subdomains you want. *.a001.domain.com (one cert)
... *.b001.domain.com (one cert)
... ... *.z999.domain.com (one cert)
CAs need to log information about the server for security anyways. I am not aware that any CA (LE especially) advertises this information, and it most likely is not public at all.