I was reconfiguring a few things last night and decided to change some hostnames which I had to create new SSL certificates for. Not long after the certificates were issued (5-15 mins), I started seeing lots of requests for the new hostnames that looked like someone was trying to guess the login details. The requests didn't look automated but they were coming from multiple IPs around the world.
I've been over the logs and my config several times and I can't see any way that anyone other that me could have known those new hostnames. I've created several fake hostnames but only partially configured each one to narrow things down. The suspicious traffic only appears on the ones where I have created an SSL certificate through Let's Encrypt. One of them I didn't even load in a browser but the traffic still started not long after creating a certificate.
Is it possible that someone is scraping the data of newly registered certificates? I create a separate certificate for each hostname (even separate ones for each subdomain) as I know people can see the list of hostnames if multiple are used on the same certificate.
All certificates from Let's Encrypt are placed in publicly accessible logs. This system is known as Certificate Transparency. It is required for certificates to be trusted by Google Chrome, Apple Safari, and other browsers and systems.
Internet scanners can and do identify hosts based on names found in certificates logged to Certificate Transparency.
Note that this is not a Let's Encrypt issue: all publicly trusted certificate authorities are required to log their issued certificates at certificate transparancy logs.
Thank you both for your responses. If this is intended functionality then I'm ok with that. I don't rely on keeping URLs secret and everything on the server is properly secured. I just wasn't expecting anyone to find the URLs so fast and was worried either me or Let's Encrypt had a security hole somewhere.
Expanding on what @JamesLE has said above - this has emerged as a well known vector for security threats in the past few years. Most Open Source projects and webhosts have implemented changes to combat that, but not all have.
The most typical exploit for a long time, was to monitor CT logs for a domain and then immediately try to exploit a new install of Wordpress on that domain in a race condition exploit. A successful exploit would then happen before the actual owner log in by: i) log in with the default admin credentials, ii) compromise the system, iii) make everything look like normal.
WordPress - and many other Open Source systems - quickly dropped their default installation credentials and used a randomized password. Until that was released, many large-scale webhosts that offered "one click installations" would either password protect the installation directories or reset the default credentials to a random password.
That being said: If you've installed any software that has a standard default password, you should consider that installation/user-directory as potentially compromised.
As a tip moving forward, people should first install open source web software in directories controlled by .htaccess until it is fully set up and secured.
There are also entities who somehow sniff DNS queries to find hostnames
I have wildcard subdomain DNS entries for all my domain names, and if I do a single nslookup for a gibberish subdomain like "sadofiwoifjsdofsaodfi.(mydomain).com", then about 8 hours later, my webserver log will start filling up with "security scans" targetting sadofiwoifjsdofsaodfi.(mydomain).com
they normally identify as Palo Alto Networks in the user agent but who knows who they really are
I can even do nslookups for something like "only-losers-scan-this-hostname.(mydomain).com" or "palo-alto-is-dumb.(mydomain).com" and they'll still go along with it
I have no idea exactly how they're sniffing DNS queries; it's actually kind of impressive
There are also entities who watch new domain registrations... once I registered a new domain and had a site up with a contact form pretty quickly, and the contact form got a bunch of spam the first day and then never again.
Security by obscurity is actually a good idea if added as an additional layer of security. I am not saying fully depend on it but take additional steps to secure an application.
I disagree. All your other layers should be strong enough, adding the additional layer of obscurity actually doesn't add much if any at all IMO. But we can agree to disagree
This is generally true, though not usually done via DNS query sniffing - that would require access to your network traffic. There are multiple, more generic, ways to extract data out of a DNS zone:
Classical word-based bruteforce attacks: Try random subdomains and look what resolves.
Try to dump the zone via AXFR queries.
If the zone is DNSSEC protected: DNSSEC-secured zones can be enumerated thanks to its NSEC (non-existence proof) feature. Unless the zone uses live-signing (which is traditionally not the case), NSEC immediatly leaks existing subdomains. This can be used to quickly dump the entire zone. NSEC3 uses hashes, which makes this harder, but with a wordlist it's still doable.
Are you actually running a Palo Alto firewall on your (corporate) network? If so, I suspect that it's really logging your DNS. Otherwise I would suspect that something upstream is logging DNS. Because for a full-wildcard zone the traditional extraction methods are not plausible.
I use VPS/cloud servers with a few different providers, I don't really have any visibility into what kind of firewalls they're running
I do use public DNS servers such as Cloudflare's 1.1.1.1 and Google's 8.8.8.8 and OpenDNS, I figured they found a way to sniff that traffic, or maybe they worked out a deal with the server operators to get access to the log files of those servers?
That was far enough into the thread that I missed it. I'm generally against wildcard DNS unless it is to send a strong message like: this name sends no mail
too lazy to create individual DNS entries for all my subdomains so I just use a wildcard (except for subdomains where I need to route them somewhere other than my main server)
Security by obscurity isn't a layer of security at all, it's a different level of concern (i.e. privacy, bandwidth conservation, log reduction, etc...), that's why nobody considers moving your SSH port to something other than port 22 a security measure, it's just purely convenience as you have not reduced your attack surface in any meaningful way.
If you are at the point where you need to worry about Certificate Transparency logs, you are very deep into something you shouldn't even be using public certificates for in the first place, and should pivot to building your own PKI.