I started looking at certificate transparency logs for one of our domains and subdomains (*.shibboleth.net) and came across one domain which is being unexpectedly issued certificates by Let's Encrypt. However, as far as we can tell the machine we think would have been involved in this hasn't been switched on for months and we're more than a little curious as to how the certificates could be issued: one point of CT logs, after all, being detecting mis-issuance.
My question, then, is whether there are any available tools for me to understand the issuance of these certificates? Obviously I don't have any of the usual information, such as account details. Is there even, for example, any logging available to help understand the verification method used or the IP address of the requester?
Not unless you have control of the machine which requested the cert.
Is that the only cert you are concerned with? Because it was issued on Dec25 and expires in one week. Normally certs are renewed with 30 days remaining so it looks to me like it was setup in the past but the requesting machine is no longer operating (since Dec25). Cert history for that apex domain is here.
Although the certificate itself will expire soon, my concerns — which are really about whether the certificate was correctly issued — will not expire.
We have a virtual machine which has the same name as the certificate. However (a) we have pretty good evidence from AWS logs that it was turned off during the whole period in question and (b) we've fired it up and checked its logs. Again, we have no evidence that that machine requested this certificate (or indeed any certificates, ever).
My conclusion is that there's another machine out there, which I can't identify, which requested this certificate. Not being able to identify that machine is a security concern for us, and I'm looking for tools to help me understand what actually happened. If (absent having the requesting system available, which is obviously not possible in this case) there are no such tools then that's disappointing, if I suppose not very surprising.
It isn't required for the host to be actually up to get a certificate for it: the dns-01 challenge could be used from any host on the internet.
If you're afraid that someone is issuing certs without your consent (e.g. leaked Route53 credentials) you can use CAA to "lock" issuance to e.g. just the http-01 challenge or perhaps even to a single account. See Enabling ACME CAA Account and Method Binding for more info.
I'm puzzled by this comment. Osiris explained the DNS challenge. But, the other common way to get a cert is the HTTP Challenge. For that a webserver (like Apache, nginx, certbot standalone, ...) handles the challenges. The IP address in the DNS dictates where the Let's Encrypt servers send the challenges.
For EC2, if you are not using an Elastic IP the IP associated with the EC2 instance is lost when you stop the instance and can (will be) reused by whoever AWS assigns it to. If you leave the DNS A record set to this IP it will now point to someone else's EC2.
You still have an A record in the DNS for rpexample. When you start that EC2 instance does it still have the same IP as in the DNS? If you don't, that is a possible way for someone else to get a cert by that name. It seems unlikely to me but thought I'd mention it just the same.
Was the domain recently on Cloudflare in any way? Cloudflare grabs certificates for all domains on their system, even if the domains aren't actively on their network, so they can instantly turn on SSL protections if requested. Some other cloud providers have similar services.
Most people who find randomly issued certificates for their domains – or domain rate limiting – are either the customer of a cloud service that processes their own certificates, or forgot to turn off a virtual server.
Yes we do, and no the EC2 instance does not retain its IP address on startup, it doesn't have an elastic IP. I don't believe it ever had one, and as I say as far as I can tell the EC2 instance calledrpexample isn't the machine that requested these certificates.
So you're right that in principle someone could have started another instance, been allocated that same IP and got a cert in that name... but I agree with you that this doesn't seem likely.
What do you mean by that? EC2 instances have a symbolic identifier but it is not visible to the public internet. The names are just to help you manage them. They are not involved in cert issuance in any way as that uses the public internet to check who controls that domain name.
EC2 instances do have a public name and the one associated with your DNS A record is this
ec2-18-215-124-135.compute-1.amazonaws.com
because your DNS A record points to the IP for that instance
Having a DNS A record pointing to something not in your control is a security hole.
Further, depending on your EC2 config your instance may reset to its base ami if you restart the instance and similar things. This may be why you can't find any evidence of it ever getting a cert. Managing "stuff" often requires taking care to use persistent storage, backups, and/or automated config when launching / restarting the instance.
I think the best explanation is you caused a fresh EC2 instance to start after Dec25 and it no longer had the cert client to renew the cert. It is more likely than someone abusing your stray DNS A record (which you should fix/remove by the way)
Let me be more precise, then. We have an EC2 instance whose Name metadata in the AWS console is rpexample.shibboleth.net, the same as the certificate in question.
I am aware that the name metadata is not externally visible. I am aware that such names aren't involved in certificate issuance. I am aware that running EC2 instances which aren't associated with Elastic IPs can be referenced by ec2-*.amazonaws.com DNS names by virtue of those existing in isolation at all times.
Nevertheless, I know this is not the explanation for what we're seeing, as AWS does record the "last time anything happened" time for an instance and as I have mentioned before this demonstrates with high confidence that this instance was not running at the time the certificate was issued (or the previous one, several months earlier, for that matter... it stopped in July, IIRC).
So while I agree that in the absence of evidence, this would be the most likely explanation and that abuse of the stray DNS record is very unlikely (and yes, we'll clean that up) the actual explanation must lie elsewhere. I think it's very likely that another instance (which no longer exists in our account) must have been responsible, but almost by definition that's difficult to confirm. It doesn't sound like there are any tools on the Let's Encrypt side that might help, which was my original question. That's fine.