Revoking certain certificates on March 4

If anyone does not have access to the serial numbers of your certs but has the domains this PHP script will cross reference the serial dump by domain - not the fastest grepping 1.3GB thousands of times but it was quick to thrown together and let me identify a few of our certs to re-order.

    <?php 

	$domain_file = '/home/dave/potential_domains.csv'; 
	$cert_issues = '/home/dave/Downloads/caa-rechecking-incident-affected-serials.txt'; 
	$match_dump_file = '/home/dave/affected_domain_match.csv';

	$counter = 0;
	$match_domains = array();

	# Grab the seed domain named 
	if (($handle = fopen($instiller_domain_file, "r")) !== FALSE) {
		while (($data = fgetcsv($handle)) !== FALSE) {
			# Clean up the domain name to grep the other file 
			$counter++;
			$domain_name = trim($data[0]);

			# initialise the match state 
			$status = 'NOT_MATCHED'; 

			# Create the command line to grep the files 
			$command_line = 'grep "' . $domain_name . '" ' . $cert_issues;

			# Only need the last line for a match
			$buffer = exec($command_line, $buffer);
			if (trim($buffer) != '') {
				$status = 'MATCHED'; 
			}

			# Dump the status 
			echo $counter . " :: " . $status . " :: " .  $command_line .  " --> [" . $buffer .  "]\n";

			# 
			if ($status == 'MATCHED') {
				$match_domains[] = $domain_name;
			}
		}

		fclose($handle);
	}

	echo "\n\n Dumping Matched \n\n";
	var_export($match_domains);
	echo "\n\n DONE \n\n";

	$fp = fopen($match_dump_file, 'w');

	foreach ($match_domains as $fields) {
		fputs($fp, $fields);
	}

	fclose($fp);


?>
5 Likes

Thanks @instiller, that script is much appreciated.

2 Likes

As we hold large amounts of customers, I had no other way but to parse files and build up a lookup tool. If someone finds it usefull, it can be found here: https://www.certic.info/tools-letsencryptrenewcheck.php

Unfortunately, it was obvious this is about to happen during the outage in late night of February 29th, I asked to get more information, unfortunately it was ignored completely.

Now we are facing short notice. Let’s Encrypt is a serious and probably one of the best project ever, but it really needs to come up with better support on public networks.

Screenshot 2020-03-03 at 23.52.58
https://twitter.com/cs_networks/status/1233704143224791042

Totally ignored, yet it was clear this is likely to happen. Now facing a few hours notice, not doing good to a public. Let me know If I can be of any help, but PR really needs to get a bit better on this.

1 Like

@yuriks I think you and the letsencrypt team should stop trying to explain why you wasted away the time figuring out which certificates were affected and blaming your late notification on the Baseline Requirements. If you had five days of notice period, you should have informed the community immediately, not after you had compiled a list of affected certificates. letsencrypt - you provide a great service for the web community, but just take it on board that you’ve handled this issue terribly - you need to stop trying to make excuses for it, just apologise, accept our feedback and move on without offering up excuses.

2 Likes

maybe they could have write a message like “you may need to renew your certificate, we don’t know yet”, but it may had harmful consequences:

  • Too many people trying to renew without needs, which could have cause an outage
  • Too many people on the forum asking for details, that they couldn’t give yet, diverting their attention from more urgent things
5 Likes

Unfortunately at the time we weren’t sure of the scale of the impact and so wouldn’t be able to give people useful guidance yet. At the time we were focusing on patching the bug and then posted an explanation of the issue at 2020.02.29 CAA Rechecking Bug.

Thanks for posting your checking tool. I have some questions about it I’m going to send in a private message to not clutter the thread.

7 Likes

@tdelmas

The system should be built in such a manner that it can handle all of the certificates being requested simultaneously as regardless of the likely random timing of requests in normal operation, it is already possible that a large percentage of all certificates could be requested to be renewed simultaneously.

With respect to your second point, that is why clear, concise communication is more effective and important than verbose explanations and excuses. A perfect example is that the email mentioned a date without a time and/or a timezone; that suggests that LE’s communication is a last minute thought and LE has clearly underestimated how much time people needed to renew their certificates across their many distributions and personal scenarios.

Again, it’s not worth making excuses for this - it’s worth finding the reasons that the correct procedures and processes were not in place for such an event as this, which is why I asked whether the governance documents are available for LE so that the community can perhaps help contribute to better disaster resolution processes and procedures and quite frankly, if LE lacks the number of people that are required to handle such an event, they need to ask the community for more help. LE doesn’t just need technical staff that are capable of handling the bug, like in this circumstance; they need the right people in the organisation to help ensure that events such as these are planned for, well thought out and tested in advance for robustness.

1 Like

We did not receive any email from LetsEncrypt and found out about this on ArsTechnica. 12am UTC deadline is absolutely unreasonable.

I think its very easy, especially as engineers, to respond in such a way. Promoting better practices, showing the golden path of a new problem. The fact is that we are here. And all of the shoulda, coulda, woulda…are not helpful in a thread like this. Helping people solve the problem is the desire so when folks run into the fact that they are running X hundreds or domains, they can fix it, not get bogged down with posts about what could have been. Open a new thread, link it to this. I’m not a member of LetsEncypt team in any way shape or form, but helping the community is a better use of time then supporting past decisions.

4 Likes

In addition to this, i used https://github.com/hannob/lecaa earlier today. I have about 140 domains with LE, it was a great help. Hopefully its correct.

3 Likes

Do we know when the certs will actually be revoked? I was told 3/4/2020 18:00 UTC by my CDN provider, but I dont see the same confirmation from LE. Does anyone have any details they can share?

I’m not talking about past decisions - I am talking about how LE are handling this event and providing feedback on the answers they are providing in this thread. My comments about how they can be better prepared in the future are not about a ‘past decision’ - they are about the decisions they are making right now in this thread and my comments in this thread have already lead to a clarification about the timezone as being UTC. We are not getting bogged down posts about what could or should have been and even if you feel that your criticism is appropriate, it is equally as ‘off point’ as mine would be.

The community should feel welcome to comment about whatever they wish - if you feel like creating a new thread that should be focussed on specific technical fixes, go right ahead and do that.

1 Like

@jxman they have mentioned in the edits at the top of this thread that they do not have a locked down specific time, but that they suggest that you should consider your certificates as having been revoked as of 2020-03-04T00:00Z (midnight at the start of the 4th March UTC)

We have not started revocations but stated that 00:00 UTC on 04 March 2020 would be the earliest we would start that process. When we begin the revocations, we will post an update here.

A post was split to a new topic: Replacing certificates with acme.sh

A post was split to a new topic: Renewing Certs with acme.sh

Jillian - Are you able to provide some more details about when you might start revocations? We’re in panic mode over here as we attempt to figure out which of our customers this might effect. Delaying the cert revocation would definitely give us more time to respond

@jillian, I sincerely believe that if the requirements are open to interpretation in any way, you should at a minimum avoid revoking certificates until 2020-03-04T23:59Z to give people at least another 23.5 hours to fix their certs, if that falls within the interpretation of the regulations to which LE are bound.

In fact, having read the incident report, if the bug was confirmed at 2020-02-29 03:08 UTC, then 5 days should be 2020-03-05 03:08 UTC.

How long does LE expect the revocation process to take?

1 Like

Yeah, tell me about it… What about people using this in an SaaS application!!! This will kill the app and not just ‘visitors will see security warnings’ guh!!

We are still assessing how long it will take to revoke this many certificates with our tools. As soon as we have an estimate, we will update the community forum and other communication channels for when we will start. Thank you for your patience and understanding while we gather this information.

I have updated the following question in our F.A.Q at the top of our page:

5 Likes