Statistik on the Main Page


#1

Here the number of cert issued for 11.01.2015 00:00 is 281.226k
While unexpired stats is 284.5k

How it is possible that there are more certificates valid than signed ?


#2

If “Issued” only counts non-revoked certs, the numbers would match:

 281,225 (Issued)
+  3,304 (Revoked)
________
 284,529 (Unexpired)

Not sure if “Issued” is the right word in that case, but it can be hard to find unambiguous terms for things like this.


#3

Hi, yes this could be an explanation. But to count revoked as not expired get me also an bad feeling.
For me an certificate is expired under different conditions:

  • No longer valid because of time constraint in the certificate.
  • Expired because browser not longer accept them based on revocation.
  • Expired because browser no longer accept them for other reason (MD5 signature this is not the case with LE but there may be other reasons)

So expired mean for me currently valid (or become valid in the future).


#4

I’m all for adjusting the language. Suggestions welcome. :slightly_smiling:

I’m also slowly releasing a little code that crawls CT to produce the same information, just from public sources. Then you can really cross-check that page: https://github.com/jcjones/letsencrypt_statistics


#5

A couple other things:

The two charts are recomputed at different times of day; plotly has taken to heavily throttling our account, so the topmost chart is updated every 4 hours while the others are generally updated once a day. I say generally because if there are any transient updates due to restarts, etc., we hit the rate limit early and often they don’t update. Swapping to hosting the datasets somewhere else is on the long-term to-do list, but the stats page is a work-in-progress anyway.

The charts on https://letsencrypt.org/stats/ are from the Boulder DB, so they will always match CT, they may be ahead by the merge-time.

The Daily Activity chart is the blend of two queries; the Certificates query is a simple:

 SELECT
      DATE(c.issued),
      SUM(scs.status = 'good') as good,
      SUM(scs.status = 'revoked') as bad
  FROM
      `boulder`.certificates AS c
      JOIN
      `boulder`.certificateStatus AS scs ON c.serial = scs.serial
  GROUP BY DATE(c.issued);

which needs a WHERE eventually, but it doesn’t have it today.

The Active Certificates chart uses this query:

SELECT
    DATE(c.issued) AS periodStarting, COUNT(1) AS numIssued
FROM
    `boulder`.certificates AS c
        JOIN
    `boulder`.certificateStatus AS scs ON c.serial = scs.serial
WHERE
    c.issued > DATE_SUB(NOW(), INTERVAL 180 DAY)
GROUP BY DATE(c.issued)
ORDER BY c.issued ASC;

and the result is processed through a for loop based on the cert lifetime to figure how many unexpired certificates there are in total, as that’s not easy to loop through in pure SQL:

for row in cur.fetchall():
  datestamp = row[0]
  numIssued = int(row[1])

  rawdata['date'].append(datestamp)
  rawdata['issuedThatWeek'].append(numIssued)
  thisOffset = len(rawdata['date'])

  unexpired = 0
  for i in range(max(0, thisOffset-self.certLifetime), thisOffset):
    unexpired += rawdata['issuedThatWeek'][i]

  # print("Date: {0}".format(self.certificatesByDayGetDate(datestamp)))
  # print("That Period: {0}".format(numIssued))
  # print("Unexpired: {0}".format(unexpired))

  issued.x.append(self.certificatesByDayGetDate(datestamp))
  issued.y.append(unexpired)

Right now the code for this needs a cleanup to be useful to anyone else, and modularized to pull out LE-specific stuff before it can be published. If you’re really interested, I can do that in the coming days.


#6

When i read the query and the problem i have instantly some points:

  • Each certificate can only have exactly one Status, one status is only valid for one certificate
    => Why not store the Status in the certificate table, this would eliminate the join.
  • If there is an index on issued,status than count(1) for an selected range would only count index range.

This should speedup the generation.