How to continuously create/renew certificates without hitting limits?

So here is a rough proof of concept for clustered Let's Encrypt with Traefik in Docker Swarm. Without a shared file system. Without spending €3000.

DO NOT USE THIS FOR WEBSITES WITH SLAs, there are many ways it can break. If the container for example is re-scheduled to a different node, certbot needs to rebuild all certificates, this takes time and may hit limits, rendering services unavailable.

Alternatives
It's probably a lot saver to use a shared file system if you can and are willing to set it up.

Workflow

  1. Run single instance of certbot container on a Traefik node
  2. Start web-server in container for own challenge and serving dynamic config
  3. Loop: Fetch domains from Traefik API
  4. Loop: Generate own challenge to see if domain is reachable
  5. Loop: Run certbot certonly --non-interactive --keep-until-expiring ...
  6. Loop: Create dynamic config file with certs for Traefik
  7. Loop: Sleep 15 seconds

Own challenge
The own challenge is implemented because routing information may be misconfigured and we don't want to connect to Let's Encrypt every 15 seconds with the same non-working domain, it will reach limits very fast.

Potential To-Dos
Currently the container serves all existing file certificates. Some may be not needed, some may be expired. So it makes sense to combine with the domain list from Traefik, but that may be empty (network trouble) and in no case you want to create an empty dynamic config file, or it may contain non-working domains without cert files. Food for thought.

Traefik configuration
Traefik can use a http provider in static config to poll in interval the dynamic configuration:

providers:
  http:
    endpoint: "http://traefik_certbot/traefik-certbot.yml"
    pollInterval: 15s
    pollTimeout: 5s

Traefik dynamic configuration
The certbot container will return the dynamic config with certificates inline:

tls:
  options:
    default:
      minVersion: VersionTLS12
  certificates:
    # CERT FILE /etc/letsencrypt/live/example.com
    - certFile: |-
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
      keyFile: |-
        -----BEGIN PRIVATE KEY-----
        ...
      -----END PRIVATE KEY-----
    # CERT FILE /etc/letsencrypt/live/www.example.com
    - certFile: |-
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----
      keyFile: |-
        -----BEGIN PRIVATE KEY-----
        ...
      -----END PRIVATE KEY-----   

Proof-of-concept shell code

# run within certbot container
apk add curl pcre-tools

WEBROOT=/webroot
echo WEBROOT $WEBROOT
mkdir -p $WEBROOT/.well-known/acme-challenge

echo START WEBSERVER
python -m http.server 80 --directory $WEBROOT &
/bin/sleep 2

while true; do

  echo FETCH DOMAINS
  DOMAINS=$( \
    curl --silent --max-time 5 http://user:pass@traefik:8080/api/http/routers | \
    pcregrep -o '(?<=Host\(`).*?(?=`\))' | sort | uniq \
  )
  echo FETCH DONE

  for NAME in $DOMAINS; do
    echo DOMAIN $NAME

    FILE=/.well-known/acme-challenge/traefik-certbot-$EPOCHREALTIME
    touch $WEBROOT$FILE
    curl --silent --max-time 5 http://$NAME$FILE >> /dev/null
    ERR=$?
    rm $WEBROOT$FILE

    if [ $ERR -eq 0 ]; then
      echo DOMAIN CHALLENGE OK, RUN CERTBOT
      certbot certonly \
        --webroot -w $WEBROOT \
        --non-interactive \
        --agree-tos \
        --no-eff-email \
        --keep-until-expiring \
        -m email@example.com \
        --quiet \
        --cert-name $NAME \
        -d $NAME
      if [ $? -eq 0 ]; then
        echo CERTBOT OK $NAME
      else
        echo CERTBOT FAILED $NAME
      fi
    else
      echo DOMAIN CHALLENGE FAILED http://$NAME$FILE
    fi

  done

  echo TRAEFIK TLS FILE GENERATION
  FILE=$WEBROOT/traefik-certbot.yml
  printf "tls:\n  options:\n    default:\n      minVersion: VersionTLS12\n  certificates:\n" > $FILE
  for NAME in $(find /etc/letsencrypt/live/ -maxdepth 1 -mindepth 1 -type d -print) ; do
    printf "TRAEFIK TLS FILE ADD $NAME\n"
    printf "    # CERT FILE $NAME\n" >> $FILE
    printf "    - certFile: |-\n" >> $FILE
    sed -e 's/^/        /' $NAME/fullchain.pem >> $FILE
    printf "      keyFile: |-\n" >> $FILE
    sed -e 's/^/        /' $NAME/privkey.pem >> $FILE
  done

  #echo TREAFIK TLS FILE CONTENT
  #cat $WEBROOT/traefik-certbot.yml

  echo SLEEP
  /bin/sleep 15
done
2 Likes