Update Script Failing... Wait, no come back!

You can make symlinks in /etc/pki that point to /etc/letsencrypt/live, but if you change the contents of /etc/letsencrypt/live itself, Certbot will break. It uses the symlink targets there to keep track of renewed certificate versions. They're not just cosmetic or for convenience but are actually used to store data.

Ok I tried that but now:
# certbot certonly -d quantum-equities.com,www.quantum-equities.com,mail.quantum-equities.com
ā€¦
Failed authorization procedure. mail.quantum-equities.com (http-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: Fetching http://mail.quantum-equities.com/.well-known/acme-challenge/-p1Ndn7gp-61EXGdukO2JzFhhXxtgxmdY8v0X21XRx0: Connection refused
ā€¦

mail.quantum-equities.com is at a different IP, but for certbotā€™s benefit I have it defined as a virtual server in the Apache server. Looks like Iā€™ve failed.

Is it on a different server? In that case you would ordinarily need to run Certbot on that server. Certbot needs to prove that you control each name for which you're requesting a certificate. (There are potential workarounds for this if you can't run Certbot on the other server or if you particularly need the names to be covered by the same certificate instead of separate certificates.)

Mail is running on another server and I can run certbot on the other server, but this would mean requesting yet more certs from LE which may unnecessarily run me over my limit.

Iā€™m trying to reuse my certs for all domain aliases using SANs. This used to work.

Did you previously have mail hosted on the same server?

No, mailā€™s always been on a different server. (6 months) When I got the certs I copied them over to the mail server. Itā€™s all just SSL I guess.

I donā€™t know how you could have gotten certs for the mail subdomain if it was always pointed at a different server! You werenā€™t using --manual, or something, but rather --apache?

No I got them manually with:
# certbot certonly -d quantum-equities.com,www.quantum-equities.com,mail.quantum-equities.com

This doesnā€™t work anymore. (option 2)

ā€¦ and hoped to renew them using a script which in part calls:
certbot certonly --csr /etc/letsencrypt/csr-quantum-equities.com.csr --fullchain-path /etc/pki/tls/chains/quantum-equities.com_fullchain-${DATE}.pem --chain-path /etc/pki/tls/chains/quantum-equities.com_chain-${DATE}.pem --cert-path /etc/pki/tls/certs/quantum-equities.com_cert-${DATE}.pem --apache > ${OUT} 2>&1

What can I do to make a working system?

Commenting out all SSL parameters in Apache vhosts.conf, it fails with a nebulous error:

-- Subject: Unit httpd.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit httpd.service has begun starting up.
May 22 12:27:16 quantum.darkmatter.org systemd[1]: httpd.service: main process exited, code=exited, status=1/FAILURE
May 22 12:27:16 quantum.darkmatter.org kill[568]: kill: cannot find process ""
May 22 12:27:16 quantum.darkmatter.org systemd[1]: httpd.service: control process exited, code=exited status=1
May 22 12:27:16 quantum.darkmatter.org systemd[1]: Failed to start The Apache HTTP Server.
-- Subject: Unit httpd.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit httpd.service has failed.
-- 
-- The result is failed.
May 22 12:27:16 quantum.darkmatter.org systemd[1]: Unit httpd.service entered failed state.
May 22 12:27:16 quantum.darkmatter.org systemd[1]: httpd.service failed.
May 22 12:27:16 quantum.darkmatter.org polkitd[738]: Unregistered Authentication Agent for unix-process:561:8811655 (system bus name :1.142, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)

I have a website catastrophe on 3 sites, and tomorrow will also lose all email. And I can only see removing all encryption to try and getting things running. Embarrassing.

I canā€™t understand why stripping down /etc/letsencrypt to nothing except options, and manually trying to fetch certs, does not work anymore. This leaves me with no options.

I don't know the reason for that systemd errorā€”maybe it's explained more clearly in Apache's own logs, probably in /var/log/apache2?

I still don't understand your responses to some earlier questions in this thread.

  • Why use --csr? It's not related to whether you use --apache or not (or to whether or not Apache can start).

In particular, this is not correct at all. It can't be running for --standalone, which is totally independent of --csr.

Also, is it possible that you're confusing --manual and --standalone? You referred to getting certificates "manually" but I didn't understand which authenticator you were using.

I expect that this should never have worked with --standalone if you always had mail.quantum-equities.com on a separate server. The certificate authority would have tried to connect to port 80 of mail.quantum-equities.com, and it shouldn't have anything that could satisfy that CA's challenges on that machine if that wasn't the same machine where you were running Certbot.

To summarize a bit:

When I use --standalone, I'm referring to the standalone authenticator plugin, which can also be selected from the interactive menu if you didn't specify the --standalone option on the Certbot command line.

--csr is separate from --standalone. The effect of --csr is to use a certificate signing request (CSR) file to specify the public key and domain names that the certificate should cover. Otherwise, Certbot will create a new public key and try to identify the domain names from various methods (including -d options on the command line, or reading them out of your Apache configuration if you used --apache, or interactively prompting you).

If you use --csr, the resulting certificate isn't saved in /etc/letsencrypt at all and can't be renewed with certbot renew.

If you don't use --csr, the resulting certificate is saved in /etc/letsencrypt and can be renewed with certbot renew.

Only --standalone requires you not to have a running web server (such as Apache) at the moment that you obtain the certificate. By contrast, --webroot requires that you do have a running web server, and --apache requires that you have a working copy of Apache installed. Each of these authentication options can be used with or without --csr (as --csr can be thought of as specifying what certificate to get, and the authenticator options like --standalone can be thought of specifying how to get it).

If you use certonly, Certbot doesn't attempt to install the certificate after it's obtained. Otherwise, if you specified an installer plugin, Certbot will also attempt to install the certificate.

For all of these plugins, they would currently use a method to prove control of your domain names where you need to receive inbound connections from the certificate authority on port 80. Therefore, they normally can't obtain certificates for names that are pointed exclusively at other servers, unless you have an HTTP redirect in place that causes the CA's connection to the other server to be redirected back to the server where you're running Certbot. There are other Certbot options (including various uses of --manual and more recently some of the DNS plugins) that are able to get certificates for other servers in some circumstances.

Certbot does not expect you to make an HTTPS virtual host in your Apache configuration before obtaining a certificate. This is not necessary. However, you can't have a broken or incomplete HTTPS virtual host. If it's present at all, it must already work properly. If you (for example) delete or rename certificates that your Apache configuration is pointing it, it will break because Apache doesn't know where the certificates have gone. No Certbot commands can "uninstall" a Certbot-installed certificate from a web server configuration, although there are revert commands that will roll back your web server configuration to a backup version if a Certbot installer has changed the web server configuration.

I definitely agree with @_az's observation that

Each authenticator has different prerequisites and behaviors.

However you proceed, you'll probably need to fix Apache so it can run; the relevant error messages might be in /var/log/apache2 rather than in the systemd output. If, after fixing Apache so it can start, you do find that a particular Certbot command fails, please post that command and its output here so that we can help you look into why that command has failed.

ā€“csr and --apache are different issues. Iā€™ve learned here that --apache is the auth method, and although I donā€™t fully understand how it works. webroot didnā€™t work at all when httpd couldnā€™t start, so it has to be --apache.

I am trying to use --csr because with quarterly expirations Iā€™d have to regen httpd Public-Key-Pins and DNS DANE every renewal, and thatā€™s untenable.

This process has been so painful and I still donā€™t have anything like a renew script that works, so Iā€™ve just resorted to manually getting the certs separately for the web and mail servers. I have a million other things I should have already done by now but my web and mail were catastrophically down. The certbot system seems an anachronistic tack-on to Linux. Maybe thatā€™s so it works on all platforms, I donā€™t know. But the certs belong in /etc/pki/tls. None of my business.

This will probably be different in the next release of Certbot, which should support a --reuse-key option to use the same subject public key again for renewals.

2 Likes

Thatā€™s going to help a lot of people. Congrats on the feature :tada: !

At least for HPKP, you can pin a public key hash for an intermediate cert, avoiding this problem. Not sure about DANE though.

Then copy them there (using the -hook options for certbot, ideally), or symlink them there from /etc/letsencrypt/live.

Dooming you in a different way if your CA(s) change intermediates unexpectedly.

(Chrome is phasing out HPKP because it's so dangerous.)

It looks like this is your problem. Most likely the CSR you are providing is unprocessable by OpenSSL in some way. Can you share your CSR here? It will include only public information (public key, hostnames).

Note that Let's Encrypt supports a maximum RSA key size of 4096 bits.

Iā€™ve wiped it all on the web server and started over.

Now the mail server is whacked. The updated certs are not going into archive, but into live/{domain}. And the live symlink is not changed so my cert is dead.

Here is the sum-total of my update script:
certbot certonly --standalone --agree-tos --email colony@proton.com --renew-by-default --domains mail.quantum-equities.com --csr /etc/letsencrypt/csr-mail.quantum-equities.com.csr --keep

if [ ! -f ${FULLCHAIN} -o ! -f ${CHAIN} -o ! -f ${CERT} ]; then
    cat /tmp/certbot-QE.out | mail -s "TLS Cert Update Fail for mail.quantum-equities.com" postmaster@quantum-equities.com
fi

The use of --csr completely disables the use of /etc/letsencrypt/renewal, /etc/letsencrypt/live, and /etc/letsencrypt/archive. It's an entire alternative to enrolling your certificate for management and automated renewal by Certbot. If you don't specify another location, the new certificate files will be saved in the current directory.

The --reuse-key feature officially landed today, but hasnā€™t yet been included in a public release. However, any subsequent release of Certbot should support it (if installed via certbot-auto, pip, or git clone; it will take much longer for this feature to reach OS package manager versions). That is, Certbot versions 0.25.0 or later should include this option.

2 Likes