As an aside, I think the Certbot docs really need to document a single successful path to use DNS validation - at the moment it is a bit hard to read between the lines/base it off the Cloudflare example.
Hopefully assisted-dns-01 lands some day and saves all this pain anyway
Yeah, it would probably be fine. If there’s one lesson I learned as an ACME client author, it’s that people sometimes have really ****ed up systems and it’s worth doing things in the most defensive way.
If there is a downside to installing things into the Certbot venv that I don’t know about, I would definitely consider changing it to use system Python. Notably, the venv getting wiped out shouldn’t matter since the hook itself ensures the presence of dns-lexicon.
Nice post @_az. We're planning on adding a way for users to optionally install the DNS plugins with certbot-auto (see #4767), but unfortunately this isn't going to happen in the next month or two. Luckily, we've been working with the maintainers of most of our distro packages and it is expected that our DNS plugins will be available in major distros such as Debian stretch backports, our Ubuntu PPA, Fedora, and EPEL 7 before the ACMEv2 endpoint goes live.
The one problem with the post is certbot-auto deletes and recreates its virtual environment every time it upgrades to a new version. We've found this to be much more reliable than trying to upgrade packages as can be seen in #4047 where someone tried to change this behavior. This means that after certbot-auto upgrades, dns-lexicon will no longer be installed.
A number of our plugins use dns-lexicon internally so distros like the ones I mentioned above should start having an OS package you can rely on. Alternatively, I'd recommend creating a 2nd virtual environment especially for dns-lexicon.
I agree and I made #5564 to track the issue. Feel free to make issues on that repo or @ name me on this forum if there's something we can change to be more useful for people.
I had realised this mistake and re-published the article within an hour of posting it - the installation of dns-lexicon is now performed within the venv at hook runtime, so the venv wipeout should not be an issue (tested).
Thank you for posting the links. Many issues to think about. Sad to see that the AWS SDK is a monster in every language .
Oh nice! Yeah that should handle the problem. I think you probably already had things this way when I posted my comment and I just didn't realize it. If so, sorry for the noise!
Yeah definitely a lot to think about. We're also considering alternatives to certbot-auto entirely such as building our own OS packages or binaries. Hoping to spend a significant amount of time getting our packaging in better shape shortly after the transition to ACMEv2.
Just to flag it for people here to avoid confusion in the future, one unfortunate limitation we hit with Lexicon while preparing Certbot’s 0.22.0 release was that it does not currently support creating multiple TXT records on the same domain or deleting a TXT record from a domain containing multiple records for many of its providers.
While the original post here should work great for many people regardless and this is not at all a problem now, when Let’s Encrypt’s ACMEv2 endpoint goes live, you may not be able to obtain a single certificate for a wildcard and the base domain until this is resolved. I opened an issue to track this at https://github.com/AnalogJ/lexicon/issues/182.
I just realized that there is a bit of a security red-flag with this approach, as it requires the upstream DNS system’s API key to be stored plaintext on the server. People should approach this cautiously and understand the dangers if their server is compromised. I’m leaning towards handling this locally, then syncing certs onto our cluster.
The APIs of most DNS providers don’t partition ACL permissions based on record types, so gaining access to this key would allow hackers to update the A and MX records of a domain.
If the DNS is handled by the registrar, the API access may contain full control over a domain including whois info and transfer control. (This is the default case for Namecheap’s API, though you can create a separate account for API access and delegate only DNS permissions to that account for your domains).
IMHO, unless your DNS API credentials can be locked to TXT records, I don’t think this is a safe approach. I think safer ideas would be to handle this as a manual update on the server (the API token could be GPG encrypted on the server and decrypted as needed) or run locally and then sync keys to the server.
We had a node compromised a few years ago because the then-default Redis configuration had a known exploit. I’ve since been very wary of how API keys are stored and what they allow.
I agree, I’ve raised that concern on previous occasions 1 & 2.
Apart from acme-dns, assisted-dns-01 was also suggested on the IETF ACME list by jsha which is a kind of “official acme-dns”, but I don’t think it was received so well.
It is possible to use the documented approach safely, though. You can run certbot in certonly mode on a server that has no internet accessible services. Pr-distribute the private key of the certificate to downstream users of the certificate, and then just copy the certificate on a regular automated schedule down via ssh/rsync/whatever else. Private key never has to touch the network.
I would guess though, that this is way too much work/thinking for the majority of users.
Edit: I would also add:
As long as you keep the API secrets only readable as root and you don’t run your software at a high privilege level, a complex compromise is required
This problem exists across pretty much all ACME clients that support the DNS challenge