I'm developing a server management app that connects to a server and among other things it installs certbot and generates wildcard certificates.
When requesting a certificate from the command line, certbot displays the TXT records that needs to be added to the DNS and waits for the user to press Enter to continue with the verification process. This is ok when manually working from the command line.
Due to the way my app is working I can't simulate pressing the "Enter" key or anything similar, so I had to resort to improvise something like this:
Start the certificate request process and wait until certbot provides the TXT records, get said records from the response and kill certbot.
Display the TXT values to the user in the GUI
Once they've updated the records, they press a "Verify" button which runs the command from step 1 again and generates the certificates.
Most of the time this works as expected, the certificate is generated and all is ok, but sometimes the TXT records received at step 1 become invalid when running the command again at step 3. Instead of finishing the certificate generation process, it claims the old records are no longer valid and returns a new set of txt records... If I add the new txt records to the domain's dns then again it comes back with an invalid response and provides again a new set of txt records.
What exactly I'm I doing wrong and is there any recommended way of achieving what I need to do: get txt records with one command and later finish the process with another command while keeping the TXT records the same between the two steps?
The command I'm currently using in steps 1 and 3 is this:
/usr/bin/certbot certonly --manual --force-renewal --preferred-challenges=dns --email my@email.com --agree-tos -d mydomain.com -d *.mydomain.com --manual-public-ip-logging-ok
Each time you run the certbot command, you'll get new TXT records unless Let's Encrypt has already cached a successful authorization for the domain. If you're going to be using certbot for this, you need to do it all in one run, not running one command to start and another command later.
If you want to script certbot's manual mode, you're probably best off writing scripts to have it call rather than trying to parse its output directly.
But really, certbot is designed to be configured by users and not really being configured by automation, and you'd be better off using an ACME library that's for whatever language you're writing your program in.
I think certbot can also reuse pending authz, i.e., authz not in the invalid state. So until you trigger the authz to try to validate the TXT record, you should be able to reuse it.
Otherwise I agree with your post: certbot probably isn't the best solution, but if you want to keep using it, the hooks are probably a better method indeed.
When responding to DNS challenges you will add/update the TXT record but it takes an indetermindate amount of time for the TXT record to be copied to all of the nameservers for your domain (mostly within 1 minutes, but in some cases a considerable amount of time).
So you might update the TXT record and it all looks fine, then Let's Encrypt checks it and it happens to use the other nameserver and gets a stale response (or no result). That challenge is then marked as failed and won't be attempted again in future runs (it will be replaced with a different value next time).
You can poll all of your nameservers individually using dig etc to ensure the TXT record has indeed been updated before proceeding, or you can simply wait long enough.
Note that this method does not account for global anycast setups: your local nameserver might return the TXT record, all looking fine, but at the other end around the globe a different nameserver with the same IP address might not have received the uodate yet.