Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.
My domain is: letsencrypttest.agd.gov.au
I ran this command:
acme.sh -d letsencrypttest.agd.gov.au --server letsencrypt --dns --yes-I-know-dns-manual-mode-enough-go-ahead-please --issue
Manually created the TXT records through a support ticket (purely for debugging purposes because we have a delegated sub domain to Azure DNS for the automated version)
acme.sh -d letsencrypttest.agd.gov.au --server letsencrypt --dns --yes-I-know-dns-manual-mode-enough-go-ahead-please --renew
It produced this output:
[Mon Oct 23 14:15:35 AEDT 2023] The domain 'letsencrypttest.agd.gov.au' seems to have a ECC cert already, lets use ecc cert.
[Mon Oct 23 14:15:35 AEDT 2023] Renew: 'letsencrypttest.agd.gov.au'
[Mon Oct 23 14:15:35 AEDT 2023] Renew to Le_API=https://acme-v02.api.letsencrypt.org/directory
[Mon Oct 23 14:15:36 AEDT 2023] Using CA: https://acme-v02.api.letsencrypt.org/directory
[Mon Oct 23 14:15:36 AEDT 2023] Single domain='letsencrypttest.agd.gov.au'
[Mon Oct 23 14:15:36 AEDT 2023] Getting domain auth token for each domain
[Mon Oct 23 14:15:36 AEDT 2023] Verifying: letsencrypttest.agd.gov.au
[Mon Oct 23 14:15:38 AEDT 2023] Pending, The CA is processing your order, please just wait. (1/30)
[Mon Oct 23 14:15:43 AEDT 2023] Invalid status, letsencrypttest.agd.gov.au:Verify error detail:DNS problem: SERVFAIL looking up CAA for agd.gov.au - the domain's nameservers may be malfunctioning
[Mon Oct 23 14:15:43 AEDT 2023] Please add '--debug' or '--log' to check more details.
[Mon Oct 23 14:15:43 AEDT 2023] See: https://github.com/acmesh-official/acme.sh/wiki/How-to-debug-acme.sh
[Mon Oct 23 14:15:45 AEDT 2023] The dns manual mode can not renew automatically, you must issue it again manually. You'd better use the other modes instead.
My web server is (include version): N/A (I'm using DNS verification)
The operating system my web server runs on is (include version): Ubuntu 20.04
My hosting provider, if applicable, is: DNS is hosted by Telstra
I can login to a root shell on my machine (yes or no, or I don't know): yes
I'm using a control panel to manage my site (no, or provide the name and version of the control panel): no
The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
# acme.sh version
https://github.com/acmesh-official/acme.sh
v3.0.7
Something weird is going on between Let's Encrypt Servers and Telstra's DNS servers, when using the Let's Encrypt staging servers sometimes it works and sometimes it fails. However with the Let's Encrypt production servers it fails every time. This used to be working, but we noticed our scheduled renewals started failing from the 23rd of September, and it's been broken since then.
The SERVFAIL sometimes happens on the TXT record or sometimes on the CAA record. We're not using DNSSEC so it's unrelated to that.
While debugging I decided to try Google Trust Services and it worked fine, so it seems to be a transient issue only affecting the network path between Let's Encrypt Production and Telstra. I raised the support request with Telstra and they looked through their DNS logs and didn't find any errors during the window. They saw the requests from let's encrypt but no errors were logged.
I tried using letsdebug.net and it shows the failure.
Then running it again, it succeeded.
Is it possible to get more detail on why Let's Encrypt DNS resolvers think it is a SERVFAIL? Is it connectivity problems? Why does staging sometimes work and production never? Why does it work via Google Trust Services in place of Let's Encrypt but using the same DNS provider?
I've been trying to diagnose this for a few weeks now and it seems a really curly problem. At the moment it seems like it's a case of "The Internet says No".

