Certificate request fails (DNS-01: zone not found) with tailscale enabled with identical search domain

This is a bit of a weird one perhaps.

System Setup
We have a series of servers that we use for internal use, all of which have tailscale installed for easy access by team members, and the occasional traffic routing between them when it makes sense (though most of the time we use direct connections).

Tailscale is currently set up with a custom DNS domain (internal.joinmastodon.org), both because it makes web addresses easier to remember for services that have web frontends, and we can carry the domain around if we ever switch to something other than tailscale (there is a DNS server sitting in the tailscale network resolving these names).

Certbot usage / issue
We would like to use certbot for certificates for various machines (web frontends, DB connections, etc). Since there are internal servers with no external access, we use the certbot-dns-multi plugin to do DNS-01 challenges in Exoscale, where our external DNS entries are. This has worked well for servers that don't have tailscale installed.

However, for the ones that do, it results in the following error:

Cleanup of some-server.internal.joinmastodon.org failed: exoscale: zone "internal.joinmastodon.org" not found
exoscale: zone "internal.joinmastodon.org" not found

(the DNS zone in exoscale is joinmastodon.org, not internal.joinmastodon.org)

I suspect this is a result of the fact that internal.joinmastodon.org is a search domain on the machine making the request, both in resolv.conf and in resolvectl, because turning tailscale off causes this to work just fine. This would be an acceptable solution for some servers which don't rely on tailscale for traffic, but for the ones that do, it would mean in interruption in service whenever the cert needs to be renewed.

The question is, is there a way to work around this? Say, explicitly tell certbot that the DNS zone in question?
Thank you all in advance for your input. :green_heart:


My domain is: internal.joinmastodon.org

I ran this command:

certbot certonly \
    -a dns-multi \
    --dns-multi-credentials=/etc/letsencrypt/dns-multi.ini \
    -d "some-server.internal.joinmastodon.org"

It produced this output:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Simulating a certificate request for some-server.internal.joinmastodon.org
Cleanup of some-server.internal.joinmastodon.org failed: exoscale: zone "internal.joinmastodon.org" not found
exoscale: zone "internal.joinmastodon.org" not found
Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

My web server is (include version):
(Technically not a web server, machine in this case is a DB server):
PostgreSQL 17.2

The operating system my web server runs on is (include version):
Ubuntu 24.04

My hosting provider, if applicable, is:
Hetzner

I can login to a root shell on my machine (yes or no, or I don't know):
yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
certbot 2.9.0

Are you only getting that error for the cleanup phase, or is there an error trying to set up the DNS errors too? I assume it's all, but needed to clarify.

Assuming the problem is what I think (the server Certbot is on does not resolve the DNS correctly due to some split horizon issues), a workaround that should work is using CNAME delegation - just as if you were running acme-dns. You can refer to the acme-dns docs for more info (GitHub - joohoi/acme-dns: Limited DNS server with RESTful HTTP API to handle ACME DNS challenges easily and securely.)

Basically:

  • you create a CNAME record on your primary DNS pointing _acme-challenge.some-server.internal.joinmastodon.org onto some other DNS zone. This could be under another TLD, or a dedicated NS under your zone (e.g. acme-auth.joinmastodon.org)

  • you edit your DNS plugin script to manipulate that CNAME target instead of the primary DNS.

This should get around the split horizon issues, because the CA (LetsEncrypt) will just look up the public DNS record and follow the CNAME to the target record, and the target record would be on a zone that doesn't have the split horizon issue.

This can be done with any DNS server/service. Doing this with acme-dns is the easiest (IMHO), as it automates everything and the script exists. the standard acme-dns integration generates random uuids for the CNAME targets, but you could UPDATE the sqlite3 tables to swap in deterministic domain names. I have a script that does that.

Assuming my understanding of this issue is correct, there aren't any Certbot flags that can handle this; the problem is happening within the Lego Go Library, which is integrated by the python package certbot-dns-multi. There may be a solution within Lego, but unless it utilizes environment vars, certbot-dns-multi would likely need to be extended.

2 Likes