Secondary SERVFAIL DNS error running certbot

Yes, again, this occurs even with all protections disabled.

Where were those protections disabled?
Which IP?

  • 38.143.59.199
  • 38.143.59.203
  • BOTH
3 Likes

Both, or either, or none. Any configuration of these fails.

You understand what you are doing/have done.
I don't clearly understand what you are doing/have done.

Please be MORE clear.

Where is "the protection"?
What is "the protection"?
How was it disabled?

3 Likes

And why only use one single DNS server?

2 Likes

The firewall and previously mentioned DDOS protection were disabled on all servers of the network, various combinations of some of them on and off, and obviously with all of them on.

The firewall is csf, if it matters.

Can you get the route table from the DNS server?
Can you get the route table from the firewall in front of the DNS server?

2 Likes
# ip route show
default via 38.143.59.193 dev eth0 onlink
38.143.59.192/27 dev eth0 proto kernel scope link src 38.143.59.203
# netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         38.143.59.193   0.0.0.0         UG        0 0          0 eth0
38.143.59.192   0.0.0.0         255.255.255.224 U         0 0          0 eth0
# traceroute 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  38.143.59.193 (38.143.59.193)  0.295 ms  0.453 ms  0.225 ms
 2  38.143.59.254 (38.143.59.254)  0.425 ms 198-98-100-13.beanfield.net (198.98.100.13)  0.990 ms  0.975 ms
 3  * po313.lsr02.800SquareVictoriaSt01.YUL.beanfield.com (199.167.154.181)  0.641 ms 198-98-100-13.beanfield.net (198.98.100.13)  0.959 ms
 4  * po313.lsr02.800SquareVictoriaSt01.YUL.beanfield.com (199.167.154.181)  0.659 ms  0.717 ms
 5  lo0-1.bdr01.1250ReneLevesqueBl01.YUL.beanfield.com (72.15.48.50)  5.837 ms  5.823 ms  5.810 ms
 6  lo0-1.bdr01.1250ReneLevesqueBl01.YUL.beanfield.com (72.15.48.50)  5.797 ms  5.487 ms 66-207-192-242.beanfield.net (66.207.192.242)  0.478 ms
 7  192.178.86.251 (192.178.86.251)  0.815 ms 142.251.51.179 (142.251.51.179)  0.760 ms 192.178.86.87 (192.178.86.87)  2.412 ms
 8  142.250.58.131 (142.250.58.131)  1.352 ms 172.253.77.117 (172.253.77.117)  1.514 ms 142.250.238.147 (142.250.238.147)  1.506 ms
 9  dns.google (8.8.8.8)  0.708 ms  0.697 ms 142.250.237.11 (142.250.237.11)  1.479 ms

Not much to see there...

3 Likes

It's all one physical machine in the end; I suppose I could create two VMs, but that seems redundant.

I'm inclined to think its something with the DNS as highlandarrow.com, served on the same DNS server, will resolve properly. But the zone is fairly elementary in both cases:

; Zone file for highlandarrow.com

; CORE records!
$TTL 14400
highlandarrow.com.	86400	IN	SOA	ns1.highlandarrow.com. notices.highlandarrow.com. 2025103007 3600 120 28800 86400
highlandarrow.com.	86400	IN	NS	ns1.highlandarrow.com.
highlandarrow.com.	86400	IN	NS	ns2.highlandarrow.com.
highlandarrow.com.	342	IN	A	38.143.59.194
ns1			342	IN	A	38.143.59.203
ns2			342	IN	A	38.143.59.203

; SUBDOMAIN stuff!
montreal		342	IN	A	38.143.59.194
www			342	IN	A	38.143.59.203
test			342	IN	A	38.143.59.203
cdn			324	IN	A	38.143.59.203
mail			342	IN	A	38.143.59.204
xmpp			342	IN	A	38.143.59.206
doom			342	IN	A	38.143.59.207
git			342	IN	A	38.143.59.208
chat			342	IN	A	38.143.59.209

; MAIL stuff!
highlandarrow.com.	14400	IN	MX	1 mail.highlandarrow.com.
_autodiscover._tcp	342	IN	SRV	1 1 443 mail.highlandarrow.com.
autodiscover		342	IN	CNAME	mail.highlandarrow.com.
autoconfig		342	IN	CNAME	mail.highlandarrow.com.
_dmarc			14400	IN	TXT	"v=DMARC1; p=reject; rua=mailto:admin@highlandarrow.com; adkim=s;"
highlandarrow.com.	14400	IN	TXT	"v=spf1 ip4:38.143.59.204 +a +mx ~all"
dkim._domainkey		IN	TXT		_25._tcp.mail	IN	TLSA	3 1 1 f89e49a651726f1df576ad4cd558a8d7b90ad2dc1d22bb4b6d362ed368037407
; Zone file for toud.pw

; CORE records!
$TTL 14400
toud.pw.	86400	IN	SOA	ns1.highlandarrow.com. notices.highlandarrow.com. 2026011204 3600 1800 1209600 86400
toud.pw.	86400	IN	NS	ns1.highlandarrow.com.
toud.pw.	86400	IN	NS	ns2.highlandarrow.com.
toud.pw.	342	IN	A	38.143.59.194

; SUBDOMAIN stuff!
www	342	IN	A	38.143.59.204
wiki	342	IN	A	38.143.59.199
ezra	342	IN	A	96.43.131.11
chat	342	IN	A	104.192.171.220
azalin	342	IN	A	38.143.59.195
nwsync	342	IN	A	38.143.59.197
forums	342	IN	A	38.143.59.202
git	342	IN	A	38.143.59.200

; CDN stuff!
vault	342	IN	CNAME	cl-gl1bd79a70.gcdn.co.
cdn	342	IN	CNAME	cl-gl1bd79a70.gcdn.co.

; CERT stuff!
_9962AEAFB699F2354D7071E38DF65A5D.git 342	IN 	CNAME	94A13E2464093192826F02E977EA443C.E47B4635C73BD50E738F88BAD192B177.83d2db4d36e39c0.comodoca.com

; MAIL stuff!
toud.pw.		14400	IN	MX	0 toud.pw.
toud.pw.		14400	IN	TXT	"v=spf1 ip4:38.143.59.204 +a +mx ~all"

Domain keys redacted.

The CNAME is from testing with other providers.

The records look fine (and bind9 has no complaints) but anything look unusual to anyone, maybe?

Redundancy is necessary [for DNS] - LOL

I'd start by routing using a secondary IP [to the same DNS server].
If that doesn't fix the issue, then move the secondary IP to its' own VM.

2 Likes

You might also sort out your EDNS compliance. There are a number of reported problems. This is perhaps a longer-shot issue since unboundtest.com resolves okay but still might be.

3 Likes

That's already happening, which you can see in the records above.

The subdomain is incorporated as part of the main record and recursive queries are disabled to prevent attacks, so this is expected behaviour.

The apex domain passes just fine in both cases:

From here Test result – Zonemaster

I suggest resolving these Warnings and Errors.

2 Likes

None of the suggestions really address how it would resolve perfectly fine for one zone on the same nameserver, but not another.

I hear your frustration @Maiyannahl; the community forum's members often go above and beyond assisting in getting a Let's Encrypt certificate issued and deployed. And DNS is often a component that comes into play. I suggest looking to other forums as well to assist in configuring your DNS to be solid and stable, we will still be here.

2 Likes

I don't mean to be unpleasant to anyone here, it's just vexing (though I will confess some mild frustration at repeating myself at the one point)

Why would, all other things being equal, one domain served up by the same DNS server and verified to be correct work properly for issuance of the domain, but another, also verified, also on the same DNS server, return SERVFAIL?

That is, ultimately, the question I'm trying to figure out.

Renewing an existing certificate for cdn.highlandarrow.com and 2 more domains

Successfully received certificate.
Certificate is saved at: /etc/letsencrypt/live/cdn.highlandarrow.com/fullchain.pem
Key is saved at:         /etc/letsencrypt/live/cdn.highlandarrow.com/privkey.pem
This certificate expires on 2026-04-12.
These files will be updated when the certificate renews.
Certbot has set up a scheduled task to automatically renew this certificate in the background.

Deploying certificate
Successfully deployed certificate for cdn.highlandarrow.com to /etc/nginx/sites-enabled/cdn.highlandarrow.com
Successfully deployed certificate for test.highlandarrow.com to /etc/nginx/sites-enabled/test.highlandarrow.com
Successfully deployed certificate for www.highlandarrow.com to /etc/nginx/sites-enabled/www.highlandarrow.com
Your existing certificate has been successfully renewed, and the new certificate has been installed.

It does explicitly say that it fails to find an A record for the domain (and AAAA - but the server isnt on IPv6), but I'm wondering if that's caused by some error with the target webserver (a gitlab instance) instead? It's the only difference between the two - the successful highlandarrow.com renewal was against a machine with a very bare nginx installation, whereas the failure for the above domain toud.pw, is being provisioned through gitlabs ACME implementation.

Actually there is one thing I see not being equal and that is their Top-Level Domain (TLD),
which DNSSEC then have different issues due to signing of the DNS RR.

2 Likes