Detail: DNS problem: SERVFAIL looking up A for thedomain.xyz

Hi, I have done nslookup, dig, and checked whatsmydns.net. Across the board it shows as being attached to my server’s IP. I run a script that works flawlessly for other domains, but fails out for this one.

Can’t figure it out. Can anyone help me to find more information, please. Which logs can I check to get this resolved? If I try to resolve the domain in my browser, it jumps to https. My script first sets up a nginx config file for port 80, then issues a cert, then adjusts the nginx config to listen on port 443, and port 80 only to redirect to https.

This isn’t making sense to me! The configuration file has no port 443 in it, and yet I’m being redirected. If I look in /etc/letsencrypt/live, there’s no folder for this specific domain. I see no reason why a bleed over would be happening from another nginx vhost config file when all others are specific to their respective domain.

Any ideas? Does this issue sound related, or are they separate?

I use this command:

/opt/letsencrypt/letsencrypt-auto certonly -a webroot --webroot-path=/var/www/users/thedomain/public -d thedomain.xyz

If I understand correctly, there are two things going on:

  • Let’s Encrypt reports that your DNS server responds with a SERVFAIL for your domain:
    This could be due to mixed-case queries that Let’s Encrypt uses to make spoofing attacks harder. Some DNS servers have trouble with that. Try the following dig command, while replacing example.com with a mixed-case version of your domain:
    dig ExAmPlE.CoM A
  • The second problem is that your domain always switches to https:// in your browser. This sounds like HSTS in action. Either your domain is on the HSTS preload list, which is a list of domains included by some browsers that will always be force-switched to https://, or you have sent a HSTS header at some point and your browser is remembering that setting. This can be verified by trying to connect with a clean browser that has never visited your site before (simply clearing the browser cache won’t work!).

Hope that helps!

1 Like

So the dig command to my domain with mixed case shows my IP address, that’s not the issue. I just ran the script again it shows the same error.

The second problem is an issue with HSTS, at least that’s what is reported in my browser. I guess it is on this HSTS pre-load list. How to resolve this?

If I can get passed issue two by resolving issue one and having a cert then that’s cool, but so far nothing is working.

Confirmed it is on this list, do certifications from letsencrypt automatically join you to this list, or how does that work? I’m curious. Is there any way this issue could be related to having a prior certificate from letsencrypt elsewhere? This individual mentioned having one issues for the domain before pointing it our wave.

I’m guessing a solution for issue #2 in the future would be:

server { listen 443 default; server_name _; ssl on; ssl_certificate dummy.name.cert; return 403; }

In a default nginx vhost with a snakeoil cert for it. That way, when someone is attempting (or forced) to hit a domain attached to my server via HTTPS when a cert has yet to be issued, they will be shown a 403 instead of someone else’s site.

To test it you need to do

dig ExaMpLe.com @your-primary-nameserver.com

(for your domain, and using your primary nameserver) ... In the section that is returned

;; QUESTION SECTION:
;ExaMpLe.com. IN A

and make sure it is not

;; QUESTION SECTION:
;example.com. IN A

1 Like

`root@beta:/etc/nginx/sites-enabled# dig MyDomAIn.xYz @dns1.registrar-servers.com
; <<>> DiG 9.10.3-P4-Ubuntu <<>> MyDomAIn.xYz @dns1.registrar-servers.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21843
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 5, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;MyDomAIn.xYz. IN A

;; ANSWER SECTION:
MyDomAIn.xYz. 1799 IN A 76.213.77.43

;; AUTHORITY SECTION:
MyDomAIn.xYz. 1800 IN NS dns2.registrar-servers.com.
MyDomAIn.xYz. 1800 IN NS dns3.registrar-servers.com.
MyDomAIn.xYz. 1800 IN NS dns1.registrar-servers.com.
MyDomAIn.xYz. 1800 IN NS dns4.registrar-servers.com.
MyDomAIn.xYz. 1800 IN NS dns5.registrar-servers.com.

;; Query time: 48 msec
;; SERVER: 216.87.155.33#53(216.87.155.33)
;; WHEN: Mon May 09 10:57:28 CDT 2016
;; MSG SIZE rcvd: 173
`

OK, that looks good on the DNS side ....

If you set the HSTS time in your config to "0", and restart nginx, then visit the site in your browser, this should reset your brwoser so it doesn't force https for the future on your domain. Of course you need to set up the https config for that.

No, it doesn't get automatically added to the list. If it was set in the headers on your site though, and you visited it, then your browser will remember that - hence needing to send again with a time of zero.

No, being added to the HSTS preload list is a manual process. The server needs to be using a valid cert and sending a HSTS header with the preload token, then you need to request inclusion and wait for it to be reviewed. The only way for it to be on there is if you went through the process or a previous owner of the domain did.

As for your "solution", they won't actually get the 403 since the invalid cert error cannot be overridden for HSTS domains.

Great, so no solutions for either problem then? Everything looks fine and yet there’s no where to go?

If you set the HSTS time in your config to "0", and restart nginx, then visit the site in your browser, this should reset your brwoser so it doesn't force https for the future on your domain. Of course you need to set up the https config for that.

There’s no HSTS time for this vhost. It’s not even listening on port 443. Before it gets issued a certificate it listens only on port 80. After the cert is issued the vhost config is rewritten to include the HSTS.

That’s why I think it’s a bleed-over. It keeps looking for a default certificate to hook into. I don’t know why this is happening. That’s why I think my solution will work.

If you're on the HSTS preload list, that means someone has added your domain through this form and that your server set the HSTS header (Strict-Transport-Security) with the preload flag set at some point (i.e., when the HSTS preload list check was performed by Google). You should be able to search for your domain on the site I linked in my previous post and verify whether this is the case. This is not something that Let's Encrypt will do automatically or anything like that.

It's worth pointing out that the HSTS issue is not related to your issues with Let's Encrypt. Let's Encrypt doesn't use the HSTS preload list and doesn't check or care about HSTS headers for verification purposes. Your issue seems to be with DNS.

I can't think of anything else that you could check with regards to DNS - with those checks all coming back without any issues, this looks like a problem I haven't seen before. Hopefully someone else will think of something

1 Like

Okay, so basically it had a certificate elsewhere, and that server must have set strict headers with a time still validated to this date. Now that it’s transferred to my server, and I have not overridden this time it’s still in effect.

That makes sense to me.

Yea, I’ve issued close to 50 certificates in the last couple of weeks for this new service, and this is the only one so far that’s having issue. A 2% fail rate… acceptable while we’re beginning beta-trials, but needs to be solved lets say in the next 10 days. I’ll keep asking around elsewhere and hope someone has some information that can help out.

Thank you for your efforts.

Correct. You can overwrite that time though, by providing a new header, with a new time ( or 0 to effectively remove it ) .

1 Like

I’ll go ahead and do this, and run the script again. If issue 2 solves issue 1 I’ll be a happy camper. If not, then I’ll still feel a touch better knowing I’m further along today than yesterday.

Thanks for confirming.

Something else that might be problematic and could lead to this error would be a broken DNSSEC configuration.

This tool has a DNSSEC check included - if you’re seeing any errors, that’s probably the reason for this issue.

2 Likes

Thanks for that!

Inconsistent security for MyDoMAin.xYz - DS found at parent, but no DNSKEY found at child. The parent has a secure delegation to the child (indicated by DS RRset at the parent), but the child has no DNSKEY records. This is probably due to a previously signed zone that became unsigned without requesting the parent to remove the secure delegation

So I’m trying to wrap my head around this. Where would the customer need to go to remove the current DNSSEC, or if that’s not what needs be done have it reissued appropriately. I realize this is outside the scope of letsencrypt. If you tell me to go elsewhere for the information I’ll understand.

To enable DNSSEC, your domain name registrar adds a DS (Delegation Signer) record. Removing that will disable DNSSEC. If you do want to use DNSSEC, you’ll need to sign your zone on your NS servers (the instructions for that would be specific to your DNS server or provider - if it’s supported, there’s probably documentation available).

1 Like

I don’t think he’s operating a mail-server, so disabling the record will probably be the easiest way to resolve this issue.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.