Dns-01 challenge not working for wildcard cert


#1

I configured BIND (named.conf) per the instructions at https://certbot-dns-rfc2136.readthedocs.io/en/latest/

I am running Centos 7 on a 4 core VPS with full DNS control and the SOA of the domain is my VPS

I am having trouble creating the required TXT record

I just would like to copy a zone file and edit it

EDIT / SOLUTION

This issue may not or ever have a solution under the certbot-dns-rfc2136 authenticator because the DNS is a non-authoritative server to Google’s public servers, without a “Google account” and we do not intend to get one, and we do not want to run our DNS in authoritative mode due to the load it could experience. The certbot manual mode with the --manual-auth-hook and our own authenticator and cleanup scripts with a custom RESTFUL api or NSUPDATE will have to be used.


#2

What trouble exactly? Do you have trouble following and/or understanding the guide you linked to? (Perhaps it needs to be more clear for example.) Or do you get some sort of error you don’t understand and/or manage to fix?

What exactly have you tried from the guide you’ve linked? Everything? How? What did you implement exactly on your server? Or did you get lost somewhere through the guide? Why? Do you get errors? Is something unclear?

As you can see, many, MANY questions remain. Unfortunately, my crystal globe has fallen to the floor again yesterday and is again broken, so I need you to give us more information.


#3

Thank you for replying, but it worked - via the manual method

I honestly do not know what I did any differently but go slow - I put an entry like this manually in the xxxxxxxx.com zone file and restarted BIND after putting in the “xxxxxxxxxx” that certbot sent - it did not work so many times that after 8 hours of this - this was my last attempt before getting ready to quit - and it worked

It should be noted I got some strange reports from the command

host -t txt _acme-challenge.xxxxxxxx.com

At first it was saying _acme-challenge.xxxxxxxx.com was an alias of xxxxxxxx.com, now it is reporting the correct manual txt file entries I put in. This MUST have been a DNS proprogation issue

_acme-challenge.xxxxxxxx.com. 86400 IN TXT “xxxx” as an entry in the zone file worked - manually.

where the TXT in the " xxxx" is the TXT the certbot returns to be requested in the entry - after restarting BIND - it asks for a second different one - which I put in and restart BIND again and then WAS failing with this in the log (date and times removed)

When I waited quite a while between entries and restarting BIND - while scratching my head . . . and checking everything I could to make sure the entries were actually showing via the host command above . . . it worked . . . otherwise I was getting errors like this below

968:INFO:certbot.auth_handler:Performing the following challenges:
968:INFO:certbot.auth_handler:dns-01 challenge for xxxxxxxx.com
968:INFO:certbot.auth_handler:dns-01 challenge for xxxxxxxx.com
888:DEBUG:certbot.error_handler:Encountered exception:
Traceback (most recent call last):
File “/usr/lib/python2.7/site-packages/certbot/auth_handler.py”, line 75, in handle_authorizations
resp = self._solve_challenges(aauthzrs)

That is done by running certbot manually with

certbot certonly --manual -i apache -d “*.xxxxxxxx.com” -d xxxxxxxx.com --agree-tos --no-bootstrap --manual-public-ip-logging-ok --preferred-challenges dns-01 --server https://acme-v02.api.letsencrypt.org/directory

and it resulted in this in the terminal window

Domain: xxxxxxxx.com
Type: unauthorized
Detail: Incorrect TXT record
“st1N5t_i5LMJ1atBOk7IFpFf1UQ9KTLFSwmJ8cpwMJw” found at
_acme-challenge.xxxxxxxx.com

Domain: xxxxxxxx.com
Type: unauthorized
Detail: Incorrect TXT record
“st1N5t_i5LMJ1atBOk7IFpFf1UQ9KTLFSwmJ8cpwMJw” found at
_acme-challenge.xxxxxxxx.com

IF I RAN the dns-rfc2136 method - which I believe is supposed to be able to directly write the challenges to the zone file . . .

certbot certonly --dns-rfc2136 --dns-rfc2136-credentials /credentials.ini -i apache -d “*.xxxxxxxx.com” -d xxxxxxxx.com --server https://acme-v02.api.letsencrypt.org/directory

I got a SERVFAIL and this in the log

377:DEBUG:certbot_dns_rfc2136.dns_rfc2136:No authoritative SOA record found for _acme-challenge.xxxxxxxx.com
380:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Received authoritative SOA response for xxxxxxxx.com

I was getting an authoritative response from xxxxxxxx.com but it appears to saying it was looking for the same thing from the _acme-challenge.xxxxxxxx.com

But I could not restart BIND with the _acme-challenge.xxxxxxxx.com zone enabled in named.conf, and this is the zone file below

_acme-challenge.xxxxxxxx.com. 86400 IN SOA ns1.yyyyyyy.com. no-reply.main.yyyyyyyy.com. (
2016122000 ;Serial Number
3600 ;refresh
7200 ;retry
2419200 ;expire
86400 ;minimum
)
_acme-challenge.xxxxxxxx.com. 86400 IN TXT “r-2_LNtcFxp59czkubzPWvSnIklxC7kbRsLaiMEXcyo”
_acme-challenge.xxxxxxxx.com. 86400 IN TXT “shxSAJw1EsyGXSZljTq-30-guYP3lpDs0JsRmQNj–E”
_acme-challenge.xxxxxxxx.com. 86400 IN TXT “jpNLi1Vtw5QWL4KKv-6xRu7FG8cErsdy4TF6QY2Q2KE”
_acme-challenge.xxxxxxxx.com. 86400 IN NS ns1.yyyyyyy.com.
_acme-challenge.xxxxxxxx.com. 86400 IN NS ns2.yyyyyyy.com.
_acme-challenge.xxxxxxxx.com. 14400 IN A 104.251.217.147
localhost 14400 IN A 127.0.0.1

All I was getting this in the named log as

391 general: error: zone xxxxxxxx.com/IN/internal: loading from master file /var/named/xxxxxxxx.com.db failed: permission denied . . .

If I tried to restart BIND with _acme-challenge.xxxxxxxx.com in the named.conf file as a zone, and I would not guess this is what the certbot challenge intends - to put a fake domain zone in named

My understanding is if using the dns-rfc2136 plugin - certbot is supposed to be able to write the challenge, but I have to change

zone “xxxxxxxx.com.” IN {
to
zone “xxxxxxxx.com.” {

from the instructions or BIND will not restart

It seems to still work and gets tokens etc . . .

A page at https://www.netgate.com/docs/pfsense/dns/rfc2136-dynamic-dns.html states . . .

“Then [zone command in named.conf] creates the initial zone file. Be aware that BIND will rewrite this zone file, which is why a subdomain is used in the example. BIND will also need read/write access to this file and the directory in which it resides so that it may rewrite the zone and its journal.”

Which concerned me as the zone file is full of subdomain entries already and I would not want them deleted.

The JNL files with the current date ARE being written to /var/named/dynamic, and the command above DID NOT have an “IN” command either as I had to remove to get it restart BIND

Since BIND will not restart - it does not create the initial zone file - and i tried it will a variant of zone file name not in the var/named directory already

The last thing is my zone files are of the format xxxxxxxx.com.db

I will still have to face getting the dns-rfc2136 plugin working so the cron jobs can keep this wildcard cert current - as I have the other domains working

The _acme-challenge.xxxxxxxx.com zone just “DISAPPEARED” by itself from the zone views, and after I got the cert to come down - I disabled the zone file for it.


#4

You might not aware that if a given zone is using dynamic updates, then you can not edit it manually anymore (and that’s the rewrite part is all about). That’s why I never went the nsupdate way, too complicated.

I am using dns-01 myself as well but I generate the zone file myself from a set of files with a specific LE one that gets modified for the acme challenge.

dns-01 as an auth method for LE is not the faint of heart, you might want to use another one :slight_smile:


#5

Only dns challenges can be used for wildcard certificates at Let’s Encrypt.


#6

Makes sense as you do not have a unique CN to check.


#7

@unioncos Is there a particular reason you’re trying to host your own DNS instead of just using one of the many DNS providers available? Running and keeping secure a raw BIND server is a lot of extra work just to get your website up and running and can jeopardize the security of your website if you do it wrong. There are plenty of DNS providers that you could use for free that also have plugins for use with certbot.

Cloudflare is probably the most popular. Digital Ocean is free if you’re only using their DNS service. LuaDns is also free for your first 3 domains.

Taking DNS off your VPS is not only making it more secure, it costs you less because you’re not paying for the resources that BIND instance consumes. It also makes it way easier to change VPS providers later because you can just bring up a copy of your site at the new place, and repoint a couple A records in DNS rather than updating NS records and potentially glue records in DNS with your registrar.


#8

I have multiple domains with subdomains and need the ability to create them and control them, and well as domains all using a central Drupal 7 codebase - no Registrars DNS will work - and I run into the headache of trying to deal with them not knowing anything about Drupal and multisite.


#9

I’m not talking about Registrar DNS. In fact, I’d specifically avoid that since their APIs, if available, tend to be poor. I’m talking about dedicated DNS hosting from a provider with a REST API. Create/Update/Delete records with curl commands (or PHP or whatever). It might not be free depending on the number of domains and/or traffic volume. But it’s a lot easier to deal with than raw BIND zone files especially if you don’t have a lot of BIND experience.


#10

Here is my issue now . . . when it comes time to renew the manual method I used is not going to work with my cron command as Osiris pointed out here => How to renew wildcard cert with cert-bot auto and the link therein to https://certbot.eff.org/docs/using.html#pre-and-post-validation-hooks

I would like to use the dns-rfc2136 method - which I believe is supposed to be able to directly write the challenges to the zone file . . .

certbot certonly --dns-rfc2136 --dns-rfc2136-credentials /credentials.ini -i apache -d “*.xxxxxxxx.com” -d xxxxxxxx.com 1 --server https://acme-v02.api.letsencrypt.org/directory

Certainly certbot CANNOT renew a cert created with the --manual flag so I have to fix the dns-rfc2136 plugin or write scripts for the pre-and-post-validation-hooks

According to Keltounet

1d

" You might not [be] aware that if a given zone is using dynamic updates, then you can not edit it manually anymore "

. . . and if that is true then there is problem because I cannot have that

If I run . . .

[root@main ~]# certbot certonly --dry-run --dns-rfc2136 --dns-rfc2136-propagation-seconds 120 --dns-rfc2136-credentials /credentials.ini -i apache -d “*.xxxxxxxx.com” -d xxxxxxxx.com --server https://acme-v02.api.letsencrypt.org/directory

the server returns this in the terminal window

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator dns-rfc2136, Installer apache
Starting new HTTPS connection (1): acme-staging-v02.api.letsencrypt.org
Cert not due for renewal, but simulating renewal for dry run
Renewing an existing certificate
Performing the following challenges:
dns-01 challenge for xxxxxxxx.com
dns-01 challenge for xxxxxxxx.com
Cleaning up challenges
Received response from server: SERVFAIL
[root@main ~]#

so the log has this STILL as the offending entries

460:DEBUG:certbot_dns_rfc2136.dns_rfc2136:No authoritative SOA record found for _acme-challenge.xxxxxxxx.com
466:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Received authoritative SOA response for xxxxxxxx.com

This is something in my setup . . . so does " _acme-challenge.xxxxxxxx.com " actually need its own SOA zone ??


#11

What does your bind say in its logs? Also, you’ll need to make sure you increase the verbosity of the logging for proper debugging.


#12

hmmm . . . getting closer /var/log/named.log has these two entries for that run today (date and times removed)

931 general: warning: managed-keys-zone/internal: No DNSKEY RRSIGs found for ‘.’: success
476 general: error: /var/named/xxxxxxxxx.com.db.jnl: create: permission denied

so I checked the etc/named.conf file options section . . .
but maybe I am looking in the wrong place . . .

options {
listen-on-v6 port 53 {
any;
};
// Hide bind version
version “unknown”;
directory “/var/named”;
dump-file “/var/named/data/cache_dump.db”;
statistics-file “/var/named/data/named_stats.txt”;
/* memstatistics-file “data/named_mem_stats.txt”; /
allow-transfer { “none”; };
allow-query { any; };
recursion no;
dnssec-enable yes;
dnssec-validation yes;
dnssec-lookaside auto;
/
Path to ISC DLV key */
bindkeys-file “/etc/named.iscdlv.key”;
managed-keys-directory “/var/named/dynamic”;
pid-file “/run/named/named.pid”;
session-keyfile “/run/named/session.key”;
};

Is this a named permissions issue ??


#13

It could be the dynamic update fails b/c of the permission error.

Check under which user bind runs and check if it can create files under /var/named/. If not, change the owner, groupd and/or permissions of the directory so bind can write to it.


#14

bind runs as user named

adding a group permission to the /var/named directory let it run, and it no longer returns a SERVFAIL

BUT currently it continues to find the OLD removed TXT records and fails

In the log I now find this . . . (dates and times removed and xxxxxxxx.com is a replacement here for the actual domain name)

350:INFO:certbot.auth_handler:Performing the following challenges:
350:INFO:certbot.auth_handler:dns-01 challenge for xxxxxxxx.com
351:INFO:certbot.auth_handler:dns-01 challenge for xxxxxxxx.com
362:DEBUG:certbot_dns_rfc2136.dns_rfc2136:No authoritative SOA record found for _acme-challenge.xxxxxxxx.com
365:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Received authoritative SOA response for xxxxxxxx.com
369:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Successfully added TXT record
375:DEBUG:certbot_dns_rfc2136.dns_rfc2136:No authoritative SOA record found for _acme-challenge.xxxxxxxx.com
377:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Received authoritative SOA response for xxxxxxxx.com
381:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Successfully added TXT record
383:INFO:certbot.plugins.dns_common:Waiting 120 seconds for DNS changes to propagate
483:INFO:certbot.auth_handler:Waiting for verification…
484:DEBUG:acme.client:JWS payload:
. . . more

and then certbot reports this to the terminal window . . .

  • The following errors were reported by the server:

    Domain: xxxxxxxx.com
    Type: unauthorized
    Detail: Incorrect TXT record
    “y3N3RuF_iYOyL5yRn0v-rvCxCeooyZ42KxfwaBK39aw” (and 1 more) found at
    _acme-challenge.xxxxxxxx.com

    Domain: xxxxxxxx.com
    Type: unauthorized
    Detail: Incorrect TXT record
    “kH14fF8CAO6sZHj_l8u7u5ZVaeVKgDf3XXN7ptGky-o” (and 1 more) found at
    _acme-challenge.xxxxxxxx.com

which are the old TXT values I used to manually get the certs

It also puts a jnl file in /var/named for the domain

I still have the question about the SOA statement for " _acme-challenge.xxxxxxxx.com "

and

is this a TTL issue of some kind maybe ??


#15

The first thing I think about when I hear of issues with dns txt records is EDNS0 which requires support for tcp based dns packets and that implies large packets. Large packets require large mtu settings and/or appropriate do not fragment settings.

Make sure your network is modern to support this capability.

That is all I can really add here. Good luck!


#16

This is a journal file issue

After running " rndc -V sync -clean " - which also deletes the journal files - it changed the domain zone file I was working with frighteningly - I am glad I had a backup sitting in the directory

This is what Keltounet was referring to in his post above

So after removing the jnl file the dry-run then returned

  • The following errors were reported by the server:

    Domain: xxxxxxx.com
    Type: unauthorized
    Detail: No TXT record found at _acme-challenge.xxxxxxx.com

    Domain: xxxxxxx.com
    Type: unauthorized
    Detail: No TXT record found at _acme-challenge.xxxxxxx.com

even thought the log reported a TXT file was successfully added - actually the two were added


#17

Can you confirm from localhost (on the server bind is running) the TXT records are added in the 120 second window certbot uses for the change to propagate?

dig @localhost _acme-challenge.example.com TXT


#18

I had just gotten around to that direction because I got a

25.010 general: warning: managed-keys-zone/localhost_resolver: Unable to fetch DNSKEY set ‘dlv.isc.org’: SERVFAIL in the /var/log/named.log file

I am trying to track that down - however given what the jnl file issue does - a better path may be the solution at How to renew wildcard cert with cert-bot auto and remove the jnl file and stay out of dynamic updating because my other domain modification program may be locked out by the jnl file matter from updating the zone file

Here is the DIG output

[root@main ~]# dig @localhost _acme-challenge.xxxxxxxx.com TXT
; <<>> DiG 9.9.4-RedHat-9.9.4-61.el7 <<>> @localhost _acme-challenge.xxxxxxxx.com TXT
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 41155
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;_acme-challenge.xxxxxxxx.com. IN TXT

;; ANSWER SECTION:
_acme-challenge.xxxxxxxx.com. 14400 IN CNAME xxxxxxxx.com.

;; AUTHORITY SECTION:
xxxxxxxx.com. 10800 IN SOA ns1.yyyyyyyy.com. no-reply.main.yyyyyyyyy.com. 2016122004 3600 7200 2419200 86400

;; Query time: 5145 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Aug 09 14:48:01 CDT 2018
;; MSG SIZE rcvd: 140

Do you or anyone know if there are sample authenticator.sh and cleanup.sh file for the RFC 2136 dns plugin or for any other plugin for BIND ??


#19

I looked into using authenticator hooks method and it appears is worse than the plugin after all

I fixed the " fetch DNSKEY " issue as it was a permission matter also cured by adding the group permission

following a hint from https://github.com/certbot/certbot/issues/5663

I recreated a zone file for " _acme-challenge.xxxxxxxx.com " and I added " check-names ignore; " to the zone config

when I ran it this time I get the following relevant entries in the log

531:INFO:certbot.auth_handler:dns-01 challenge for xxxxxxxx.com
555:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Received authoritative SOA response for _acme-challenge.xxxxxxxx.com
567:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Successfully added TXT record
578:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Received authoritative SOA response for _acme-challenge.xxxxxxxx.com
584:DEBUG:certbot_dns_rfc2136.dns_rfc2136:Successfully added TXT record

Initially I put in the zone file this kind of txt entry

_acme-challenge.xxxxxxxx.com. 86400 IN TXT “”

In reviewing the zone file I see it has been modified as thus:

$TTL 14400 ; 4 hours
A xxx.xxx.xxx.xxx
$ORIGIN _acme-challenge.xxxxxxxx.com.
localhost A 127.0.0.1

and the jnl file is now created for _acme-challenge.xxxxxxxx.com

but it still fails as this

Failed authorization procedure. xxxxxxxx.com (dns-01): urn:ietf:params:acme:error:unauthorized :: The client lacks sufficient authorization :: No TXT record found at _acme-challenge.xxxxxxxx.com, xxxxxxxx.com (dns-01): urn:ietf:params:acme:error:unauthorized :: The client lacks sufficient authorization :: No TXT record found at _acme-challenge.xxxxxxxx.com

Everything on the DNS appears is now working and named is not throwing errors into its log, so does anyone have a clue now what is the issue ??

As the github post above shows - the documentation is sparse

Subsequent dry-run efforts kept bringing the old value for the TXT file even if the jnl file is removed and BIND is restarted

A subsequent run produced

$TTL 120 ; 2 minutes
TXT “”
$ORIGIN _acme-challenge.xxxxxxxx.com.
$TTL 14400 ; 4 hours
localhost A 127.0.0.1

which reported as an error incorrect TXT “”

removing that above and the jnl file and a BIND restart - still again brings back a NO TXT record found

It is beginning to look like this is a bug


#20

I still haven’t seen any dig results during the waiting period.