Switching to DNS challence - challenge timeout, implementation best practices?

Hi folks,
I have a couple of certs, and they tend to get more. I don't know where the limit is, but it doesn't look nice that way. I would prefer to have a single cert that covers all the (virtual) hosts under a certain domain (like *.example.org).

I learned that this is possible, but makes it necessary to use a DNS-XX challence.

Currently I am using certbot with "webroot" enabled, and this seems to be a global option - all-or-nothing, one cannot manage some certs via "webroot" and others via DNS.
This appears to be a problem, because I have some certs where I can access the DNS and others where I cannot.

Then, looking for the way to do the DNS challenge, I find a couple of certbot plugins, but they seem to cover commercial DNS providers (or whatever that is, I don't understand most of it).

An exception is the "dns-rfc2136" plugin. This should do dynamic updates as documented here. Dynamic updates are problematic; they normally do not increment the serial, and they will have race-conditions with the regular updates of the zonefiles, specifically in hierarchical master/slave configurations (e.g. "hidden primary"). Correctly handling a zone with mixed static/dynamic content appears to be a PITA.

What I specifically did not find is some means to interact with the given backend of the DNS system, whatever that might be: a DLZ database, a DNSSEC signing system, or simply some zonefiles to be edited with unix tools (sed&awk).

So I currently assume this has to be done with a self-written hook, and "manual" mode. The important question hereby is: How much time do I have?

In my case, I have a DNSSEC signer that lives in a vault, right beside certbot (which also lives in the vault). An interaction would be very easy to implement: write an include for the zonefile, kick rebuild, and - wait.

This will then take a while. The signer has to do it's job, then it must send out the fresh zonefile to the backend primary, then it will get distributed to the slaves. Then comes the matter of DNS caching, which is quite interesting and complicated. :wink:

So, on the safe side, one might want to wait an hour or more before proceeding - which is just fine because we usually have a full month before expire, and it runs automated anyway. But how long is it allowed to take?

I didn't find information about allowable delays. Instead, I found a fancy discussion about the definition of the word "propagation". (This is something new; I normally know this kind of discussion only about the definition of the word "bug".) Obviousely, such doesn't help me here.

Any ideas? Any cool comments? Anybody running something similar? Or, anybody knowing the actual allowable delay?

Quite a lot. At least an hour. Maybe a day.

1 Like

If you really want to use a wildcard, then yes, the dns-01 challenge is required.

However, you can add up to 100 hostnames in a single certificate, so maybe a wildcard isn't necessary?

This is not true. Certbot can perfectly use challenge A for cert X, challenge B for cert Y and challenge C for cert Z.

Alternatively, you could delegate the _acme-challenge subdomain to a specific host running acme-dns and use the acme-dns-certbot hook.

4 Likes

Maybe - but then if I install some application with it's own virtual-server. i.e. new hostname, I still have to fetch a fresh cert with the added hostname. With the wildcard it should work instantly. And, as it appears, the wildcards seem to work with web-browsers.
(I'm not sure if they work well with SMTP.)

Hm, I didn't yet figure out how to configure this. My config.ini has preferred-challenges = http, and the renew-invocation then just checks all the existing certs for renewal.

What I did find in the meantime, however, is the --config option to certbot - so it is possible to have multiple cli.ini files with different behaviour (and different working-dirs) carrying independent sets of certs.

I definitely don't want yet another DNS server - I had a hard time designing all the views to get internal and external zones work seamlessly together with full validation, and allow the neccessary IP-flows.

But, this is interesting in another regard: when _acme-challenge is a delegation point, it can be a separate zone file. And this one can be dynamically maintained on the actual public-facing server(s). It doesn't need visibility in the LAN, it doesn't (yet) need DNSSEC - so then the dns-rfc2136 plugin becomes feasible, and there is no delay...

Okay, stuff to think about. Thank You! :slight_smile:

2 Likes

Don't use cli.ini or the --config option. Most, if not all, useful settings entered on the command line are stored in the renewal configuration files.

Personally, I only have anything in my cli.ini any more.

I can understand that and I don't know how complicated adding acme-dns would be, but note that it's only for answering the ACME token, nothing more, nothing less.

3 Likes

Have you considered acme-dns ? GitHub - joohoi/acme-dns: Limited DNS server with RESTful HTTP API to handle ACME DNS challenges easily and securely.

It is explained in full here: A Technical Deep Dive: Securing the Automation of ACME DNS Challenge Validation | Electronic Frontier Foundation

Using it could solve many of your concerns. You basically delegate authorization of any FQDNs to a dedicated subdomain that is running in a DNS namespace that is controlled by the acme-dns server.

1 Like

I encountered the precise engineering answer. It's in RFC 8555 chapter 7.4:
The order object returned by the server represents a promise that if the client fulfills the server's requirements before the "expires" time, then the server will be willing to finalize the order upon request and issue the requested certificate.

That "order object" with the expires time is written into the certbot log, and it is actually - a week. :wink:

3 Likes

So, here we are. Conclusion report: I did it.
What can I say - as soon as you do it right, it works:
https://flowm.daemon.contact/
(Don't worry if You don't understand what this does; it's a firewall autoconfigurator)

Obviousely one should put the verification record into the right zonefile, and not forget the '.' at the end (where appropriate).
(This will get more interesting when we get deeper zoning with hosts like a.b.c.example.com.)

So, as things are, I decided to go straight forward, just provide an include snippet for the zonefile, and let my DNSSEC-signer detect the updated include and then run the automation as it is. That takes now max. 22 minutes if the NOTIFY are operational (otherwise an hour longer - so it will get two hours for safety).

Another thing worth mentioning: there are two hosts that were validated with http-01 and should now be switched to dns-01, because they should have a genuine own cert, but are not (or should not be) web accessible.
But I found I cannot switch: The last validation appears to be valid for a month, and during that time there is no new challenge, and certbot will just report a successful renew and keep the http-01 method.

@Osiris I can now see why You advse against the cli.ini. As soon as one tries to do something a little more complex, this thing becomes a creepshow. This can only be used for options that never change and are valid in all circumstances - maybe the key-length or such.

1 Like

You can use --dry-run to test. And technically it's possible to deactivate an already valid authorization, but Certbot doesn't have this feature.

Correct. While it's possible to add any option to cli.ini, I would highly advise against it and indeed only use it for options you would change between certs.

3 Likes

Yes, I was thinking for a moment that certbot might no longer be the best tool here. It brings a real lot of user-friendliness and failsafe protection, which is mostly waisted when I'm now doing all scripting and automation. There should be something around a bit more low-level, maybe even a ruby library or such. But I'm too lazy to search now and I had enough scripting this month. Anyway it was a great helping hand for the beginning, and it works reliable, so lets just be patient.

1 Like

If you are writing in Ruby, I think GitHub - unixcharles/acme-client: A Ruby client for the letsencrypt's ACME protocol. is the most popular Ruby library for ACME.

3 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.