Wildcard DNS challenge fails due to duplicate TXT record?

Hi, I'm having issues verifying DNS records using certbot:v1.18.0 docker image. I had this working about a year ago so I'm not sure what's going wrong.

I'm running certonly with the domain parameters as such:

certbot certonly -nd 'example.com' -d '*.example.com' \
    --manual --preferred-challenges dns \
    --manual-auth-hook /vimexx/auth \
    --manual-cleanup-hook /vimexx/clean \
    --agree-tos --manual-public-ip-logging-ok --no-eff-email \
    --dry-run --debug \
    --email 'user@gmail.com'

Logs below. What seems to be happening is my auth-hook script is called twice with different TXT record contents, the second overwriting the first, then the TXT record is compared against the contents of the first write. The last time I worked with this, I seem to remember there was only one auth request done for both example.com and *.example.com at the same time.

What am I missing? Did something change? As far as I know I just need to create 1 TXT record named _acme-challenge.example.com.

certbot_1  | Account registered.
certbot_1  | Simulating a certificate request for example.com and *.example.com
certbot_1  | Hook '--manual-auth-hook' for example.com ran with output:
certbot_1  |  Authenticating example.com with 7nGb4ZVr-_pV75l0iMZvlrttlbLF12TJxPOAmHAbC_U
certbot_1  |  Setting example.com to 7nGb4ZVr-_pV75l0iMZvlrttlbLF12TJxPOAmHAbC_U
certbot_1  | Hook '--manual-auth-hook' for example.com ran with output:
certbot_1  |  Authenticating example.com with u3Y0IQGRe7N-65oucbSLNCXvW5nxOSjoCItDOiGzGXU
certbot_1  |  Replacing example.com with u3Y0IQGRe7N-65oucbSLNCXvW5nxOSjoCItDOiGzGXU
certbot_1  |
certbot_1  | Certbot failed to authenticate some domains (authenticator: manual). The Certificate Authority reported these problems:
certbot_1  |   Domain: example.com
certbot_1  |   Type:   unauthorized
certbot_1  |   Detail: Incorrect TXT record "u3Y0IQGRe7N-65oucbSLNCXvW5nxOSjoCItDOiGzGXU" found at _acme-challenge.example.com
certbot_1  |
certbot_1  | Hint: The Certificate Authority failed to verify the DNS TXT records created by the --manual-auth-hook. Ensure that this hook is functioning correctly and that it waits a sufficient duration of time for DNS propagation. Refer to "certbot --help manual" and the Certbot User Guide.
certbot_1  |
certbot_1  | Saving debug log to /var/log/letsencrypt/letsencrypt.log
certbot_1  | Exiting abnormally:
certbot_1  | Traceback (most recent call last):
certbot_1  |   File "/usr/local/bin/certbot", line 33, in <module>
certbot_1  |     sys.exit(load_entry_point('certbot', 'console_scripts', 'certbot')())
certbot_1  |   File "/opt/certbot/src/certbot/certbot/main.py", line 15, in main
certbot_1  |     return internal_main.main(cli_args)
certbot_1  |   File "/opt/certbot/src/certbot/certbot/_internal/main.py", line 1566, in main
certbot_1  |     return config.func(config, plugins)
certbot_1  |   File "/opt/certbot/src/certbot/certbot/_internal/main.py", line 1426, in certonly
certbot_1  |     lineage = _get_and_save_cert(le_client, config, domains, certname, lineage)
certbot_1  |   File "/opt/certbot/src/certbot/certbot/_internal/main.py", line 128, in _get_and_save_cert
certbot_1  |     lineage = le_client.obtain_and_enroll_certificate(domains, certname)
certbot_1  |   File "/opt/certbot/src/certbot/certbot/_internal/client.py", line 456, in obtain_and_enroll_certificate
certbot_1  |     cert, chain, key, _ = self.obtain_certificate(domains)
certbot_1  |   File "/opt/certbot/src/certbot/certbot/_internal/client.py", line 386, in obtain_certificate
certbot_1  |     orderr = self._get_order_and_authorizations(csr.data, self.config.allow_subset_of_names)
certbot_1  |   File "/opt/certbot/src/certbot/certbot/_internal/client.py", line 436, in _get_order_and_authorizations
certbot_1  |     authzr = self.auth_handler.handle_authorizations(orderr, self.config, best_effort)
certbot_1  |   File "/opt/certbot/src/certbot/certbot/_internal/auth_handler.py", line 90, in handle_authorizations
certbot_1  |     self._poll_authorizations(authzrs, max_retries, best_effort)
certbot_1  |   File "/opt/certbot/src/certbot/certbot/_internal/auth_handler.py", line 178, in _poll_authorizations
certbot_1  |     raise errors.AuthorizationError('Some challenges have failed.')
certbot_1  | certbot.errors.AuthorizationError: Some challenges have failed.
certbot_1  | Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.
myservices_certbot_1 exited with code 1

Nothing has changed, but you do need two TXT records to support both *.example.com and example.com in one certificate. The auth hook needs to add them, and the cleanup hook needs to remove them, and you need to have both TXT records in your DNS server at once in-between. (Unless there's some way to tell certbot to run one validation before running the next, but I don't think certbot handles that. Some other client might, though.)

Perhaps last time you were testing this, you had a cached authorization for one of the names? Once you complete a challenge Let's Encrypt's server usually doesn't ask you to validate the name again for a few weeks on the same ACME account.

3 Likes

Hi @freem1nd, and welcome to the LE community forum :slight_smile:

Has this ever worked?
How much of a delay do you use to allow the new TXT records to synchronize?

1 Like

Hi, thanks for replying! I'm still analyzing this, it seems there's something going wrong at the API side of my provider but I'm unsure.

To be clear, I need to create 2 TXT records, both named _acme-challenge.example.com but with different contents? That's what I'm trying and I'm getting a result: ok back from the API, however the 2nd record never actually shows up in subsequent read requests or in the provider's admin panel.

2 Likes

Hi, thanks for the welcome :blush:

This has worked before, even with 0 delay. Now, I tested with several delays but even with a 5 minute delay between each auth/cleanup request, the 2nd record never shows. Beginning to suspect an issue with my provider's API.

1 Like

Handling multiple TXT records at once is something that for some reason some DNS provider APIs don't handle well, despite it being a standard and typically-used thing. You might need to add both values in one modify-DNS-record call or the like, which can be tricky to integrate with certbot but I suspect is possible. But yes, probably the first thing to do is to try to reach out to your DNS provider. Especially if the call to add a second value used to work and now doesn't, they might have broken something on their side.

Another option, if your DNS provider just doesn't work well for automation, is to use something like acme-dns where you delegate the acme challenge TXT record to a dedicated server that's purely designed for handling the ACME DNS challenges.

2 Likes

This actually doesn't make sense to me.

It's been a while, but last I used certbot for this stuff, they looped the challenges once - not with a separate setup, auth, cleanup phases. i.e. each challenge should be handled independently and follow this flow:

  • auth hook
  • LetsEncrypt Validation
  • cleanup hook

IIRC, the challenge data is passed into the auth hooks via environment vars.

Personally, I would add some debugging lines to these files:

--manual-auth-hook /vimexx/auth \
--manual-cleanup-hook /vimexx/clean \

Perhaps there is a debugging option/loglevel built in to those scripts, to trace what they are setting and testing with a bit more granularity.

I don't think there needs to be 2 simultaneous TXT records, but there can be.

Years ago I used namecheap for some domains, and while they had one of the better APIs -- because you could delegate ONLY dns to an api token -- their DNS system appeared to use a read-through cache against the backing datastore , while their API and Admin Panel only affected the backing datastore (it was not a write-through cache). Every time you issued a DNS query, their systems would cache the data for 5 minutes. The way around it, until I switched to acme-dns, was to use a 301 second delay. Insane, I know.

Another option you can try is to obtain a first cert for example.com, a second cert for *.example.com, and then a third cert for both. If you do that within a short time period, I believe letsencrypt will re-use the successful validations and not issue a challenge. (I could be wrong on this part, they might be consumed on a successful cert- but I think that strategy will work)

2 Likes

Hmm. I haven't actually written any manual auth hooks for certbot, so I'll defer to your experience. It might not be the problem that I feared it was. I remember people having trouble here in the past with a DNS API only supporting one record at a time, but maybe it was a different client.

3 Likes

I think here's the thread I was thinking of:

But that was an issue with doing it all manually, but it sounded like authorizing through a script was supposed to handle things sequentially correctly. So forget everything I said. :slight_smile:

2 Likes

@petercooperjr

Some clients will do all the setups first, and acme-dns accommodates for this design pattern by allowing two records per domain.

I have no idea what certbot currently does, but the very old version I use for DNS auths does loop each challenge independently.

I am fairy certain that your understanding of the ACME protocol has increased significantly since that post, but on the off-chance it has not, the RFC essentially provides for this:

  1. when you create a valid order, you will receive a list of URLs for each authorization
  2. when you visit each authorization url, you will receive options to complete available challenges

So, when an order is made, the Server has essentially created all the authorizations and challenges - but the client does not know any of that information aside from a listing or URLs that will each contain a single authorization object.

While the order is pending, a client can inspect the URLs whenever they want - so they can decide to load all the authorization objects first and defer acting on them to any later point in time (before their expiry), or they can decide to instantly loop the authorization urls and act upon each authorization object's challenge within that loop.

I have spent TOO MUCH TIME reading the RFC and working out edge cases :wink:

2 Likes

Oh yes, I knew that clients could handle challenges and authorize them sequentially in whatever order they want. I just didn't know whether certbot (or other clients) actually did them that way. And in the case of non-scripted manual mode, it looks like it doesn't have an easy way to tell you one record for a name, validate it, and then tell you the other record for that same name. But that's irrelevant to what we're discussing here, of scripted manual mode, so never mind. :slight_smile:

1 Like

Thanks for replying! I wrote those scripts myself and they worked before. I did add some logging to it, which you can already see being output in the OP. I have since added more, just dumping the JSON before sending/processing it, which is how I confirmed that 1: The first record is saved properly (and immediately visible in admin panel, zero delay) and 2: Upon trying to send the 2nd record, I receive a succesful result from the API but it's not actually stored.

The order of operations that I get with the certbot command in OP is:

Auth
Auth
Failed validation
Clean
Clean

I'll try the 300+ second delay to be sure, but given that the 1st record is immediately available I don't think it's caching issue.

Support ticket at provider is awaiting a response... I'll wait for that before I'll try your last suggestion because that is just, ugh.

2 Likes

I would make sure to add to your debugging a line for "Successful validation". The one bit of concern I have, is that you shouldn't be going through an auth on a previously auth'd domain - although you are dealing with dry-run, which may not be affected by that implementation detail in Boulder.

Other than that, I am fairly certain you are dealing with a architecture change with your DNS provider.

As an immediate stopgap measure, I would try to issue two separate certificates and then get the third combined cert you actually want. I'd also start looking into running your own acme-dns or finding another dns provider, because this may not be something your current provider can address or wants to address.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.