Seeking technical clarification on certbot certificate creation using DNS-01 and HTTP-01 challenge

TrainingByCoding · July 18, 2023, 3:17pm

Dear Let's Encrypt Community,

I have encountered an interesting issue while testing Certificate Installation and renewal on a whitelisted domain (acme-challenge-test.domain_name.com) that allows traffic only from specific range of IP addresses. During this testing, I faced problem with the http-01 challenge failing with the following error.

Command used: certbot certonly --standalone -d acme-challenge-test.domain_name.com --http-01-port=8888 --debug-challenges -v

Saving debug log to /var/log/letsencrypt/letsencrypt.log

...

Waiting for verification...

Challenge failed for domain acme-challenge-test.domain_name.com

http-01 challenge for acme-challenge-test.domain_name.com

Certbot failed to authenticate some domains (authenticator: standalone). The Certificate Authority reported these problems:

Domain: acme-challenge-test.domain_name.com

Type: connection

Detail: xxx.xxx.xxx.xxx: Fetching http://acme-challenge-test.domain_name.com/.well-known/acme-challenge/xUNTJyXuqVJM7CYhFWrWrrN4sKZPUqVfqlfadsfdfdsf: Error getting validation data

Hint: The Certificate Authority failed to download the challenge files from the temporary standalone webserver started by Certbot on port 8888. Ensure that the listed domains point to this machine and that it can accept inbound connections from the internet.

Cleaning up challenges

Some challenges have failed.

Ask for help or search for solutions at https://community.letsencrypt.org. See the logfile /var/log/letsencrypt/letsencrypt.log or re-run Certbot with -v for more details.

To resolve this issue, I decided to create the cert with DNS challenge, which completed successfully. Here's the command used for the same:

certbot -d acme-challenge-test.domain_name.com --manual --preferred-challenges dns certonly --debug-challenges -v

Interestingly, after the DNS challenge succeeded, I gave the http-01 challenge another shot and this time it worked without any issues (same command as above).

I am seeking your expertise to clarify on the following:

Question 1: Why did the http-01 challenge succeed on the second attempt? Could it be possible that the authentication was cached, and the second time it renewed the certificate without re-authentication?

Question 2: If caching is involved, where is this saved? And is there a way to clear the authentication cache to replicate the initial failure and investigate it further?

I have tried "resolvectl statistics" and "resolvectl flush-caches" to clear the cache (Ubuntu v22.04), but the behaviour persisted. I attempted to find any cache-related files in the certbot directories, but couldn't find anything apparently.

Question 3: What could be the reason behind the successful completion of the http-01 challenge after the DNS challenge? Is there a connection between these two challenges, especially considering that the http-01 challenge failed before the DNS challenge?

I would appreciate any insights or explanations that can help me better understand this behavior.

Thank you for your assistance!

Best regards,

ChandrGupt

trainingbycoding@gmail.com

petercooperjr · July 18, 2023, 3:24pm

As you saw, that won't let you get a certificate through http-01, because Let's Encrypt needs to verify that you own the name as seen by everywhere on the Internet and so they check from many places.

Note that this only changes the port that the standalone server is listening on, the validation will still happen over port 80. That option is designed for weird cases where you have some NAT device mapping incoming port 80 to some other port on the server, and isn't useful nearly as often as people try to use it.

Yes, Let's Encrypt saves successful validations for 30 days, though they're considering reducing it. During that time, you can get a certificate for the name without needing to re-authorize.

In Let's Encrypt's database, the "authorization" object for your name is marked as successful, and has an expiration of when it will no longer work and your ACME account would need to validate again.

In theory, yes your ACME client can explicitly invalidate the authorization. I don't think certbot exposes the functionality directly, but when you do --dry-run to test against staging, it should invalidate all the authorizations and so it will actually test the authorizations. If you're trying to do testing, then definitely use the staging environment, as that's what it's for.

The connection is just that they're for the same domain name.

TrainingByCoding · July 18, 2023, 4:08pm

Thank you very much Peter, for the detailed and super prompt response. This clarifies most of my queries.

I tried running certbot with --dry-run and on --staging too, but it kept on referring to the cached auth object. Couldn't find a way to invalidate / bypass it and make the http-01 challenge fail. I used this command -
sudo certbot certonly --standalone -d acme-challenge-test.domain_name.com --http-01-port=8888 --staging --dry-run --debug-challenges -v

Anyways, I was doing this out of technical curiosity and to understand the internal working better. If it's difficult to replicate it, I'll probably leave it at that. But, if anything comes up that can help me invalidate / ignore the cache for dry-run in staging, would love to try it out.

aarongable · July 18, 2023, 4:24pm

To provide just a little bit more context here: The ACME protocol specifically supports "authorization deactivation", which prevents an authorization from being re-used for a future order. Some ACME clients (such as acme.sh) expose this functionality directly, allowing the user to run a command which causes the client to make the appropriate authorization deactivation requests. Certbot, with its emphasis on full automation, does not.

petercooperjr · July 18, 2023, 4:27pm

Thanks, but I had thought that certbot's --dry-run did the invalidation to ensure that it was testing current status of being able to complete authorizations. I know only enough about certbot to be dangerous though (I use it sometimes for testing something but it's not my "daily driver" client) so maybe someone else knows better how to ensure that it's not reused cached authorizations. I might suggest trying --dry-run but without also specifying --staging, but again it's not something I've tried.

Osiris · July 18, 2023, 5:26pm

It does, but Certbot doesn't have some method to deactivate valid authz exposed to the user. Only internally for use by --dry-run. See:

github.com

certbot/certbot/blob/c31d3a2cfd1d55b9027e0932cd3dc373bc43b514/certbot/certbot/_internal/auth_handler.py#L120


      
                  # Keep validated authorizations only. If there is none, no certificate can be issued.
                  authzrs_validated = [authzr for authzr in authzrs
                                       if authzr.body.status == messages.STATUS_VALID]
                  if not authzrs_validated:
                      raise errors.AuthorizationError('All challenges have failed.')
          
                  return authzrs_validated
          
              raise errors.Error("An unexpected error occurred while handling the authorizations.")
          
          def deactivate_valid_authorizations(self, orderr: messages.OrderResource) -> Tuple[List, List]:
              """
              Deactivate all `valid` authorizations in the order, so that they cannot be re-used
              in subsequent orders.
              :param messages.OrderResource orderr: must have authorizations filled in
              :returns: tuple of list of successfully deactivated authorizations, and
                        list of unsuccessfully deactivated authorizations.
              :rtype: tuple
              """
              if not self.acme:
                  raise errors.Error("No ACME client defined, cannot deactivate valid authorizations.")

You can see it being used (and only being used) at:

github.com

certbot/certbot/blob/c31d3a2cfd1d55b9027e0932cd3dc373bc43b514/certbot/certbot/_internal/client.py#L487-L488


      
          if orderr and self.config.dry_run:
              deactivated, failed = self.auth_handler.deactivate_valid_authorizations(orderr)

If one would remove the and self.config.dry_run, Certbot would always deactivate any authz Hack hack hack..

TrainingByCoding · July 18, 2023, 5:40pm

Thank you very much Aaron, this clarifies.
We started with certbot and continued with it, as it's the default recommended option and it served all our purposes till date. This is probably the first such instance where we are seeing the need to try another ACME client. Will try out acme.sh for this particular use-case & see how it works out. Thanks again for your suggestion.

Osiris · July 18, 2023, 5:44pm

Why's that exactly? IMO there isn't any reason to want to deactivate valid authz on the production environment? What use case would you have for that?

TrainingByCoding · July 18, 2023, 5:44pm

Yes, I tried both the options : with --dry-run and --staging and also only with --dry-run; in both these cases observation was the same and the auth cache wasn't invalidated / ignored.
Thanks again for all your inputs.

Osiris · July 18, 2023, 5:46pm

I highly doubt that a valid cached authz was not deactivated when using --dry-run? Do you have the log to support that? As would very interested to see that.

Unless perhaps your Certbot version is older than when this feature was introduced? Although that would mean your used version is older than 0.40.0 as that was the version where the deactivation feature for --dry-run was introduced, almost 4 years ago now. Current most recent version is 2.6.0... Not sure when authz reuse was introduced though, can't find it in the changelog.

TrainingByCoding · July 18, 2023, 5:52pm

Fair point Osiris, I agree and maybe my earlier point needs rephrasing.
In production, there isn't any valid use case to carry this out, however, given the scenario articulated above, I thought of digging deeper into this and establishing that http-01 is working out after dns-01, only because of the cached auth object, which when invalidated, it doesn't.

TrainingByCoding · July 18, 2023, 5:59pm

Yes, I just checked on that, after seeing your (--dry-run codebase) response above.
The current certbot version is 1.21.0, it should then have the deactivation feature for --dry-run.
However, let me still check after upgrading to the latest version and see if the behavior changes. Will update shortly.

Osiris · July 18, 2023, 6:01pm

Version 1.21.0 should be fine with deactivating valid authz when using --dry-run. Interested to see a log where --dry-run didn't deactivate any valid authz

TrainingByCoding · July 18, 2023, 6:58pm

letsencrypt.txt (33.0 KB)
Here you go, the log file excerpt containing the --dry-run output.
I can see "Recreating order after authz deactivations" (line 212) however, the dry-run completed successfully.

petercooperjr · July 18, 2023, 7:05pm

It looks to me like the dry-run completed successfully because the standalone web server is in fact responding to the challenges and making a new valid authorization. What makes you think there's a problem?

TrainingByCoding · July 18, 2023, 7:21pm

To reiterate, http-01 works out successfully only after dns-01 does, because it finds a cached auth object. However, if --dry-run with http-01 actually ignores the existing cached authz and creates a new one, then expectedly it should have failed because http-01 challenge originally fails for this domain, because of it being a whitelisted one that allows traffic from designated CIDRs.

Bruce5051 · July 18, 2023, 7:23pm

Let's Encrypt uses Multi-Perspective Validation Improves Domain Validation Security - Let's Encrypt

petercooperjr · July 18, 2023, 7:27pm

The log you posted showed the http-01 challenges being responded to by the standalone web server, though.

2023-07-19 00:02:04,299:DEBUG:acme.standalone:::ffff:127.0.0.1 - - Incoming request
2023-07-19 00:02:04,299:DEBUG:acme.standalone:::ffff:127.0.0.1 - - Serving HTTP01 with token 'M4-KxboYi3rIIt5kKNdjP2ec9Ef0sN-XIRwObOsxJDk'
2023-07-19 00:02:04,299:DEBUG:acme.standalone:::ffff:127.0.0.1 - - "GET /.well-known/acme-challenge/M4-KxboYi3rIIt5kKNdjP2ec9Ef0sN-XIRwObOsxJDk HTTP/1.1" 200 -
2023-07-19 00:02:04,398:DEBUG:acme.standalone:::ffff:127.0.0.1 - - Incoming request
2023-07-19 00:02:04,399:DEBUG:acme.standalone:::ffff:127.0.0.1 - - Serving HTTP01 with token 'M4-KxboYi3rIIt5kKNdjP2ec9Ef0sN-XIRwObOsxJDk'
2023-07-19 00:02:04,399:DEBUG:acme.standalone:::ffff:127.0.0.1 - - "GET /.well-known/acme-challenge/M4-KxboYi3rIIt5kKNdjP2ec9Ef0sN-XIRwObOsxJDk HTTP/1.1" 200 -
2023-07-19 00:02:04,405:DEBUG:acme.standalone:::ffff:127.0.0.1 - - Incoming request
2023-07-19 00:02:04,406:DEBUG:acme.standalone:::ffff:127.0.0.1 - - Serving HTTP01 with token 'M4-KxboYi3rIIt5kKNdjP2ec9Ef0sN-XIRwObOsxJDk'
2023-07-19 00:02:04,406:DEBUG:acme.standalone:::ffff:127.0.0.1 - - "GET /.well-known/acme-challenge/M4-KxboYi3rIIt5kKNdjP2ec9Ef0sN-XIRwObOsxJDk HTTP/1.1" 200 -

So perhaps you intend for your firewall to be blocking the traffic, but it doesn't look like it is.

Osiris · July 18, 2023, 7:47pm

I concur with @petercooperjr. Let's walk through the log together, shall we? (The relevant parts that is.)

00:02:00,026: Certbot requests the ACME servers directory
00:02:00,027: Certbot retrieves the directory
00:02:00,851: Certbot requests a new order for acme-challenge-test.domain_name.com
00:02:01,125: Certbot retrieves an order already in the "ready" state with an authz "7356118554"
00:02:01,127: Certbot requests the authz "7356118554"
00:02:01,383: Certbot retrieves the authz "7356118554" already in the "valid" state
00:02:01,385: Certbot makes a POST to authz "7356118554" with content "status": "deactivated", thereby deactivating the already valid authz
00:02:01,645: Log notes "Recreating order after authz deactivations"
00:02:01,648: Certbot requests a new order
00:02:01,923: Certbot retrieves a new order with status "pending" containing a new authz "7356423174"
00:02:01,925: Certbot requests the authz "7356423174"
00:02:02,183: Certbot retrieves authz "7356423174" in the "pending" state, containing three challenges, all also in the "pending" state
00:02:02,184: Certbot fires up the standalone authenticator
00:02:03,794: Certbot makes a POST to the http-01 challenge
00:02:04,299 to 00:02:04,406: Certbot serves the three tokens
00:02:05,056: Certbot checks the authz by sending an empty POST to the authz URI
00:02:05,311: Certbot retrieves the now valid authz containing the now valid http-01 challenge
00:02:05,410: Certbot sends the CSR to the finalize URI of the order, triggering the ACME server to generate the certificate
00:02:05,677: Certbot retrieves the order in the "processing" state as a response
00:02:06,680: Certbot polls the order
00:02:06,938: Certbot retrieves the order poll and gets the order with "valid" state and containing a certificate URI
00:02:06,940: Certbot requests the certificate
00:02:07,198: Certbot retrieves the certificate

Aaaaaand done.

TrainingByCoding · July 18, 2023, 8:27pm

Thank you @petercooperjr and @Osiris for your prompt and precise assistance on this topic.

Sorry, my bad with the last testing iteration; due to an oversight, a required ACL was wrongly configured for the domain being tested for. After having fixed that, I can now confirm that --dry-run worked as expected and the certificate renewal with http-01 challenge failed because --dry-run deactivates the valid cached authz object.

Summarizing the findings here, for anyone who might refer this thread in the future -

http-01 fails for a whitelisted domain, that's accessible only from specific CIDRs
dns-01 challenge validates successfully for the same domain
retrying http-01 on this domain now succeeds, as authz object is cached in Lets Encrypt db for a designated period (30 days currently, considering reducing it)
retrying http-01 challenge with --dry-run fails, as expected, because --dry-run deactivates the valid cached authz object.

Your insights were invaluable, thank you very much for your support.

Topic		Replies	Views
Certbot Verification Issue (Challenge failed for domain) Help	5	18105	January 10, 2020
New domain - challenges on pending (dns-01 and http-01) forever Help	10	3292	March 3, 2018
Certbot - Cannot Pass HTTP-01 Challenge Help	4	2219	June 14, 2017
LetsEncrypt Challenge failed for domain when i try to get Certbot certificate Help	7	8028	July 9, 2023
DNS-01 challenge - CNAME/TXT Record incorrect Issue Help	4	2117	April 16, 2021

Seeking technical clarification on certbot certificate creation using DNS-01 and HTTP-01 challenge

Related topics