Hi,
Lately, we are getting a lot of errors when we request LE to validate the DNS challenge, I will describe the flow:
We provide service to order certificates from LE using a DNS challenge, when the user ask to order a certificate we are doing the following actions:
- We perform a dry run challenge using fake TXT challenge value.
- We are validating the dry run challenge to make sure the user really own the domain.
- We start the order process against LE and we get the real DNS challenge.
- We send the domain validation request to the user DNS challenge handler API
- Once the user response with 200 OK, we start validating that the challenge satisfied by querying for he TXT record, if the record found successfully. we send LE a request to validte the challeng (and only if we really find it before).
So lately we are getting a lot of errors when asking LE to validate the challenge, these are the logs:
Oct 29 09:48:26 orders-manager-96c7888cf-d7557 orders-manager DEBUG DNS TXT query for domain '29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com' found record with value 2_X2dAOlRER35PxiwduncpXq0yQgPV8_llrpkPeY2Co was found, certificate 'crn:v1:staging:public:cloudcerts:us-south:a/56f52a905e4c4d8614b507ef330225e0:914705eb-2e1a-46e3-ab70-a7c01bef46d0:certificate:53aeb30b3cfb60dd1a40642cdf52e2ef'. The challenge was satisfied
Oct 29 09:48:26 orders-manager-96c7888cf-d7557 orders-manager DEBUG Domain validation for'29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com' finished successfully.
Oct 29 09:48:27 orders-manager-96c7888cf-d7557 orders-manager DEBUG Going to call the CA to validate the challenge for domain '29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com after a delay of '40000' ms
Oct 29 09:49:07 orders-manager-96c7888cf-d7557 orders-manager DEBUG The CA accepted the request to validate the challenge for domain '29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com'. Response headers are {"server":"nginx","date":"Thu, 29 Oct 2020 07:49:07 GMT","content-type":"application/json","content-length":"184","connection":"close","boulder-requester":"51223980","cache-control":"public, max-age=0, no-cache","link":"<https://acme-v02.api.letsencrypt.org/directory>;rel=\"index\", <https://acme-v02.api.letsencrypt.org/acme/authz-v3/8215809301>;rel=\"up\"","location":"https://acme-v02.api.letsencrypt.org/acme/chall-v3/8215809301/_F6PXA","replay-nonce":"0103GruBFvMzhOE8PsJhUTFH_75z-1Z-B9W8qZpxdIeDd6A","x-frame-options":"DENY","strict-transport-security":"max-age=604800"}. Response body is {"type":"dns-01","status":"pending","url":"https://acme-v02.api.letsencrypt.org/acme/chall-v3/8215809301/_F6PXA","token":"q4kZhTvRFHOFvgm98m2EKG7SinF4kemRraIK56AjCRQ"}
Oct 29 09:49:08 orders-manager-96c7888cf-d7557 orders-manager ERROR Couldn't order certificate for domains '["29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com"]'. Reason is: Certificate Manager was not able to process your request. Domain validation failed, check your DNS configuration.
Oct 29 09:49:08 orders-manager-96c7888cf-d7557 orders-manager DEBUG Polling domain '29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com' challenge validation status. Attempt number 1. Total polling delay 1 seconds
Oct 29 09:49:08 orders-manager-96c7888cf-d7557 orders-manager DEBUG Polled domain '29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com' challenge validation status from 'https://acme-v02.api.letsencrypt.org/acme/chall-v3/8215809301/_F6PXA'. Status is: 200. Response body is '{"type":"dns-01","status":"invalid","error":{"type":"urn:ietf:params:acme:error:dns","detail":"DNS problem: NXDOMAIN looking up TXT for _acme-challenge.29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com - check that a DNS record exists for this domain","status":400},"url":"https://acme-v02.api.letsencrypt.org/acme/chall-v3/8215809301/_F6PXA","token":"q4kZhTvRFHOFvgm98m2EKG7SinF4kemRraIK56AjCRQ"}'
Oct 29 09:49:08 orders-manager-96c7888cf-d7557 orders-manager ERROR Polled domain '29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com' challenge validation - status is 'invalid'. response body: '{"type":"dns-01","status":"invalid","error":{"type":"urn:ietf:params:acme:error:dns","detail":"DNS problem: NXDOMAIN looking up TXT for _acme-challenge.29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com - check that a DNS record exists for this domain","status":400},"url":"https://acme-v02.api.letsencrypt.org/acme/chall-v3/8215809301/_F6PXA","token":"q4kZhTvRFHOFvgm98m2EKG7SinF4kemRraIK56AjCRQ"}'
Oct 29 09:49:08 orders-manager-96c7888cf-d7557 orders-manager ERROR Domain '29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com' challenge validation polling failed. Reason is: 'Certificate Manager was not able to process your request. Domain validation failed, check your DNS configuration.'
In the logs, we can see that our server find the TXT recor. after finding it we are waning 40 sec and then posting the request to LE for validating the challenge.
Even though we find the TXT record, LE response with an error NXDOMAIN looking up TXT for _acme-challenge.29102012.preprod.e2e.certificate-manager.test.cloud.ibm.com - check that a DNS record exists for this domain"
So I wonder, what we are doing wrong that we find the TXT record but LE didn't.What should be done to prevent those errors?
Thanks for the help!