SSL certificate is not getting issued for a particular domain

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

This post is being made by SFMC Account Engagement support on behalf of a customer - RealPage, Inc.

My domain is: realpage.com (root domain)
SSL certificate is to be issued for sub domain - success.realpage.com

I ran this command: Subdomain is added to Salesforce Marketing Cloud Account Engagement application as a tracker domain. When we try to issue the domain a certificate, we get the following error:
{"type":"urn:ietf:params:acme:error:dns","detail":"DNS problem: query timed out looking up CAA for realpage.com","instance":"","code":400}

It produced this output: SSL error, certificate is not getting issued. Root domain does not have a CAA record so SSL should get issued. However since this is query timed out error no checks can be made on the root domain resulting in the SSL error.

My web server is (include version): Not applicable (customer provided answer)

The operating system my web server runs on is (include version): Not applicable (customer provided answer)

My hosting provider, if applicable, is: Not applicable (customer provided answer)

I can login to a root shell on my machine (yes or no, or I don't know): Yes (customer provided answer)

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): No (customer provided answer)

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): Not answered

Hi @Aritra, and welcome to the LE community forum :slight_smile:

Beyond the CAA problem, you also have an unexpected redirection found:

curl -Ii http://success.realpage.com/.well-known/acme-challenge/TEst_File-1234
HTTP/1.1 302 Found
Date: Wed, 13 Sep 2023 19:25:58 GMT
Content-Type: text/html; charset=UTF-8
Connection: keep-alive
set-cookie: pardot=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0
location: http://www.realpage.com   <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
cache-control: max-age=63072000
expires: Fri, 12 Sep 2025 19:25:58 GMT
vary: User-Agent
Server: PardotServer
X-Pardot-Route: e8229a0ff18ebffc83a98010d2521dd5
3 Likes

See:
success.realpage.com | DNSViz

4 Likes

Thank you for sharing the insights. One more question - customer is asking if Letsencrypt has an IP for them to whitelist? If yes, could you provide the IP? If no, then would there be any possible alternate solutions?

Hello! I'm working with Aritra on this case for Account Engagement (formerly Pardot)!

I believe the redirect you mentioned is expected, based on how we manage what we refer to as our "tracker domains". I'll admit, the DNSViz link you posted is a bit above my head but, from what we're seeing in our system, their DNS is properly configured. I don't suspect any issues with their CNAME setup as is.

Ultimately, while we're trying to issue the cert for success.realpage.com, it seems that the timeout is happening when LE is trying to query the root realpage.com for a CAA.

2 Likes

The DNS server needs to be globally reachable in order for Let's Encrypt to confirm that you actually control the domain name. Let's Encrypt checks from multiple vantage points, with regularly changing IPs, to ensure that the name is controlled the same way from everywhere on the internet.

Well, it doesn't look properly configured from DnsViz (or from Let's Encrypt's vantage points). The DNS server doesn't need to have a CAA record, but it needs to be able to properly say that it doesn't have one rather than giving an error (including DNSSEC errors) or not replying. See CAA Errors in the documentation.

4 Likes

Thanks Peter!

So, this is where things start to get sticky...we're aware that there is SOMETHING in the DNS which is causing the timeout. I indeed reviewed the documentation you posted, and we supplied our customer (who owns and controls the domain) with this specific paragraph:

Sometimes CAA queries time out. That is, the authoritative name server never replies with an answer at all, even after multiple retries. Most commonly this happens when your nameserver has a misconfigured firewall in front of it that drops DNS queries with unknown qtypes. File a support ticket with your DNS provider and ask them if they have such a firewall configured.

Unfortunately, they won't make any adjustments unless we can figure out exactly what adjustments need to be made. Would you be able to tell from here what realpage.com needs to adjust in their DNS configuration to allow LE to see there is no CAA record?

2 Likes

That seems like a good place to start:

6 Likes

This is another place to improve:
They should use DNSSEC [GoDaddy supports it - and they have (a possibly outdated) signed zone]:
image

6 Likes

I noticed those errors...and I compared them to another customer who have a functioning SSL cert, and they definitely don't have those errors...so I'm with you that this is where our customer wants to start.

However, I'm not exactly sure what these errors mean, nor why it would cause LE to timeout when querying for a CAA when I can get a response via a direct DIG command or by using xnnd:

If the issue were due to something globally in the DNS, wouldn't all services timeout when querying?

I think the LE failure is more related to a (very) "bad" CAA response than "no response".

A "quick-fix" might be for them to unsign the zone [removing all DNSSEC].

4 Likes

Gotcha.

So...at the end of the day, you'd feel confident saying that the solution would lie within the DNS system for realpage.com, and not anything wrong with LE or Salesforce? (as the intermediary...the customer seems to feel we are the ones at fault here).

2 Likes

It does look it to me, yes. Or, at least the DNSSEC errors being cleared up would remove some of the noise to make it easier to diagnose what's really going on. It may be possible to work around it by having a CAA record at the success.realpage.com level, but since that chain of CNAMEs ends at an AWS name that might be tricky to do for you in practice.

Another workaround could be to try another CA. (There are several free ones using ACME besides Let's Encrypt now.) I wouldn't necessarily expect it to work any better, but it might give you some different diagnostics, and could help convince people that it was really on the DNS side of things.

6 Likes

Another data point, but may be a red herring: When I use Unboundtest to try to get the CAA record (which is a service set up similarly to how Let's Encrypt resolves names), it does get the NOERROR empty CAA record, but (1) seems to take longer than I would expect, and (2) has some entries reading error: read (in tcp s): Connection reset by peer. Now, the output is even harder to parse through than DNSViz's output (which I know is saying something), but I think that it may be that the DNS server (or some firewall in front of it) is seeing a bunch of requests coming in at once and blocking at least some of them. Let's Encrypt does check from multiple places all at once, which sometimes looks to overly-aggressive firewalls like it's some sort of attack to be blocked. That might not be what's happening here, but it may be something else to look into.

7 Likes

That's the wrong domain.

3 Likes

The issue is likely that we're blocking one or a few of the source IP addresses. We do block some specific offenders, but without knowing the source IPs or even regions, we can't prevent them from being blocked. For all we know, these tests could be coming out of Russia.
There are no CAA records, and multiple online sources have no issues checking it, so this is definitely something unique to LE.
https://www.entrust.com/resources/certificate-solutions/tools/caa-lookup
https://www.nslookup.io/domains/realpage.com/dns-records/caa/

It's not unheard of to block IP addresses from accessing a DNS server and the expectation shouldn't be to leave a DNS server completely open to the entire DNS. There are multiple ways of protecting a DNS server from DDoS, but without a whitelist this could pop up at any time.

As long as you are still publishing invalid DNSSEC, nothing else is relevant. I would focus on the elephant in the room before getting distracted by some flies.

4 Likes

I'm sorry, but that's not correct. DNSSEC only matters if the Client is enforcing it. In this case LE would be the client and they would receive a response with AD=0 instead of a timeout.

It's a red herring. The signature has already been re-generated but the CAA request still times out. LE needs to publish it's source IPs.

Has the cert request been retried now that DNSSEC is fixed? Because HTTP cert requests for success.realpage.com are passing Let's Debug tests including the Let's Encrypt staging test. That is, no more DNS issues at least in Staging

4 Likes

That's simply not going to happen. See the Let's Encrypt FAQ.

2 Likes