Help: http-01 challenge fails with urn:acme:error:connection


#1

My domain is: theorganicstore.com.ar

I ran this command: We are using an acme client in a PHP script that requests certificates for several domains. I’m using theorganicstore.com.ar as an example, but the issue happens with others domains too.

It produced this output: The response we’re getting when trying to solve the challenges is:

{  
   "identifier":{  
      "type":"dns",
      "value":"theorganicstore.com.ar"
   },
   "status":"invalid",
   "expires":"2019-01-22T20:16:49Z",
   "challenges":[  
      {  
         "type":"tls-sni-01",
          [...]
      },
      {  
         "type":"dns-01",
          [...]
      },
      {  
         "type":"http-01",         // This is the challenge we're aiming to solve
         "status":"invalid",
         "error":{  
            "type":"urn:acme:error:connection",
            "detail":"Fetching http://theorganicstore.com.ar/.well-known/acme-challenge/4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88: Timeout during connect (likely firewall problem)",
            "status":400
         },
         "uri":"https://acme-v01.api.letsencrypt.org/acme/challenge/myCaqTZqqxR4GPQjkg3d4vN0GMx1lWPc5rpnx8c7Odk/11502104564",
         "token":"4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88",
         "validationRecord":[  
            {  
               "url":"http://theorganicstore.com.ar/.well-known/acme-challenge/4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88",
               "hostname":"theorganicstore.com.ar",
               "port":"80",
               "addressesResolved":[  
                  "52.200.197.31"
               ],
               "addressUsed":"52.200.197.31"
            }
         ]
      },
      {  
         "type":"tls-alpn-01",
          [...]
      }
   ],
   "combinations":[  
      [2], [1], [3], [0]
   ]
}

My web server is: nginx/1.14.0, hosted on AWS.
The operating system my web server runs on is: Ubuntu 14.04.4 LTS
I can login to a root shell on my machine: yes

The important part of the output is this:

Fetching http://theorganicstore.com.ar/.well-known/acme-challenge/4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88: Timeout during connect (likely firewall problem)

We’re trying to understand why this happens. This is what we’ve observed during our research:

  • Accessing http://theorganicstore.com.ar/.well-known/acme-challenge/4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88 works fine, and returns the expected answer.
  • The DNS is properly configured. An “A” record pointing to 52.200.197.31, and no IPv6 record.
  • The request never reaches our nginx logs (this is the weirdest) and we don’t have anything between our DNSs and the nginx. Our servers are in AWS.
  • The error is not permanent. After retrying once or twice, the certificate is issued without problems.

I would really appreciate some help to know more about the request made by LE, the response received, and why it fails randomly :slight_smile:

Thanks in advance!


#2

Hi @agustin-tiendanube

there are three things, one is critical.

X Nameserver Timeout checking EDNS512: e.dns.ar

One nameserver of com.ar doesn’t support EDNS512, this is not good, but not a too big problem.

The second is normally critical, but not now. com.ar has an invalid DNSSEC - configuration.

Fatal error: DNSKEY 1474 signs DNSKEY RRset, but no confirming DS RR in the parent zone found. No chain of trust created.

Rechecked with another tool ( https://dnssec-analyzer.verisignlabs.com/com.ar ), same result.

But the third problem is relevant ( https://check-your-website.server-daten.de/?q=theorganicstore.com.ar ):

The redirect http -> https is ok. But then there is a connection close, this is bad.

Perhaps a DDOS detection.


#3

The file is there and “accessible” from the Internet (good news):

wget http://theorganicstore.com.ar/.well-known/acme-challenge/4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88
--2019-01-15 20:52:11--  http://theorganicstore.com.ar/.well-known/acme-challenge/4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88
Resolving theorganicstore.com.ar (theorganicstore.com.ar)... 52.200.197.31
Connecting to theorganicstore.com.ar (theorganicstore.com.ar)|52.200.197.31|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://theorganicstore.com.ar/.well-known/acme-challenge/4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88 [following]
--2019-01-15 20:52:11--  https://theorganicstore.com.ar/.well-known/acme-challenge/4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88
Connecting to theorganicstore.com.ar (theorganicstore.com.ar)|52.200.197.31|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88’

#more 4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88
4YL3podIpvjMBxCt4b3rNunDpVyOopwWaCAZ_QBZN88.q87g1C0ug-YSOf8yMGGRIGGwfK_uWGwRW-OU1Svzdzc


#4

The active nameservers do not match your NS records.

nslookup -q=ns theorganicstore.com.ar a.dns.ar
theorganicstore.com.ar nameserver = ns2.sitelutions.com
theorganicstore.com.ar nameserver = ns1.sitelutions.com
theorganicstore.com.ar nameserver = ns-1984.awsdns-56.co.uk
theorganicstore.com.ar nameserver = ns-466.awsdns-58.com

nslookup -q=ns theorganicstore.com.ar ns1.sitelutions.com
theorganicstore.com.ar nameserver = ns2.sitelutions.com
theorganicstore.com.ar nameserver = ns4.sitelutions.com
theorganicstore.com.ar nameserver = ns5.sitelutions.com
theorganicstore.com.ar nameserver = ns3.sitelutions.com
theorganicstore.com.ar nameserver = ns1.sitelutions.com

nslookup -q=ns theorganicstore.com.ar ns-466.awsdns-58.com
theorganicstore.com.ar nameserver = ns-1141.awsdns-14.org
theorganicstore.com.ar nameserver = ns-1984.awsdns-56.co.uk
theorganicstore.com.ar nameserver = ns-466.awsdns-58.com
theorganicstore.com.ar nameserver = ns-930.awsdns-52.net


#5

Hi Juerguen! I’ve been thinking in DDOS protections as a possibility, first of all because that would explain why the request never reaches the Nginx server, but we didn’t set up any tool and the volume of requests is pretty low for that to happen. Anyway, I’ll research that possibility further.

Thanks for your help, and for those tools you used I didn’t know :slight_smile:


#6

Hi rg305! Yes, we keep two DNS providers for each domain, to avoid SPoFs. Do you think that might be generating issues?

Thanks for your help!


#7

They have different SOA records so it is difficult to say if both are actually synchronized properly.
And they don’t agree on which servers are authoritative.

I can’t be sure if that is causing any problem with the current situation.

I’m thinking Geo-location blocking maybe?


#8

Any chance you have some of Let’s Encrypt’s IPs blocked in a firewall?


#9

@rg305 I’ll include Geo-location blocking in my questions to AWS team, because our team haven’t set up anything like that.

@mnordhoff I’ve checked that, and we do not :confused: If anything like that is happening, might be on the AWS side. I’ll double check that in the AWS console. Thank you for your help!


#10

We’re still experiencing this issue. Our current workaround currently is retrying with a backoff strategy to avoid hitting rate-limits.

We’ve checked DDoS attack protection, and other kinds of filters, and couldn’t find any step in the middle that might be closing connections before our nginx servers.

Just as a reminder, the error message is:
Timeout during connect (likely firewall problem)

Thanks for all the help o/


#11

It’s the same picture. Checking with the tool https + /.well-known/acme-challenge has a

ConnectionClosed - The request was aborted: The connection was closed unexpectedly.

loading the site manual,

La página no existe + 404


#12

Seems like it allows/denies based on client header information?