Sudden urn:ietf:params:acme:error:unauthorized Errors Occurring

Hi all! My company has a platform where customers can purchase custom domains for their shop. We utilize a "Cert Service" with the AcmePHP client to request certificates. This process is triggered at the time of domain purchase or from a cronjob to update expiring certs.

About a week and a half ago, we noticed an uptick in failed authorizations (see example output below). Our service is not in active development and has been working properly since the Acme V1 deprecation. I'm curious if anyone else has experienced this type of issue before.

My domain is: woollastudio.com but we have many others that are also getting an invalid authorization.

I ran this command: Automated task that utilizes AcmePHP to order, authorize, and issue certs.

It produced this output: Invalid response from https://woollastudio.com:443/.well-known/acme-challenge/dgSHTK5GET4_5S3VY9k_bBbPGs3mBbh4HNycLH_dxTs: 400
Note: our cronjob is still active and may result in this link becoming invalid. I can always provide an updated acme-challenge URL as needed.

My web server is (include version): Apache 2.4.6

The operating system my web server runs on is (include version): CentOS 7

I can login to a root shell on my machine (yes or no, or I don't know): yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): no

1 Like

Welcome to the community @jsabada

Hmmm. I get a proper http 200 OK response for that URL. The headers and data look reasonable.

But, I do see some odd stuff. Maybe this will further the diagnosis

Your server is currently sending a cert from Digicert that looks related to Etsy. Is that intended?

And, an attempt at a faulty URL also gets http 200 OK (should get 403 Not Found):

curl -iLk http://woollastudio.com/.well-known/acme-challenge/ForumTest

HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Referrer-Policy: no-referrer
Location: https://woollastudio.com:443/.well-known/acme-challenge/ForumTest
Content-Length: 262
Date: Wed, 04 May 2022 18:02:32 GMT

HTTP/1.1 200 OK
Date: Wed, 04 May 2022 18:02:32 GMT
Server: Apache
Expires: Thu, 19 Nov 1981 08:52:00 GMT
(many other headers removed for brevity)

Token in url does not match pending challenge token

Note the "Token in url ..." is the data returned from your server for what should be a 403 Not Found.

Neither of these explain the http 400 in your error message. Is there more detailed debug info avail from that ACME client?

4 Likes

This probably doesn't make a difference one way or another, since HTTP-01 challenge validation ignores certificates.

Are you running any security software that automatically blocks certain IPs based on activity patterns, or third-party rule lists? E.g. fail2ban?

If you grep for one of the failed challenge URLs in your Apache access logs and error logs, what do you see?

Does it work on staging?

3 Likes

Yes, but, they do acquire a Let's Encrypt cert under that domain name and one is even valid for another 10 days. It is odd they don't send it out. I should have been more explanatory with my last sentence so I am glad you brought this up.

You are probably right in thinking it is more likely a firewall problem. But, the collection of odd things I saw pointed at their "custom domain platform" as possible culprit.

4 Likes

Q1. Has the redirection changed recently?
Q2. Do you really need to redirect to HTTPS (and add :443 to the URL)?
Q3. Since it is Apache, have you verified there is no name:port overlap?
[apachectl -t -D DUMP_VHOSTS]

2 Likes

We've been digging into the security rules and haven't noticed anything that correlates with the amount of errors we've been encountering but are still digging deeper.

Here is an example of one of the logs we're seeing:

Error with challenge for www.woollastudio.com:type="http-01" status="invalid" error="array ( 'type' => 'urn:ietf:params:acme:error:unauthorized', 'detail' => '130.211.40.170: Invalid response from https://www.woollastudio.com:443/.well-known/acme-challenge/OxHCbZVEmUuFUIhgN4zkrq4H8vS48Dwka56EN984dXw: 400', 'status' => 403,\n)" url="https://acme-v02.api.letsencrypt.org/acme/chall-v3/105068686346/6Pwp8Q" token="OxHCbZVEmUuFUIhgN4zkrq4H8vS48Dwka56EN984dXw" validationRecord="array ( 0 => array ( 'url' => 'http://www.woollastudio.com/.well-known/acme-challenge/OxHCbZVEmUuFUIhgN4zkrq4H8vS48Dwka56EN984dXw', 'hostname' => 'www.woollastudio.com', 'port' => '80', 'addressesResolved' => array ( 0 => '130.211.40.170', ), 'addressUsed' => '130.211.40.170', ), 1 => array ( 'url' => 'https://www.woollastudio.com:443/.well-known/acme-challenge/OxHCbZVEmUuFUIhgN4zkrq4H8vS48Dwka56EN984dXw', 'hostname' => 'www.woollastudio.com', 'port' => '443', 'addressesResolved' => array ( 0 => '130.211.40.170', ), 'addressUsed' => '130.211.40.170', ),\n)" validated="2022-05-04T15:43:57Z"

Unfortunately, this service has been in the "Keep the Lights On" mode for quite some time and we do not have a staging instance.

3 Likes

If the cert issuance fails, the default wildcard cert is apparently used. I can only assume the original devs thought the default cert was better than no cert at all.

2 Likes

What I meant regarding staging was: if you try to validate against Let's Encrypt's staging environment, does it work?

Thanks for the logs! These look like ACME client logs. I'm also particularly interested in logs (both error and access) from your Apache instance.

Being slightly pedantic: If it were a traditional firewall, we'd probably see a "Timeout during connect" problem. I suspect this is at the web server level (Apache); though products that operate on the web server level are often called "Web Application Firewalls (WAF)".

4 Likes

Well, it's fine for Let's Encrypt but many browsers won't be happy if they access that domain name

2 Likes

A1. Redirection has not changed recently.
A2. It is our security policy to always redirect to HTTPS.
A3. There doesn't appear to be any overlap. This particular server is only used for this Cert Service to make the calls to Lets Encrypt.

1 Like

Have you been able to inspect your apache access and error logs? It is important clue if we see the acme challenge requests in them. For example, if they don't appear there then something in front of that server is blocking the request.

Also, a Let's Debug test just now revealed a wrongly formatted CAA record. Is that new? The wrong format will prevent LE from issuing a cert to that domain (I think so anyway). Do you have this CAA record on the other domains?

2 Likes

Then there really seems like even less reason to redirect the HTTP challenge requests to HTTPS.
It only delays what could have been dealt with then.

Further, you overlooked the "and add :443 to the URL".
https://any.site/
is technically equal to:
https://any.site:443/
but they may not be handled exactly equally (especially by systems outside your control).

3 Likes

I checked with curl -v and the client's headers don't mention the port number on which the connection took place (and this information also isn't expressed inside of the TLS protocol). I think it would be challenging to construct a situation in which https://example.com/ and https://example.com:443/ URLs produce detectably different behavior on the wire. (Maybe if a client is using an application-layer proxy and the user-agent passes the complete URL to that proxy. A reverse proxy or WAF on the server side will apparently not be able to detect this distinction, though.)

3 Likes

If I were to make a program (like a proxy or browser) where HTTPS defaults to use some other port...
Then any use of HTTPS forced to port 443 would fail.
Unless :443 was explicitly stripped off or ignored.
Which begs the question: Why have it there at all?

2 Likes

The port number is a TCP issue (and UDP, with http/3).

The webserver might know about the TCP details, but the webapp?

TLS, http, they're above in the stack.

2 Likes

Thank you all for the message so far. We are working to get additional logging in the critical path to see what else we can glean from the failing process.

2 Likes

Don't forget your faulty CAA record. Let's Debug reports it as invalid which would block cert issuance. You have extra \" characters surrounding the value.

Does anyone know for sure those extra chars would block issuance? I am not confident enough in my Boulder code-reading to say for sure. Could that just be a bug in Let's Debug? tag @Osiris

4 Likes

The value may only contain alphanumerical characters and hyphens (per RFC 8659). So quotes are not allowed.

It looks like Boulder only checks if the domain part of the value only contains these chars:

So yes, I would recommend removing those quotes from the value.

It's rather confusing that every example you can find on the internet puts quotes around the domain of the value, but you shouldn't do that manually.

5 Likes

You're correct. The :443 was causing an issue on our end. I'm curious where the port is being added in. Our codebase doesn't appear to be adding it anywhere. Could something have changed in the requests LetsEncrypt is making for the acme-challenges?

3 Likes

Ahh yes. That was a mistake on my end as I was attempting different solutions in my initial investigation. I fixed that in the DNS record and the cert was issued properly.

3 Likes