Reissue of certificate fails

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is:
http://portal.applebywestward.co.uk/

I ran this command:
Auto-renew and manual reissue fails

It produced this output:

My web server is (include version):
CPU Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz (2 core(s))
Version Plesk Obsidian v18.0.44_build1800220614.18 os_Ubuntu 20.04
OS Ubuntu 20.04.4 LTS
System Uptime: 203 day(s) 22:30

The operating system my web server runs on is (include version):
As above screenshot: Ubuntu 20.04.4 LTS

My hosting provider, if applicable, is:
Inside an Azure secure network managed by the parent company.

I can login to a root shell on my machine (yes or no, or I don't know):
Yes, from within a Citrix secure network

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):
Plesk: PLSK.08021520.0011

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):
I'm not sure what this is?

Additional notes:
We have had firewall issues in the past since these are controlled by the parent company. Perhaps this is an issue still?

When the challenge file is created, I can access from within a browser, however, from within Plesk the reissue and auto-renew fails.

The website and certificate were working fine until certificate auto-renew (in Plesk) was attempted.

Any help will be gratefully received.

1 Like

Is your firewall from the Palo Alto brand? See Palo Alto firewall users with failing HTTP-01 challenges: enable "acme-protocol"

@MikeMcQ Wanna do some Palo Alto debugging? You had some nice tricks I believe to test this.

10 Likes

I have often applied them but I don't think I invented them (9peppe maybe?). Anyway, I don't see a "reset by peer" error anymore like they show in the first post. Maybe they fixed their firewall in the meantime?

Right now I see a failure in their redirects. An http challenge request gets redirected ultimately to a Wordpress login page. I don't know Plesk / Wordpress well enough to suggest fixes other than to show the failure below (various headers omitted for brevity)

curl -ikL http://portal.applebywestward.co.uk/.well-known/acme-challenge/SampleToken
HTTP/1.1 301 Moved Permanently
Server: nginx
Date: Wed, 22 Jun 2022 16:36:53 GMT
X-Redirect-By: WordPress
Location: https://portal.applebywestward.co.uk/

HTTP/2 302
server: nginx
x-redirect-by: WordPress
location: https://portal.applebywestward.co.uk/wp-login.php?redirect_to=https%3A%2F%2Fportal.applebywestward.co.uk%2F&wppb_referer_url=https%3A%2F%2Fportal.applebywestward.co.uk%2F

HTTP/2 200
server: nginx
set-cookie: wordpress_test_cookie=WP%20Cookie%20check; path=/; secure
vary: Accept-Encoding
11 Likes

Thank you, guys. Are you saying that a force redirect to the WP login page could be causing this? Not sure if I want to turn off that feature every time the certificate needs updating but I will give this a go. Thank you once again :slight_smile:

2 Likes

Perhaps you could make an exception for the path /.well-known/acme-challenge/ somehow.

10 Likes

Thank you. I have turned off the 'Private website' feature (which forces people to login to view the site) but I'm getting the same problem:

When I copy and paste (into a browser) the 'problem' url that cannot be fetched, it loads ok:
http://portal.applebywestward.co.uk/.well-known/acme-challenge/mkNrtdjN7WnZQuHwKDJUJ6h8M4E7KQPTTanYFHke1jg

1 Like

Oh, right. I can now reproduce the "reset by peer". Use the same curl requests I used earlier but add a user-agent string the same as the Let's Encrypt servers use. It will fail with reset by peer. We have seen this variation with the user-agent more frequently lately and it is very similar to an earlier problem with Palo Alto network firewalls but we have not yet had it confirmed.

Can you check with your network support? If Palo Alto brand they made a change to add an Application Rule blocking "acme challenge". Maybe yours is like this even with a slightly different symptom.

See this fail but removing the -A (user-agent) option works:

curl -I http://portal.applebywestward.co.uk/.well-known/acme-challenge/ForumTest123  -A "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"

curl: (56) Recv failure: Connection reset by peer
10 Likes

Note: The very first line from the very first reply:

Has that question been answered?

8 Likes

No. And, so far no one who fails ONLY because of the user-agent string on the acme-challenge has said it was Palo Alto Networks. It is very coincidental but it is distinctly different than earlier patterns. Would be nice to get this confirmed. I think this is 3rd or 4th user-agent string I've worked on in just past couple weeks.

10 Likes

Ah fantastic. I do not know the firewall brand. I will have to ask. All makes sense what you are saying :slight_smile:

1 Like

Ok, I will check the firewall brand. In any case it sounds like acme-challenge is being blocked and/or the specific user-agent

2 Likes

Actually, it is the acme challenge URL AND the user-agent from the Let's Encrypt server (or even Let's Debug test site).

The acme challenge URL or that user-agent by themselves are not blocked.

It would be great if you find the cause and let us know any specifics (brand, setting, ...). It will make it easier for us to help others with same symptoms.

11 Likes

I've also been speaking to one of my IT colleagues and he seems to think the parent company DOES use Palo Alto firewall. I will do my best to have this confirmed tomorrow. Thank you once again

2 Likes

I will do my best to relay any findings back in here. I know that sometimes the fixes are not recorded so I will do my best to let you all know :slight_smile:

3 Likes

I can confirm that the firewall is Palo Alto! We have a Teams meeting with the parent company tech team later this morning.

4 Likes

Parent company have apparently added a rule. However.....
image

2 Likes

Hmmm. First, I still see the same problems with acme challenge requests being sent to your server (inbound) as I showed in post #8. Similarly, the Let's Debug test site still fails. Whatever rule they changed did not fix this problem.

Further, the problem you now show is one connecting outbound from your server to the Let's Encrypt server. The error can be tested using this command:

curl -I https://acme-v02.api.letsencrypt.org/directory

That can fail for any number of reasons and was not happening earlier. A request to this URL happens very early in the acme request sequence and would not even reach the point where the inbound request would fail with reset by peer. Your "could not resolve host" error points to a DNS lookup problem.

You should also try something like this to ensure you system can make any outbound requests

curl -I https://google.com

If you can connect to google (should see an http 301) but not acme-v02 then your network group may now have a block on those outbound requests. But, if you can't reach google either you have a more fundamental networking issue. In either case another talk with your network group is needed.

10 Likes

Thank you for this. I've also run some curl tests myself and sent to the parent company network team. I'm seeing DENY in the results.

3 Likes

The Let's Debug is useful information too. Again, I'll pass this on.

3 Likes

"Could not resolve host" suggests a DNS issue.

10 Likes