Must run SlickStack install twice to generate Let's Encrypt certs

On the new servers I can run ss-install and setup completed fine, however only one problem is Let's Encrypt certs did not issue with "unauthorized" error....

After running ss-install again the errors disappeared and Let's Encrypt certs generated fine. Why??

Webroot method is used and the /.well-known/ is allow all request.

## well-known ##
location ^~ /.well-known/ {
		allow all;
		auth_basic off;
		default_type "text/plain";
		try_files $uri =404;
	}

Cloudflare is active during installation... is that a problem about IPv6 or something? But why it can generate fine after run the install again?

@zocuf Welcome to the community

I moved your post to the Help category. Had you posted there first you would have been shown the questions below. Please answer as best you can

=============================

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. crt.sh | example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is:

I ran this command:

It produced this output:

My web server is (include version):

The operating system my web server runs on is (include version):

My hosting provider, if applicable, is:

I can login to a root shell on my machine (yes or no, or I don't know):

I'm using a control panel to manage my site (no, or provide the name and version of the control panel):

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot):

3 Likes

Hello,

This appears to be because Certbot runs webroot verification tests over HTTP by default and because SlickStack is HTTPS-only with HSTS enabled by default, the verification fails initially.

Here's what I commented on GitHub:

This is strange, because unless the Certbot team has carelessly written their error messages (unlikely) then it means the verification tests for both example.com and www.example.com are trying to load from http://example.com which doesn't make very much sense to me... I would expect separate domain verifications.

And because SlickStack runs over HTTPS by default and has HSTS enabled by default, the HTTP verification is going to fail which means we need a way to tell Certbot to run the tests over https://... instead I think.

@jessuppi Note that Certbot does not do the validation, the ACME server does. And the validation server does not care about HSTS. When using the http-01 challenge (in casu the webroot plugin), it always initiates the request using HTTP and follows redirects to HTTPS if applicable using the HTTP Location header.

4 Likes

Thanks for the clarification @Osiris

So it's not an HSTS problem, but do you know why the verification error message uses the same non-www version of the domain in both responses? I would expect it to spit out 2 distinct error messages (e.g. one for www and one for non-www version of the domain(s) that failed verification).

Certbot failed to authenticate some domains (authenticator: webroot). The Certificate Authority reported these problems:
  Domain: example.com
  Type:   unauthorized
  Detail: ... Invalid response from http://example.com/.well-known/acme-challenge/5vnlI6sdSN5ixd0467ij9wZgoaWr2NiS3dsmdmj54k4: 404

  Domain: www.example.com
  Type:   unauthorized
  Detail: ... Invalid response from http://example.com/.well-known/acme-challenge/1SgtlSd0B60jZWGy2LEUlHZ4jgBIhjouVeqH65OS44Q: 404

In other words, why doesn't it respond with:

Invalid response from http://www.example.com ... on the 2nd error message here?

Is that because it's followed the 301 redirects already (to the non-www version of the website) because if that's the case, I'm even more confused why it's failing verification.

If it's already followed redirects, the error message should read: https://example.com because that is the final destination of all the redirects on this server.

The URI in the error message is the ultimate URI used which returns the error. Let's Encrypt follows HTTP redirects, so if the www subdomain gets redirected to the apex domain which ultimately fails, the only hostname in the error message would be the apex domain due to the redirect.

Apparently that's not the case for the path /.well-known/acme-challenge/.

Anyway, I'm not familiar with "SlickStack", but it might be due to how they have the webserver configured. Without details about "SlickStack" and its configuration, we're just guessing here.

Please answer the questionnaire posted by @MikeMcQ above.

3 Likes

If the verification script follows all 301 redirects, the ultimate URL should still contain https:// though unless there's a bug or known ambiguity in the error messages.

I'm the lead dev for the SlickStack project, so I can confirm this happens on any new LEMP installs, regardless of the cloud provider, and it's on Ubuntu 22.04. I guess there is something in our Nginx configuration (or Cloudflare) that is conflicting with how Certbot runs, I'm just not sure what. Many of our users have also been having problems with OpenSSL and kernel errors on Ubuntu 22.04 and must reboot before their website frontend loads properly, so that could be related.

However, the 2nd time our SlickStack installer runs, Let's Encrypt always issues the certs properly (even without a reboot), which I assume means that our configuration works fine, but maybe the first time there's something not loading or redirecting properly in regard to HTTPS or something.

The fact that successive attempts work fine leads me to believe this is related to network/SSL caching and why I thought HSTS could be involved.

Unless the webserver did not redirect the challenge to HTTPS. We regulalry advice users to actually don't redirect to HTTPS for the challenge if the users webserver configuration isn't working properly with the http-01 challenge and redirects to HTTPS.

Most likely a bug in the webserver configuration. The code in Boulder is pretty simple. I trust Boulders error messages completely regarding the URI presented.

2 Likes

Does SlickStack setup a port 443 listener with a self-signed cert before invoking certbot for first time? Or, are there only port 80 listen server blocks?

And, after nginx conf is adjusted the first time is it restarted before invoking certbot? Adding/removing listen sockets needs a restart rather than just reload.

The flow for Let's Encrypt and certbot webroot is straight-forward. Certbot (ACME client) places challenge data in webroot folder and requests a cert from LE ACME Server.
Multiple LE Servers send HTTP challenge requests (*) to the domain per the IP in DNS. They follow any redirects to http(80) or https(443). The Servers expect to see the proper challenge data returned by that request. If satisfied, the ACME Client is able to retrieve the cert from the LE Server. (obviously, some technical flows omitted)

If you had the port 443 listen block setup prior, a simple reload of nginx picks up a new cert.

It sounds like a default nginx server block could be handling the requests until the second time around. But, I know nothing about SS so it's just a vibe based on the trouble report. Hope this helps

(*) A URL like: http://(domain)/.well-known/acme-challenge/SampleTokenName

3 Likes

Yes this is exactly how it works -- first, SlickStack generates a self-signed OpenSSL cert, installs Nginx configuration that points to that self-signed cert and restarts Nginx. Then, the Certbot script runs and attempts (and fails) to generate the Let's Encrypt certs, and the rest of our installer finishes.

On the 2nd attempt, it always works fine... I have carefully reviewed permissions and tested all different kinds of Nginx location rules for the /.well-known/ and so forth, but there's nothing in our installer that I can find that should be running earlier in the sequence or anything.

In our default Nginx server block, we have a "catch-all" that 301 redirects all HTTP requests and non-matching domain names to the HTTPS block.

The error messages make me think something is not redirecting the first attempt correctly...

1 Like

Hmm.

What is the actual command you use for the nginx restart?

Can you run nginx -T just before certbot the first time? I'd be happy to take a look.

What you describe sounds fair but there are two symptoms. There is nothing inherently wrong with your www domain redirecting to the apex. Apart from you not knowing why :slight_smile:

The bigger problem is why the 404. Probably one thing explains them both.

2 Likes