Certbot renew fails even when the challenge HTTP request is working

Attempting to renew a domain that we manage the challenge is failing, even though I've verified our server is returning HTTP status 200 with the challenge, I've tested this with a mock challenge file and by looking at our server logs:

34.221.255.206 - - [18/Jan/2022:16:37:19 +0000]  "GET /.well-known/acme-challenge/29YzYLF7PuG9_RvhgCzYRkBcdyJYza4DVNPa7ZS-ihI HTTP/1.1" 200 87 "http://status.upstreampay.com/.well-known/acme-challenge/29YzYLF7PuG9_RvhgCzYRkBcdyJYza4DVNPa7ZS-ihI" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
3.142.122.14 - - [18/Jan/2022:16:37:19 +0000]  "GET /.well-known/acme-challenge/29YzYLF7PuG9_RvhgCzYRkBcdyJYza4DVNPa7ZS-ihI HTTP/1.1" 200 87 "http://status.upstreampay.com/.well-known/acme-challenge/29YzYLF7PuG9_RvhgCzYRkBcdyJYza4DVNPa7ZS-ihI" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
64.78.149.164 - - [18/Jan/2022:16:37:19 +0000]  "GET /.well-known/acme-challenge/29YzYLF7PuG9_RvhgCzYRkBcdyJYza4DVNPa7ZS-ihI HTTP/1.1" 200 87 "http://status.upstreampay.com/.well-known/acme-challenge/29YzYLF7PuG9_RvhgCzYRkBcdyJYza4DVNPa7ZS-ihI" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"

My domain is: status.upstreampay.com

I ran this command: certbot certonly --webroot -w /var/www -m letsencrypt@statuspal.io --agree-tos -d status.upstreampay.com

It produced this output:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Cert is due for renewal, auto-renewing...
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for status.upstreampay.com
Using the webroot path /var/www for all unmatched domains.
Waiting for verification...
Challenge failed for domain status.upstreampay.com
http-01 challenge for status.upstreampay.com
Cleaning up challenges
Some challenges have failed.

IMPORTANT NOTES:
 - The following errors were reported by the server:

   Domain: status.upstreampay.com
   Type:   connection
   Detail: Fetching
   https://status.upstreampay.com/.well-known/acme-challenge/-AuMfLJNEnvC99s_RSgBAaHDyv8bYO-Cv1MAh0rvoWQ:
   Error getting validation data

   To fix these errors, please make sure that your domain name was
   entered correctly and the DNS A/AAAA record(s) for that domain
   contain(s) the right IP address. Additionally, please check that
   your computer has a publicly routable IP address and that no
   firewalls are preventing the server from communicating with the
   client. If you're using the webroot plugin, you should also verify
   that you are serving files from the webroot path you provided.

My web server is (include version): nginx/1.20.1

The operating system my web server runs on is (include version): Ubuntu 20.04 LTS

My hosting provider, if applicable, is: DigitalOcean

I can login to a root shell on my machine (yes or no, or I don't know): yes

I'm using a control panel to manage my site (no, or provide the name and version of the control panel): no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you're using Certbot): 1.0.0

2 Likes

I see the HTTP challenges have been redirected to HTTPS.
Can we have a look at the HTTPS vhost config (or both)?

5 Likes

FYI, this isn't a "renew":

This is:
certbot renew

5 Likes

Thanks @rg305, but it looks like certbot is smart enough to start the renewal process right?

...
Cert is due for renewal, auto-renewing...
Renewing an existing certificate
...

Will take it into account nevertheless, going forward we'll use certbot renew instead.

And this is the full NGINX configuration, for both HTTPS and HTTP:

server {
    server_name status.upstreampay.com; # managed by Certbot

    listen [::]:443 ssl;
    listen 443 ssl; # managed by Certbot
    ssl_certificate /etc/letsencrypt/live/status.upstreampay.com/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/status.upstreampay.com/privkey.pem; # managed by Certbot
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot

    location /.well-known/ {
        alias /var/www/.well-known/;
    }

    location / {
        proxy_pass http://statushq/status_pages/upstreampay$request_uri;
    }
}
server {
    if ($host = status.upstreampay.com) {
        return 301 https://$host$request_uri;
    } # managed by Certbot


    listen 80;
    listen [::]:80  ;
    server_name status.upstreampay.com;
    return 404; # managed by Certbot
}

If certbot follows redirects this should work, as seen with this test file:

http://status.upstreampay.com/.well-known/test/test-1

3 Likes

Where is that file locally?

Try placing a file in the expected challenge location:
http://status.upstreampay.com/.well-known/acme-challenge/test-2

5 Likes

That file is under /var/www/.well-known/test/test-1.

Now added file /var/www/.well-known/acme-challenge/test-2, with content Test OK.

You can test it at http://status.upstreampay.com/.well-known/acme-challenge/test-2.

2 Likes

Try changing:

To:

    location /.well-known/acme-challenge/ {
        alias /var/www/.well-known/acme-challenge/;
    }

If that fails, then try placing that modified location block within the HTTP vhost config.

5 Likes

Thanks Rudy!

Can you explain why making that change would make a difference? clearly by the looks of my test the challenge should work no? unless Certbot would fail to follow redirects from http to https, which seems to be the case, since adding that location statement to hte http vhost is what fixed the issue for me.

The questions is, why did this start to fail now? it has been working like this for years, did LetsEncrypt change something on how they perform their challenges? I was hoping I could get an answer clarifying on this.

3 Likes

LE does follows (as do our browsers and curl), but something in where certbot determines it should be placing the challenge file doesn't seem to match up.

5 Likes

But I have verified that the challenge file is indeed being placed in the right place.

And that LE is indeed being able to fetch it via HTTP:

3.120.130.29 - - [24/Jan/2022:16:12:51 +0000]  "GET /.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA HTTP/1.1" 301 169 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
3.120.130.29 - - [24/Jan/2022:16:12:51 +0000]  "GET /.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA HTTP/1.1" 200 87 "http://status.databalance.eu/.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
34.221.255.206 - - [24/Jan/2022:16:12:52 +0000]  "GET /.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA HTTP/1.1" 301 169 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
64.78.149.164 - - [24/Jan/2022:16:12:52 +0000]  "GET /.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA HTTP/1.1" 301 169 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
3.19.56.43 - - [24/Jan/2022:16:12:52 +0000]  "GET /.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA HTTP/1.1" 301 169 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
34.221.255.206 - - [24/Jan/2022:16:12:52 +0000]  "GET /.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA HTTP/1.1" 200 87 "http://status.databalance.eu/.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"
64.78.149.164 - - [24/Jan/2022:16:12:53 +0000]  "GET /.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA HTTP/1.1" 200 87 "http://status.databalance.eu/.well-known/acme-challenge/BhNPKPsO0kfJYttiyo6nN32fYWm_z6HGsetwMiYmyhA" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)"

As you can see its hitting the URL with HTTP and being redirected to the HTTPS vhost (as that's the only case where we 301 redirect according to the NGINX configs I've provided).

So something else must be off :frowning:

1 Like

For the moment the only way I found to fix this is to return the challenge directly in the HTTP vhost:

server {
    listen 80;
    listen [::]:80  ;
    server_name domain.com;

    location /.well-known/ {
        alias /var/www/.well-known/;
    }

    location / {
        return 301 https://domain.com$request_uri;
    }
}

Not super happy with it, but at least it works. If somebody could point to a better solution, and/or explain why LE challenge fails when redirection is involved, that'd be greatly appreciated.

2 Likes

No sure I understand your desire to use HTTPS for ACME challenge requests.
It's a cleartext conversation of an encrypted nature. Nothing is compromised.

5 Likes

It's not my desire, more the fact that this has been working for years, and is the NGINX config provided by certbot, and I believe it still works on our other server, it just stopped working on this instance for some reason, I just wanted to understand why.

Also we'll have to update all of our existing site's configs, which are more than 200, so again I'd love to understand why, or even better, get it to work again :slight_smile:

2 Likes

I agree but I do not see the cause either. What are the other error messages related to "error getting validation data"? There is usually some more info - maybe a clue. Update: in /var/log/letsencrypt/letsencrypt.log

Also, I still see your HTTP requests redirecting to HTTPS. You said adding the location / alias to your http server "fixed" this but I do not see any change:

curl -I http://status.upstreampay.com/.well-known/acme-challenge/Forum123 

HTTP/1.1 301 Moved Permanently
Server: nginx/1.20.1
Date: Mon, 24 Jan 2022 19:58:42 GMT
Content-Type: text/html
Content-Length: 169
Connection: keep-alive
Location: https://status.upstreampay.com/.well-known/acme-challenge/Forum123
6 Likes

I'd update that and see if it helps.

6 Likes

Agree with Rudy that upgrading is worthwhile.

In addition to my comment about the redirection, also want to note using certonly --webroot is different than renew

The renew command relies on values in the /etc/letsencrypt/renewal config file to construct the certbot parameters for that domain. Using certonly --webroot does not.

Your nginx conf has "managed by certbot" and those come from using the --nginx plug-in. That plug-in makes temp changes directly in the nginx conf to satisfy the http challenge without creating challenge files. If that is what is in your /renewal/ conf for this domain it could behave different than certonly --webroot.

When using ``renew``` command, the /var/log/letsencrypt/letsencrypt.log file shows the temp nginx settings when odd problems result.

6 Likes

The LE challenge is speaking http (unencrypted) and expects the reply to be http. Your redirect sends a reply via https (encrypted) which causes the challenge to fail. It's like the challenge is asked in English and the reply is returned in, say, Japanese. Oops! In other words, a plain text challenge is sent and a plain text reply is expected, not an encrypted reply. Hope this helps explain it.

7 Likes

I'd go farther and use

    location /.well-known/acme-challenge/ {
        root /var/www;
    }
3 Likes

I wouldn't do that!
Something more unique and a lot less able to reach anything else (including other sites) would be preferred.

5 Likes

root and alias work differently in nginx.

the advantage of using root inside a location /.well-known/acme-challenge/ block is that you can use the same identical path in certbot --webroot -w /path/to/dir and root /path/to/dir (which in this case is practically equivalent to alias /path/to/dir/.well-known/acme-challenge)

2 Likes