Unable to remove obsolete TLS-SNI-01 from server


#1

Having problems disabling the obsolete TLS-SNI-01 on my webserver - couldn’t find any details on this specific error. I’m not an advanced Linux user so don’t want to poke around too much on a live system without advice. This server was set up a while ago so I’m rusty.

Note that I have Nginx running two domains, but the errors for the two domains are not exactly alike.

My domain is:
www.cosmicdan.com
www.wafassociation.org

I ran this command:
[Everything in the guide]
Specific error occurs during the renewal dry-run.

It produced this output:

root@localhost:/etc/letsencrypt/renewal# certbot renew --dry-run
Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/www.cosmicdan.com.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator nginx, Installer nginx
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for www.cosmicdan.com
Waiting for verification...
Cleaning up challenges
Attempting to renew cert (www.cosmicdan.com) from /etc/letsencrypt/renewal/www.cosmicdan.com.conf produced an unexpected error: Failed authorization procedure. www.cosmicdan.com (http-01): urn:ietf:params:acme:error:unauthorized :: The client lacks sufficient authorization :: Error reading HTTP response body: invalid byte in chunk length. Skipping.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/www.wafassociation.org.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator nginx, Installer nginx
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for www.wafassociation.org
Waiting for verification...
Cleaning up challenges
Attempting to renew cert (www.wafassociation.org) from /etc/letsencrypt/renewal/www.wafassociation.org.conf produced an unexpected error: Failed authorization procedure. www.wafassociation.org (http-01): urn:ietf:params:acme:error:unauthorized :: The client lacks sufficient authorization :: Invalid response from http://www.wafassociation.org/.well-known/acme-challenge/II7gCnFAN3EK9eR2RmHisyPY6Jn6fFTGKzH3NY-7bE4: "<!DOCTYPE html><html lang=\"en\" data-adblockkey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBANnylWw2vLY4hUn9w06zQKbhKBfvjFUCsdFlb6TdQhxb9RXWX". Skipping.
All renewal attempts failed. The following certs could not be renewed:
  /etc/letsencrypt/live/www.cosmicdan.com/fullchain.pem (failure)
  /etc/letsencrypt/live/www.wafassociation.org/fullchain.pem (failure)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates below have not been saved.)

All renewal attempts failed. The following certs could not be renewed:
  /etc/letsencrypt/live/www.cosmicdan.com/fullchain.pem (failure)
  /etc/letsencrypt/live/www.wafassociation.org/fullchain.pem (failure)
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates above have not been saved.)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
2 renew failure(s), 0 parse failure(s)

IMPORTANT NOTES:
 - The following errors were reported by the server:

   Domain: www.cosmicdan.com
   Type:   unauthorized
   Detail: Error reading HTTP response body: invalid byte in chunk
   length

   To fix these errors, please make sure that your domain name was
   entered correctly and the DNS A/AAAA record(s) for that domain
   contain(s) the right IP address.
 - The following errors were reported by the server:

   Domain: www.wafassociation.org
   Type:   unauthorized
   Detail: Invalid response from
   http://www.wafassociation.org/.well-known/acme-challenge/II7gCnFAN3EK9eR2RmHisyPY6Jn6fFTGKzH3NY-7bE4:
   "<!DOCTYPE html><html lang=\"en\"
   data-adblockkey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBANnylWw2vLY4hUn9w06zQKbhKBfvjFUCsdFlb6TdQhxb9RXWX"

   To fix these errors, please make sure that your domain name was
   entered correctly and the DNS A/AAAA record(s) for that domain
   contain(s) the right IP address.

My web server is (include version):
nginx/1.10.3 (Ubuntu)

The operating system my web server runs on is (include version):
Ubuntu 16.04.5 LTS

My hosting provider, if applicable, is:
Linode

I can login to a root shell on my machine (yes or no, or I don’t know):
Yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel):
No

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you’re using Certbot):
0.28.0 (seems the latest available for Ubuntu 16.04)

Any advice and troubleshooting steps appreciated - I don’t even know where to start. Thanks!

P.S. Was about to attach the debug log but I’m not sure if there is any sensitive information in there.


#2

Some additional details/background:

I have some pretty strong security features enabled in the Nginx config, as well as forced redirection from HTTP to HTTPS. Could this cause the problem (even though renewal has worked fine up until now)? I could share Nginx config but again, I’m concerned about sharing sensitive information so I’m not sure.


#4

Your two domains also appear to be hosted on completely different servers/networks.

It seems like www.wafassociation.org currently points to a domain parking page rather than the server where you’re running Certbot. Can you check its DNS settings?


#5

Oh dear, something is indeed wrong with wafassociation.org - I will have to contact my colleague to see if she’s done anything. Shouldn’t cosmicdan.com still work regardless though?

There’s nothing special about this nginx install as far as I can remember, but I do have a detailed site config for extra security - which worked fine up until now.

root@localhost:/etc/nginx/sites-available# ss -tlnp | grep -E ":(443|80)"
LISTEN     0      128          *:80                       *:*                   users:(("nginx",pid=9720,fd=12),("nginx",pid=9719,fd=12),("nginx",pid=8896,fd=12))
LISTEN     0      128          *:443                      *:*                   users:(("nginx",pid=9720,fd=10),("nginx",pid=9719,fd=10),("nginx",pid=8896,fd=10))
LISTEN     0      128         :::80                      :::*                   users:(("nginx",pid=9720,fd=13),("nginx",pid=9719,fd=13),("nginx",pid=8896,fd=13))
LISTEN     0      128         :::443                     :::*                   users:(("nginx",pid=9720,fd=11),("nginx",pid=9719,fd=11),("nginx",pid=8896,fd=11))

CosmicDan.com is a low traffic site, so it’s not 100% out of the question for me to poke around nginx site config - I’ve just no idea what to try.


#6

Yes, that domain should work.

I’m not sure why chunked encoding is even involved in this HTTP response.

Would you be able to run with --debug-challenges and post here the URL it shows, to this forum, without resuming Certbot?

--debug-challenges    After setting up challenges, wait for user input
                      before submitting to CA (default: False)

Edit: The URL will take the this form in the output of Certbot:

location = /.well-known/acme-challenge/A8YAdrLOS0AVEokgMWWgaEEzWOGOi3vaYULc69Jd54A{default_type text/plain;return 200 A8YAdrLOS0AVEokgMWWgaEEzWOGOi3vaYULc69Jd54A.mIHyn6_G0b0ij0wchQL9BGmu74wP5ZTXF3pGAgppaMk;} # managed by Certbot

#7

[NB: the other domain is resolved now, as expected they both have the same error]

Couldn’t see a location line so I ran with -v too (certbot renew --dry-run --debug-challenges -v); here’s the full output (apologies if this is a chore to read, I see it dumps all my nginx config too so hopefully something will catch your eye). Text is too long to post here and I’m not allowed to upload files, so here you go: https://www.cosmicdan.com/certbot_debug_output.txt

UPDATE: Spotting this in nginx logs:
2019/01/27 03:33:07 [error] 12525#12525: *48 access forbidden by rule, client: 123.243.135.43, server: www.cosmicdan.com, request: "GET /.well-known/acme-challenge/VoQWNQBOHP3JviMBgNZxIIDowqQa0pRILDfZLDZREEg HTTP/1.1", host: "www.cosmicdan.com", referrer: "http://www.cosmicdan.com/.well-known/acme-challenge/VoQWNQBOHP3JviMBgNZxIIDowqQa0pRILDfZLDZREEg"

So indeed, it is my security… but I don’t know why it suddenly broke or which specific rule is blocking it.


#8

Thanks! The log does contain the URL (starts with location = /.well-known).

Unfortunately I wanted to see that URL while Certbot was still paused during --debug-challenges, so I could visit it to see what’s wrong on the response level. But now the URL has been removed from the nginx config so it’s not posible.

I can suggest one experiment to try work around this issue, and that’s to add this to that port 80 server block for your domain:

chunked_transfer_encoding off;

Regarding your access forbidden error, that will probably require diving into your custom .confs …


#9

Oh I understand. It did pause but only for about 30 seconds, it just continued on it’s own :\ maybe a weird thing with my SSH client.

I’ll try that out, please check the update I made on the last post if you haven’t already (seems something in my config is denying letsencrypt server? I don’t know why or how though).


#10

That request was from me for a non-existent URL just to check your response headers - but the live requests for the real challenge URL shouldn’t be getting denied like that.

In any case, we need to get past the transfer-encoding error.

One further workaround you can try is sticking this inside each server{} block :

location ~ ^/\.well-known/acme-challenge/([-_a-zA-Z0-9]+)$ {
  default_type text/plain;
  return 200 "$1.P_3FvhH6--cVeEqEQmIfFTAuIKsxVOu2dH-z1KlugqA";
}

It’s hardcoded to your Let’s Encrypt account, but it would allow us to check what nginx is doing something strange when it sends the response (without the complication of --debug-challenges).

You might also need to move:

return 301 https://www.cosmicdan.com$request_uri;

into a location block:

location / {
  return 301 https://www.cosmicdan.com$request_uri;
}

(so the 301 doesn’t override the previous one).


#11

Before I try your edit with the location block, I did disable chunking and got a new error:

Attempting to renew cert (www.cosmicdan.com) from /etc/letsencrypt/renewal/www.cosmicdan.com.conf produced an unexpected error: Failed authorization procedure. www.cosmicdan.com (http-01): urn:ietf:params:acme:error:unauthorized :: The client lacks sufficient authorization :: Invalid response from http://www.cosmicdan.com/.well-known/acme-challenge/pJIgJSGsghcEq0b37VcFcnNgeSywhMK8aOERI3M9YWc: "Content-Encoding: gzip\r\n\r\n\x1f\ufffd\b\x00\x00\x00\x00\x00\x00\x03+\ufffd\ufffdL\ufffd\nv/N\ufffdHv-4H26\x0fKvK\ufffd\ufffdKO\r\ufffd,\ufffd\ufffd\ufffd\ufffdH\ufffdw\r\ufffd4\ufffd\ufffd\ufffd\fO\ufffd\v\ufffd7v+\ufffd\ufffd0\ufffd\ufffdM\x0eKu-t\r\ufffd\ufffdLs\vq,\ufffd\ufffd.\ufffd\b\ufffd/5J\ufffd\u042d2\ufffd\ufffd)M/t\x04\x00\u007f\ufffd\ufffd". Skipping.

OK, so I also added gzip off; in the same server block and now we’ve got this:

Attempting to renew cert (www.cosmicdan.com) from /etc/letsencrypt/renewal/www.cosmicdan.com.conf produced an unexpected error: Failed authorization procedure. www.cosmicdan.com (http-01): urn:ietf:params:acme:error:unauthorized :: The client lacks sufficient authorization :: The key authorization file from the server did not match this challenge [l805tNnOssTEwuIbuiqgV3tQxpvfhNZ-v2Sp0qC-InE.P_3FvhH6--cVeEqEQmIfFTAuIKsxVOu2dH-z1KlugqA] != [l805tNnOssTEwuIbuiqgV3tQxpvfhNZ-v2Sp0qC-InE.P_3FvhH6--cVeEqEQmIfFTAuIKsxVOu2dH-z1Klug]. Skipping.

OK, I added the location block you shared (to common-other_include which is included by all https blocks, and manually in the http/80 blocks) and the error is the same as the second one above (did not match).


#12

That very last error was almost the correct behavior. I think you may have cut off the final qA on the config I suggested?

But something really weird is definitely going on with your server. If I absolutely had to guess, I would blame the multi-line CSP header that you are sending. Something may be majorly mishandling that header, but it’s hard to tell with the extraordinarily unique symptoms you’re getting.


#13

Yeah, the config was copied correctly. Indeed it’s very strange, the final two characters ‘qA’ are being truncated. I seem to be really good at getting strange errors on Linux machines, it’s basically the only reason I have stuck with Windows on my daily machines haha!

Alright, last resort: process of elimination time then. I’ll go through and disable the various security features and hopefully find the cause.


#14

Correct sir, this is the problem! Commented-out, the dry-run succeeded.

Thank you very much for your time, especially since I could have figured this out on my own if I wasn’t so afraid to break things more haha. Now I need to figure out how to get the CSP working, but that’s my problem :slight_smile:

Thanks again!


#15

That’s kinda wild. I think the nginx project would probably consider that a bug, since it is literally causing the response body to be corrupted.

You might try updating to a later version of nginx, since it might already be fixed. For what it’s worth, I tried the same CSP config on nginx 1.15.5 and it did not cause the same issue.


#16

Ah very good to know. I suppose I will go the whole nine yards and do a full server backup and update to 18.04 too, while I still only have ~1 visitor per day.

Thank you to all Let’s Encrypt staff and community :smiley:


#17

Follow-up for posterity:

Upgrading to latest development version of nginx did not solve the problem. Final solution was to use a less-aggressive CSP, as per this article’s example.

I don’t know how to exclude Let’s Encrypt servers from the CSP, if possible at all - but that could be a proper solution. We’re a little out of scope of this forum now but if anybody knows how I’d do this kind of thing I’d appreciate it.


#18

Can’t you just remove the newlines?

add_header content-security-policy "default-src 'self' https://*.$http_host https://*.google-analytics.com https://*.googleapis.com https://*.gstatic.com https://*.gravatar.com https://*.w.org data: 'unsafe-inline' 'unsafe-eval'; img-src *;" always;

My guess at the mechanics of the issue is that the newlines in the header are causing nginx to lose track of where the response body started and started applying the encoding of the body (both chunked and gzip) in the wrong place. This is due to HTTP/1 being a text-based request-response protocol.

Let’s Encrypt does not care about CSP one way or the other, it’s a totally opaque header to it. But it does care if the HTTP protocol gets corrupted.

So joining the lines together should perhaps eliminate the problem.


#19

Oh wow, you’re right - that’s the true solution. I didn’t even consider that nginx would be so… “literal”, to put it kindly. Well it is in double quotes, so I suppose this isn’t a bug in nginx but PEBCAK.

Thanks again!