Issue with renewals after TLS-SNI-01


#1

Having trouble with some domains’ renewals after removing TS-SNI-01 by following the instructions in How to stop using TLS-SNI-01 with Certbot .

My domain is: best.mobiledispatch.me
(among others, but I’m assuming fixing this one will fix all, as they all respond the same way)

I ran this command (as root):./certbot-auto renew --dry-run

It produced this output:

Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/best.mobiledispatch.me.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator apache, Installer apache
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for best.mobiledispatch.me
Waiting for verification...
Cleaning up challenges
Attempting to renew cert (best.mobiledispatch.me) from /etc/letsencrypt/renewal/best.mobiledispatch.me.conf produced an unexpected error: Failed authorization procedure. best.mobiledispatch.me (http-01): urn:ietf:params:acme:error:serverInternal :: The server experienced an internal error :: Remote PerformValidation RPCs failed. Skipping.
All renewal attempts failed. The following certs could not be renewed:
  /etc/letsencrypt/live/best.mobiledispatch.me/fullchain.pem (failure)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates below have not been saved.)

All renewal attempts failed. The following certs could not be renewed:
  /etc/letsencrypt/live/best.mobiledispatch.me/fullchain.pem (failure)
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates above have not been saved.)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1 renew failure(s), 0 parse failure(s)

IMPORTANT NOTES:
 - The following errors were reported by the server:

   Domain: best.mobiledispatch.me
   Type:   serverInternal
   Detail: Remote PerformValidation RPCs failed

   Unfortunately, an error on the ACME server prevented you from
   completing authorization. Please try again later.

My web server is: Apache 2.2.15

The operating system my web server runs on is (include version): CentOS release 6.6 (Final)

My hosting provider, if applicable, is: MediaTemple

I can login to a root shell on my machine: yes

I’m using a control panel to manage my site: no

The version of my client is: certbot 0.31.0 (certbot-auto)


The issue does not appear to be IPv6 related:

curl -i -6 -m 10 best.mobiledispatch.me
HTTP/1.1 301 Moved Permanently
Date: Tue, 19 Feb 2019 20:35:46 GMT
Server: Apache
Location: ImagingDB/top/mobile/index.php?variant=best
Content-Length: 0
Connection: close
Content-Type: text/html; charset=UTF-8

That’s an expected response based on our redirects.


#2

Hi @bta

this is an internal problem of the Letsencrypt server.

Try it later again.

PS: You don’t have an ipv6 address. So this isn’t a problem. And your /.well-known/acme-challenge has a good http status 404


#3

I would very much like to believe it’s as simple as trying again later, but I’ve already done that. This is several attempts over several weeks, and these same domains (one of which is listed above) are reliably failing each time.

I can assume you’re certain of your answer, but I may return in a panic on March 29 or earlier. :grinning:


#4

That’s bad. The error is an internal Letsencrypt error:

urn:ietf:params:acme:error:serverInternal :: The server experienced an internal error :: Remote PerformValidation RPCs failed. Skipping.

Your nameservers are buggy.

me

X Fatal error: Nameserver doesn’t support EDNS with max. 512 Byte Udp payload or sends more then 512 Bytes: ns1.mediatemple.net
X Fatal error: Nameserver doesn’t support EDNS with max. 512 Byte Udp payload or sends more then 512 Bytes: ns2.mediatemple.net

Problems with EDNS512 - check, but if this is a problem, Letsencrypt shows an error message.

Other EDNS-checks are not passed, but that shouldn’t be a problem. Perhaps Letsencrypt has updated unboundtest, so the DNS Flag day is relevant (01.02.2019).

And TCP-answers are slow.

But Unboundtest

https://unboundtest.com/m/CAA/best.mobiledispatch.me/XD7LVXKY

doesn’t see any error. So I don’t think it’s really a nameserver problem.


#5

Let’s Debug doesn’t report any issues either. I’d expect that to fail if there was a DNS problem.

We might need @lestaff to take a look, if this is really an issue on the Let’s Encrypt servers (which it does look like it might be).

Have any actual renewal attempts failed, or only dry-runs?


#6

Do you have a highly restrictive firewall? Staging validates from a different set of IPs (which will change over time).


#7

No. In fact, these servers are accessed by and are in constant contact with dozens or a hundred tablets connected via cell signal throughout the day.

When doing renewals, does certbot have to restart Apache? Does it try to do so gracefully? I could consider the possibility that it can’t graceful restart with hundreds of tablets keeping connections open in the particular method they do for our messaging system.

I haven’t had actual renewals fail on these yet, but they haven’t been up for renewal yet since this change was implemented.


#8

From what I can tell, all of our HTTP validation attempts for this domain result in either a 404 for the challenge file, or a timeout. While the timeouts may be difficult to diagnose, the 404s should be pretty easy to track down and fix in the interaction between Certbot and Apache.

Is certbot logging any errors regarding attempts to write the challenge file in /.well-known/acme-challenge/?
Is Apache logging anything regarding the 404s which might point you to a solution?


#9

I don’t see anything in the certbot log regarding inability to write a file. The log file says that the server reported errors, and I assume if it were a local issue it wouldn’t ask the server to do anything. The certbot log is too long to post here.

The Apache error log has some lines like this, though…

[Tue Feb 19 17:59:27 2019] [error] [client 172.104.24.29] File does not exist: /var/www/vhosts/best.mobiledispatch.me/httpdocs/.well-known
[Tue Feb 19 17:59:27 2019] [error] [client 13.58.30.69] File does not exist: /var/www/vhosts/best.mobiledispatch.me/httpdocs/.well-known
[Tue Feb 19 17:59:28 2019] [error] [client 52.29.173.72] File does not exist: /var/www/vhosts/best.mobiledispatch.me/httpdocs/.well-known
[Tue Feb 19 17:59:28 2019] [error] [client 66.133.109.36] File does not exist: /var/www/vhosts/best.mobiledispatch.me/httpdocs/.well-known

That would appear to be an incomplete path?

Thanks for the help so far. I’ll be checking up on this tomorrow.


#10

Looks that way.
What does the location or rewrite statement look like that handles the challenge requests?


#11

So I was able to fix the issue on best.mobiledispatch.me. It was an erroneously commented-out line hidden deep within the apache configuration files.

After fixing that, I looked for the same line in the other domains having issues, but to no avail.

mtvl.mobiledispatch.me, dlds.mobiledispatch.me, and pens.mobiledispatch.me have the line not commented out, but I’m still having difficulty with renewals on them.

Worse, yum upgrade certbot and yum info certbot are, at the moment, both telling me there’s no such package, and I’m pretty sure that’s how I got it on these systems to begin with, so… the plot thickens.


#12

Would it be at all possible for someone to tell me where certbot might put an HTTP challenge so that I can try to figure out what might be going wrong?


#13

The http-01 challenge file must be in

Webroot/.well-known/acme-challenge

so Letsencrypt can check that file via

http://best.mobiledispatch.me/.well-known/acme-challenge/long-random-filename

Use --debug-challenges to check the file:

–debug-challenges After setting up challenges, wait for user input before submitting to CA (default: False)

So you should see the file in this subdirectory.


#14

certbot and certbot-auto are two different things.
certbot-auto is not installed via yum, so it wouldn’t know anything about that one either.


#15

Yes, right you are. I forgot for a moment which environment I was working in.

–debug-challenges seems to pause it, but it finishes without additional input. I’m not seeing the .well-known directory (even with -a, obviously) at the web root. In fact, on dlds.mobiledispatch.me, I’m not even seeing an entry in the Apache error log. I’ve removed all banned IPs, in case that was causing it, to no avail. I can manually navigate to dlds.mobiledispatch.me/.well-known/acme-challenge in a web browser, which of course gives me a 404 and an entry in the Apache error log.


#16

can you add a test file in that folder, so that it can be access at:
http://dlds.mobiledispatch.me/.well-known/acme-challenge/1234


#17

Done. I had to create both folders, but that link will go somewhere at the moment.

The dry run still fails.

./certbot-auto renew --dry-run
Saving debug log to /var/log/letsencrypt/letsencrypt.log

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Processing /etc/letsencrypt/renewal/dlds.mobiledispatch.me.conf
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator apache, Installer apache
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for dlds.mobiledispatch.me
Waiting for verification...
Cleaning up challenges
Attempting to renew cert (dlds.mobiledispatch.me) from /etc/letsencrypt/renewal/dlds.mobiledispatch.me.conf produced an unexpected error: Failed authorization procedure. dlds.mobiledispatch.me (http-01): urn:ietf:params:acme:error:serverInternal :: The server experienced an internal error :: Remote PerformValidation RPCs failed. Skipping.
All renewal attempts failed. The following certs could not be renewed:
  /etc/letsencrypt/live/dlds.mobiledispatch.me/fullchain.pem (failure)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates below have not been saved.)

All renewal attempts failed. The following certs could not be renewed:
  /etc/letsencrypt/live/dlds.mobiledispatch.me/fullchain.pem (failure)
** DRY RUN: simulating 'certbot renew' close to cert expiry
**          (The test certificates above have not been saved.)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
1 renew failure(s), 0 parse failure(s)

IMPORTANT NOTES:
 - The following errors were reported by the server:

   Domain: dlds.mobiledispatch.me
   Type:   serverInternal
   Detail: Remote PerformValidation RPCs failed

   Unfortunately, an error on the ACME server prevented you from
   completing authorization. Please try again later.

The log file is too large to post. Let me know if a section of it would help.


#18

The --dry-run isn’t using --webroot
If you know the webroot, you can use that instead.


#19

Sorry, I’m not quite sure what you mean by that.


#20

I meant you should try certbot with --webroot
./certbot-auto renew --webroot -w /path/to/site/root --dry-run