Certbot-auto unable to complete TLS-SNI-01 challenge on Ubuntu 14.04

I am running Ubuntu 14.04LTS. I have grabbed certbot-auto from the EFF website since certbot isn’t in 14.04.

I’ve tried running it as

certbot-auto --apache -d my.domain

and it consistently times out during the TLS-SNI-01 challenge with the following message:

Failed authorization procedure. my.domain (tls-sni-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: Failed to connect to ww.xx.yy.zz:443 for TLS-SNI-01 challenge

I have verified that inbound TLS is not being blocked by running openssl’s s_server command and successfully connecting to the s_server “server” from external locations.

Looking at /var/log/apache2/error.log I do see a suspicious warning about (I assume) the docroot that certbot-auto is temporarily trying to set up not being found:

[Tue Dec 06 00:34:02.306032 2016] [mpm_prefork:notice] [pid 19521] AH00171: Graceful restart requested, doing restart
[Tue Dec 06 00:34:03.001014 2016] [mpm_prefork:notice] [pid 19521] AH00163: Apache/2.4.7 (Ubuntu) configured -- resuming normal operations
[Tue Dec 06 00:34:03.001094 2016] [core:notice] [pid 19521] AH00094: Command line: '/usr/sbin/apache2'
[Tue Dec 06 00:34:14.596461 2016] [mpm_prefork:notice] [pid 19521] AH00171: Graceful restart requested, doing restart
AH00112: Warning: DocumentRoot [/var/lib/letsencrypt/tls_sni_01_page/] does not exist
[Tue Dec 06 00:34:14.661372 2016] [ssl:warn] [pid 19521] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)
[Tue Dec 06 00:34:15.000674 2016] [mpm_prefork:notice] [pid 19521] AH00163: Apache/2.4.7 (Ubuntu) OpenSSL/1.0.1f configured -- resuming normal operations
[Tue Dec 06 00:34:15.001293 2016] [core:notice] [pid 19521] AH00094: Command line: '/usr/sbin/apache2'

In /var/log/letsencrypt/letsencrypt.log I see the following, which looks totally reasonable:

2016-12-06 05:34:13,247:INFO:certbot.auth_handler:Performing the following challenges:
2016-12-06 05:34:13,268:INFO:certbot.auth_handler:tls-sni-01 challenge for mydomain.com
2016-12-06 05:34:13,359:INFO:certbot_apache.configurator:Enabled Apache socache_shmcb module
2016-12-06 05:34:13,520:INFO:certbot_apache.configurator:Enabled Apache ssl module
2016-12-06 05:34:14,345:DEBUG:certbot_apache.tls_sni_01:Adding Include /etc/apache2/le_tls_sni_01_cert_challenge.conf to /files/etc/apache2/apache2.conf
2016-12-06 05:34:14,351:DEBUG:certbot_apache.tls_sni_01:writing a config file with text:
 <IfModule mod_ssl.c>
<VirtualHost *>
    ServerName b103a17811594b35d3ade8eb763c8c42.74b423d383109b22eaecca3ea14cd1f7.acme.invalid
    UseCanonicalName on
    SSLStrictSNIVHostCheck on

    LimitRequestBody 1048576

    Include /etc/letsencrypt/options-ssl-apache.conf
    SSLCertificateFile /var/lib/letsencrypt/mRH5i2aIrWMqjsZn3jMXvdNnYV5Ejd0GBtcDoIJqQ4U.crt
    SSLCertificateKeyFile /var/lib/letsencrypt/mRH5i2aIrWMqjsZn3jMXvdNnYV5Ejd0GBtcDoIJqQ4U.pem

    DocumentRoot /var/lib/letsencrypt/tls_sni_01_page/
</VirtualHost>

</IfModule>

If I watch /var/lib/letsencrypt while certbot-auto runs I don’t ever see a tls_sni_01_page being created. Here’s the sequence I see – a temp directory and some temp certs being made and then deleted when the challenge fails, but never tls_sni_01_page

/var/lib/letsencrypt$ ls
backups
/var/lib/letsencrypt$ ls
backups
/var/lib/letsencrypt$ ls
backups
/var/lib/letsencrypt$ ls
backups  temp_checkpoint
/var/lib/letsencrypt$ ls
backups  temp_checkpoint
/var/lib/letsencrypt$ ls
backups  e1pvy9TgUvOmBxC4dlkwG5Ly2mTY0DU63qhUFpBik2Y.crt  e1pvy9TgUvOmBxC4dlkwG5Ly2mTY0DU63qhUFpBik2Y.pem  temp_checkpoint

I’ve also changed /etc/letsencrypt/options-ssl-apache.conf to set the logging level to DEBUG and to have that temporary virtual host log to its own error and access logs. I see output but no problems in the error log, but don’t see anything at all in the access log.

I’ve tried running tcpdump -n port 443 and I do see some sort of connection attempt from the remote machine:

09:14:44.388328 IP 66.133.109.36.48890 > w.x.y.z.443: Flags [S], seq 1627589669, win 29200, options [mss 1460,sackOK,TS val 2834688174 ecr 0,nop,wscale 7], length 0
09:14:44.388435 IP w.x.y.z.443 > 66.133.109.36.48890: Flags [S.], seq 773250860, ack 1627589670, win 28960, options [mss 1460,sackOK,TS val 424229197 ecr 2834688174,nop,wscale 7], length 0
09:14:44.451190 IP 66.133.109.36.48890 > w.x.y.z.443: Flags [.], ack 1, win 229, options [nop,nop,TS val 2834688237 ecr 424229197], length 0
09:14:44.451546 IP 66.133.109.36.48890 > w.x.y.z.443: Flags [P.], seq 1:220, ack 1, win 229, options [nop,nop,TS val 2834688237 ecr 424229197], length 219
09:14:44.451619 IP w.x.y.z.443 > 66.133.109.36.48890: Flags [.], ack 220, win 235, options [nop,nop,TS val 424229216 ecr 2834688237], length 0
09:14:44.454461 IP w.x.y.z.443 > 66.133.109.36.48890: Flags [P.], seq 1:393, ack 220, win 235, options [nop,nop,TS val 424229217 ecr 2834688237], length 392
09:14:44.454626 IP w.x.y.z.443 > 66.133.109.36.48890: Flags [F.], seq 393, ack 220, win 235, options [nop,nop,TS val 424229217 ecr 2834688237], length 0
09:14:44.517433 IP 66.133.109.36.48890 > w.x.y.z.443: Flags [.], ack 393, win 237, options [nop,nop,TS val 2834688303 ecr 424229217], length 0
09:14:44.517481 IP 66.133.109.36.48890 > w.x.y.z.443: Flags [P.], seq 220:227, ack 394, win 237, options [nop,nop,TS val 2834688303 ecr 424229217], length 7
09:14:44.517514 IP 66.133.109.36.48890 > w.x.y.z.443: Flags [F.], seq 227, ack 394, win 237, options [nop,nop,TS val 2834688303 ecr 424229217], length 0
09:14:44.517544 IP w.x.y.z.443 > 66.133.109.36.48890: Flags [.], ack 228, win 235, options [nop,nop,TS val 424229235 ecr 2834688303], length 0

certbot-auto does seem to be correctly configuring Apache for SSL. From the Apache error log while certbot-auto is doing its thing:

[Wed Dec 07 13:36:03.503262 2016] [ssl:info] [pid 8921] AH02200: Loading certificate & private key of SSL-aware server '57297267e797e92815e8341926997fe1.85fb57a32e6086501ea25e41889cdcd3.acme.invalid:443'
[Wed Dec 07 13:36:03.504539 2016] [ssl:debug] [pid 8921] ssl_engine_pphrase.c(506): AH02249: unencrypted RSA private key - pass phrase not required
[Wed Dec 07 13:36:03.504802 2016] [ssl:info] [pid 8921] AH01914: Configuring server 57297267e797e92815e8341926997fe1.85fb57a32e6086501ea25e41889cdcd3.acme.invalid:443 for SSL protocol
[Wed Dec 07 13:36:03.505062 2016] [ssl:debug] [pid 8921] ssl_engine_init.c(328): AH01893: Configuring TLS extension handling
[Wed Dec 07 13:36:03.505080 2016] [ssl:debug] [pid 8921] ssl_engine_init.c(838): AH02232: Configuring RSA server certificate
[Wed Dec 07 13:36:03.505128 2016] [ssl:warn] [pid 8921] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)
[Wed Dec 07 13:36:03.505233 2016] [ssl:debug] [pid 8921] ssl_util_ssl.c(407): AH02412: [57297267e797e92815e8341926997fe1.85fb57a32e6086501ea25e41889cdcd3.acme.invalid:443] Cert matches for name '57297267e797e92815e8341926997fe1.85fb57a32e6086501ea25e41889cdcd3.acme.invalid' [subject: CN=dummy / issuer: CN=dummy / serial: B5D5A13512E3E11CCFCE36A317ED7743 / notbefore: Dec 7 18:36:02 2016 GMT / notafter: Dec 14 18:36:02 2016 GMT]
[Wed Dec 07 13:36:03.505247 2016] [ssl:debug] [pid 8921] ssl_engine_init.c(893): AH02236: Configuring RSA server private key

Any idea what the heck is going on? It appears other people with Ubuntu 14.04LTS and apache have this working fine.

@bmw @pde @erica Do you folks have any idea on this one? It sounds like the Boulder VA is connecting to the machine. Is there anything obvious in the certbot traces shared?

I’m having the same issue on a Trisquel 7 (Ubuntu 14.04 based) server. A different Trisquel 7 server does not have this problem. I’ve tried using the git revision used by the working server, but to no avail. I’m not sure what’s different between these two hosts. I’m able to get a response over 443 with netcat using a remote server, so the port appears to be open on both servers.

@sudoman @QuantumMechanic could one of you provide the domain names you’re trying to issue for? It feels like a Certbot issue but I can check the Boulder-side logs to make sure there isn’t anything there.

jabserver2p.fsf.org is working with certbot, jabserver3p.fsf.org is not. I’ve tried the default and staging server on the latter.

Interesting! Thanks for providing that. In this case Boulder's validation authority is returning this error which might help the Certbot folks diagnose:

err=[tls.RecordHeaderError{Msg:"oversized record received with length 20527", RecordHeader:[5]uint8{0x48, 0x54, 0x54, 0x50, 0x2f}}]

And to confirm - you’re using Certbot with the Apache plugin to handle the TLS-SNI-01 challenge like @QuantumMechanic ?

Yes, I am using --apache with TLS-SNI-01.

The VA error about the Oversized Record is typically indicative of speaking HTTPS to an HTTP only server.

I think you folks should open a certbot issue with as much of the information from this thread as you can.

There must be something wiggly happening with Cerbot’s Apache SNI config.

With Apache running, and not Certbot, can you connect to it via HTTPS on 443? I.e. curl --insecure https://my.domain? One guess I have is that you may have Apache listening on 443, but serving HTTP on that port instead of HTTPS.

here’s what I get from a remote host when apache is running on the affected machine without a virtual host listening on port 443:

$ curl --insecure https://my.domain
curl: (35) error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol

I recall that I enabled the ssl module on this host earlier. I’ll keep looking into this from my end.

I’ve seen similar issues (mostly on CentOS, but the same logic applies on Ubuntu) where the problem was that mod_ssl was either completely disabled, or that the configuration that enabled mod_ssl came after the <VirtualHost> responsible for handling the tls-sni-01 challenge. Since it’s wrapped in a <IfModule mod_ssl.c>, apache would just silently ignore that vhost definition (as mod_ssl isn’t enabled yet when the configuration parser reaches that point) and serve HTTP on port 443.

I’m using rlcarr.com. I just ran again (with the same failure) at 13:36 EST by my clock.

I see the <IfModule mod_ssl.c> at the beginning of the configuration printed to the terminal by letsencrypt. the ssl module is enabled in apache on both hosts, and the contents of the ssl.load and ssl.conf files in /etc/apache2/mods-enabled are the same across both hosts.

As for the output by curl: perhaps that is because apache is listening on port 443 (according to lsof) when the ssl module is loaded even though there is no virtual host set to use port 443, and no certificate is configured, due to the configuration in /etc/apache2/ports.conf? The configuration of that file is the same on both hosts, but that may explain the output of curl.

With Apache running and not Certbot I cannot connect via HTTPS (or HTTP for that matter) on 443. Which I’d expect since my ports.conf only has Listen 80.

Doesn’t sound like it’s an instance of this issue in that case, though there could be other reasons for apache to serve HTTP on port 443 during validation, I suppose (do you happen to have any vhost definitions with a wildcard port?). To rule out the mod_ssl issue, you can probably check if you have a log line like the one posted by OP in your apache error log:

[Tue Dec 06 00:34:14.661372 2016] [ssl:warn] [pid 19521] AH01906: RSA server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)

If that warning exists after running the client, it would indicate that mod_ssl is indeed enabled when apache reaches the validation vhost and that we’re looking at some other problem. (The warning itself is harmless and part of normal operation, but it allows us to debug this.)

@QuantumMechanic: FWIW, the DocumentRoot warning is normal as well. tls-sni-01 doesn’t work based on files/paths (unlike http-01). The DocumentRoot is just there as a dummy value. Mentioning this just as a pointer for further debugging; I currently don’t have any ideas on what might be going on here.

This is consistent with the theory that Apache is trying to serve HTTP on 443. You can confirm with another curl command:

curl http://my.domain:443/

I'm not sure what the problem is in your Apache config. If you're willing to share the full Apache config, we can probably help figure out what the problem is, and make Certbot smarter about detecting it in the future.

It looks as though http was indeed being served over that port. I've created a backup of the apache2 config directory in case we don't figure out what the issue is here and/or you would like to take a look.

I installed the ssl-cert package so i could obtain a default "snakeoil" certificate. I then enabled the default-ssl.conf virtual host. This time I got a different error:

Failed authorization procedure. my.domain (tls-sni-01): urn:acme:error:unauthorized :: The client lacks sufficient authorization :: Incorrect validation certificate for TLS-SNI-01 challenge. Requested 1...4.0...e.acme.invalid from 111.111.111.111:443. Received certificate containing ''

IMPORTANT NOTES:
 - The following errors were reported by the server:

   Domain: my.domain
   Type:   unauthorized
   Detail: Incorrect validation certificate for TLS-SNI-01 challenge.
   Requested
   1...4.0...e.acme.invalid
   from 111.111.111.111:443. Received certificate containing ''

   To fix these errors, please make sure that your domain name was
   entered correctly and the DNS A record(s) for that domain
   contain(s) the right IP address.

It seems as though apache is not getting configured by the certbot client.

I don't see this message in the logs before or after the above change nor in the logs of the working server.

I have an additional certificate on the working server. I do not recall, but it is possible that the other server worked because the valid ssl certificate was installed and configured on that host before running and switching to letsencrypt.

@pfg - I’ve updated my original post with an extract from the apache error.log while Certbot is running. I do see the line about RSA server certificate is a CA certificate... and overall it looks like mod_ssl is starting up OK.