cURL error 35 - random failures to connect to LE

Please fill out the fields below so we can help you better. Note: you must provide your domain name to get help. Domain names for issued certificates are all made public in Certificate Transparency logs (e.g. https://crt.sh/?q=example.com), so withholding your domain name here does not increase secrecy, but only makes it harder for us to provide help.

My domain is: numerous, but including e.g. annemiller.uk (renewal attempted at 2019-10-03 06:21 BST)

I ran this command: “dehydrated -c”

It produced this output:“ERROR: Problem connecting to server (post for https://acme-v02.api.letsencrypt.org/acme/chall-v3/610799063/GogFSA; curl returned with 35)”

My web server is (include version): various, mostly Apache 2.4 and Apache 2.2

The operating system my web server runs on is (include version): various, but including Debian Stretch, Raspbian Stretch, SLES and Centos

My hosting provider, if applicable, is: various

I can login to a root shell on my machine (yes or no, or I don’t know): yes

I’m using a control panel to manage my site (no, or provide the name and version of the control panel): no

The version of my client is (e.g. output of certbot --version or certbot-auto --version if you’re using Certbot): recently updated https://github.com/lukas2511/dehydrated (I’ve been using this on lots of different servers since LetsEncrypt started)

About a week ago, I started getting errors from the client, which were to do with the case of headers coming from LE having changed, so a grep needed changing to “grep -i” to check them case-insensitively. This problem had already been fixed in the client so I updated everything. However, it looks to me like the reason it started failing is because there has been a major change at the LE end, perhaps switching to HTTP/2 (and the case sensitivity of headers was simply a minor symptom of that)?

Since then I have sometimes been getting curl error 35 (according to https://curl.haxx.se/libcurl/c/libcurl-errors.html error 35 is: “CURLE_SSL_CONNECT_ERROR (35) A problem occurred somewhere in the SSL/TLS handshake. You really want the error buffer and read the message there as it pinpoints the problem slightly more. Could be certificates (file formats, paths, permissions), passwords, and others.”)

It fails randomly on several different servers running different web servers and operating systems, different LE verification types (DNS and HTTP), some with IPv4 only, some with IPv6 only and some with both, and at different times in the process (mostly when there is nothing actually to renew: I assume there is an account and/or license check at the start; but also sometimes part way through the process). If I rerun it manually when I get to see the error, it works. In the example above it got 5 challenge tokens successfully, checked the DNS tokens on two of them and failed to connect on the third. It seems completely random, but I’m guessing it is failing to connect on maybe 1 in 30 attempts. I’ve had at least one failure on 3 of the last 7 days. This morning I got 3 failures from 3 different servers out of 19 servers/protocol combinations in all.

Because of the diversity, this looks to me like a problem at LE end, or possibly a general problem in curl when interacting with the new configuration of the web server at LE. I don’t think it is the dehydrated client, as that is just using cURL to do its communication with LE, and it randomly fails in only maybe 3% of cases at different points in the process.

Any ideas?

1 Like

FWIW, there are two threads about different random curl errors:


I don’t know if the issue you’re experiencing is the same, though.

@jillian: Ping?

yea - sounds pretty much like the same Problem we handle in those 2 Threads mentioned above.
It’s also already reflected on the LE Status Page: https://letsencrypt.status.io/

Guess/Hope they should be able to fix it shorthand now as @jillian is getting close to the root cause … :wink:
see her post here: Curl: TCP connection reset by peer

Hard to tell whether they’re the same underlying cause - the error messages are different.

Am I right that LE has recently changed to HTTP/2? If so it would not be surprising for there to be a whole slew of new connection issues.

They made a big change to the API endpoints – they switched from one CDN that was terminating TLS to a different CDN that isn’t. As part of that, HTTP/2 also got enabled.

It’s the kind of change that won’t reveal any problems in theory but inevitably does in practice. :grimacing:

@fas - Same Error as @403 mentioned in this Post: cURL error to /directory endpoint

`2019-10-01 17:00:04 sid-215 LE[205834]: ERROR: Problem connecting to server (get for https://acme-v02.api.letsencrypt.org/directory; curl returned with 35)`

Error Code 35

you are so damn right about this … hehe :wink:

1 Like

Good catch, @futureweb! I didn’t realize this precise error had been reported.

1 Like

OK. It certainly looks from the dates and nature of the error that the problem arises from the CDN change, possibly HTTP/2 but maybe something more subtle.

I guess a curl probe every few seconds would show up the error sooner or later. Might it be load related (most of my accesses happen around 06:00 UK time, as they are in cron.daily and that seems to be the default run time on most Debian systems) so that’s +/- midnight in the US)?

1 Like

@fas thanks for providing a detailed report! You’re the first user to suggest it’s not just /directory and I will update our status page accordingly.

You’re on the same track as us thinking it’s load related and I’ve been working on finding the bottleneck.

If you have a few more timestamps of when you’ve seen this problem and which ips you are connecting too, I’d appreciate that detail.

2 Likes

@jillian - as for Times/IPs I can give some more Information too - shown Times are MESZ (UTC+2)

All those Requests were fired from 83.65.246.198


grafik

2 Likes

Here you are: All times in BST (UTC+1). All but the second failed before doing anything substantive, so that’s probably directory, but the second one was definitely at responding to challenge. I guess it’s much more likely to fail on directory if that’s what all attempts do first: it only actually renews once in 60 days/attempts for any one cert, though #2 and #5 is on a server with quite a few certs for different domains (so again, I’d expect to see it there ore often if it is random). If I get any more over the next few days I’ll add them here.

2019-10-03 06:25:18 81.103.31.58 (ironically, my server which collects logs for all the failed attempts!)
2019-10-03 06:21:01 2a00:1098:0:86:1000:48:0:2 when renewing tst.annemiller.uk at “responding to challenge”
2019-10-03 02:01:02 2001:630:212:8:21e:67ff:fe1b:a1fc
2019-10-02 02:01:01 2001:630:212:8:21e:67ff:fe1b:a060
2019-09-29 06:16:02 2a00:1098:0:86:1000:48:0:2

2 Likes

Three more errors this morning:
2019-10-05 02:01 BST 2001:630:212:8:21e:67ff:fe1b:a060
2019-10-05 03:14 BST 2a00:d680:20:50::e0a (first time on this one, running an older CentOS, different server provider, entirely different network)

I also got a different error on 2019-10-05 06:25 BST 81.103.31.58, a HTTP 500 from the LE API. This is probably unrelated, but I mention it in case. When I retired it manually, it failed differently again with NO REPLAY NONCE, but a third attempt worked:

HTTP/1.1 500 Internal Server Error
Server: nginx
Date: Sat, 05 Oct 2019 02:14:11 GMT
Content-Type: application/problem+json
Content-Length: 173
Connection: keep-alive
Cache-Control: public, max-age=0, no-cache
Link: https://acme-v02.api.letsencrypt.org/directory;rel=“index”
Replay-Nonce: 0001ErpEUN-W_VKxCYDAXMhv59BNOPzQ95OmoP9_GbUUAt4
{
“type”: “urn:ietf:params:acme:error:serverInternal”,
“detail”: "Error retrieving account “https://acme-v02.api.letsencrypt.org/acme/acct/2159887"”,
“status”: 500
}

2 Likes

We started seeing these errors Saturday night. Here is our most recent log entry:

Blockquote2019-10-08 00:00:04,335:DEBUG:certbot.main:certbot version: 0.39.0
2019-10-08 00:00:04,335:DEBUG:certbot.main:Arguments: [’-c’, ‘/etc/letsencrypt/domain.com.ini’]
2019-10-08 00:00:04,335:DEBUG:certbot.main:Discovered plugins: PluginsRegistry(PluginEntryPoint#apache,PluginEntryPoint#manual,PluginEntryPoint#nginx,PluginEntryPoint#null,PluginEntryPoint#standalone,Plug
inEntryPoint#webroot)
2019-10-08 00:00:04,353:DEBUG:certbot.log:Root logging level set at 20
2019-10-08 00:00:04,353:INFO:certbot.log:Saving debug log to /var/log/letsencrypt/letsencrypt.log
2019-10-08 00:00:04,356:DEBUG:certbot.plugins.selection:Requested authenticator standalone and installer None
2019-10-08 00:00:04,361:DEBUG:certbot.plugins.selection:Single candidate plugin: * standalone
Description: Spin up a temporary webserver
Interfaces: IAuthenticator, IPlugin
Entry point: standalone = certbot.plugins.standalone:Authenticator
Initialized: <certbot.plugins.standalone.Authenticator object at 0x7f7c1968e5d0>
Prep: True
2019-10-08 00:00:04,361:DEBUG:certbot.plugins.selection:Selected authenticator <certbot.plugins.standalone.Authenticator object at 0x7f7c1968e5d0> and installer None
2019-10-08 00:00:04,361:INFO:certbot.plugins.selection:Plugins selected: Authenticator standalone, Installer None
2019-10-08 00:00:04,371:DEBUG:certbot.main:Picked account: <Account(RegistrationResource(body=Registration(status=None, terms_of_service_agreed=None, agreement=None, only_return_existing=None, contact=(),
key=None, external_account_binding=None), uri=u’https://acme-v02.api.letsencrypt.org/acme/acct/61555815’, new_authzr_uri=None, terms_of_service=None), 133f2d4d78116d7e0029aaa31f2c0abc, Meta(creation_host
=u’localhost’, creation_dt=datetime.datetime(2019, 7, 19, 5, 29, 12, tzinfo=)))>
2019-10-08 00:00:04,372:DEBUG:acme.client:Sending GET request to https://acme-v02.api.letsencrypt.org/directory.
2019-10-08 00:00:04,374:DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org:443
2019-10-08 00:00:19,832:DEBUG:certbot.log:Exiting abnormally:
Traceback (most recent call last):
File “/opt/eff.org/certbot/venv/bin/letsencrypt”, line 11, in
sys.exit(main())
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py”, line 1378, in main
return config.func(config, plugins)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py”, line 1249, in certonly
le_client = _init_le_client(config, auth, installer)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/main.py”, line 614, in _init_le_client
return client.Client(config, acc, authenticator, installer, acme=acme)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/client.py”, line 261, in init
acme = acme_from_config_key(config, self.account.key, self.account.regr)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/certbot/client.py”, line 46, in acme_from_config_key
return acme_client.BackwardsCompatibleClientV2(net, key, config.server)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py”, line 828, in init
directory = messages.Directory.from_json(net.get(server).json())
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py”, line 1161, in get
self._send_request(‘GET’, url, **kwargs), content_type=content_type)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/acme/client.py”, line 1110, in _send_request
response = self.session.request(method, url, *args, **kwargs)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/sessions.py”, line 533, in request
resp = self.send(prep, **send_kwargs)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/sessions.py”, line 646, in send
r = adapter.send(request, **kwargs)
File “/opt/eff.org/certbot/venv/local/lib/python2.7/site-packages/requests/adapters.py”, line 514, in send
raise SSLError(e, request=request)
SSLError: HTTPSConnectionPool(host=‘acme-v02.api.letsencrypt.org’, port=443): Max retries exceeded with url: /directory (Caused by SSLError(SSLError(“bad handshake: SysCallError(104, ‘ECONNRESET’)”,),))
2019-10-08 00:00:19,839:ERROR:certbot.log:An unexpected error occurred:

1 Like

How often does it happen outside the kind of of peak load times when cron runs?

After several days when I’ve three or four servers failing every night, last night there were none. Have you fixed it, or was it just luck? Do you still want me to post times and IPs of failed attempts?

I’ll second that. No issues last night for us either.

@fas @zachr @mnordhoff - have look here: Curl: TCP connection reset by peer - there’s an explanation what LE has done to ease the Situation … :wink:

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.