Curl: TCP connection reset by peer

futureweb · September 30, 2019, 2:24pm

Hey there,
since a few Days we are getting random “Curl: TCP connection reset by peer” Errors when trying to renew Certs. We got our own PHP Client integrated into our CMS. (Based on https://github.com/analogic/lescript)
A second attempt a few Seconds later succeeds without Problems …
Requests sent from Servers located in Austria. (ie. from 83.65.246.198)
As there are several thousand Certs hosted on this Server there are lot’s of renewals per Day … 99,9% succeed … but some trigger this Error … not a big Deal as we automatically try again later … but just wondering …
No Network Issues observed on our End / our Datacenter.
So could there be an Issue on LE side? Anything known? Others getting such Errors too?
thx, bye from Austria
Andreas Schnederle-Wagner

JuergenAuer · September 30, 2019, 2:36pm

Hi @futureweb

there was a change:

Letsencrypt supports now http/2.

futureweb · September 30, 2019, 2:59pm

Hey @JuergenAuer,
thx for the Information - seems like it’s related to this … saw this Error the first Time ever on 25.09 02:22, since then it happend 14 Times on some dozen/hundreds of renewals …
Happens just randomly …

grafik

Can I do something to help debugging this Issue?

mnordhoff · September 30, 2019, 3:00pm

FWIW, someone else had a similar report a few days ago:

https://community.letsencrypt.org/t/curl-error-to-directory-endpoint/103027

futureweb · October 1, 2019, 7:49am

Some more Information regarding this Issue:

Using ACME V1 Endpoint: https://acme-v01.api.letsencrypt.org
Host OS: CentOS Linux release 7.6.1810 (Core)
IPv4 - no IPv6
#curl -V
curl 7.29.0 (x86_64-redhat-linux-gnu) libcurl/7.29.0 NSS/3.36 zlib/1.2.7 libidn/1.28 libssh2/1.4.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smtp smtps telnet tftp
Features: AsynchDNS GSS-Negotiate IDN IPv6 Largefile NTLM NTLM_WB SSL libz unix-sockets
Also tested manually compiled current Version of CURL … triggered the same Error.

Possible LE new CDN can’t hold up with all the incoming requests and start denying new connections?
Someone on the LE Infrastructure Side who could have a look into this? (@cpu ? ;-))

thx, bye from sunny Austria
Andreas

JuergenAuer · October 1, 2019, 7:56am

Isn't your curl too old? That's from

7.29.0 Feb 6 2013

https://curl.haxx.se/docs/releases.html

futureweb · October 1, 2019, 8:14am

@JuergenAuer - using Centos 7 which is Binary “Clone” of RHEL 7. those Enterprise Linuxes tend to stay on old but good tested Versions for Compatibility/Stability reasons. But all important Security Fixes get backported by them to those old Versions (see: https://access.redhat.com/solutions/64838)
So that’s “normal” in this case
Newer Versions coming with Centos 8 which was released just a few Days ago …

I also tried it on manually compiled 7.65 Version which throwed the same Error - so I’m more or less ruling out CURL as the Source of the Problem

cpu · October 1, 2019, 1:43pm

I will ask someone on the SRE team to investigate.

futureweb · October 1, 2019, 1:55pm

thx a lot ... hope they can help track it down!

jillian · October 1, 2019, 4:55pm

Hi @futureweb -

Thanks for bringing this to our attention. I’ve started investigating this internally and will provide any updates here.

Is there any more detail you can provide from the error messages? Perhaps, running the command with curl -vvv to see more details.

unixcharles · October 2, 2019, 5:30pm

I’m seeing about 1k ECONNRESET type errors per hours. We’re provisioning around 25k certificates per day. We are still getting a lot of successful requests.

Last one was at Oct 2nd, 17:26:10 UTC. We’re requesting from Google Cloud.

jillian · October 3, 2019, 2:24am

Just a quick update. I’ve been reviewing logs and data and I’m getting close to a root cause. Thanks for your patience while we continue investigating.

futureweb · October 3, 2019, 8:56am

@jillian was out of Office yesterday, but as you are getting close to the root cause I guess you don’t need any more verbose Logs anymore?
If we can do anything further to help you hunting down this nasty thing just drop me a line!

zertrin · October 5, 2019, 4:50pm

The problem doesn’t seem to only happen with curl: I’ve been having errors from my daily certbot cron for the last 3 days:

(timezone in the log below is UTC)

2019-10-05 00:45:09,509:DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org
2019-10-05 00:45:25,065:DEBUG:certbot.log:Exiting abnormally:
An unexpected error occurred:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 417, in wrap_socket
    cnx.do_handshake()
  File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1426, in do_handshake
    self._raise_ssl_error(self._ssl, result)
  File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1166, in _raise_ssl_error
    raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (104, 'ECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 594, in urlopen
    chunked=chunked)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 350, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 837, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 323, in connect
    ssl_context=context)
  File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 324, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 424, in wrap_socket
    raise ssl.SSLError('bad handshake: %r' % e)
ssl.SSLError: ("bad handshake: SysCallError(104, 'ECONNRESET')",)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 423, in send
    timeout=timeout
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 624, in urlopen
    raise SSLError(e)
requests.packages.urllib3.exceptions.SSLError: ("bad handshake: SysCallError(104, 'ECONNRESET')",)

During handling of the above exception, another exception occurred:

requests.exceptions.SSLError: ("bad handshake: SysCallError(104, 'ECONNRESET')",)
Please see the logfiles in /var/log/letsencrypt for more details.

Same errors happened at the following timestamps (still UTC):

2019-10-03 00:45:08,350:DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org
2019-10-03 00:45:23,950:DEBUG:certbot.log:Exiting abnormally:
<same error as above>
2019-10-04 00:45:09,660:DEBUG:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): acme-v02.api.letsencrypt.org
2019-10-04 00:45:25,413:DEBUG:certbot.log:Exiting abnormally
<same error as above>

futureweb · October 7, 2019, 7:43am

@jillian - got some more Errors over the Weekend - here the Info:
Source IP: 83.65.246.198, Timezone: MESZ (UTC+2)

Date		Time		v1 Endpoint		Error
06.10.2019	14:35		/acme/new-authz		TCP connection reset by peer
07.10.2019	00:19		/directory		TCP connection reset by peer
07.10.2019	03:19		/acme/new-authz		TCP connection reset by peer

Date		Time		v2 Endpoint		Error
07.10.2019	10:44		/acme/new-acct		TCP connection reset by peer
07.10.2019	13:35		/acme/new-nonce		TCP connection reset by peer
07.10.2019	16:24		/directory		TCP connection reset by peer
07.10.2019	23:35		/acme/new-acct		TCP connection reset by peer
08.10.2019	03:15		/acme/new-nonce		TCP connection reset by peer
08.10.2019	04:57		/acme/chall-v3/681374162/Utss7g		TCP connection reset by peer
08.10.2019	08:36		/acme/authz-v3/683535874		TCP connection reset by peer
08.10.2019	08:57		/acme/chall-v3/683734935/dB4U8A		TCP connection reset by peer
08.10.2019	09:17		/acme/new-acct		TCP connection reset by peer
08.10.2019	12:23		/acme/authz-v3/685829208		TCP connection reset by peer
08.10.2019	13:01		/acme/new-nonce		TCP connection reset by peer
08.10.2019	13:10		/directory		TCP connection reset by peer
08.10.2019	14:07		/acme/new-nonce		TCP connection reset by peer
08.10.2019	14:40		/directory		TCP connection reset by peer
08.10.2019	15:13		/acme/authz-v3/687506321		TCP connection reset by peer
08.10.2019	15:16		/acme/new-nonce		TCP connection reset by peer
08.10.2019	17:08		/acme/new-order		TCP connection reset by peer
08.10.2019	17:15		/acme/new-nonce		TCP connection reset by peer
08.10.2019	18:09		/acme/chall-v3/689037489/Ij7g3g		TCP connection reset by peer
08.10.2019	19:35		/acme/new-order		TCP connection reset by peer
08.10.2019	23:05		/acme/new-nonce		TCP connection reset by peer
09.10.2019	00:25		/directory		TCP connection reset by peer
09.10.2019	00:32		/acme/new-acct		TCP connection reset by peer
09.10.2019	00:34		/acme/chall-v3/692785848/7JJlFw		TCP connection reset by peer
09.10.2019	03:01		/directory		TCP connection reset by peer
09.10.2019	04:50		/acme/new-order		TCP connection reset by peer
09.10.2019	06:33		/acme/chall-v3/696489572/WgutoA		TCP connection reset by peer
09.10.2019	08:55		/acme/new-acct		TCP connection reset by peer
09.10.2019	09:00		/directory		TCP connection reset by peer
09.10.2019	09:40		/acme/authz-v3/698269473		TCP connection reset by peer
09.10.2019	12:20		/acme/new-acct		TCP connection reset by peer
09.10.2019	12:54		/acme/new-nonce		TCP connection reset by peer
09.10.2019	14:56		/acme/chall-v3/701278204/O1uALw		TCP connection reset by peer
09.10.2019	15:30		/acme/new-order		TCP connection reset by peer
09.10.2019	19:40		/acme/new-nonce		TCP connection reset by peer

I’ll update posting as new Errors occur …

awwsu · October 8, 2019, 5:30pm

FWIW, I am also seeing this error when using an Ansible playbook to update. In our case, we have a proxy server involved, so the logs would be… messy. I’ll see what I can get into a useful format.

brablc · October 9, 2019, 9:11am

We are getting following error in some 1-2% of renewals since couple of days. Not sure if this is the same error:

Attempting to renew cert (www.somedomain.sk) from /etc/letsencrypt/renewal/www.somedomain.sk.conf produced an unexpected error: HTTPSConnectionPool(host='acme-v02.api.letsencrypt.org', port=443): Max retries exceeded with url: /directory (Caused by SSLError(SSLError("bad handshake: SysCallError(104, 'ECONNRESET')",),)). Skipping. All renewal attempts failed. The following certs could not be renewed: /etc/letsencrypt/live/www.somedomain.sk/fullchain.pem (failure)

futureweb · October 9, 2019, 11:07am

Just had a quick look about how many % of the Requests fail …

On 08.10.2019 we had 70 Requests in total, 17 of them failed … so we get those failures at about 1/4 of all Requests … more than I initially thought … :-/

jillian · October 9, 2019, 5:14pm

Thank you everyone for all details and information! It’s really appreciated and helpful for us to know more of the scope. We’re still investigating a root cause as a high priority issues.

andygabby · October 10, 2019, 1:07am

Hi folks!

Thanks for all the help troubleshooting and patience!

We have found a couple places that were causing at least some errors and fixed them.

Added additional frontend proxy capacity.
Lowered our frontend proxy keepalive timeout to be less than our firewall session timeout.

Let us know if you are still seeing problems.

Topic		Replies	Views
Activation and renewal errors Help	15	498	November 23, 2023
Curl: (35) TCP connection reset by peer Help	14	17971	March 23, 2023
"Connection reset by peer" when trying to curl LE API Help	15	3413	May 7, 2021
IP blocked? reset by peer Help	5	87	May 16, 2025
"Connection reset by peer" Help	24	4717	May 22, 2020

Curl: TCP connection reset by peer

Related topics