Too many connections to memcached in TIME_WAIT state - tcp

I have troubles with connections to memcached.
I assume there are no free local ports at busy hours.
netstat -n | grep "127.0.0.1" | grep TIME_WAIT | wc
This command give me 36-50k connections, possible it is more at busy hours
How could extend port range or is there other way to fix it?

We have fixed it.
So if you have many connections with TIME_WAIT status (more than 10-20K) i recommend to make some changes for tcp/ip settings
Modify net.ipv4.tcp_fin_timeout. We use 20s, and i think we can 15s or 10s because connections between servers are really fast.
Extend port range. Modify net.ipv4.ip_local_port_range. Set it for "1024 - 65535"

Related

overcome single tcp connection speed limit

i am a bit stuck and need some hints please.
My ISP seems to throttle single connections to 400 kbps no matter if i use VPN/ssh tunnel/proxy or direct connect. For file transfer its not a problem, i can use lftp with multiple connections which gets me to 1 Gbps in up or down.
I have a few vps and a dedicated server which i would like to use as a tunnel to overcome this limitation. I have already tried wireguard/openvpn/cloak/shadowsocks. Is there anything else what creates maybe multiple tcp or udp connections to get things going?
Thanks,
Dennis

Unable to reduce TIME_WAIT

I'm attempting to reduce the amount of time a connection is in the TIME_WAIT state by setting tcp_fin_timeout detailed here:
root:~# sysctl -w net.ipv4.tcp_fin_timeout=30
net.ipv4.tcp_fin_timeout = 30
However, this setting does not appear to affect anything. When I look at the netstat of the machine, the connections still wait the default 60s:
root:~# watch netstat -nato
tcp 0 0 127.0.0.1:34185 127.0.0.1:11209 TIME_WAIT timewait (59.14/0/0)
tcp 0 0 127.0.0.1:34190 127.0.0.1:11209 TIME_WAIT timewait (59.14/0/0)
Is there something I'm missing? The machine is running Ubuntu 14.04.1.
Your link is urban myth. The actual function of net.ipv4.tcp_fin_timeout is as follows:
This specifies how many seconds to wait for a final FIN packet before the socket is forcibly closed. This is strictly a violation of the TCP specification, but required to prevent denial-of-service attacks. In Linux 2.2, the default value was 180.
This doesn't have anything to do with TIME_WAIT. It establishes a timeout for a socket in FIN_WAIT_1, after which the connection is reset (which bypasses TIME_WAIT altogether). This is a DOS measure, as stated, and should never arise in a correctly written client-server application. You don't want to set it so low that ordinary connections are reset: you will lose data. You don't want to fiddle with it at all, actually.
The correct way to reduce TIME_WAIT states is given here.

What could cause so many TIME_WAIT connections to be open?

So, I have application A on one server which sends 710 HTTP POST messages per second to application B on another server, which is listening on a single port. The connections are not keep-alive; they are closed.
After a few minutes, application A reports that it can't open new connections to application B.
I am running netstat continuously on both machines, and see that a huge number of TIME_WAIT connections are open on each. Virtually all connections showing are in TIME_WAIT. From reading online, it seems that this is the state it's in for 30 seconds (on our machines 30 seconds according to /proc/sys/net/ipv4/tcp_fin_timeout value) after each side closes the connection.
I have a script running on each machine that's continuously doing:
netstat -na | grep 5774 | wc -l
and:
netstat -na | grep 5774 | grep "TIME_WAIT" | wc -l
The value of each, on each machine, seems to get to around 28,000 before application A reports that it can't open new connections to application B.
I've read that this file: /proc/sys/net/ipv4/ip_local_port_range provides the total number of connections that can be open at once:
$ cat /proc/sys/net/ipv4/ip_local_port_range
32768 61000
61000 - 32768 = 28232, which is right in line with the approximately 28,000 TIME_WAITs I am seeing.
My question is how is it possible to have so many connections in TIME_WAIT.
It seems that at 710 connections per second being closed, I should see approximately 710 * 30 seconds = 21300 of these at a given time. I suppose that just because there are 710 being opened per second doesn't mean that there are 710 being closed per second...
The only other thing I can think of is a slow OS getting around to closing the connections.
TCP's TIME_WAIT indicates that local endpoint (this side) has closed the connection. The connection is being kept around so that any delayed packets can be matched to the connection and handled appropriately. The connections will be removed when they time out within four minutes.
Assuming that all of those connections were valid, then everything is working correctly. You can eliminate the TIME_WAIT state by having the remote end close the connection or you can modify system parameters to increase recycling (though it can be dangerous to do so).
Vincent Bernat has an excellent article on TIME_WAIT and how to deal with it:
The Linux kernel documentation is not very helpful about what net.ipv4.tcp_tw_recycle does:
Enable fast recycling TIME-WAIT sockets. Default value is 0. It should
not be changed without advice/request of technical experts.
Its sibling, net.ipv4.tcp_tw_reuse is a little bit more documented but the language is about the same:
Allow to reuse TIME-WAIT sockets for new connections when it is safe
from protocol viewpoint. Default value is 0. It should not be changed
without advice/request of technical experts.
The mere result of this lack of documentation is that we find numerous tuning guides advising to set both these settings to 1 to reduce the number of entries in the TIME-WAIT state. However, as stated by tcp(7) manual page, the net.ipv4.tcp_tw_recycle option is quite problematic for public-facing servers as it won’t handle connections from two different computers behind the same NAT device, which is a problem hard to detect and waiting to bite you:
Enable fast recycling of TIME-WAIT sockets. Enabling this option is
not recommended since this causes problems when working with NAT
(Network Address Translation).

nginx connection limit

We had 2 nginx servers running perfectly at 1000reqs/second total in front of 3 php5-fpm servers with TCP connections. We thought that one nginx server would be sufficient and redirected all of our traffic to it. But, the server could not serve more than 750reqs/sec. It has gigabit ethernet and total traffic on it doesnot exceed 100mbits (Debian 6.0)
We could not find any reason and after googling found out that it might be related with TCP issues. But it did not seem very likely that we should do any change with this number of connections and bandwidth (around 70mbits/sec) Later we redirected half of our traffic back to another nginx and again reached 1000reqs/second.
We have been looking at nginx error and access logs. Is there any tool or file that could help us find the solution for the problem?
Most linux distributions have 28232 ephemeral ports available. A server needs one ephemeral port for each connection in order to free up the primary port (i.e. http server port 80) for the new connections.
So, it would seem if the server is handling 1000 requests/sec for content generated by php5-fpm over TCP, you are allocating 2000 ports/sec. This is not really the case, it is likely 5% PHP and 95% static (no port allocation) and IIRC nginx<->php-fpm keeps ports open for subsequent requests. There are lots of factors what can affect these numbers, but for arguments sake, lets say 1000 port allocations/sec.
On the surface this does not seem like a problem, but by default ports are not immediately released and made available for new connections. There are various reasons for this behavior, and I highly recommend a thorough understanding of TCP before arbitrarily making changes detailed here (or anywhere else).
Primarily a connection state called TIME_WAIT (socket is waiting after close to handle packets still in the network, netstat man page) is what holds up ports from being released for reuse. On recent (all?) linux kernels TIME_WAIT is hard-coded to 60 seconds, and according to RFC793 a connection may stay in TIME_WAIT up to four minutes!
This means at least 1000 ports will be in use for at least 60 seconds. In the real world, you need to account for transit time, keep-alive requests (multiple requests use the same connection), and service ports (between nginx and backend server). Lets arbitrarily knock it down to 750 ports/sec.
In ~37 seconds all your available ports will be used up (28232 / 750 = 37). That's a problem, because it takes 60 seconds to release a port!
To see all the ports in use, run apache bench or something similar that can generate the number of requests per second you are tuning for. Then run:
root:~# netstat -n -t -o | grep timewait
You'll get output like (but many, many more lines):
tcp 0 0 127.0.0.1:40649 127.1.0.2:80 TIME_WAIT timewait (57.58/0/0)
tcp 0 0 127.1.0.1:9000 127.0.0.1:50153 TIME_WAIT timewait (57.37/0/0)
tcp 0 0 127.0.0.1:40666 127.1.0.2:80 TIME_WAIT timewait (57.69/0/0)
tcp 0 0 127.0.0.1:40650 127.1.0.2:80 TIME_WAIT timewait (57.58/0/0)
tcp 0 0 127.0.0.1:40662 127.1.0.2:80 TIME_WAIT timewait (57.69/0/0)
tcp 0 0 127.0.0.1:40663 127.1.0.2:80 TIME_WAIT timewait (57.69/0/0)
tcp 0 0 127.0.0.1:40661 127.1.0.2:80 TIME_WAIT timewait (57.61/0/0)
For a running total of allocated ports:
root:~# netstat -n -t -o | wc -l
If you're receiving failed requests, the number will be at/close to 28232.
How to solve the problem?
Increase the number of ephemeral ports from 28232 to 63976.
sysctl -w net.ipv4.ip_local_port_range="1024 65000"
Allow linux to reuse TIME_WAIT ports before the timeout expires.
sysctl -w net.ipv4.tcp_tw_reuse="1"
Additional IP addresses.

What is the cost of many TIME_WAIT on the server side?

Let's assume there is a client that makes a lot of short-living connections to a server.
If the client closes the connection, there will be many ports in TIME_WAIT state on the client side. Since the client runs out of local ports, it becomes impossible to make a new connection attempt quickly.
If the server closes the connection, I will see many TIME_WAITs on the server side. However, does this do any harm? The client (or other clients) can keep making connection attempts since it never runs out of local ports, and the number of TIME_WAIT state will increase on the server side. What happens eventually? Does something bad happen? (slowdown, crash, dropped connections, etc.)
Please note that my question is not "What is the purpose of TIME_WAIT?" but "What happens if there are so many TIME_WAIT states on the server?" I already know what happens when a connection is closed in TCP/IP and why TIME_WAIT state is required. I'm not trying to trouble-shoot it but just want to know what is the potential issue with it.
To put simply, let's say netstat -nat | grep :8080 | grep TIME_WAIT | wc -l prints 100000. What would happen? Does the OS's network stack slow down? "Too many open files" error? Or, just nothing to worry about?
Each socket in TIME_WAIT consumes some memory in the kernel, usually somewhat less than an ESTABLISHED socket yet still significant. A sufficiently large number could exhaust kernel memory, or at least degrade performance because that memory could be used for other purposes. TIME_WAIT sockets do not hold open file descriptors (assuming they have been closed properly), so you should not need to worry about a "too many open files" error.
The socket also ties up that particular src/dst IP address and port so it cannot be reused for the duration of the TIME_WAIT interval. (This is the intended purpose of the TIME_WAIT state.) Tying up the port is not usually an issue unless you need to reconnect a with the same port pair. Most often one side will use an ephemeral port, with only one side anchored to a well known port. However, a very large number of TIME_WAIT sockets can exhaust the ephemeral port space if you are repeatedly and frequently connecting between the same two IP addresses. Note this only affects this particular IP address pair, and will not affect establishment of connections with other hosts.
Each connection is identified by a tuple (server IP, server port, client IP, client port). Crucially, the TIME_WAIT connections (whether they are on the server side or on the client side) each occupy one of these tuples.
With the TIME_WAITs on the client side, it's easy to see why you can't make any more connections - you have no more local ports. However, the same issue applies on the server side - once it has 64k connections in TIME_WAIT state for a single client, it can't accept any more connections from that client, because it has no way to tell the difference between the old connection and the new connection - both connections are identified by the same tuple. The server should just send back RSTs to new connection attempts from that client in this case.
Findings so far:
Even if the server closed the socket using system call, its file descriptor will not be released if it enters the TIME_WAIT state. The file descriptor will be released later when the TIME_WAIT state is gone (i.e. after 2*MSL seconds). Therefore, too many TIME_WAITs will possibly lead to 'too many open files' error in the server process.
I believe OS TCP/IP stack has been implemented with proper data structure (e.g. hash table), so the total number of TIME_WAITs should not affect the performance of the OS TCP/IP stack. Only the process (server) which owns the sockets in TIME_WAIT state will suffer.
If you have a lot of connections from many different client IPs to the server IPs you might run into limitations of the connection tracking table.
Check:
sysctl net.ipv4.netfilter.ip_conntrack_count
sysctl net.ipv4.netfilter.ip_conntrack_max
Over all src ip/port and dest ip/port tuples you can only have net.ipv4.netfilter.ip_conntrack_max in the tracking table. If this limit is hit you will see a message in your logs "nf_conntrack: table full, dropping packet." and the server will not accept new incoming connections until there is space in the tracking table again.
This limitation might hit you long before the ephemeral ports run out.
In my scenario i ran a script which schedules files repeatedly,my product do some computations and sends response to client ie client is making a repetitive http call to get the response of each file.When around 150 files are scheduled socket ports in my server goes in time_wait state and an exception is thrown in client which opens a http connection ie
Error : [Errno 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted
The result was that my application hanged.I do not know may be threadshave gone in wait state or what has happened but i need to kill all processes or restart my application to make it work again.
I tried reducing wait time to 30 seconds since it is 240 seconds by default but it did not work.
So basically overall impact was critical as it made my application non-responsive
it looks like the server can just run out of ports to assign for incoming connections (for the duration of existing TIMED_WAITs) - a case for a DOS attack.

Resources