Consider such a scenario, there are client-a and server-b. server-b has TCP keepalive disabled. And server-b does not have any application logic to check whether a TCP connection is open. client-a establishes a TCP connection to server-b. There is no data transfer between client-a and server-b afterward. In such a case, will the server ever close the idle connection? Or the server will have the TCP connection stay open forever.
After reading Longest Open TCP Connection?, I guess that such a TCP connection will stay open forever. Is my guess correct?
There is no limit in the TCP connection itself. Client and server could in theory stay connected for years without exchanging any data and without any packet flow. Problems are usually caused by middleboxes like NAT router or firewalls which keep a state and expire the state after some inactivity. Any new packets sent within the connection cannot be delivered then because no associated state exists anymore in the middlebox.
Related
Can you describe the main process with TCP connection status?
In fact I'm more concerned about whether those connections that have been established can be closed after the client receives a proper reply from the server ...... That's part of the graceful shutdown, I think.
The accepted connections are totally independent of the listening socket. So the server can stop listening and the accepted sockets can still be used as if nothing happened. This means that each accepted socket has it's own tcp connection state (diagram).
Often, though, servers stop listening when they are shut down, so they close all sockets at that time.
Architecture:
We have a bunch of IoT devices connected via an AWS network loadbalancer (NLB) to our backend servers.
This is a bidirectional channel (not a request response style, but messages passed from either party to the other).
Objective:
How to keep connections (both sides of NLB) alive during inactivity.
Description:
Frequently clients go to inactive mode and do not send (or receive) anything to (or from) servers. If this state lasts longer than 350 seconds (connection idle timeout value of NLBs) the LB silently kill the connection. This is bad, because we see a lot of RST packets everywhere.
Questions:
I'm aware of SO_KEEPALIVE feature and can enable it on our backend servers. This keeps the connection between backend servers and NLB alive. But what about clients? Do NLBs forward TCP keep-alive packets to the other party? (Here it says it does not). If it does not, how to keep clients connections open? (At them moment, I'm thinking to send an empty message to keep the connection.)
Is this behavior specific to AWS NLBs or do loadbalancers generally work this way?
AWS docs say that NLB TCP listener has ability to keep connection alive with TCP keep-alive packets: link
For TCP listeners, clients or targets can use TCP keepalive packets to reset the idle timeout.
Based on my tests client is receiving TCP keep alive packets sent by server and correctly responds back.
Server doesn't interrupt connection what means it receives response from client.
It means that NLB TCP listener actually forwards keep-alive packets.
Based on the same docs, NLB TLS listener shouldn't react the same on TCP keep-alive packets.
TCP keepalive packets are not supported for TLS listeners.
But actual tests result shocked me when Wireshark showed keep-alive packets received on client connected through TLS listener.
My previous test results performed 2 months ago don't correspond what I'm experiencing now and I'm thinking behaviour may changed.
(previously server was keeping the connection even after client became unavailable in unexpected manner)
Not an answer, just to document what I found/did:
NELBs do not forward keep-alive packets. Meaning you have to enable them on both server and clients.
NELB's timeout cannot be changed. it's 350 second
I couldn't find any way to forge an empty TCP packet to fool the LB to forward it to the other side of the LB.
At the end, we implemented the keep alive feature at the application layer (sending an empty message to clients periodically.)
Consider a scenario where exists one server and multiple clients. And each client creates TCP connections to interact with the server. There are three usages of TCP alive:
Server-side keepalive: The server sends TCP keepalive to make sure that the client is alive. If the client is dead, the server closes the TCP connection to the client.
Client-side keepalive: Clients sends TCP keepalive to prevent the server from closing the TCP connection to the client.
Both-side keepalive: Both server and clients send TCP keepalive as described in 1 and 2.
Which of the above usages of TCP keepalive are typical?
Actually, both server and client peers may use TCP keepalive. It is useful to ensure that the operating system will eventually release any resource associated with dead connections. Note that if a connection between two hosts get lost because of some issue with a router between them, then both hosts have to independently detect that the connection is dead, and cleanup for themselves.
Now, each host will maintain a timer on each connection indicating when it last received a packet associated with that connection. A host will send a keepalive packet when that timer goes over a certain threshold, which is defined locally (that is, hosts do not exchange information about their own keepalive configuration). So either host with the lowest keepalive time will take the initiative of sending a keepalive packet to the other host. If the packet indeed goes through, the other host (that is, the one with the higher keepalive time) will respond to that packet and reset its own timer; therefore, the host with an higher keepalive time will certainly never reach the need to send keepalive packet itself, unless the connection has indeed been lost.
Arguably, it could be said that servers are generally more aggressive on keepalive than client machines (that is, they will more often be configured with lower keepalive time), because hanging connections often have undesirable effects on server software (for example, the software may accept a limited number of concurrent connection, or the server may fork a new process instance associated with each connection).
Server-side keepalive: The server sends TCP keepalive to make sure that the client is alive. If the client is dead, the server closes the TCP connection to the client.
If the client is dead, the server gets a 'connection reset' error, after which it should close the connection.
Client-side keepalive: Clients sends TCP keepalive to prevent the server from closing the TCP connection to the client.
No. Client sends keepalive so that if the server is dead, the client will get a 'connection reset' error, after which it should close the connection.
Both-side keepalive
Both sides are capable of getting a 'connection reset' due to keepalive failure, as above.
Whuch of the above usages is typical?
Any of them, or none. If a peer is sending regularly it doesn't really need keepalive as well. It is therefore often of more use to a server than a client.
I have around 20 clients communicating together with a central server in the same LAN. The clients can make transaction simultaneously with the server. The server forward each transaction to external appliance in the network. Sometimes it works, sometimes my application shows a "time out" message in a client screen (randomly)
I mirrored all traffic and found TCP Retransmission after TCP Reset packets for the first TCP Sequence. I immediately thought about packet loss but all my cables/NIC are fine, and I do not see DUP ACK in the capture.
It seems that RST packets may have different significations.
What causes those TCP Reset?
Where should I focus my investigation: network or application design ?
I would appreciate any help. Thanks in advance.
Judging by the capture, I assume your central server is 137.56.64.31. What's happening is the clients are initiating a connection to the server with a SYN packet and the server responds with a RST. This is typical if the server has no application listening on that particular port e.g. the webserver application isn't running and a client tries to connect to port 80.
The clients are all connecting to different ports on the server, which is unusual for an central server, but not unheard of. The destination ports the clients are connecting to on the server are: 11007, 11012, 11014, 11108, and 11115. Is that normal for the application? If not, the clients should be connecting to whatever port the application server is listening on.
The reason for the retransmits is that instead of giving up on the connection upon receiving a RST from the server, the client tries to initiate the connection again so Wireshark considers it a retransmission.
I have client and server component. Server may be installed behind the firewall or load balancer. Many sites/forums suggested to use TCP keep-alive feature to avoid connection termination due to inactivity.
The question is whether the keep-alive message from client will actually reach to server?
I tried to simulate the deployment using tcptrace utility and found that the keep-alive messages does not reach to server still the client was getting ACK for keep alive message.
I am not sure whether LB/FW work in same manner.
Is the keep-alive good option to avoid connection termination due to inactivity over socket in case of firewall and load balancer?
The answer is, of course: "it depends".
Many firewalls and load balancers maintain separate frontend and backend TCP connections, e.g.:
client <-- TCP --> firewall/balancer <-- TCP --> server
For situations like this, using TCP keepalive will not work as you'd expect. Why not? The TCP keepalive works for that TCP session only, and the keepalive probe packets are more like "administrative overhead" packets that data-bearing packets. This means that a) using TCP keepalive on the client end only means keeping the TCP connection to the firewall/balancer alive, and b) the firewall/balancer does not "forward" those keepalive probe packets across to the backend connection.
So is using TCP keepalive useful? Yes. There are other types of proxies which work at lower layers in the OSI stack, and which do forward those packets; using TCP keepalive is good for keeping your idle connection alive through those types of network intermediaries.
If your client/server application uses a long-lived, possibly idle TCP connection through firewalls/balancers, the best way to ensure that that connection is not torn down (sometimes politely, e.g. with a RST packet sent by the firewall/balancer, sometimes silently) is to use a "ping" or "heartbeat" message at the application layer. (Think of this as an "application keepalive".) This is just some kind of message that is sent e.g. from the client to the server. A simple and effective technique is to have the client periodically send some bytes to the server, which the server echoes back to the client. The client knows which bytes it sent, and when it receives those same bytes back from the server, it knows that everything in the network path is still working as expected.
Hope this helps!