Relation between HTTP Keep Alive duration and TCP timeout duration - http

I am trying to understand the relation between TCP/IP and HTTP timeout values. Are these two timeout values different or same? Most Web servers allow users to set the HTTP Keep Alive timeout value through some configuration. How is this value used by the Web servers? is this value just set on the underlying TCP/IP socket i.e is the HTTP Keep Alive timeout and TCP/IP Keep Alive Timeout same? or are they treated differently?
My understanding is (maybe incorrect):
The Web server uses the default timeout on the underlying TCP socket (i.e. indefinite) regardless of the configured HTTP Keep Alive timeout and creates a Worker thread that counts down the specified HTTP timeout interval. When the Worker thread hits zero, it closes the connection.
EDIT:
My question is about the relation or difference between the two timeout durations i.e. what will happen when HTTP keep-alive timeout duration and the timeout on the Socket (SO_TIMEOUT) which the Web server uses is different? should I even worry about these two being same or not?

An open TCP socket does not require any communication whatsoever between the two parties (let's call them Alice and Bob) unless actual data is being sent. If Alice has received acknowledgments for all the data she's sent to Bob, there's no way she can distinguish among the following cases:
Bob has been unplugged, or is otherwise inaccessible to Alice.
Bob has been rebooted, or otherwise forgotten about the open TCP socket he'd established with Alice.
Bob is connected to Alice, and knows he has an open connection, but doesn't have anything he wants to say.
If Alice hasn't heard from Bob in awhile and wants to distinguish among the above conditions, she can resend her last byte of data, wrapped in a suitable TCP frame to be recognizable as a retransmission, essentially pretending she hasn't heard the acknowledgment. If Bob is unplugged, she'll hear nothing back, even if she repeatedly sends the packet over a period of many seconds. If Bob has rebooted or forgotten the connection, he will immediately respond saying the connection is invalid. If Bob is happy with the connection and simply has nothing to say, he'll respond with an acknowledgment of the retransmission.
The Timeout indicates how long Alice is willing to wait for a response when she sends a packet which demands a reply. The Keepalive time indicates how much time she should allow to lapse before she retransmits her last bit of data and demands an acknowledgment. If Bob goes missing, the sum of the Keepalive and Timeout values will indicate the worst-case time between Alice receiving her last bit of data and her deciding that Bob is dead.

They're two separate mechanisms; the name is a coincidence.
HTTP keep-alive (also known as persistent connections) is keeping the TCP socket open so that another request can be made without setting up a new connection.
TCP keep-alive is a periodic check to make sure that the connection is still up and functioning. It's often used to assure that a NAT box (e.g., a DSL router) doesn't "forget" the mapping between an internal and external ip/port.

KeepAliveTimeout Directive
Description: Amount of time the server will wait for subsequent
requests on a persistent connection Syntax: KeepAliveTimeout seconds
Default: KeepAliveTimeout 15 Context: server config, virtual host
Status: Core Module: core The number of seconds Apache will wait for a
subsequent request before closing the connection. Once a request has
been received, the timeout value specified by the Timeout directive
applies.
Setting KeepAliveTimeout to a high value may cause performance
problems in heavily loaded servers. The higher the timeout, the more
server processes will be kept occupied waiting on connections with
idle clients.
In a name-based virtual host context, the value of the first defined
virtual host (the default host) in a set of NameVirtualHost will be
used. The other values will be ignored.
TimeOut Directive
Description: Amount of time the server will wait for certain events
before failing a request Syntax: TimeOut seconds Default: TimeOut 300
Context: server config, virtual host Status: Core Module: core The
TimeOut directive currently defines the amount of time Apache will
wait for three things:
The total amount of time it takes to receive a GET request. The amount
of time between receipt of TCP packets on a POST or PUT request. The
amount of time between ACKs on transmissions of TCP packets in
responses. We plan on making these separately configurable at some
point down the road. The timer used to default to 1200 before 1.2, but
has been lowered to 300 which is still far more than necessary in most
situations. It is not set any lower by default because there may still
be odd places in the code where the timer is not reset when a packet
is sent.

Related

Data cost of keeping a tcp connection open

Let's suppose 2 computers:
The first is running a netcat server on a tcp port.
The second is running a netcat client, connected to the previous netcat server.
(netcat is an example, you can imagine a basic c program with socket)
We ca send data between the 2 computers.
Let's imagine nobody send data during multiple days.
Is there a timeout in tcp stack ?
Does netcat (or operating system) sends some packets to keep the connection opened ?
What i want to know is how much data is sent if there is no top level activity.
Thanks
Is there a timeout in tcp stack ?
There are many different timeouts in the TCP stack, depending on what state we are currently in, and how the connection was configured (e.g. with keepalive or not). The idle connection timeout (which is what you refer to) does not seem to be defined. With keepalive the timeout is ~2 hours. That being said pretty much every firewall in the world will setup some timeout. Based on this reddit thread 15 minutes looks like a reasonable assumption, maybe even 1 hour. But multiple days? I doubt it will be alive in any network (except your own).
Does netcat (or operating system) sends some packets to keep the connection opened ?
No. You will have to do it yourself by sending data. With the keepalive option for TCP, the OS will do it for you (note: keepalive is disabled by default), but this works between direct peers, i.e. may fail when proxies are involved. Sending data is definitely a better approach.

TCP connection: After a while, server cannot send packets to client. Client can though

I think it relates just to the TCP layer, but I describe my setup in the following paragraph:
On google compute engine I set up a http and websocket server (python, geventwebsocket+gevent.WSGIServer). At home I have my computer (esp8266) that connects to it using websockets.
I use websockets because I need bidirectional communication (a couple of messages a day, it goes like this: a message from server, a response from client.) The connection itself is initiated by the client, as it's behind a NAT.
The problem is that a couple of seconds from the last packet exchange, the messages from server don't arrive to the client. However, the client can send packets to the server even minutes after (and possibly much longer). And interestingly then, the probably retransmitted packets from server finally arrive.
I examined the packets are indeed sent from server with wireshark (and retrasmitted, if not ack'ed) and log every network communication on the client, so the problem probably isn't the application software. I get no exceptions in the applications. The connections are open.
I tested the time server can sent packets after the connection initiation/last delivered packet generally and it's between 6 and 20 seconds, varying between tests. In the test server sends out packets with a set, fixed, delay between them.
In a test (couple of packets) with the single set delay usually either all packets arrive, or none (yeah if one doesn't arrive, the next won't).
I suspect that might be because of the NAT. But then the one solution I see would be to periodically (every 6 seconds or less) send out keep alive packets (Pings and Pongs in websocket, or the TCP's keepalive) from the client. But that doesn't seem elegant, as there should be only a few data messages in a day.
And the similar thing happens when ssh'ing from my desktop to the server: after a couple seconds of inactivity at my and server side, the server stops sending anything (tested e.g. with watch -n20 date. Sometimes it just freezes and doesn't update until I press a key = send a packet from client. But the update is not instant in case of the ssh, it takes a couple of seconds after the keypress to see new stuff. Edit: of course that must be due to the retransmission timer algorithm)
So I studied what is the purpose of TCP keep-alive packets etc. and the thing is that routers and NAT's forget the connections or mappings or whatever in some time/keep only the newest. (So I guess in the case of client->server the mappings just recreate as the destination ip is public and is the actual server. And in the opposite direction it is not possible, so it doesn't work.)
But didn't think it can be as bad as in 6 seconds. The websockets almost reduce to polling (although with a possibly smaller lag).
It seems that the router's NAT mechanism may cause the problem. Maybe you can usee some little tools like NAT-PMP or Upnp to open a port and mapping to your local client. This will last long enough for you to do bidirectional communication.

Why are underlying TCP connections are released so late?

As you see above, the tcp connection release so slow.
I'm wondering how it happened and if it affect my program (http layer)?
This is persistent connections that defined by HTTP/1.1. When client makes requests to the server several requests can share one underlying TCP connection.
In your case request was performed and system waits for a while expecting other request. After 30 seconds inactivity it considers connection as idle and close it (sends TCP FIN).
About impact to the system: some resources are consumed for TCP connection handling. This may be an issue on huge servers handling millions requests but I don't think that this is your case.

What is an idle http connection?

I am working with http connection and using a MultiThreadedHttpConnectionManager and httpClient.
For my purpose I am closing all the idle connection after 1ms with the following method : closeIdleConnections(1).
I am wondering what is considered as an " idle connection" in http ? It seems that waiting for an answer is not an idle connection.
Regards,
HTTP (1.1) specifies that connections should remain open until explicitly closed, by either party. Beyond that the specification provides only one example for a policy, suggesting using a timeout value beyond which an inactive (idle) connection should be closed. A connection kept open until the next HTTP request reduces latency and TCP connection establishment overhead. However, an idle open TCP connection consumes a socket and buffer space memory.
Excerpt from RFC 7230:
6.5. Failures and Timeouts
Servers will usually have some time-out value beyond which they will no longer maintain an inactive connection. Proxy servers might make this a higher value since it is likely that the client will be making more connections through the same server. The use of persistent connections places no requirements on the length (or existence) of this time-out for either the client or the server.
When a client or server wishes to time-out it SHOULD issue a graceful close on the transport connection. Clients and servers SHOULD both constantly watch for the other side of the transport close, and respond to it as appropriate. If a client or server does not detect the other side's close promptly it could cause unnecessary resource drain on the network.
A client, server, or proxy MAY close the transport connection at any time. For example, a client might have started to send a new request at the same time that the server has decided to close the "idle" connection. From the server's point of view, the connection is being closed while it was idle, but from the client's point of view, a request is in progress.
By studying the source code, in the HttpClient MultiThreadedHttpConnectionManager implementation, connection is simply considered idle when the connection in the pool's age is more than the idleTime. The idleTime is passed to the method closeIdleConnections(idleTime) as an argument.

How many times will TCP retransmit

In the case of a half open connection where the server crashes (no FIN or RESET sent to client), and the client attempts to send some data on this broken connection, each TCP segment will go un-ACKED. TCP will attempt to retransmit packets after some timeout. How many times will TCP attempt to retransmit before giving up and what happens in this case? How does it inform the operating system that the host is unreachable? Where is this specified in the TCP RFC?
If the server program crashes, the kernel will clean up all open sockets appropriately. (Well, appropriate from a TCP point of view; it might violate the application layer protocol, but applications should be prepared for this event.)
If the server kernel crashes and does not come back up, the number and timing of retries depends if the socket were connected yet or not:
tcp_retries1 (integer; default: 3; since Linux 2.2)
The number of times TCP will attempt to
retransmit a packet on an established connection
normally, without the extra effort of getting
the network layers involved. Once we exceed
this number of retransmits, we first have the
network layer update the route if possible
before each new retransmit. The default is the
RFC specified minimum of 3.
tcp_retries2 (integer; default: 15; since Linux 2.2)
The maximum number of times a TCP packet is
retransmitted in established state before giving
up. The default value is 15, which corresponds
to a duration of approximately between 13 to 30
minutes, depending on the retransmission
timeout. The RFC 1122 specified minimum limit
of 100 seconds is typically deemed too short.
(From tcp(7).)
If the server kernel crashes and does come back up, it won't know about any of the sockets, and will RST those follow-on packets, enabling failure much faster.
If any single-point-of-failure routers along the way crash, if they come back up quickly enough, the connection may continue working. This would require that firewalls and routers be stateless, or if they are stateful, have rulesets that allow preexisting connections to continue running. (Potentially unsafe, different firewall admins have different policies about this.)
The failures are returned to the program with errno set to ECONNRESET (at least for send(2)).

Resources