Why are underlying TCP connections are released so late? - http

As you see above, the tcp connection release so slow.
I'm wondering how it happened and if it affect my program (http layer)?

This is persistent connections that defined by HTTP/1.1. When client makes requests to the server several requests can share one underlying TCP connection.
In your case request was performed and system waits for a while expecting other request. After 30 seconds inactivity it considers connection as idle and close it (sends TCP FIN).
About impact to the system: some resources are consumed for TCP connection handling. This may be an issue on huge servers handling millions requests but I don't think that this is your case.

Related

TCP connection: After a while, server cannot send packets to client. Client can though

I think it relates just to the TCP layer, but I describe my setup in the following paragraph:
On google compute engine I set up a http and websocket server (python, geventwebsocket+gevent.WSGIServer). At home I have my computer (esp8266) that connects to it using websockets.
I use websockets because I need bidirectional communication (a couple of messages a day, it goes like this: a message from server, a response from client.) The connection itself is initiated by the client, as it's behind a NAT.
The problem is that a couple of seconds from the last packet exchange, the messages from server don't arrive to the client. However, the client can send packets to the server even minutes after (and possibly much longer). And interestingly then, the probably retransmitted packets from server finally arrive.
I examined the packets are indeed sent from server with wireshark (and retrasmitted, if not ack'ed) and log every network communication on the client, so the problem probably isn't the application software. I get no exceptions in the applications. The connections are open.
I tested the time server can sent packets after the connection initiation/last delivered packet generally and it's between 6 and 20 seconds, varying between tests. In the test server sends out packets with a set, fixed, delay between them.
In a test (couple of packets) with the single set delay usually either all packets arrive, or none (yeah if one doesn't arrive, the next won't).
I suspect that might be because of the NAT. But then the one solution I see would be to periodically (every 6 seconds or less) send out keep alive packets (Pings and Pongs in websocket, or the TCP's keepalive) from the client. But that doesn't seem elegant, as there should be only a few data messages in a day.
And the similar thing happens when ssh'ing from my desktop to the server: after a couple seconds of inactivity at my and server side, the server stops sending anything (tested e.g. with watch -n20 date. Sometimes it just freezes and doesn't update until I press a key = send a packet from client. But the update is not instant in case of the ssh, it takes a couple of seconds after the keypress to see new stuff. Edit: of course that must be due to the retransmission timer algorithm)
So I studied what is the purpose of TCP keep-alive packets etc. and the thing is that routers and NAT's forget the connections or mappings or whatever in some time/keep only the newest. (So I guess in the case of client->server the mappings just recreate as the destination ip is public and is the actual server. And in the opposite direction it is not possible, so it doesn't work.)
But didn't think it can be as bad as in 6 seconds. The websockets almost reduce to polling (although with a possibly smaller lag).
It seems that the router's NAT mechanism may cause the problem. Maybe you can usee some little tools like NAT-PMP or Upnp to open a port and mapping to your local client. This will last long enough for you to do bidirectional communication.

TCP connection existing on server/client during TIME_WAIT state

A simple question:
When the connection reaches the TIME_WAIT state on the client, does the connection still exist on the server?
Thanks!
The connection reaches the TIME_WAIT state when both peers have closed their sockets. It doesn't exist for any practical purpose except timing out the TIME_WAIT state.
The presence of many TIME-WAIT TCBs can increases the demultiplexing time for active connections.
The design of TCP places the TIME-WAIT TCB at the endpoint that closes the connection; this decision conflicts with the semantics of many application protocols. The File Transfer Protocol (FTP) and HTTP both interpret the closing of the transport connection as an end-of-transaction marker. In each case, the application protocol requires that servers close the transport connection, and the transport protocol requires that servers incur a memory cost if they do. Protocols that use other methods of marking end-of-transaction, e.g., SUN RPC over TCP, can have the clients close connections at the expense of a more complex application protocol.
As networks become faster and support more users, the connection rates at busy servers are likely to increase, resulting in more TIME-WAIT loading.
Intuitively, the first endpoint to close a connection closes it actively, and the second passively; HTTP and FTP servers generally close connections actively.

Should my IoT device poll with HTTP or listen with TCP?

I'm creating an IoT Device + Server system using .NET Micro Framework and ASP.NET WebAPI (Probably in Azure).
The IoT device needs to be able to frequently update the server with stats whilst also being able to receive occasional incoming commands from the server that would change its behaviour. In this sense, the device needs to act as both client and server itself.
My concern is getting the best balance between the security of the device and the load on the server. Furthermore, there must be a relatively low amount of latency between the server needing to issue a command and the device carrying it out; of the order of a few seconds.
As I see it my options are:
Upon connection to the internet, the device establishes a persistent TCP connection to the server which is then used for both polling and receiving commands.
The device listens on a port (e.g. HttpListener) for incoming commands whilst updating the server via frequent HTTP requests.
The device only ever polls the server with HTTP requests. The server uses the response to give the device commands.
The 2nd option seems to be the least secure as the device would have open incoming ports. The 1st option looks the most difficult to reliably implement as it would require low level socket programming. The 3rd option seems easy and secure but due to the latency requirements the device would need to poll every few seconds. This impacts the scalability of the system.
So at what frequency does HTTP polling create more overhead than just constantly keeping a TCP connection open? 5s? 3s? 1s? Or am I overstating the overhead of keeping a TCP connection open in ASP.NET? Or is there a completely different way that this can be implemented?
Thanks.
So at what frequency does HTTP polling create more overhead than just constantly keeping a TCP connection open? 5s? 3s? 1s?
There is nothing to do to keep a TCP connection open. The only thing you might need to do is to use TCP keep-alive (which have nothing to do with HTTP keep-alive!) in case you want to keep the connection idle (i.e no data to send) for a long time.
with HTTP your overhead already starts with the first request, since your data need to be encapsulated into a HTTP message. This overhead can be comparable small if the message is large or it can easily be much larger than the message itself for small messages. Also, HTTP server close the TCP connection after some idle time so you might need to re-establish the TCP connection for the next data exchange which is again overhead and latency.
HTTP has the advantage to pass through most firewalls and proxies, while plain TCP does not. You also get encryption kind of free with HTTPS, i.e. there are established standards for direct encrypted connection and for tunneling through a proxy.
WebSockets is something in between: you do a HTTP request and then upgrade HTTP to WebSocket. The initial overhead is thus as large as for HTTP but for the next messages the overhead is not that much higher than TCP. And you can do also WebSockets with HTTPS (i.e. wss:// instead of ws://). It passes through most simple firewalls and proxies, but more deeper inspection firewalls might still have trouble with it.
Setting up a TCP listener will be a problem if you have your IoT device behind some NAT router, i.e. the usual setup inside private or SoHo networks. To reach the device one would need to open a tunnel at the router from outside into the network, either by administrating the router by hand or with UPnP (which is often switched off for security reasons). So you would introduce too much problems for the average user.
Which means that the thing which the fewest problems for the customer is probably HTTP polling. But this is also the one with the highest overhead. Still mostly compatible are WebSockets which have less overhead and more problems but even less overhead can be reached with simple TCP to the server. TCP listener instead would cause too much trouble.
As for resources on the server side: each HTTP polling request might use new TCP connection but you can also reuse an existing one. In this case you could decide between more overhead and latency one the client side (new TCP connection for each request) which needs few resources on the server side and less overhead and latency on the client side which needs more resources on the server side (multiple HTTP requests per TCP connection). With WebSockets and plain TCP connection you always need more server side resources, unless your client will automatically re-establish the connection on loss of connectivity.
These days you should use a IOT Specific communication protocol over TLS 2.0 for secure light weight connections. For example AWS uses MQTT http://mqtt.org/ and Azure uses AMQP https://www.amqp.org/
The idea is you get a broker you can connect to securely then you use a messaging protocol with a topic to route messages to the proper devices. Also IBM has been using MQTT for a long time and routers now typically come with port 8883 open which is MQTT over TLS.
Good Luck!
Simply use SignalR to connect client and server. It provides you minimal latency without polling. The API is very simple to use.
Physically, this runs over WebSockets which are scalable to a large number of concurrent connections. If you don't have a need for more than 100k per Windows server this would not be a concern.

What is an idle http connection?

I am working with http connection and using a MultiThreadedHttpConnectionManager and httpClient.
For my purpose I am closing all the idle connection after 1ms with the following method : closeIdleConnections(1).
I am wondering what is considered as an " idle connection" in http ? It seems that waiting for an answer is not an idle connection.
Regards,
HTTP (1.1) specifies that connections should remain open until explicitly closed, by either party. Beyond that the specification provides only one example for a policy, suggesting using a timeout value beyond which an inactive (idle) connection should be closed. A connection kept open until the next HTTP request reduces latency and TCP connection establishment overhead. However, an idle open TCP connection consumes a socket and buffer space memory.
Excerpt from RFC 7230:
6.5. Failures and Timeouts
Servers will usually have some time-out value beyond which they will no longer maintain an inactive connection. Proxy servers might make this a higher value since it is likely that the client will be making more connections through the same server. The use of persistent connections places no requirements on the length (or existence) of this time-out for either the client or the server.
When a client or server wishes to time-out it SHOULD issue a graceful close on the transport connection. Clients and servers SHOULD both constantly watch for the other side of the transport close, and respond to it as appropriate. If a client or server does not detect the other side's close promptly it could cause unnecessary resource drain on the network.
A client, server, or proxy MAY close the transport connection at any time. For example, a client might have started to send a new request at the same time that the server has decided to close the "idle" connection. From the server's point of view, the connection is being closed while it was idle, but from the client's point of view, a request is in progress.
By studying the source code, in the HttpClient MultiThreadedHttpConnectionManager implementation, connection is simply considered idle when the connection in the pool's age is more than the idleTime. The idleTime is passed to the method closeIdleConnections(idleTime) as an argument.

In normal operation, how long will the TCP connection be kept open with websocket

Websocket is often done via a protocol upgrade from http - but even then - it is a layer on top (directly) of TCP.
Given that - how long can I expect a single TCP connection to be used underlying the websocket? how often will that connection have to be replaced (all without the client/server necessarily being aware).
As Joakim Erdfelt commented, there is no way to know the exact timeout value of a TCP connection, because it is a value which can be configured individually on each system which takes part in the connection.
But note that TCP connections are usually only dropped when there was no communication for an extended period of time. You can prevent this from happening by sending small, meaningless keep-alive messages at regular intervals.
In my current application I am doing this by pinging the client every 10 seconds. This has the advantage that I also get a good statistic of network latency.

Resources