Why do we need a half-close socket? - networking

According to this blog, it seems half open connection is what we want to avoid.
So why does Java still provides the facility to make a socket half close?

According to this blog, it seems half open connection is what we want to avoid.
This author of the blog explicitly notes that he does not talk about deliberately half-closed connections but about half-open connections which are caused by intermediate devices like routers which drop the connection state after some timeout.
So why does Java still provides the facility to make a socket half close?
Because there are useful? Half-close just means that no more data will be send on the socket but it will still be able to receive data. This kind of behavior is actually useful for various situations where the client sends only a request and receives a response because it can be used to indicate the end of the request to the peer.

Related

What are the general rules for getting the 104 "Connection reset by peer" error?

Are there any general rules on when a website sends out a TCP reset, triggering the Connection reset by peer error?
Like
too many open connections
too high bandwidth use
connected for too long
…?
I'm pretty certain that there is no law governing this and that different websites/web developers have different tastes, but I would be interested if there are some general rule sets (from websites or textbooks on the subject or what you have been taught in school/at work) that are mostly followed.
Reason why I'm asking, of course, is that I want to get around being blocked…
I'm downloading some government data that is freely available, but is lacking an API or something, so the two official ways to get it are either clicking around in some web-GIS a few thousand times or going along the Kafkaesque path of explaining various levels of clerks the concepts of databases, csv files, zip files and that you can't (and won't need to, if they'd just did what you try to explain them) just drive to their agency with a "giant" harddrive, so I'm trying to just go the most resource saving way for everyone involved…
A website is not "sending" a "Connection reset by peer" error. This error is generated by the OS kernel on the client site if it gets a TCP reset for an active connection. There are many reasons this TCP reset might be sent. A TCP reset might be sent by design from some kind of load limit, for example to limit the number of connections from the same IP address within a specific time as a form of DOS protection, to restrict data scraping or to enforce some kind of fair use. There is no general rule or even law for this kind of explicit limits.
A TCP reset might also be caused by the application being overloaded, application crashing, system running out of resources ... .
And a TCP reset will happen if the client writes to a connection which the server already considers as closed. This can happen for example with HTTP keep alive: the server might close the connection on inactivity at any time after the HTTP response was sent. If the client sends a new request on the same connection at the same time the server closes the connection, the server will reject the new request (since the connection is closed on the server end) and will send a TCP RST, causing a connection reset by peer at the client. The client needs to properly handle this situation by creating a new connection and sending the request again (provided that the request was not state changing, i.e. is idempotent).

TCP as connection protocol questions

I'm not sure if this is the correct place to ask, so forgive me if it isn't.
I'm writing computer monitoring software that needs to connect to a server. The server may send out relatively urgent messages, such as sound or cancel an alarm, and the client may send out data about the computer, such as screenshots. The data that the client sends isn't too critical on timing, but shouldn't be more than a two minutes late.
It is essential to the software that portforwarding need not be set up, and it is assumed that the internet connection will be done through a wireless router that has NAT almost all the time.
My idea is to have a TCP connection initiated from the client, and use that to transfer data. Ideally, I would have no data being sent when it is not needed, but I believe this to be impossible. Would sending the equivalent of a ping every now and again keep the connection alive, and what sort of bandwidth would it use if this program was running all the time on the computer? In addition, would it be possible to reduce the header size for these keep-alives?
Before I start designing the communication and programming, is this plan for connection flawed? Are there better alternatives?
Thanks!
1) You do not need to send 'ping' data to keep the connection alive, the TCP stack does this automatically; one reason for sending 'ping' data would be to detect a connection close on the client side - typically you only find out something has gone wrong when you try and read/write from the socket. There may be a way to change various time-outs so you can detect this condition faster.
2) In general while TCP provides a stream-oriented error free channel, it makes no guarantees about timeliness, if you are using it on the internet it is even more unpredictable.
3) For applications such as this (I hope you are making it for ethical purposes) - I would tend to use TCP, since you don't want a situation where the client receives a packet to raise an alarm but misses that one that turns it off again.

Detecting HTTP close using inet

In my mochiweb application, I am using a long held HTTP request. I wanted to detect when the connection with the user died, and I figured out how to do that by doing:
Socket = Req:get(socket),
inet:setopts(Socket, [{active, once}]),
receive
{tcp_closed, Socket} ->
% handle clean up
Data ->
% do something
end.
This works when: user closes his tab/browser or refreshes the page. However, when the internet connection dies suddenly (say wifi signal lost all of a sudden), or when the browser crashes abnormally, I am not able to detect a tcp close.
Am I missing something, or is there any other way to achieve this?
There is a TCP keepalive protocol and it can be enabled with inet:setopts/2 under the option {keepalive, Boolean}.
I would suggest that you don't use it. The keep-alive timeout and max-retries tends to be system wide, and it is optional after all. Using timeouts on the protocol level is better.
The HTTP protocol has the status code Request Timeout which you can send to the client if it seems dead.
Check out the after clause in receive blocks that you can use to timeout waiting for data, or use the timer module, or use erlang:start_timer/3. They all have different performance characteristics and resource costs.
There isn't a default "keep alive" (but can be enabled if supported) protocol over TCP: in case there is a connection fault when no data is exchanged, this translates to a "silent failure". You would need to account for this type of failure by yourself e.g. implement some form of connection probing.
How does this affect HTTP? HTTP is a stateless protocol - this means that every request is independent of every other. The "keep alive" functionality of HTTP doesn’t change that i.e. "silent failure" can still occur.
Only when data is exchanged can this condition be detected (or when TCP Keep Alive is enabled).
I would suggest sending the application level keep alive messages over HTTP chunked-encoding. Have your client/server smart enough to understand the keep alive messages and ignore them if they arrive on time or close and re-establish the connection again.

Using NetConnection and URLStream to send/recieve data at high frequency

I'm writing a Comet-like app using Flex on the client and my own hand-written server.
I need to be able to send short bursts of data from the client at quite a high frequency (e.g. of the order of 10ms between sends).
I also need the server to push short bursts of data at a similarly high frequency.
I'm using NetConnection.call() to send the data to the server, and URLStream (with chunked encoding) to push the data from the server to the client.
What I've found is that the data isn't being sent/received as soon as it's available. For example, in IE, it seems the data is sent every 200ms rather than as soon as NetConnection.call() is called. Similarly, URLStream isn't making the data available as soon as the server is sending it.
Judging by the difference in behaviour between the browsers, it seems as though the Flash Player (version 10) is relying on the host browser to do all the comms. Can anyone confirm this? Update: This is very likely as only the host browser would know about the proxy settings that might be set.
I've tried using the Socket class and there's no problem with speed there: it works perfectly. However, I'd like to be able to use HTTP-based (port 80) connections so that my app can run in heavily fire-walled environments (I tried using a Socket over port 80, but that has its problems).
Incidentally, all development/testing has been done on an internal LAN, so bandwidth/latency is not an issue.
Update: The data being sent/received is in small packets and doesn't need to be in any particular format. For example, I might need to send a short array of Numbers, and this could either be encoded in AMF (e.g. via NetConnection.call()) or could be put into GET parameters (e.g. using sendToURL()). The main point of my question is really to see whether anyone else has experienced the same problem in calling NetConnection/URLStream frequently, and whether there is a workaround (it's also possible that the fault lies with my server code of course, rather than Flash).
Thanks.
Turns out the problem had nothing to do with Flash/Flex or any of the host browsers. The problem was in my server code (written in C++ on Linux), and without access to my source code the cause is hard to find (so I couldn't have hoped for an answer from this forum).
Still - thank you everyone who chipped in.
It was only after looking carefully at the output shown in Wireshark that I noticed the problem, which was twofold:
Nagle's algorithm
I was sending replies in multiple packets by calling write() multiple times (e.g. once for the HTTP response header, and again for the HTTP response body). The server's TCP/IP stack was waiting for an ACK for the first packet before sending the second, but because of Nagle's algorithm the client was waiting 200ms before sending back the ACK to the first packet, so the server took at least 200ms to send the full HTTP response.
The solution is to use send() with the flag MSG_MORE until all the logically connected blocks are written. I could also have used writev() or setsockopt() with TCP_CORK, but it suited my existing code better to use send().
Chunk-encoded streams
I'm using a never-ending HTTP response with chunk encoding to push data back to the client. Naggle's algorithm needs to be turned off here because even if each chunk is written as one packet (using MSG_MORE), the client OS TCP/IP stack will still wait up to 200ms before sending back an ACK, and the server can't push a subsequent chunk until it gets that ACK.
The solution here is to ask the server not to wait for an ACK for each sent packet before sending the next packet, and this is done by calling setsockopt() with the TCP_NODELAY flag.
The above solutions only work on Linux and aren't POSIX-compliant (I think), but that isn't a problem for me.
I'm almost 100% sure the player relies on the browser for such communications. Can't find an official page stating so atm, but check this out for example:
Applications hosting the Flash Player
ActiveX control or Flash Player
plug-in can use the
EnforceLocalSecurity and
DisableLocalSecurity API calls to
control security settings.
Which I think somehow implies the idea. Also, I've suffered some network related bugs on FF/IE only which again points out to the player using each browser for networking (otherwise there wouldn't be such differences).
And regarding your latency problem, I think that if speed is critical, your best bet is sockets. You have some work to do, but seems possible, check out the docs again:
This error occurs in SWF content.
Dispatched if a call to
Socket.connect() attempts to connect
either to a server outside the
caller's security sandbox or to a port
lower than 1024. You can work around
either problem by using a cross-domain
policy file on the server.
HTH,
Juan

Is there a HTTP header field / hack to tell the browser NOT to pipeline its requests?

I am implementing a minimalistic web server application on a Microcontroller. When I have several images (or CSS/JS) on the web page, the browser creates several connections and fetches them. But the Microcontroller can not catch up with this. Is there a way to tell the browser to stop pipelining and fetch them one by one ?
Note :: "Connection: close" is already in place.
I think Connection:close is exactly the wrong message. When the browser creates multiple connections, it precisely does not pipeline its requests - so ISTM that you want the browser to pipeline, instead of creating parallel connections.
So one step towards that would be to use HTTP 1.1, and keep the connection open. The browser would then reuse the TCP connection for further requests. This should allow the microcontroller to catch up.
Now, the browser might still try to create additional, parallel connections. The best reaction to that is to not accept any of these connections. So limit the number of parallel connections that you are serving (independent of client), and only read new requests when you are done reading the previous ones. In doing so, prefer to read from established connections over accepting new connections.
If you have access to the TCP stack of the controller, you might be able to tell what host a connection comes from, so you can accept connections from other browsers while limiting the number of connections from the same browser (something that you cannot do in the regular socket API).
"Pipelining" is something else; it means that the user agent sends additional requests on the same connection although the first one didn't complete yet (see http://greenbytes.de/tech/webdav/rfc2616.html#pipelining).
"Connection: close" doesn't seem to be relevant; that being said: is there a reason why you don't want the connection reused?
With respect to your question: no, I don't think you can prevent clients from doing that. Did you try limiting the maximum number of open connections on your server?
Same problem... However, Firefox loads my site very fast unlike Opera. I have not invented anything better than rejecting connections at an initial stage: SYN. I'm just answering with RST flag. But probably it doesn't suit Opera.
My device supports only two simultaneous connections.

Resources