What are the general rules for getting the 104 "Connection reset by peer" error? - http

Are there any general rules on when a website sends out a TCP reset, triggering the Connection reset by peer error?
Like
too many open connections
too high bandwidth use
connected for too long
…?
I'm pretty certain that there is no law governing this and that different websites/web developers have different tastes, but I would be interested if there are some general rule sets (from websites or textbooks on the subject or what you have been taught in school/at work) that are mostly followed.
Reason why I'm asking, of course, is that I want to get around being blocked…
I'm downloading some government data that is freely available, but is lacking an API or something, so the two official ways to get it are either clicking around in some web-GIS a few thousand times or going along the Kafkaesque path of explaining various levels of clerks the concepts of databases, csv files, zip files and that you can't (and won't need to, if they'd just did what you try to explain them) just drive to their agency with a "giant" harddrive, so I'm trying to just go the most resource saving way for everyone involved…

A website is not "sending" a "Connection reset by peer" error. This error is generated by the OS kernel on the client site if it gets a TCP reset for an active connection. There are many reasons this TCP reset might be sent. A TCP reset might be sent by design from some kind of load limit, for example to limit the number of connections from the same IP address within a specific time as a form of DOS protection, to restrict data scraping or to enforce some kind of fair use. There is no general rule or even law for this kind of explicit limits.
A TCP reset might also be caused by the application being overloaded, application crashing, system running out of resources ... .
And a TCP reset will happen if the client writes to a connection which the server already considers as closed. This can happen for example with HTTP keep alive: the server might close the connection on inactivity at any time after the HTTP response was sent. If the client sends a new request on the same connection at the same time the server closes the connection, the server will reject the new request (since the connection is closed on the server end) and will send a TCP RST, causing a connection reset by peer at the client. The client needs to properly handle this situation by creating a new connection and sending the request again (provided that the request was not state changing, i.e. is idempotent).

Related

How to limit HTTPS to one TCP connection?

I'm using uIP along with mbed TLS to run a simple web server on a microcontroller, and host an HTTPS page.
The problem is: my chip only has enough RAM to handle one TLS connection at a time, but Firefox (and Chrome) tries to open multiple connections at once to load the images on the page. If I tell uIP to abort or close additional connections, Firefox assumes an error and gives up loading the rest of the page.
I can tell uIP to limit the total connections to 1, and in that case it just drops new SYN packets if there is already a connection. This actually works, as Firefox will wait and try again until the page is fully loaded. I can't use this a solution however, since I do need to allow more than 1 TCP connection total in order to handle other types of connections (I can serve a regular HTTP web page at the same time, for example). If I could tell uIP to limit connections on a specific port to 1 at a time, that may solve the problem, but I don't think uIP has that capability. I also don't see a way to force uIP to drop certain packets.
I've looked all over the web, but I can't find any information on running a web server using just one TCP connection at a time.
Does anyone have any ideas?
Thanks!
Marlon
Just ignore the SSL connection until you are ready to process it. Browsers should tolerate this.

Why do we need a half-close socket?

According to this blog, it seems half open connection is what we want to avoid.
So why does Java still provides the facility to make a socket half close?
According to this blog, it seems half open connection is what we want to avoid.
This author of the blog explicitly notes that he does not talk about deliberately half-closed connections but about half-open connections which are caused by intermediate devices like routers which drop the connection state after some timeout.
So why does Java still provides the facility to make a socket half close?
Because there are useful? Half-close just means that no more data will be send on the socket but it will still be able to receive data. This kind of behavior is actually useful for various situations where the client sends only a request and receives a response because it can be used to indicate the end of the request to the peer.

TCP as connection protocol questions

I'm not sure if this is the correct place to ask, so forgive me if it isn't.
I'm writing computer monitoring software that needs to connect to a server. The server may send out relatively urgent messages, such as sound or cancel an alarm, and the client may send out data about the computer, such as screenshots. The data that the client sends isn't too critical on timing, but shouldn't be more than a two minutes late.
It is essential to the software that portforwarding need not be set up, and it is assumed that the internet connection will be done through a wireless router that has NAT almost all the time.
My idea is to have a TCP connection initiated from the client, and use that to transfer data. Ideally, I would have no data being sent when it is not needed, but I believe this to be impossible. Would sending the equivalent of a ping every now and again keep the connection alive, and what sort of bandwidth would it use if this program was running all the time on the computer? In addition, would it be possible to reduce the header size for these keep-alives?
Before I start designing the communication and programming, is this plan for connection flawed? Are there better alternatives?
Thanks!
1) You do not need to send 'ping' data to keep the connection alive, the TCP stack does this automatically; one reason for sending 'ping' data would be to detect a connection close on the client side - typically you only find out something has gone wrong when you try and read/write from the socket. There may be a way to change various time-outs so you can detect this condition faster.
2) In general while TCP provides a stream-oriented error free channel, it makes no guarantees about timeliness, if you are using it on the internet it is even more unpredictable.
3) For applications such as this (I hope you are making it for ethical purposes) - I would tend to use TCP, since you don't want a situation where the client receives a packet to raise an alarm but misses that one that turns it off again.

TcpListener stops accepting or accepts broken connections

We currently experience a problem with a self-written server application running on Windows (occurs on different versions). The server listens at a TCP port, accepts connections, exchanges some data and then closes the connections again. There are about 100 clients that connect from time to time.
Sometimes the server stops to work: Log files show that connections are still accepted, but that at the first read attempt a socket error (10054 - Connection reset by peer) occurs. I don't think it is a client issue because it suddenly stops working for all clients.
Now we found out, that the same problem occurs with our old server software, that is even written in another programming language. So it doesn't seem to be an error in our program - I think it has to be some kind of OS / firewall issue? Of course, firewalls have been deactivated, which didn't solve the issue yet.
Any ideas where to look into? Wireshark logs will follow soon..
Excerpt from the log (Timestamp, Thread Id, message)
11:37:56.137 T#3960 Connection from 10.21.13.3
11:37:56.138 T#3960 Client Exception: Socket Error # 10054
Connection reset by peer.
11:37:56.138 T#3960 ClientDisconnected
11:38:00.294 T#4144 Connection from 10.21.13.3
You can see that the exception occurs almost at the same time as the connection is accepted, in this case the client reconnects after a few seconds.
A "stateful" firewall or NAT keeps track of connections, and ought to send RSTs for connectiosn it doesn't know about. If the firewall loses track of connections for some reason, then you'll probably see random connections being reset.
Our router at work does this — it forgets about connections when the PPP connection dies, which is remarkably unhelpful when it rains and the DSL restart takes a bit too long. However, instead of resetting connections, it just drops packets (even more unhelpful!).
Sounds like a firewall or routing issue - maybe stale connections get disconnected after a timeout period. Are you using a ping/keepalive inside your protocol.
Otherwise you may ask Wireshark to see what is going on.
First, thanks for many hints - I'm afraid the problem was a completely different one which you couldn't possibly solve by reading my question.
The server application uses log4net, configured with a log file an ImmediateFlush = true. If every log statement is directly written into the file and multiple socket connections occur this slows down the whole application.
The server needed about a minute to really accept the connection. This was far more than the timeout on clientside. So in the log there was only shown "accepted" followed by "disconnected" - even the log was delayed!
Sorry for the inconvenience...
Have you tried changing the backlog and then see how much time or how many clients are served before this problem occurs
You don't say what Windows versions you're using for the server, but you should be aware that the Windows TCP/IP stack behaves differently in server and client OSes. There are limits on how many simultaneous incoming connections a client OS will allow, and they are significantly lower than you might expect.
What do the logs look like from the client side?
Since the error is stating that the client is dropping the connection; if you see the same error on the client side then it is a firewall or proxy that is dropping the connection (both side seeing the opposite side dropping the connection is indicative of a proxy/firewall).
If the error is not present on the client side; then I would say that your client side is where you will see the actual error.

Chat Server - persistent TCP or new Connection for each poll

Whats the best practice for scalable servers which need to maintain a list of active users?
Should I open a persistent TCP Connection for each client on which the server sends update messages?
This could lead in many open connection and propably no traffic for many seconds. Is this a problem in TCP?
Or would it be better to let the Client poll updates periodically (with a new tcp connection each)?
How do Chat Servers or large Online Games handle this?
Personally I'd go for a single persistent TCP connection per client to avoid a) the additional work in creating and destroying connections and the additional latency involved in all the TCP packets involved and b) to avoid creating lots of sockets in TIME_WAIT on either the clients or the server. There's simply no good reason to create and destroy the connections.
Depending on your platform there may be various tricks to deal with the various platform specific problems that you might get when you have lots of connections open, and by lots I mean 10s of thousands. For example, on Windows, using overlapped I/O and I/O completion ports would be a good design for lots of connections and if your connections are generally idle most of the time then you might find that using the 'zero byte read' trick would allow you to handle more connections on lesser hardware; but it's something you can add once you know you have a problem due to the amount of buffer space that you have waiting for reads which only complete infrequently.
I wouldn't have the clients polling the server. It's inefficient. Have the server publish data to the clients as and when there is data available. This would allow the server to control the workload somewhat by letting it decide how often to send the data to the clients - it could either send every time new data became available for a client or send after it had batched up some data and waited a short while, etc. If the server is pushing the data then the server (the weak point, the place that might get overwhelmed by client demand) has more control over the work that it will need to do.
If you have each client polling then a) you're generating more network noise as each client sends a message to ask the server if it has anything that it should send it and b) you're generating more work for the server as it needs to respond to the polls. The server knows when there's data for the client, let it be responsible for telling the clients.

Resources