TCP Connection slows down after 10.000 packets

TCP Connection slows down after 10.000 packets - qt

I am using the QT implementation of the TCP stack to control a robot. We exchange short msgs (<200Byte) and have a round trip time of about 8ms. After maybe 10.000 Packets in each direction, the connection slows down and i have to wait about 1 sec for the answer of my packet. If I restart my program, and reconnect, I again get the 8ms RTT.
For me it sounds like some kind of buffer is filling up but I havn't worked with TCP much, so maybe some one could give me a hint.

The problem is in the code that you're not showing. Likely the slot that gets executed on readyRead() is not emptying the buffer.
It is acceptable for the buffer not to be completely empty, say when you're reading complete lines/packets.
It is not acceptable for the buffer size to be constantly growing.
At the end of your slot reading slot, check if bytesAvailable() is non-zero. It can only be non-zero in case #1. Even then, you should be able to place an upper bound on it - say some small multiple of packet size or maximum line length. If the bound is ever exceeded, you've got a bug in your code.

It is just a wild guess, but a common catch by using qt sockets is that you need to delete the socket object by yourself ( for example with "deleteLater()") on error and disconnection.
Example code:
connect(socket, SIGNAL(disconnected()), socket, SLOT(deleteLater()));
The event loop will then remove the socket the next time it is able to do it.
The QTcpSockets or AbstractSockets don't delete themselfs on close() or on leaving the scope (because then the Signal/Slots won't work).

Related

Will the write() system call block further operation till read() is involved, or vice versa?

Written as part of a TCP/IP client-server:
Server:
write(nfds,data1,sizeof(data1));
usleep(1000);
write(nfds,data2,sizeof(data2));
Client:
read(fds,s,sizeof(s));
printf("%s",s);
read(fds,s,sizeof(s));
printf("%s",s);
Without usleep(1000) between the two calls to write(), the client prints data1 twice. Why is this?
Background:
I am doing a Client-Server program where the server has to send two consecutive pieces of information after their acquisition, via the network (socket); nfds is the file descriptor we get from accept().
In the client side, we receive these information via read; here fds is the file descriptor obtained via socket().
My issue is that when I am NOT using the usleep(1000) between the write() functions, the client just prints the info represented by data1 twice, instead of printing data1 and then data2. When I put in the usleep() it's fine. Exactly WHY is this happening? Is write() blocking the operation till the buffer is read or is read() blocking the operation till info is written into the buffer? Or am I completely off the page?

You are making several false assumptions. There is nothing in TCP that guarantees that one send equals one receive. There is a lot of buffering, at both ends, and there are deliberate delays in sending to as to coalesce packets (the Nagle algorithm). When you call read(), or recv() and friends, you need to store the result into a variable and examine it for each of the following cases:
-1: an error: examine/log/print errno, or strerror(), or call perror(), and in most cases close the socket and exit the reading loop.
0: end of stream; the owner has closed the connection; close the socket and exit the reading loop.
a positive value but less than you expected: keep reading and accumulate the data until you have everything you need.
a positive value that is more than you expected: process the data you expected, and save the rest for next time.
exactly what you expected: process the data, discard it all, and repeat. This isn the easy case, and it is rare, but it is the only case you are currently programming for.
Don't add sleeps into networking code. It doesn't solve problems, it only delays them.

Handling messages over TCP

I'm trying to send and receive messages over TCP using a size of each message appended before the it starts.
Say, First three bytes will be the length and later will the message:
As a small example:
005Hello003Hey002Hi
I'll be using this method to do large messages, but because the buffer size will be a constant integer say, 200 Bytes. So, there is a chance that a complete message may not be received e.g. instead of 005Hello I get 005He nor a complete length may be received e.g. I get 2 bytes of length in message.
So, to get over this problem, I'll need to wait for next message and append it to the incomplete message etc.
My question is: Am I the only one having these difficulties to appending messages to each other, appending lengths etc.. to make them complete Or is this really usually how we need to handle the individual messages on TCP? Or, if there is a better way?

What you're seeing is 100% normal TCP behavior. It is completely expected that you'll loop receiving bytes until you get a "message" (whatever that means in your context). It's part of the work of going from a low-level TCP byte stream to a higher-level concept like "message".
And "usr" is right above. There are higher level abstractions that you may have available. If they're appropriate, use them to avoid reinventing the wheel.

So, there is a chance that a complete message may not be received e.g.
instead of 005Hello I get 005He nor a complete length may be received
e.g. I get 2 bytes of length in message.
Yes. TCP gives you at least one byte per read, that's all.
Or is this really usually how we need to handle the individual messages on TCP? Or, if there is a better way?
Try using higher-level primitives. For example, BinaryReader allows you to read exactly N bytes (it will internally loop). StreamReader lets you forget this peculiarity of TCP as well.
Even better is using even more higher-level abstractions such as HTTP (request/response pattern - very common), protobuf as a serialization format or web services which automate pretty much all transport layer concerns.
Don't do TCP if you can avoid it.

So, to get over this problem, I'll need to wait for next message and append it to the incomplete message etc.
Yep, this is how things are done at the socket level code. For each socket you would like to allocate a buffer of at least the same size as kernel socket receive buffer, so that you can read the entire kernel buffer in one read/recv/resvmsg call. Reading from the socket in a loop may starve other sockets in your application (this is why they changed epoll to be level-triggered by default, because the default edge-triggered forced application writers to read in a loop).
The first incomplete message is always kept in the beginning of the buffer, reading the socket continues at the next free byte in the buffer, so that it automatically appends to the incomplete message.
Once reading is done, normally a higher level callback is called with the pointers to all read data in the buffer. That callback should consume all complete messages in the buffer and return how many bytes it has consumed (may be 0 if there is only an incomplete message). The buffer management code should memmove the remaining unconsumed bytes (if any) to the beginning of the buffer. Alternatively, a ring-buffer can be used to avoid moving those unconsumed bytes, but in this case the higher level code should be able to cope with ring-buffer iterators, which it may be not ready to. Hence keeping the buffer linear may be the most convenient option.

LWIP: How exactly does the TCP_INTERVAL relate to the reception of ACK Messages?

I am trying to implement a data transfer from an embedded board to a PC. For this, I need to use low latency communication and I am bound to use Ethernet with TCP/IP.
Furthermore, I'm using the lwip stack.
First of all, I disabled nagle algorithm, because I have to send small packets of data (10 KB) and I want them to be sent as soon as possible, without waiting for intermediate ACKS.
The Wireshark Log shows me that this is working quite fine (the whole data is being sent to the PC in about 1msec).
After that, the PC takes about 200msec to send the last ACK (because the last Segment is not maximum size).
The problem is now, that on the embedded processor, it takes a very long time, until the lwip gives my application the message, that all of the data has been ACKED.
When I decrease the TCP_INTERVAL (to let's say 5), it speeds up greatly.
I am wondering, why lwip behaves like this? I would think that the Periodic-TCP-Tasks (which are being called according to the TCP_INTERVAL) have nothing to do with the Handling of the received frames (which is really another call in the main).
I hope I could state my problem somehow understandable, if not I would appreciate feedback, so I can improve my question!
Thanks!
EDIT:
After more debugging, I found out that the process of sending data results in the following function calls:
My main calls tcp_write(...)
tcp_tmr() is called multiple times (through the LwIP_Periodic_Handle() function). This happens seven times. During the eigth call:
tcp_output() is called. During this call, all segments which were added during the last tcp_write() call are sent by calling tcp_output_segment().
So now it is clear that if I reduce the TCP_INTERVAL, of course the data gets sent sooner, because the tcp_tmr() function is called more quickly.
but my question is still: Is this the normal behaviour? It seems a bit odd, that lwIP is waiting such a long time before actually sending the data.

Since Youre doing this My main calls tcp_write(...)
use tcp_output() immediately after tcp_write
or else use tcp_write() in tcp_recv callback

Pcap Dropping Packets

// Open the ethernet adapter
handle = pcap_open_live("eth0", 65356, 1, 0, errbuf);
// Make sure it opens correctly
if(handle == NULL)
{
printf("Couldn't open device : %s\n", errbuf);
exit(1);
}
// Compile filter
if(pcap_compile(handle, &bpf, "udp", 0, PCAP_NETMASK_UNKNOWN))
{
printf("pcap_compile(): %s\n", pcap_geterr(handle));
exit(1);
}
// Set Filter
if(pcap_setfilter(handle, &bpf) < 0)
{
printf("pcap_setfilter(): %s\n", pcap_geterr(handle));
exit(1);
}
// Set signals
signal(SIGINT, bailout);
signal(SIGTERM, bailout);
signal(SIGQUIT, bailout);
// Setup callback to process the packet
pcap_loop(handle, -1, process_packet, NULL);
The process_packet function gets rid of header and does a bit of processing on the data. However when it takes too long, i think it is dropping packets.
How can i use pcap to listen for udp packets and be able to do some processing on the data without losing packets?

Well, you don't have infinite storage so, if you continuously run slower than the packets arrive, you will lose data at some point.
If course, if you have a decent amount of storage and, on average, you don't run behind (for example, you may run slow during bursts buth there are quiet times where you can catch up), that would alleviate the problem.
Some network sniffers do this, simply writing the raw data to a file for later analysis.
It's a trick you too can use though not necessarily with a file. It's possible to use a massive in-memory structure like a circular buffer where one thread (the capture thread) writes raw data and another thread (analysis) reads and interprets. And, because each thread only handles one end of the buffer, you can even architect it without locks (or with very short locks).
That also makes it easy to detect if you've run out of buffer and raise an error of some sort rather than just losing data at your application level.
Of course, this all hinges on your "simple and quick as possible" capture thread being able to keep up with the traffic.
Clarifying what I mean, modify your process_packet function so that it does nothing but write the raw packet to a massive circular buffer (detecting overflow and acting accordingly). That should make it as fast as possible, avoiding pcap itself dropping packets.
Then, have an analysis thread that takes stuff off the queue and does the work formerly done in process_packet (the "gets rid of header and does a bit of processing on the data" bit).
Another possible solution is to bump up the pcap internal buffer size. As per the man page:
Packets that arrive for a capture are stored in a buffer, so that they do not have to be read by the application as soon as they arrive.
On some platforms, the buffer's size can be set; a size that's too small could mean that, if too many packets are being captured and the snapshot length doesn't limit the amount of data that's buffered, packets could be dropped if the buffer fills up before the application can read packets from it, while a size that's too large could use more non-pageable operating system memory than is necessary to prevent packets from being dropped.
The buffer size is set with pcap_set_buffer_size().
The only other possibility that springs to mind is to ensure that the processing you do on each packet is as optimised as it can be.
The splitting of processing into collection and analysis should alleviate a problem of not keeping up but it still relies on quiet time to catch up. If your network traffic is consistently more than your analysis can handle, all you're doing is delaying the problem. Optimising the analysis may be the only way to guarantee you'll never lose data.

Receiving image through winsocket

i have a proxy server running on my local machine used to cache images while surfing. I set up my browser with a proxy to 127.0.0.1, receive the HTTP requests, take the data and send it back to the browser. It works fine for everything except large images. When I receive the image info, it only displays half the image (ex.: the top half of the google logo) heres my code:
char buffer[1024] = "";
string ret("");
while(true)
{
valeurRetour = recv(socketClient_, buffer, sizeof(buffer), 0);
if(valeurRetour <= 0) break;
string t;
t.assign(buffer,valeurRetour);
ret += t;
longueur += valeurRetour;
}
closesocket(socketClient_);
valeurRetour = send(socketServeur_, ret.c_str(),longueur, 0);
the socketClient_ is non-blocking. Any idea how to fix this problem?

You're not making fine enough distinctions among the possible return values of recv.
There are two levels here.
The first is, you're lumping 0 and -1 together. 0 means the remote peer closed its sending half of the connection, so your code does the right thing here, closing its socket down, too. -1 means something happened besides data being received. It could be a permanent error, a temporary error, or just a notification from the stack that something happened besides data being received. Your code lumps all such possibilities together, and on top of that treats them the same as when the remote peer closes the connection.
The second level is that not all reasons for getting -1 from recv are "errors" in the sense that the socket is no longer useful. I think if you start checking for -1 and then calling WSAGetLastError to find out why you got -1, you'll get WSAEWOULDBLOCK, which is normal since you have a non-blocking socket. It means the recv call cannot return data because it would have to block your program's execution thread to do so, and you told Winsock you wanted non-blocking calls.
A naive fix is to not break out of the loop on WSAEWOULDBLOCK but that just means you burn CPU time calling recv again and again until it returns data. That goes against the whole point of non-blocking sockets, which is that they let your program do other things while the network is busy. You're supposed to use functions like select, WSAAsyncSelect or WSAEventSelect to be notified when a call to the API function is likely to succeed again. Until then, you don't call it.
You might want to visit The Winsock Programmer's FAQ. (Disclaimer: I'm its maintainer.)

Have you analyzed the transaction at the HTTP level i.e. checked Headers?
Are you accounting for things like Chunked transfers?
I do not have a definite answer in part because of the lack of details given here.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex