Non blocking TCP write(2) succeds but the request is not sent - tcp

I am seeing that a small set of message written to a non-blocking TCP socket using write(2) are not seen on the source interface and also not received by the destination.
What could be the problem? Is there any way that the application can detect this and retry?
while (len > 0) {
res = write (c->sock_fd, tcp_buf, len);
if (res < 0) {
switch (errno) {
case EAGAIN:
case EINTR:
<handle case>
break;
default:
<close connection>
}
}
else {
len -= res;
}
}

Non blocking write(2) means that whatever the difficulties, the call will return. The proper way to detect what happened is to inspect the return value of the function.
If it returns -1 check errno. A value of EAGAIN means the write did not happen and you have to do it again.
It could also return a short write (i.e. a value less than the size of the buffer you passed it) in which case you’ll probably want to retry the missing part.
If this is happening on short lived sockets also read The ultimate SO_LINGER page, or: why is my tcp not reliable. It explains a particular problem regarding the closing part of a transmission.
when we naively use TCP to just send the data we need to transmit, it often fails to do what we want - with the final kilobytes or sometimes megabytes of data transmitted never arriving.
and the conclusions is:
The best advice is to send length information, and to have the remote program actively acknowledge that all data was received.
It also describes a hack for Linux.

write() returns the number of bytes written, this might be less than the amount of bytes you send in, and even 0! Make sure you check this and retransmit whatever was dropped (due to not enough buffer space on the NIC or whatever)

You want to read up on TCP_NODELAY option and the nature of the TCP send buffer.

Related

Nonblocking MPI and rendezvous protocol

I don't completely understand how MPI's nonblocking communication and rendezvous protocol are supposed to interact.
Firstly, consider this pseudocode, which can block when rendezvous protocol is used (assume we are having 2 processes):
if (rank == 0) {
MPI_Send (big_message, destination=1)
MPI_Recv(source=1)
} else {
MPI_Send(big_message, destination=0)
MPI_Recv(source=0)
}
This can obviously block when message is too big to fit in the internal buffer, as MPI_Sends in both processes would wait for a matching receive to be posted.
On my system, I have found the following modification to work:
if (rank == 0) {
MPI_Isend (big_message, destination=1, &request)
MPI_Recv(source=1)
MPI_Wait(request)
} else {
MPI_Isend(big_message, destination=0, &request)
MPI_Recv(source=0)
MPI_Wait(request)
}
We use nonblocking communication for sending the message. Would my solution be correct on every implementation of MPI? I have read that the implementations are not mandated to initiate any form of communication when MPI_Isend is called, and can perform it upon calling MPI_Wait. Would such implementation break my code? My understanding is that in such cicumstances MPI_Isend is basically a no-op and, for my code, both processes would wait in MPI_Recv for a send which does not come.
If my pseudocode is non-portable is there a way of using nonblocking communication to fix it?
Saying that all communication can happen at the wait call is a simplistic view, and probably only correct if all communication is non-blocking. Taken strictly it would mean that your code would deadlock on the blocking receives because the sends would happen after them. That does not happen.
For your case, section 3.7.4 of the standard say that that "[...] a call to MPI_Wait that completes a send will eventually return if a matching receive has been started [... some notes...]" So, yes, your code is correct.

Is gen_tcp:send/2 blocking?

Is gen_tcp:send() asynchronous? Assume I'll send some byte array using gen_tcp:send/2. Will process continue to work:
a) Immediately
b) At the time data will arrive in target's inner buffer
c) When the target gets the data from buffer
Thank You in advance.
gen_tcp:send/2 is synchronous. It means that the call returns only after the given packet is really sent. Usually it happens immediately, however if TCP window is full gen_tcp:send/2 blocks until the data is sent. So it means that the call can theoretically block infinitely (for example when receiver does not read data from socket on its side).
Fortunately there are some options to avoid such situation. There are two options {send_timeout, Integer} and {send_timeout_close, Boolean} for sockets which can be specified by the call inet:setopts/2. The first one allows to specify a longest time to wait for a send operation.
When the limit is exceeded, the send operation will return {error, timeout}. Default value of that option is infinity (and it is the reason of infinite block). Also unfortunately it is unknown how much of data was sent if {error, timeout} was returned. In that case it is better to close the socket. If the second option {send_timeout_close, Boolean} is set to true then the socket will be close automatically if {error, timeout} occurs.

Relationship with recvfrom, sleep

I don't know exactly,
In my case.
I was tested UPnP via Linux, I just use recvfrom.
I got a HTTP response not expected counts. (In this time, I expected 3)
So, I do put sleep(1) in while(), It works!
I have a question is 'why'?
recvfrom returns to buffer per one packets. <-- this is what I know, and is there a relationship with this?
You can use recvfrom() function for both connection and connection-less sockets.if you are using this function in connection-less socket, If a message is too long to fit in the supplied buffer, the excess bytes are discarded. To avoid this kind of situations "you can set the flag MSG_WAITALL that Requests the function block until the full amount of data requested can be returned. The function may return a smaller amount of data if a signal is caught, if the connection is terminated, if MSG_PEEK was specified, or if an error is pending for the socket."
if you are using recvfrom() function in a stream-based sockets such as SOCK_STREAM, message boundaries are ignored. In this case, data is returned to the user as soon as it becomes available, and no data is discarded.
In your case , instead of using sleep() you can set MSG_WAITALL flag that will block your socket untill full amount of data requested can be returned. and there is no relationship between recvfrom() and sleep() functions.

Pcap Dropping Packets

// Open the ethernet adapter
handle = pcap_open_live("eth0", 65356, 1, 0, errbuf);
// Make sure it opens correctly
if(handle == NULL)
{
printf("Couldn't open device : %s\n", errbuf);
exit(1);
}
// Compile filter
if(pcap_compile(handle, &bpf, "udp", 0, PCAP_NETMASK_UNKNOWN))
{
printf("pcap_compile(): %s\n", pcap_geterr(handle));
exit(1);
}
// Set Filter
if(pcap_setfilter(handle, &bpf) < 0)
{
printf("pcap_setfilter(): %s\n", pcap_geterr(handle));
exit(1);
}
// Set signals
signal(SIGINT, bailout);
signal(SIGTERM, bailout);
signal(SIGQUIT, bailout);
// Setup callback to process the packet
pcap_loop(handle, -1, process_packet, NULL);
The process_packet function gets rid of header and does a bit of processing on the data. However when it takes too long, i think it is dropping packets.
How can i use pcap to listen for udp packets and be able to do some processing on the data without losing packets?
Well, you don't have infinite storage so, if you continuously run slower than the packets arrive, you will lose data at some point.
If course, if you have a decent amount of storage and, on average, you don't run behind (for example, you may run slow during bursts buth there are quiet times where you can catch up), that would alleviate the problem.
Some network sniffers do this, simply writing the raw data to a file for later analysis.
It's a trick you too can use though not necessarily with a file. It's possible to use a massive in-memory structure like a circular buffer where one thread (the capture thread) writes raw data and another thread (analysis) reads and interprets. And, because each thread only handles one end of the buffer, you can even architect it without locks (or with very short locks).
That also makes it easy to detect if you've run out of buffer and raise an error of some sort rather than just losing data at your application level.
Of course, this all hinges on your "simple and quick as possible" capture thread being able to keep up with the traffic.
Clarifying what I mean, modify your process_packet function so that it does nothing but write the raw packet to a massive circular buffer (detecting overflow and acting accordingly). That should make it as fast as possible, avoiding pcap itself dropping packets.
Then, have an analysis thread that takes stuff off the queue and does the work formerly done in process_packet (the "gets rid of header and does a bit of processing on the data" bit).
Another possible solution is to bump up the pcap internal buffer size. As per the man page:
Packets that arrive for a capture are stored in a buffer, so that they do not have to be read by the application as soon as they arrive.
On some platforms, the buffer's size can be set; a size that's too small could mean that, if too many packets are being captured and the snapshot length doesn't limit the amount of data that's buffered, packets could be dropped if the buffer fills up before the application can read packets from it, while a size that's too large could use more non-pageable operating system memory than is necessary to prevent packets from being dropped.
The buffer size is set with pcap_set_buffer_size().
The only other possibility that springs to mind is to ensure that the processing you do on each packet is as optimised as it can be.
The splitting of processing into collection and analysis should alleviate a problem of not keeping up but it still relies on quiet time to catch up. If your network traffic is consistently more than your analysis can handle, all you're doing is delaying the problem. Optimising the analysis may be the only way to guarantee you'll never lose data.

Receiving image through winsocket

i have a proxy server running on my local machine used to cache images while surfing. I set up my browser with a proxy to 127.0.0.1, receive the HTTP requests, take the data and send it back to the browser. It works fine for everything except large images. When I receive the image info, it only displays half the image (ex.: the top half of the google logo) heres my code:
char buffer[1024] = "";
string ret("");
while(true)
{
valeurRetour = recv(socketClient_, buffer, sizeof(buffer), 0);
if(valeurRetour <= 0) break;
string t;
t.assign(buffer,valeurRetour);
ret += t;
longueur += valeurRetour;
}
closesocket(socketClient_);
valeurRetour = send(socketServeur_, ret.c_str(),longueur, 0);
the socketClient_ is non-blocking. Any idea how to fix this problem?
You're not making fine enough distinctions among the possible return values of recv.
There are two levels here.
The first is, you're lumping 0 and -1 together. 0 means the remote peer closed its sending half of the connection, so your code does the right thing here, closing its socket down, too. -1 means something happened besides data being received. It could be a permanent error, a temporary error, or just a notification from the stack that something happened besides data being received. Your code lumps all such possibilities together, and on top of that treats them the same as when the remote peer closes the connection.
The second level is that not all reasons for getting -1 from recv are "errors" in the sense that the socket is no longer useful. I think if you start checking for -1 and then calling WSAGetLastError to find out why you got -1, you'll get WSAEWOULDBLOCK, which is normal since you have a non-blocking socket. It means the recv call cannot return data because it would have to block your program's execution thread to do so, and you told Winsock you wanted non-blocking calls.
A naive fix is to not break out of the loop on WSAEWOULDBLOCK but that just means you burn CPU time calling recv again and again until it returns data. That goes against the whole point of non-blocking sockets, which is that they let your program do other things while the network is busy. You're supposed to use functions like select, WSAAsyncSelect or WSAEventSelect to be notified when a call to the API function is likely to succeed again. Until then, you don't call it.
You might want to visit The Winsock Programmer's FAQ. (Disclaimer: I'm its maintainer.)
Have you analyzed the transaction at the HTTP level i.e. checked Headers?
Are you accounting for things like Chunked transfers?
I do not have a definite answer in part because of the lack of details given here.

Resources