I have a question about the socket send buffer of TCP. As the syscall of "write" only means that the send buffer is available, and the "write" returns only when all data are copied into the buffer and not assumes the data are sent out.
So I have a question that if the end part of an application request doesn't fill the send buffer, when will kernel send out the data, as I haven't seen any system calls to let the user process control when to send all buffer data.
Related
I have some basic questions on TCP protocol
Situation: Machine_A calls send(sockfd) to send data to Machine_B. send() call succeeds.
Question: When the send() call returns, does it mean the data has already reached Machine_B? Or has it just been accepted by the operating system
Situation: Machine_A calls send(sockfd) to send data to Machine_B. But the application_B on Machine_B has not been reading from the socket fast enough. Application_A is writing 10MB/s but Application_B is just reading 1KB/sec.
Question:
When does the send() call succeed on Machine_A in this case?
Does it succeed the moment the data is submitted to OS_A on Machine_A or does it wait until there is an acknowledgement from OS_B?
Does OS_B require Application_B to pull the packets before it is acknowledged to OS_A?
send only cares about putting data into the local socket buffer, i.e. it will not wait for an ACK from the recipients machine or even wait until the data are processed by the recipient application (which is even later). If you need this kind of information you would need to have some application-level acknowledgement. Moreover, while an ACK gets send by TCP it would not get send by other protocols like UDP anyway.
send will only fail if it cannot put data in the socket buffer, maybe because there is no socket buffer (socket closed) or because the socket buffer is already full but send called non-blocking. If the socket buffer is full and send is called blocking it will just block until there is again space in the socket buffer.
Here's my understanding of incoming data flow in TCP/IP
Kernel reads data to its buffer from network interface
Kernel copy data from its buffer to TCP Socket Buffer, where Sliding Window works
The program that is blocked by read() wakes up and copy data from socket buffer.
I'm a little bit confused about where does the sliding window locate, or is it the same as socket buffer
Linux does not handle TCP's sliding window as a separate buffer, rather as several indices indicating how much has already been received / read. The Linux kernel packet handling process can be described in many ways and can be divided to small parts as yo go deeper, but the general flow is as follows:
The kernel prepares to receive data over a network interface, it prepares SKB (Socket Buffer) data structures and map them to the interface Rx DMA buffer ring.
When packets arrive, they fill these preconfigured buffers and notify the kernel in an interrupt context of the packets arrival. In this context, the buffers are moved to a recv queue for the network stack to handle them out of an interrupt context.
The network stack retrieves these packets and handles them accordingly, eventually arriving to the TCP layer (if they are indeed TCP packets) which in turn handles the window.
See struct tcp_sock member u32 rcv_wnd which is then used in tp->rcvq_space.space as the per-connection space left in window.
The buffer is added to socket receive queue and is read accordingly as stream data in tcp_recvmsg()
The important thing to remember here is that copies is the worst thing regarding performance. Therefore, the kernel will always (unless absolutely necessary) will avoid copies and use pointers instead.
When we send a packet and re-transmission starts does we come out of send function or not?
In my case my application took a lock and waits for send to return and then it leaves the lock.
But In my scenario it never came back. I want to know do we really come out of send function when we have a re transmission case?
The send function transfers data into the socket send buffer, blocking while there isn't enough room.
Data is removed from the socket send buffer when acknowledged.
Retransmission starts when data that has been sent to the peer hasn't been acknowledged within the appropriate timeout interval.
The interactions between retransmission and the send() function consist basically of this: if data hasn't been acknowledged, it is still in the send buffer, which may cause the send() function to block.
How can I identify TCP pushback when using IOCP? I.e. how can I find out that the receiver is not receiving, that tx/rx buffers on both sides of connection are full and that the sender should cease to send more data?
With any async TCP send operation the way to determine the rate that the peer is receiving data is to monitor the rate of send completions on the sender.
I've written about this in depth here. In summary, when the receiver's buffers fill and TCP flow control is in operation and the TCP window is reduced the sender cannot send which causes the sender's TCP buffers to fill. This then means that async send requests can not complete. If you track the number of outstanding send requests that are pending you can spot this situation and throttle the sender.
When this happens, the send won't complete.
I'm not sure how to tell in TCP when the sender finished sending me the information.
For example if A needs to send 200 bytes to B, how will B know that A finished sending, and that the information is ready to be transferred to the application layer?
There's the FIN flag, but as far as I know it's only there to symbolizes that your going to close the connection.
Thanks.
TCP has no obligation to tell the receiver when the sender has finished sending data on the connection. It can't because it has no understanding of the higher level protocol data it's transporting.
However it will tell you if the connection closes (then you'll see a FIN, FIN-ACK) sequence in a network trace. From a programming perspective, assuming that you're using C the function recv will return 0 when the peer closes the connection.
You define a protocol such as first sending a 4 byte int in a specified byte order which indicates how many bytes will follow in the message body.
If you're on unix/linux, select and poll can signal you that the other end finished transfer (did a half/full close). read will also return with an error if you've read all the data and want to read from a closed connection.
If you do multiple transfers on one connection and want to signal the end of a "package" you have to build that into your protocol.