TCP socket continuously returns EAGAIN - tcp

In my testing, I found that when i send packets of bytes 1000-5000 bytes from my sender, they get assembled/bundled at receiver with sizes 8000-14000 bytes. I checked the wireshark capture to confirm this.
I have 2 questions:
1) Who bundles these packets in between, receiver receives these and I use select() to detect data and call recvmsg api ?.
2) When packets of lengths increase at receiver, I implemented partial reception so that 'recvmsg' returns partial data also. In this case after some time, recvmsg call returns EAGAIN with 0 bytes.
The connection with peer is still up, because peer is still sending packets, why is recvmsg call returing error with EAGAIN ?.
Please help !

I found that when i send packets of bytes 1000-5000 bytes from my sender, they get assembled/bundled at receiver with sizes 8000-14000 bytes
1) Who bundles these packets in between, receiver receives these and I use select() to detect data and call recvmsg api?.
If those are Ethernet packets of size 9000 bytes then you must be using jumbo frames.
TCP is a stream, there is no concept of a message. Data sent in multiple send() calls may be received in one recv() call and vice versa.
2) When packets of lengths increase at receiver, I implemented partial reception so that 'recvmsg' returns partial data also. In this case after some time, recvmsg call returns EAGAIN with 0 bytes. The connection with peer is still up, because peer is still sending packets, why is recvmsg call returing error with EAGAIN ?.
It means you are using non-blocking sockets and no more data is available in the socket receive buffer. In this case you should use select/poll/epoll to wait on one or more sockets until more data is available for reading.

Related

Identifying last packet in a message sent by TCP

Say we have sender A sending a message to receiver B using TCP. Say the message to be sent from A to B is split into three packets of length 500 bytes, 500 bytes and 50 bytes, to be sent in that order. How does A indicate to B that the packet of length 50 bytes is the last part of the message? I can understand that an ACK from B to A, sent every other packet received by B, indicates using the sequence number how much data has been received by B since the last ACK was sent by B. I read that FIN is used to terminate the connection between the sender and receiver. However, I can't find a description of how the the last packet, of a message split into several packets, is indicated. I'm thinking the packets have to be reassembled, in order, before the message is sent to the receiving application. I think that as one of TCPs actions is to split the message into packets, there must be some way of the sender flagging the last packet of a message has been sent.
I think that as one of TCPs actions is to split the message into
packets
No, TCP takes a stream of data and segments it into PDUs called segments. It is IP that uses the TCP segments as the payload of IP packets, which are in turn the payload of the data-link protocol, e.g. ethernet, frames.
However, I can't find a description of how the the last packet, of a
message split into several packets, is indicated.
Something like that is up to a higher protocol, e.g. HTTP. I think you are looking at TCP the wrong way. A TCP connection is like a bidirectional pipe; whatever you put in one end comes out the other end. TCP has no idea of the data structure, it just sends whatever it gets from the application or application-layer protocol. When an application or application-layer protocol is through using the connection, it tells TCP to tear it down.
The receiving TCP simply receives data and reorders it, asking for lost or missing segments. It passes properly ordered data up to the application or application-layer protocol, having no idea of the data structure because it is just a data stream to TCP.
Also, remember that both ends of a TCP connection are peers that can send and receive, and either end can send a segment with FIN that tells the other end that it is done sending, but the end sending the FIN is obligated to continue to receive until the other end also sends a FIN to say it is done sending. Either side could also kill the connection with a RST segment.
there must be some way of the sender flagging the last packet of a
message has been sent.
Probably, but that is not the job of TCP, that is up to the application or application-layer protocol. When the application-layer is done, it tells TCP to close, and that starts the FIN process. TCP has no idea what is the last part of a message is because it knows nothing about the data. It keeps the pipe open until it is told to close it.

Basic TCP protocol questions -- What happens on send() and recv()

I have some basic questions on TCP protocol
Situation: Machine_A calls send(sockfd) to send data to Machine_B. send() call succeeds.
Question: When the send() call returns, does it mean the data has already reached Machine_B? Or has it just been accepted by the operating system
Situation: Machine_A calls send(sockfd) to send data to Machine_B. But the application_B on Machine_B has not been reading from the socket fast enough. Application_A is writing 10MB/s but Application_B is just reading 1KB/sec.
Question:
When does the send() call succeed on Machine_A in this case?
Does it succeed the moment the data is submitted to OS_A on Machine_A or does it wait until there is an acknowledgement from OS_B?
Does OS_B require Application_B to pull the packets before it is acknowledged to OS_A?
send only cares about putting data into the local socket buffer, i.e. it will not wait for an ACK from the recipients machine or even wait until the data are processed by the recipient application (which is even later). If you need this kind of information you would need to have some application-level acknowledgement. Moreover, while an ACK gets send by TCP it would not get send by other protocols like UDP anyway.
send will only fail if it cannot put data in the socket buffer, maybe because there is no socket buffer (socket closed) or because the socket buffer is already full but send called non-blocking. If the socket buffer is full and send is called blocking it will just block until there is again space in the socket buffer.

TCP send function retransmission logic?

When we send a packet and re-transmission starts does we come out of send function or not?
In my case my application took a lock and waits for send to return and then it leaves the lock.
But In my scenario it never came back. I want to know do we really come out of send function when we have a re transmission case?
The send function transfers data into the socket send buffer, blocking while there isn't enough room.
Data is removed from the socket send buffer when acknowledged.
Retransmission starts when data that has been sent to the peer hasn't been acknowledged within the appropriate timeout interval.
The interactions between retransmission and the send() function consist basically of this: if data hasn't been acknowledged, it is still in the send buffer, which may cause the send() function to block.

recv() and recvfrom() read from kernel buffer in terms of packet(segment) or interms of bytes?

I build UDP server or TCP server which use recv() or recvfrom for receiving packets from clients
but it seems to me that the mechanism is: the kernel receive packets from network
and stripe the IP/TCP/UDP header and then put the data payload part in the
kernel buffer, then recv() or recvfrom() read the data in from the kernel buffer
so this means there are only bytes in the buffer, and the bytes are not divided into parts, each of which corresponds to the payload of a UDP datagram/TCP segment
if I hope each call of recv() or recvfrom() only receives one TCP segment or UDP datagram(note, one TCP or UDP packet may includes several IP packets due to IP fragmentation)
is it possible or not?
if so, how?
thanks!
I hope each call of recv() or recvfrom() only receives one TCP segment
No. It may return anything from one byte to the length you supplied (or zero bytes in non-blocking mdoe), and the data may cross TCP segment boundaries, not that you have any way of telling where a TCP segment boundary is in the first place. You have to regard a TCP connection as a byte stream, nothing more.
or UDP datagram
Yes.

How does TCP/IP report errors?

How does TCP/IP report errors when packet delivery fails permanently? All Socket.write() APIs I've seen simply pass bytes to the underlying TCP/IP output buffer and transfer the data asynchronously. How then is TCP/IP supposed to notify the developer if packet delivery fails permanently (i.e. the destination host is no longer reachable)?
Any protocol that requires the sender to wait for confirmation from the remote end will get an error message. But what happens for protocols where a sender doesn't have to read any bytes from the destination? Does TCP/IP just fail silently? Perhaps Socket.close() will return an error? Does the TCP/IP specification say anything about this?
TCP/IP is a reliable byte stream protocol. All your bytes will get to the receiver or you'll get an error indication.
The error indication will come in the form of a closed socket. Regardless of what the communication pattern (who does the sending), if the bytes can't be delivered, the socket will close.
So the question is, how do you see the socket close? If you're never reading, you'd eventually get an error trying to write to the closed socket (with ECONNRESET errno, I think).
If you have a need to sleep or wait for input on another file handle, you might want to do your waiting in a select() call where you include the socket in the list of sources you're waiting on (even if you never expect to receive anything). If the select() indicates that the socket is ready for a read call, you may get a -1 return (with ECONNRESET, I think). An EOF would indicate an orderly close (other side did a shutdown() or close().
How to distinguish this error close from a clean close (other program exiting, for example)? The errno values may be enough to distinguish error from orderly close.
If you want an unambiguous indication of a problem, you'll probably need to build some sort of application level protocol above the socket layer. For example, a short "ack" message sent by the receiver back to the sender. Then the violation of that higher level application protocol (sender didn't see an ack) would be a confirmation that it was an error close vs a clean close.
The sockets API has no way of informing the writer exactly how many bytes have been received as acknowledged by the peer. There are no guarantees made by the presence of a successful shutdown or close either.
The TCP/IP specification says nothing about the application interface (which is nearly always the sockets API).
SCTP is an alternative to TCP which attempts to address these shortcomings, among others.
In C, if you write to a socket that has failed with send(), you will get back the number of bytes that were sent. If this does not match the number of bytes you meant to send, then you have a problem. But also, when you write to a failed socket, you get SIGPIPE back. Before you start socket handling, you need to have a signal handler in place that will alert you when you get SIGPIPE.
If you are reading from a socket, you really should wrap it with an alarm so you can timeout. Like "alarm(timeout_val); recv(); alarm(0)". Check the return code of recv, and if it's 0, that indicates that the connection has been closed. A negative return result indicates a read failure and you need to check errno.
TCP is built upon the IP protocol, which is the centerpiece for the Internet, providing much of the interoperability that drives Routing, which is what determines how to get packets from their source to their destination. The IP protocol specifies that error messages should be sent back to the sender via Internet Control Message Protocol(ICMP) in the case of a packet failing to get to the sender. Some of these reasons include the Time To Live(TTL) field being decremented to zero, often meaning that the packet got stuck in a routing loop, or the packet getting dropped due to switch contention causing buffer overruns. As others have said, it is the responsibility of the Socket API that is being used to relay these errors at the IP layer up to the application interacting with the network at the TCP layer.
TCP/IP packets are either raw, UDP, or TCP. TCP requires each byte to be acked, and it will re-transmit bytes that are not acked in time. raw, and UDP are connectionless (aka best effort), so any lost packets (barring some ICMP cases, but many of these get filtered for security) are silently dropped. Upper layer protocols can add reliability, such as is done with some raw OSPF packets.

Resources