How to know when you finish receiving a TCP stream? - networking

I'm not sure how to tell in TCP when the sender finished sending me the information.
For example if A needs to send 200 bytes to B, how will B know that A finished sending, and that the information is ready to be transferred to the application layer?
There's the FIN flag, but as far as I know it's only there to symbolizes that your going to close the connection.
Thanks.

TCP has no obligation to tell the receiver when the sender has finished sending data on the connection. It can't because it has no understanding of the higher level protocol data it's transporting.
However it will tell you if the connection closes (then you'll see a FIN, FIN-ACK) sequence in a network trace. From a programming perspective, assuming that you're using C the function recv will return 0 when the peer closes the connection.

You define a protocol such as first sending a 4 byte int in a specified byte order which indicates how many bytes will follow in the message body.

If you're on unix/linux, select and poll can signal you that the other end finished transfer (did a half/full close). read will also return with an error if you've read all the data and want to read from a closed connection.
If you do multiple transfers on one connection and want to signal the end of a "package" you have to build that into your protocol.

Related

Identifying last packet in a message sent by TCP

Say we have sender A sending a message to receiver B using TCP. Say the message to be sent from A to B is split into three packets of length 500 bytes, 500 bytes and 50 bytes, to be sent in that order. How does A indicate to B that the packet of length 50 bytes is the last part of the message? I can understand that an ACK from B to A, sent every other packet received by B, indicates using the sequence number how much data has been received by B since the last ACK was sent by B. I read that FIN is used to terminate the connection between the sender and receiver. However, I can't find a description of how the the last packet, of a message split into several packets, is indicated. I'm thinking the packets have to be reassembled, in order, before the message is sent to the receiving application. I think that as one of TCPs actions is to split the message into packets, there must be some way of the sender flagging the last packet of a message has been sent.
I think that as one of TCPs actions is to split the message into
packets
No, TCP takes a stream of data and segments it into PDUs called segments. It is IP that uses the TCP segments as the payload of IP packets, which are in turn the payload of the data-link protocol, e.g. ethernet, frames.
However, I can't find a description of how the the last packet, of a
message split into several packets, is indicated.
Something like that is up to a higher protocol, e.g. HTTP. I think you are looking at TCP the wrong way. A TCP connection is like a bidirectional pipe; whatever you put in one end comes out the other end. TCP has no idea of the data structure, it just sends whatever it gets from the application or application-layer protocol. When an application or application-layer protocol is through using the connection, it tells TCP to tear it down.
The receiving TCP simply receives data and reorders it, asking for lost or missing segments. It passes properly ordered data up to the application or application-layer protocol, having no idea of the data structure because it is just a data stream to TCP.
Also, remember that both ends of a TCP connection are peers that can send and receive, and either end can send a segment with FIN that tells the other end that it is done sending, but the end sending the FIN is obligated to continue to receive until the other end also sends a FIN to say it is done sending. Either side could also kill the connection with a RST segment.
there must be some way of the sender flagging the last packet of a
message has been sent.
Probably, but that is not the job of TCP, that is up to the application or application-layer protocol. When the application-layer is done, it tells TCP to close, and that starts the FIN process. TCP has no idea what is the last part of a message is because it knows nothing about the data. It keeps the pipe open until it is told to close it.

How does TCP PSH work?

PSH is a way to send data via TCP. Besides that, I can find very little info on how to implement it properly.
Here is what interests me:
Let's say, server window is 8000 bytes, and I send 2 requests with 150 and 600 bytes. Do I get some sort of confirmation that the data has been received? Can I somehow trigger a confirmacion?
I've seen some ACK packets, which does not contain PSH but do contain some sort of payload data (Wireshark marks it as "TCP segment data"). Is this data passed on to user, and if it is, why do we need PSH flag?
TCP PSH generally doesn't 'work' at all. Berkely-derived TCP implementations completely ignore it.
Source: W.R. Stevens, TCP/IP Illustrated, vol I: 20.5 PUSH Flag.
#Arsen: Answering to the second part of your question "why do we need PSH flag?"
The PSH flag in the TCP header informs the receiving host that the data should be pushed up to the receiving application immediately.
We are using PSH flag to exchange Time Stamp value between two servers.
i am assuming , if we set push flag , packet wont wait in receive buffer , it will directly send to receiver.
The data doesn't sit waiting in the receive buffer anyhow.
TCP apps must go out of their way to have the TCP layer bulk up a few packets and deliver full data buffers.
In fact its somewhat frustrating to see applications allocate 64KB buffers to receive data and see them getting a gazillion 1480/1472 byte messages.

When does a Java socket send an ack?

My question is that when a socket at the receiver-side sends an ack? At the time the application read the socket data or when the underlying layers get the data and put it in the buffer?
I want this because I want both side applications know whether the other side took the packet or not.
It's up to the operating system TCP stack when this happens, since TCP provides a stream to the application there's no guarenteed 1:1 correlation between the application doing read/writes and the packets sent on the wire and the TCP acks.
If you need to be assured the other side have received/processed your data, you need to build that into your application protocol - e.g. send a reply stating the data was received.
TCP ACKs are meant to acknowledge the TCP packets on the transmission layer not the application layer. Only your application can signal explicitly that it also has processed the data from the buffers.
TCP/IP (and therefor java sockets) will guarantee that you either successfully send the data OR get an error (exception in the case of java) eventually.

Disconnect and Reconnect a connected datagram socket

Iam trying to create an iterative server based on datagram sockets (UDP).
It calls connect to the first client which it gets from the first recvfrom() call (yes I know this is no real connect).
After having served this client, I disconnect the UDP socket (calling connect with AF_UNSPEC)
Then I call recvfrom() to get the first packet from the next client.
Now the problem is, that the call of recvfrom() in the second iteration of the loop is returning 0. My clients never send empty packets, so what could be going on.
This is what Iam doing (pseudocode):
s = socket(PF_INET, SOCK_DGRAM, 0)
bind(s)
for(;;)
{
recvfrom(s, header, &client_address) // get first packet from client
connect(s,client_address) // connect to this client
serve_client(s);
connect(s, AF_UNSPEC); // disconnect, ready to serve next client
}
EDIT: I found the bug in my client accidently sending an empty packet.
Now my problem is how to make the client wait to get served instead of sending a request into nowhere (server is connected to another client and doesn't serve any other client yet).
connect() is really completely unnecessary on SOCK_DGRAM.
Calling connect does not stop you receiving packets from other hosts, nor does it stop you sending them. Just don't bother, it's not really helpful.
CORRECTION: yes, apparently it does stop you receiving packets from other hosts. But doing this in a server is a bit silly because any other clients would be locked out while you were connect()ed to one. Also you'll still need to catch "chaff" which float around. There are probably some race conditions associated with connect() on a DGRAM socket - what happens if you call connect and packets from other hosts are already in the buffer?
Also, 0 is a valid return value from recvfrom(), as empty (no data) packets are valid and can exist (indeed, people often use them). So you can't check whether something has succeeded that way.
In all likelihood, a zero byte packet was in the queue already.
Your protocol should be engineered to minimise the chance of an errant datagram being misinterpreted; for this reason I'd suggest you don't use empty datagrams, and use a magic number instead.
UDP applications MUST be capable of recognising "chaff" packets and dropping them; they will turn up sooner or later.
man connect:
...
If the initiating socket is not connection-mode, then connect()
shall set the socket’s peer address, and no connection is made.
For SOCK_DGRAM sockets, the peer address identifies where all
datagrams are sent on subsequent send() functions, and limits
the remote sender for subsequent recv() functions. If address
is a null address for the protocol, the socket’s peer address
shall be reset.
...
Just a correction in case anyone stumbples across this like I did. To disconnect connect() needs to be called with the sa_family member of sockaddr set to AF_UNSPEC. Not just passed AF_UNSPEC.

How does TCP/IP report errors?

How does TCP/IP report errors when packet delivery fails permanently? All Socket.write() APIs I've seen simply pass bytes to the underlying TCP/IP output buffer and transfer the data asynchronously. How then is TCP/IP supposed to notify the developer if packet delivery fails permanently (i.e. the destination host is no longer reachable)?
Any protocol that requires the sender to wait for confirmation from the remote end will get an error message. But what happens for protocols where a sender doesn't have to read any bytes from the destination? Does TCP/IP just fail silently? Perhaps Socket.close() will return an error? Does the TCP/IP specification say anything about this?
TCP/IP is a reliable byte stream protocol. All your bytes will get to the receiver or you'll get an error indication.
The error indication will come in the form of a closed socket. Regardless of what the communication pattern (who does the sending), if the bytes can't be delivered, the socket will close.
So the question is, how do you see the socket close? If you're never reading, you'd eventually get an error trying to write to the closed socket (with ECONNRESET errno, I think).
If you have a need to sleep or wait for input on another file handle, you might want to do your waiting in a select() call where you include the socket in the list of sources you're waiting on (even if you never expect to receive anything). If the select() indicates that the socket is ready for a read call, you may get a -1 return (with ECONNRESET, I think). An EOF would indicate an orderly close (other side did a shutdown() or close().
How to distinguish this error close from a clean close (other program exiting, for example)? The errno values may be enough to distinguish error from orderly close.
If you want an unambiguous indication of a problem, you'll probably need to build some sort of application level protocol above the socket layer. For example, a short "ack" message sent by the receiver back to the sender. Then the violation of that higher level application protocol (sender didn't see an ack) would be a confirmation that it was an error close vs a clean close.
The sockets API has no way of informing the writer exactly how many bytes have been received as acknowledged by the peer. There are no guarantees made by the presence of a successful shutdown or close either.
The TCP/IP specification says nothing about the application interface (which is nearly always the sockets API).
SCTP is an alternative to TCP which attempts to address these shortcomings, among others.
In C, if you write to a socket that has failed with send(), you will get back the number of bytes that were sent. If this does not match the number of bytes you meant to send, then you have a problem. But also, when you write to a failed socket, you get SIGPIPE back. Before you start socket handling, you need to have a signal handler in place that will alert you when you get SIGPIPE.
If you are reading from a socket, you really should wrap it with an alarm so you can timeout. Like "alarm(timeout_val); recv(); alarm(0)". Check the return code of recv, and if it's 0, that indicates that the connection has been closed. A negative return result indicates a read failure and you need to check errno.
TCP is built upon the IP protocol, which is the centerpiece for the Internet, providing much of the interoperability that drives Routing, which is what determines how to get packets from their source to their destination. The IP protocol specifies that error messages should be sent back to the sender via Internet Control Message Protocol(ICMP) in the case of a packet failing to get to the sender. Some of these reasons include the Time To Live(TTL) field being decremented to zero, often meaning that the packet got stuck in a routing loop, or the packet getting dropped due to switch contention causing buffer overruns. As others have said, it is the responsibility of the Socket API that is being used to relay these errors at the IP layer up to the application interacting with the network at the TCP layer.
TCP/IP packets are either raw, UDP, or TCP. TCP requires each byte to be acked, and it will re-transmit bytes that are not acked in time. raw, and UDP are connectionless (aka best effort), so any lost packets (barring some ICMP cases, but many of these get filtered for security) are silently dropped. Upper layer protocols can add reliability, such as is done with some raw OSPF packets.

Resources