Frame-based and stream-based protocols - networking

I am reading WebSocket specification, it's written:
The WebSocket Protocol is designed on the principle that there should be minimal framing (the only framing that exists is to make the protocol frame-based instead of stream-based and to support a distinction between Unicode text and binary frames).
I don't understand what do frame-based and stream-based mean. Could you please provide examples of them with some basic explanation of the difference?

A short overview:
Frame-Based: If I told you that I'm sending you 8 bytes and I send you 6 bytes, you would wait for the next 2 bytes and than say "this is a message".
Stream-Based: I send you 6 bytes. I send you 2 bytes. Is it one message? two messages? six messages? How do you know where one message starts and another begins?

Related

What indicates the end of a TCP stream?

I'm attempting to make a low-level packet reader, sort of like wireshark, for learning purposes and fun.
Going into http parsing, I quickly found that there are these TCP streams, and that data doesn't all come in one packet.
EDIT: What I consider to be the stream I've been told is not the right term. David Schwartz referred to it as a Query and EJP as a Sequence of Segments.
So I did a little reading on TCP, and I've read everywhere about the sequence and acknowledgment numbers, how they increase and wrap around etc..
Connection establishment and termination I already had knowledge of.
To be honest I've never been too good at interpretation or using google, but I've been at this for a bit, and I just can't seem to understand how the server/client ends a TCP stream.
I'm asking this because, after sending the totality of the http request, the server immediately replies with the page, so, as I've been going back and forward through packets in wireshark and I can't seem to find what delimits this stream (anything indicating length in the first packet or difference between the penultimate and last packets).
Is there no actual delimiter in TCP, and you just get the total length from the http headers's Content-Length?
As seen here, the first 3 packets are connection establishment.
The 3 after those are the sending of the http request (with first being ACK (the selected one), and the latest 2 being PSH & ACK).
The two after those are ACKs from the server, followed by the entirety of the response.
The last one is my ACK and the connection is never closed for the time I "recorded".
Thank you.
The end of a TCP stream is marked by a segment whose FIN bit is sent.
However that isn't the answer to your real question, which is about the end of an HTTP request or response, which is determined by:
The Content-length, if present, or
The final chunk in a chunked transfer, if used, or
The FIN
if there is a body, otherwise the blank line after the headers as pointed out by #Barmar below.

Is websocket a stream-based or package-based protocol?

Imagine that I have server and client talking via WebSocket. Each of time sends another chunks of data. Different chunks may have different length.
Am I guaranteed, that if server sends chunk in one call, then client will receive it in one message callback and vice-versa? I.e., does WebSocket have embedded 'packing' ability, so I don't have to care if my data is splitted among several callbacks during transmission or it doesn't?
Theoretically the WebSocket protocol presents a message based protocol. However, bear in mind that...
WebSocket messages consist of one or more frames.
A frame can be either a complete frame or a fragmented frame.
Messages themselves do not have any length indication built into the protocol, only frames do.
Frames can have a payload length of up to 9,223,372,036,854,775,807 bytes (due to the fact that the protocol allows for a 63bit length indicator).
The primary purpose of fragmentation is to allow sending a message that is of unknown size when the message is started without having to buffer that message.
So...
A single WebSocket "message" could consist of an unlimited number of 9,223,372,036,854,775,807 byte fragments.
This may make it difficult for an implementation to always deliver complete messages to you via its API...
So whilst, in the general case, the answer to your question is that the WebSocket protocol is a message based protocol and you don't have to manually frame your messages. The API that you're using to use the protocol may either have message size limits in place (to allow it to guarantee delivery of messages as a single chunk) or may present a streaming interface to allow for unlimited sized messages.
I ranted about this back during the standardisation process here.
WebSocket is a message-based protocol, so if you send a chunk of data as the payload of a WebSocket message, the peer will receive one separate WebSocket message with exactly that chunk of data as payload.

TCP Binary protocol implementation

I'm writing a simple binary protocol over TCP/IP: It is composed as below
[DATASIZE - 2 Byte Fixed][DATA]
With this solution, even I read multiple messages, I can divide each one by its size, but I have one doubt: It is possible to receive only a "portion" of a single message?
The only messages that you can receive over a TCP streaming service are single bytes. Anything more complex may be received in single bytes or any multiples. So, yes, you can receive 'only a "portion" of a single message' in a single receive call.

Packets sometimes get concatenated

I'm trying to make a simple server/application in Erlang.
My server initialize a socket with gen_tcp:listen(Port, [list, {active, false}, {keepalive, true}, {nodelay, true}]) and the clients connect with gen_tcp:connect(Server, Port, [list, {active, true}, {keepalive, true}, {nodelay, true}]).
Messages received from the server are tested by guards such as {tcp, _, [115, 58 | Data]}.
Problem is, packets sometimes get concatenated when sent or received and thus cause unexpected behaviors as the guards consider the next packet as part of the variable.
Is there a way to make sure every packet is sent as a single message to the receiving process?
Plain TCP is a streaming protocol with no concept of packet boundaries (like Alnitak said).
Usually, you send messages in either UDP (which has limited per-packet size and can be received out of order) or TCP using a framed protocol.
Framed meaning you prefix each message with a size header (usualy 4 bytes) that indicates how long the message is.
In erlang, you can add {packet,4} to your socket options to get framed packet behavior on top of TCP.
assuming both sides (client/server) use {packet,4} then you will only get whole messages.
note: you won't see the size header, erlang will remove it from the message you see. So your example match at the top should still work just fine
You're probably seeing the effects of Nagle's algorithm, which is designed to increase throughput by coalescing small packets into a single larger packet.
You need the Erlang equivalent of enabling the TCP_NODELAY socket option on the sending socket.
EDIT ah, I see you already set that. Hmm. TCP doesn't actually expose packet boundaries to the application layer - by definition it's a stream protocol.
If packet boundaries are important you should consider using UDP instead, or make sure that each packet you send is delimited in some manner. For example, in the TCP version of DNS each message is prefixed by a 2 byte length header, which tells the other end how much data to expect in the next chunk.
You need to implement a delimiter for your packets.
One solution is to use a special character ; or something similar.
The other solution is to send the size of the packet first.
PacketSizeInBytes:Body
Then read the provided amount of bytes from your message. When you're at the end you got your whole packet.
Nobody mentions that TCP may also split your message into multiple pieces (split your packet into two messages).
So the second solution is the best of all. But a little hard. While the first one is still good but limits your ability to send packets with special characters. But the easiest to implement. Ofc theres a workaround for all of this. I hope it helps.

Nagle-Like Problem

so I have this real-time game, with a C++ sever with disabled nagle using SFML library , and client using asyncsocket, also disables nagle. I'm sending 30 packets every 1 second. There is no problem sending from the client to the server, but when sending from the server to the clients, some of the packets are migrating. For example, if I'm sending "a" and "b" in completly different packets, the client reads it as "ab". It's happens just once a time, but it makes a real problem in the game.
So what should I do? How can I solve that? Maybe it's something in the server? Maybe OS settings?
To be clear: I AM NOT using nagle but I still have this problem. I disabled in both client and server.
For example, if I'm sending "a" and "b" in completly different packets, the client reads it as "ab". It's happens just once a time, but it makes a real problem in the game.
I think you have lost sight of the fundamental nature of TCP: it is a stream protocol, not a packet protocol. TCP neither respects nor preserves the sender's data boundaries. To put it another way, TCP is free to combine (or split!) the "packets" you send, and present them on the receiver any way its wants. The only restriction that TCP honors is this: if a byte is delivered, it will be delivered in the same order in which it was sent. (And nothing about Nagle changes this.)
So, if you invoke send (or write) on the server twice, sending these six bytes:
"packet" 1: A B C
"packet" 2: D E F
Your client side might recv (or read) any of these sequences of bytes:
ABC / DEF
ABCDEF
AB / CD / EF
If your application requires knowledge of the boundaries between the sender's writes, then it is your responsibility to preserve and transmit that information.
As others have said, there are many ways to go about that. You could, for example, send a newline after each quantum of information. This is (in part) how HTTP, FTP, and SMTP work.
You could send the packet length along with the data. The generalized form for this is called TLV, for "Type, Length, Value". Send a fixed-length type field, a fixed-length length field, and then an arbitrary-length value. This way you know when you have read the entire value and are ready for the next TLV.
You could arrange that every packet you send is identical in length.
I suppose there are other solutions, and I suppose that you can think of them on your own. But first you have to realize this: TCP can and will merge or break your application packets. You can rely upon the order of the bytes' delivery, but nothing else.
You have to disable Nagle in both peers. You might want to find a different protocol that's record-based such as SCTP.
EDIT2
Since you are asking for a protocol here's how I would do it:
Define a header for the message. Let's say I would pick a 32 bits header.
Header:
MSG Length: 16b
Version: 8b
Type: 8b
Then the real message comes in, having MSG Length bytes.
So now that I have a format, how would I handle things ?
Server
When I write a message, I prepend the control information (the length is the most important, really) and send the whole thing. Having NODELAY enabled or not makes no difference.
Client
I continuously receive stuff from the server, right ? So I have to do some sort of read.
Read bytes from the server. Any amount can arrive. Keep reading until you've got at least 4 bytes.
Once you have these 4 bytes, interpret them as the header and extract the MSG Length
Keep reading until you've got at least MSG Length bytes. Now you've got your message and can process it
This works regardless of TCP options (such as NODELAY), MTU restrictions, etc.

Resources