Packet data structure? - networking

I'm designing a game server and I have never done anything like this before. I was just wondering what a good structure for a packet would be data-wise? I am using TCP if it matters. Here's an example, and what I was considering using as of now:
(each value in brackets is a byte)
[Packet length][Action ID][Number of Parameters]
[Parameter 1 data length as int][Parameter 1 data type][Parameter 1 data (multi byte)]
[Parameter 2 data length as int][Parameter 2 data type][Parameter 2 data (multi byte)]
[Parameter n data length as int][Parameter n data type][Parameter n data (multi byte)]
Like I said, I really have never done anything like this before so what I have above could be complete bull, which is why I'm asking ;). Also, is passing the total packet length even necessary?

Passing the total packet length is a good idea. It might cost two more bytes, but you can peek and wait for the socket to have a full packet ready to sip before receiving. That makes code easier.
Overall, I agree with brazzy, a language supplied serialization mechanism is preferrable over any self-made.
Other than that (I think you are using a C-ish language without serialization), I would put the packet ID as the first data on the packet data structure. IMHO that's some sort of convention because the first data member of a struct is always at position 0 and any struct can be downcast to that, identifying otherwise anonymous data.
Your compiler may or may not produce packed structures, but that way you can allocate a buffer, read the packet in and then either cast the structure depending on the first data member. If you are out of luck and it does not produce packed structures, be sure to have a serialization method for each struct that will construct from the (obviously non-destination) memory.
Endiannes is a factor, particularly on C-like languages. Be sure to make clear that packets are of the same endianness always or that you can identify a different endian based on a signature or something. An odd thing that's very cool: C# and .NET seems to always hold data in little-endian convention when you access them using like discussed in this post here. Found that out when porting such an application to Mono on a SUN. Cool, but if you have that setup you should use the serialization means of C# anyways.
Other than that, your setup looks very okay!

Start by considering a much simpler basic wrapper: Tag, Length, Value (TLV). Your basic packet will look then like this:
[Tag] [Length] [Value]
Tag is a packet identifier (like your action ID).
Length is the packet length. You may need this to tell whether you have the full packet. It will also let you figure out how long the value portion is.
Value contains the actual data. The format of this can be anything.
In your case above, the value data contains a further series of TLV structures (parameter type, length, value). You don't actually need to send the number of parameters, as you can work it from the data length and walking the data.
As others have said, I would put the packet ID (Tag) first. Unless you have cross-platform concerns, I would consider wrapping your application's serialised object in a TLV and sending it across the wire like that. If you make a mistake or want to change later, you can always create a new tag with a different structure.
See Wikipedia for more details on TLV.

To avoid reinventing the wheel, any serialization protocol will work for on the wire data (e.g. XML, JSON), and you might consider looking at BEEP for the basic protocol framework.
BEEP is summed up well in its FAQ document as 'kind of a "best hits" album of the tricks used by experienced application protocol designers since the early 80's.'

There's no reason to make something so complicated like that. I see that you have an action ID, so I suppose there would be a fixed number of actions.
For each action, you would define a data structure, and then you would put each one of those values in the structure. To send it over the wire, you just allocate sum(sizeof(struct.i)) bytes for each element in your structure. So your packet would look like this:
[action ID][item 1 (sizeof(item 1 bytes)][item 1 (sizeof(item 2 bytes)]...[item n (sizeof(item n bytes)]
The idea is, you already know the size and type of each variable on each side of the connection is, so you don't need to send that information.
For strings, you can just throw 'em in in a null terminated form, and then when you 'know' to look for a string based on your packet type, start reading and looking for a null.
--
Another option would be to use '\r\n' to delineate your variables. That would require some overhead, and you would have to use text, rather then binary values for numbers. But that way you could just use readline to read each variable. Your packets would look like this
[action ID]
[item 1 (as text)]
...
[item n (as text)]
--
Finally, simply serializing objects and passing them down the wire is a good way to do this too, with the least amount of code to write. Remember that you don't want to prematurely optimize, and that includes network traffic as well. If it turns out you need to squeeze out a little bit more performance later on you can go back and figure out a more efficient mechanism.
And check out google's protocol buffers, which are supposedly an extreemly fast way to serialize data in a platform-neutral way, kind of like a binary XML, but without nested elements. There's also JSON, which is another platform neutral encoding. Using protocol buffers or JSON would mean you wouldn't have to worry about how to specifically encode the messages.

Do you want the server to support multiple clients written in different languages? If not, it's probably not necessary to specify the structure exactly; instead use whatever facility for serializing data your language offers, simply to reduce the potential for errors.
If you do need the structure to be portable, the above looks OK, though you should specify stuff like endianness and text encoding as well in that case.

Related

Is this an advantage of MPI_PACK over derived datatype?

Suppose a process is going to send a number of arrays of different sizes but of the same type to another process in a single communication, so that the receiver builds the same arrays in its memory. Prior to the communication the receiver doesn't know the number of arrays and their sizes. So it seems to me that though the task can be done quite easily with MPI_Pack and MPI_Unpack, it cannot be done by creating a new datatype because the receiver doesn't know enough. Can this be regarded as an advantage of MPI_PACK over derived datatypes?
There is some passage in the official document of MPI which may be referring to this:
The pack/unpack routines are provided for compatibility with previous libraries. Also, they provide some functionality that is not otherwise available in MPI. For instance, a message can be received in several parts, where the receive operation done on a later part may depend on the content of a former part.
You are absolutely right. The way I phrase it is that with MPI_Pack I can make "self-documenting messages". First you store an integer that says how many elements are coming up, then you pack those elements. The receiver inspects that first int, then unpacks the elements. The only catch is that the receiver needs to know an upper bound on the number of bytes in the pack buffer, but you can do that with a separate message, or a MPI_Probe.
There is of course the matter that unpacking a packed message is way slower than straight copying out of a buffer.
Another advantage to packing is that it makes heterogeneous data much easier to handle. The MPI_Type_struct is rather a bother.

What Are The Reasons For Bit Shifting A Float Before Sending It Via A Network

I work with Unity and C# - when making multiplayer games I've been told that when it comes to values like positions that are floats, I should use a bit shift operator on them before sending them and reverse the operation on receive. I have been told this not only allows for larger numbers values and is capable of maintaining floating point precision which may be lost. However, if I do not have to, I do not wish to run this operation every time I receive a packet unless I have to. Though the bottle necks seem to be the actual parsing of the bytes received. Especially without message framing and attempting to move from string to byte array. (But that's another story!)
My question are:
Are these valid reason to undergo the operation? Are they accurate statements?
If not should I be running bit shift ops on my floats?
If I should, what are the real reasons to do it?
Any additional information would be most appreciated.
One of the resourcesI'm referring to:
Main reasons for going back and forth to/from network byte order is to combat endianness caused problems, mainly to ensure each byte of multi byte values (long, int but also floats) is read and written in the way giving the same results regardless of architecture. This issue can be theoretically ignored if you are sure you are exchanging data between systems using the same endianness, but that's rather bad idea from very beginning as you are simply creating unneded technological debt and keep unjustified exceptions in the code ("It all works BUT on the same endianness only. What can go wrong?").
Depending on your app architecture you can rewrite the packet payload/data once you receive it and then use that version further in the code. Also note that you need to encode the data again prior sending it out.

How to handle passing different types of serialized messages on a network

I'm currently sitting with the problem of passing messages that might contain different data over a network. I have created a prototype of my game, and now I'm busy implementing networking for my game.
I want to send different types of messages, as I think it would be silly to constantly send all the information every network-tick and I would rather send different messages that contain different data. What would be the best way to distinguish what message is received on the receiving side?
Currently I have a system where I prepend a string which distinguishes a certain type of message. My message is then sent through my own message parser class where it determines the type, and deserializes it to the correct type.
What I would like to know is if there is a better way of doing this? It seems like it should be a fairly common problem and so there must be a more trivial solution, unless I'm already doing it the trivial way.
Thanks!
I have read again carefully your question, and now I do not understand what is your problem, you say Currently I have a system where I prepend a string which distinguishes a certain type of message. My message is then sent through my own message parser class where it determines the type, and deserializes it to the correct type.
Looks OK, you may reduce the size of your message with my answer below horizontal line but the principle stays identical.
This the right way for asynchronous communication, but if you do synchrone you know that when you send A message you will receive B answer, so you do not have to prepend with a string which distinguishes the message, but you have to take care not sending another message before having the answer from the previous ...
So if you know how is formatted the answer you do not need any identification bytes, for example you know that the first four bytes is an integer, then a float on eight bytes, etc ...
Use boost::serialization, typically you save your structures, even with pointers, within a dumb bytes buffer, send that buffer over your network, and the other side de-serialize.
This example shows how Boost.Serialization can be used with asio to encode and decode structures for transmission over a socket.
Even if it is using boost::asio you could extract only the serialization part easily.

Serialized Data: How to Check if byte array is Qt or Boost

I receive raw data blocks without header information about the serialized datas origin. The only information i have is that its one out of the following: A serialized QtByteArray or a Boost Archive. Is there any way to check for a signature or similar?
Thanks!
TL;DR: No.
The other answers are rather dangerous. What you wish to do simply cannot be done without adding some information to the serialized data that describes the type of serialization used. Remember that neither Qt's nor Boost's serialization is designed to be robust against what amounts to a malicious data stream.
Qt's serialization of a `QByteArray' is simply a 32-bit byte count followed by the data. There is no type information or anything like that. The boost archive contains a bit more information, but still, there are absolutely no guarantees that it will fail gracefully on what amounts to a random stream of bytes. It may fail by exhausting the memory, for example.
Try to deserialize using one then the other and assume it is valid data if the operation succeed. It is not error prone, but the you can assume that the probability of some raw data being valid for both is small. Even in this case you have to try to deserialize in a specific order. Put the most probable first.

Math - big number from couple of numbers export-able

Let's say I have some numbers, like
5,10,7,8,9,6,2,4,8,5,3,9,78,5,6
I need to send this to another computer, but as the least number of possible bytes. I know what there is a way to do that, I just forgot what it's called and how it works, but generally doing some math with those numbers, getting a big number that, from this number, I'll be able to export the data and get this numbers from this number. Thanks in advance.
EDIT
OK so I need to send this text in UDP but I need it as less bits as possible. I'm sending some options, like firstcolor-secondcolor, let's say I have 15 colors. Every color is just number, from 1 to 199, but maybe there is a better way to send this data? thanks.
No one can say which compression scheme is the best for you. We don't have any information about the numbers. But as a first try, you could just write them into a file and use gzip compression on it. Or bzip2, or 7zip.
And only if all these don't help, you should think about doing the compression yourself.
You also didn't tell us your operating systems (source computer, destination computer) and from where you get the data.
[Update, based on the edit in the question:] So basically you want to send some numbers in the range of 1 to 199. This is pretty close to what a single byte can hold.
If it is ok that you use 8 bits per number (meaning you waste 0.4 bits per number), this is trivial but highly depends on the programming language. Here is how it might look like in Java syntax:
ByteBuffer buf = new ByteBuffer();
buf.add(1);
buf.add(199);
buf.add(78);
buf.add(7);
udpSocket.send(buf.toArray());
Get a compression library (like zlib, for example) and feed your numbers in (as an array of integers, for example). This is compressing your data. That same library should allow you to reverse the process and decompress the data at the other end to get your values back out.
If you want to improve your algorithmic knowledge and your requirements are simple and non-critical I'd recommend having a go at writing your own compression/decompression code. If not, grab some code off the shelf - there are loads of good libraries around.

Resources