what is overhead, payload, and header [closed] - networking

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Can someone please explain me what is overhead, payload, header and packet. As far as I know a packet is the whole data that is to be transmitted. This packet consists of the actual data which I think is the payload and the source/destination information of the packet is in the header. So a packet consists of header and payload. So what is this overhead? Is overhead a part of the header? I got this from the internet "Packet overheard includes all the extra bytes of information that are stored in the packet header".
The header already contains source/destination info. What are the extra bytes of information that this packet overhead has? I'm confused.

The packet like you said, have the "payload" which is the data itself it needs to transfer (usually the user's data), the "header" contains various things depends on the protocol you are using, for example UDP contains just simple things in the header like Destination and Source IP/PORT, TCP on the other end contains more things like the sequence number of the packet to ensure ordered delivery, a lot of flags to ensure the packet actually received in it's destination and checksum of the data to make sure it didn't get corrupted and received correctly in its detination.
Now, the "overhead" part is actually the additional data that you need in order to send your payload. In the cases I talked about above it's the header part, because you need to add it to every payload that you want to send over the internet. TCP has bigger overhead than UDP because it needs to add more data to your payload, but you are guaranteed that your data will be received in it's destination, in the order you sent it and not corrupted. UDP does not have this features so it can't guarantee that.
Sometimes you will read/hear discussions on what protocol to use according to the data you want to send. For example, let's say you have a game, and you want to update the player's position everytime he moves, the payload it self will contain this:
int playerID;
float posX;
float posY;
The payload's size is 12 byte, and let's say we send it using TCP, now the whole packet will look like this:
-------------
TCP_HEADER
-------------
int playedID;
float posX;
float posY;
Now the whole packet's size is payload + TCP_HEADER which is 12 bytes + (32 bytes to 72 bytes), you now have 32 to 72 bytes overhead for your data. You can read about TCP's header here. Notice that the overhead is even bigger than the data itself!
Now you need to decide if it is the protocol you want to use for your game, if you don't need the features TCP offers you better of using UDP because it have smaller overhead and therefore less data to be sent.

You are correct that a packet generally consists of a header and then the payload. The overhead of a packet type is the amount of wasted bandwidth that is required to transmit the payload. The packet header is extra information put on top of the payload of the packet to ensure it gets to its destination.
The overhead is variable because you can choose a different type of packet (Or packet protocol) to transmit the data. Different packet protocols give you different features. The two key type of packet protocols that exist today are TCP and UDP.
One can say UDP has a lower overhead than TCP because its packets have a smaller header and therefore take less bandwidth to send the payload (The data).
The reasons for this are a deep subject but suffice to say that TCP provides many very useful features that UDP does not, such as ensured delivery of the packets and corruption detection. Both are very useful protocols and are chosen based on what features an application needs (Speed or reliability).

Related

Many small UDP datagrams vs fewer, larger ones

I have a system that sends "many" (hundreds) of UDP datagrams in bursts, every once in awhile (say, 10 times a minute). According to nload, this averages about 222kBit/s. The content of these datagrams is JSON. I've considered altering the system so that it waits some time (500ms?) and combines many of the JSON objects into one datagram, before sending. But I'm not sure it's worth the effort (bandwidth, protocol, frequency of sending considered.) Would the new approach provide any real benefits over the current one?
The short answer is that it's up to you to decide that.
The long version is that it depends on your use case. Since we don't know what you're building, it's hard to say what's more important - latency? Throughput? Reliability? Something else? Let's analyze some pros and cons. Here's what I came up with:
Pros to sending larger packets:
Fewer messages means fewer system calls and less I/O against the network. That means fewer blocked/waiting threads and less time spent on interrupts.
Fewer, larger packets means less overhead for each individual packet (stuff like IP/UDP headers that's send with each packet). Therefore a higher data rate is (theoretically) achievable, although keep in mind that all of these headers (L2+IP+UDP) typically add up to no more than 60-70 bytes per packet since the UDP header is only 8 bytes long.
Since UDP doesn't guarantee ordering, larger packets with more time between them will reduce any existing reordering.
Cons to sending larger packets:
Re-writing existing code, and making it (slightly) more complicated.
UDP is unreliable, so a loss of a single (large) packet would be more significant compared to the loss of a small packet.
Latency - some data will have to wait 500ms to be sent. That means that a delay is added between the sender and the receiver.
Fragmentation - if one of the packets you create crosses the MTU boundary (typically 1450-1500 bytes including the IP+UDP header, which is normally 28 bytes long), the IP layer would need to fragment the packet into several smaller ones. IP fragmentation is considered bad for a multitude of reasons.
Processing of larger packets might take longer

TCP Urgent Pointer, buffer management, and the "Send" call

My question relates to TCP Segment-creation on the sending device.
My understanding of TCP is that it will buffer "non-urgent" bytes of traffic until either some kind of internal timeout is reached...or the MSS is reached. Then the segment is finished and transmitted onto the wire.
My question: If TCP has been buffering "normal/non-urgent" bytes, and then receives a string of "urgent" bytes from the upper-layer process will it:
Terminate buffering of "non-urgent" bytes, send the non-urgent segment, and start creation of a new TCP segment, beginning with the "urgent" bytes...or...
Continue building the currently-buffered, partial-segment, placing the urgent bytes somewhere in the middle of the segment after the normal bytes.
RFC 1122 (section 4.2.2.4) indicates that the Urgent Pointer points to the LAST BYTE of urgent data in a segment (inferring that non-urgent data could follow the urgent data within the same segment). It does not clarify if a segment must BEGIN with urgent data...or, if the urgent data might be "in the middle".
This question concerns a possible TCP segment with the "urgent" bit set but NOT the "push" bit. My understanding of RFC 793 is that they are mutually exclusive of each other (although typically set together).
Thanks!
My understanding of TCP is that it will buffer "non-urgent" bytes of traffic until either some kind of internal timeout is reached
If the Nagle algorithm is enabled, which it is by default. Otherwise it will just send the data immediately, subject to windowing etc.
...or the MSS is reached. Then the segment is finished and transmitted onto the wire.
Not really. It will transmit as and when it can, subject only to the Nagle algorithm, windowing, congestion control, etc.
My question: If TCP has been buffering "normal/non-urgent" bytes, and then receives a string of "urgent" bytes from the upper-layer process will it:
Terminate buffering of "non-urgent" bytes, send the non-urgent segment, and start creation of a new TCP segment, beginning with the "urgent" bytes...or...
Continue building the currently-buffered, partial-segment, placing the urgent bytes somewhere in the middle of the segment after the normal bytes.
Neither. See above. From the point of view of actually sending, there is nothing 'urgent' about urgent data.
RFC 1122 (section 4.2.2.4) indicates that the Urgent Pointer points to the LAST BYTE of urgent data in a segment (inferring that non-urgent data could follow the urgent data within the same segment).
Correct, except that you mean 'implying'.
It does not clarify if a segment must BEGIN with urgent data...or, if the urgent data might be "in the middle".
It doesn't require that the segment must begin with urgent data, so it needn't.
This question concerns a possible TCP segment with the "urgent" bit set but NOT the "push" bit.
Why? The PUSH bit is basically meaningless to modern TCP implementations, and ignored, and there is no way you can detect segment boundaries, so why do you care?
My understanding of RFC 793 is that they are mutually exclusive of each other (although typically set together).
Why? Please explain.

Is it possible to send one bit from one computer to another computer through socket?

I am working on a project in which i have to send data in bits.
Is it possible to send 1 bit from one computer to another through internet.Most of the people
said to me that minimum internet packet size is 64bytes.If i send 1bit from one computer to another then packet bandwidth is 64bytes.
A TCP or UDP packet consists of a header and data. Maybe you could have one bit of data in the data section, but you would need the header as well. Without the header it would be impossible to send the packet. The header contains all the information required for sending the packet where it is supposed to go and making sure it arrives safely.
I will take "internet packet" to mean Ethernet frame based on the value you give.
An Ethernet frame has a minimum total size of 64 bytes including both header and payload, this is to ensure that the time to transmit a single frame is greater than the round trip time between nodes.
This requirement is a feature of any network that uses CSMA/CD (specifically the CD (collision detection) part), it allows a sensing node to detect a collision whilst still transmitting a frame.
Whilst Ethernet can be used to send a frame smaller than 64 bytes "padding" will be added to ensure the frame is at least 64 bytes.

How to manage multi-packet sends with gsocket?

I got a question regarding tcp/ip socket networking. Basically it is there: are there any parts of tcp/ip that I can leverage to help manage multi-packet sends. For example, I want to send a 100 mb binary file which would take something like 70-80 tcp packets. Meanwhile I have a relatively fast polling receive on the other side. Would my receive have to receive each packet it individually and "stitch" together the data packet by packet, looking for some size to be reached(it can look at the opcode and determine size) or is there some way to tell tcp to say "hey I'm sending 100 mb here, let them know when it is finished."
I am using glib's low level socket library (gsocket).
When using a binary encoding like, say protocol buffers, you would wrap the actual payload by inserting a header that would include the information necessary to decode the payload on the other end.
Say appending 8 bytes where the first 4 signify the type of the encoded message and the second four indicate the length of the entire message.
On the receiving side you are then reading this header, that's part of the payload, to determine the message type and length of the message. This lets you combine multiple messages in one payload or split messages across packets and reliably recombine them.

Is there a good way to frame a protocol so data corruption can be detected in every case?

Background: I've spent a while working with a variety of device interfaces and have seen a lot of protocols, many serial and UDP in which data integrity is handled at the application protocol level. I've been seeking to improve my receive routine handling of protocols in general, and considering the "ideal" design of a protocol.
My question is: is there any protocol framing scheme out there that can definitively identify corrupt data in all cases? For example, consider the standard framing scheme of many protocols:
Field: Length in bytes
<SOH>: 1
<other framing information>: arbitrary, but fixed for a given protocol
<length>: 1 or 2
<data payload etc.>: based on length field (above)
<checksum/CRC>: 1 or 2
<ETX>: 1
For the vast majority of cases, this works fine. When you receive some data, you search for the SOH (or whatever your start byte sequence is), move forward a fixed number of bytes to your length field, and then move that number of bytes (plus or minus some fixed offset) to the end of the packet to your CRC, and if that checks out you know you have a valid packet. If you don't have enough bytes in your input buffer to find an SOH or to have a CRC based on the length field, then you wait until you receive enough to check the CRC. Disregarding CRC collisions (not much we can do about that), this guarantees that your packet is well formed and uncorrupted.
However, if the length field itself is corrupt and has a high value (which I'm running into), then you can't check the (corrupt) packet's CRC until you fill up your input buffer with enough bytes to meet the corrupt length field's requirement.
So is there a deterministic way to get around this, either in the receive handler or in the protocol design itself? I can set a maximum packet length or a timeout to flush my receive buffer in the receive handler, which should solve the problem on a practical level, but I'm still wondering if there's a "pure" theoretical solution that works for the general case and doesn't require setting implementation-specific maximum lengths or timeouts.
Thanks!
The reason why all protocols I know of, including those handling "streaming" data, chop up the datastream in smaller transmission units each with their own checks on board is exactly to avoid the problems you describe. Probably the fundamental flaw in your protocol design is that the blocks are too big.
The accepted answer of this SO question contains a good explanation and a link to a very interesting (but rather heavy on math) paper about this subject.
So in short, you should stick to smaller transmission units not only because of practical programming related arguments but also because of the message length's role in determining the security offered by your crc.
One way would be to encode the length parameter so that it would be easily detected to be corrupted, and save you from reading in the large buffer to check the CRC.
For example, the XModem protocol embeds an 8 bit packet number followed by it's one's complement.
It could mean doubling your length block size, but it's an option.

Resources