"Why does the TCP header have a header length field while the UDP header does not?" Is this a valid question? - tcp

My question is the inverse of this one asked on Stack Overflow:
InverseQuestion
...but I don't think I get an unequivocal answer to what I'm seeking from that either. My question was posed on a mid-term I just took and was worth A LOT of points. I argued that it was not a legitimate question because the UDP header DOES have a length field (that speaks to the header and data), as shown in the screenshot I'm embedding. I could list dozens of references that have a similar diagram and explanation. The instructor simply marked it wrong with no explanation. We have been going back and forth since then and I can't get an answer as to why every UDP header diagram on the internet shows a length field if there is no length! Can someone help me understand--if it's true there is no header length--in plain English? Am I misinterpreting all these similar diagrams? Thanks.
UDP Diagram
https://www.computernetworkingnotes.com/ccna-study-guide/segmentation-explained-with-tcp-and-udp-header.html
https://www.lifewire.com/tcp-headers-and-udp-headers-explained-817970

Why does the TCP header have a header length field while the UDP header does not? might be a valid question.
UDP header contains the header+data length
TCP header contains the header length in 32b DWORD
IP header contains the total length of the IP packet
Important:
UDP header is fixed 8 Bytes => no meaning to make the header bigger for a constant
TCP header can vary with options
If you're looking for the reason why UDP includes the data and TCP doesn't, you can check in the draft of each RFC specification. Nevertheless, there might not be any reason for that, don't forget those protocols have been defined tens years ago.

Related

Is IP header checksum a full proof method of error detection?

While going through IP header checksum i.e. 1's complement of 1's complement sum of 16 bits data, I can't help but think that how come this method can detect error/alteration in data. For example, computer A sends a packet with data (12 and 7) and computer B receives the packet but with data altered (13 and 6). Hence in the receiver, checksum still match however data is altered. Could you please help me to understand if I am missing something in this topic?
Thank you.
Is IP header checksum a full proof method of error detection?
No.
The IP header checksum's purpose is to enable detection of a damaged IP header. It does not protect against manipulation or damage to the data field (which often has its own checksum).
For protection against manipulation a cryptographic method is required.

Understanding tshark output

I am trying to understand the output of network data captured by tshark using the following command
sudo tshark -i any ‘tcp port 80’ -V -c 800 -R ‘http contains <filter__rgument>' > <desired_file_location>
Accordingly, I get some packets in output each starting with a line something like this:
Frame 5: 1843 bytes on wire (14744 bits), 1843 bytes captured (14744 bits) on interface 0
I have some basic questions regarding a packet:
Is a frame and a packet the same thing (used interchangeably)?
Does a packet logically represent 1 request (in my case HTTP request)? If not, can a request span across multiple packets, or can a packet contain multiple requests? A more basic question will be what does a packet represent?
I see a lot of information being captured in the request. Is there a way using tshark to just capture the http headers and http reqeust body? Basically, my motive of this whole exercise is to capture all these requests to replay them later.
Any pointers in order to answer these doubts will be really helpful.
You've asked several questions. Here are some answers.
Are frames and packets the same things?
No. Technically, when you are looking at network data and that data includes the Layer 2 frame header, you are looking at a frame. The IP packet inside of that frame is just data from Layer 2's point of view. When you look at the IP datagram (or strip off the frame header), you are now looking at a packet.
Ultimately, I tell people that you should know the difference and try to use the terms properly, but in practice it's not an extremely important distinction.
Does a packet represent a single request?
This really depends. With HTTP 1.0 and 1.1, you could look at it this way, though there's no reason that, if the client has a significant amount of POST data to send, the request can't span multiple packets. It is better to think of a single "connection" or "session" as a single request/response. (This is not necessarily strictly true with HTTP 1.1, but it is generally true)
With HTTP 2.0, this is by design not true. A single connection or session is used to handle multiple data streams (requests/responses).
How can I get at the request headers?
This is far too lengthy for me to answer here. The simplest thing to do, most likely, is to simply fire up WireShark, go into the filter bar and type "http." As soon as you hit the dot, you will see a list of all of the different sub-elements that you can look at. You can use these in tshark using the '-Y' option, and you can additionally specify columns that you would like to display (so you can add and remove columns, effectively).
An alternative way to see this information is to use the filter expression button to bring up the protocols selector. If you scroll down to HTTP, you can select it and then see all of the fields that are available.
When looking through these, realize that some of the fields are in the top-level rather than within request or response. For example, content-length appears as a field under http rather than http.request.content_length. This is because content-length is a field common to all requests and responses.

what is overhead, payload, and header [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Can someone please explain me what is overhead, payload, header and packet. As far as I know a packet is the whole data that is to be transmitted. This packet consists of the actual data which I think is the payload and the source/destination information of the packet is in the header. So a packet consists of header and payload. So what is this overhead? Is overhead a part of the header? I got this from the internet "Packet overheard includes all the extra bytes of information that are stored in the packet header".
The header already contains source/destination info. What are the extra bytes of information that this packet overhead has? I'm confused.
The packet like you said, have the "payload" which is the data itself it needs to transfer (usually the user's data), the "header" contains various things depends on the protocol you are using, for example UDP contains just simple things in the header like Destination and Source IP/PORT, TCP on the other end contains more things like the sequence number of the packet to ensure ordered delivery, a lot of flags to ensure the packet actually received in it's destination and checksum of the data to make sure it didn't get corrupted and received correctly in its detination.
Now, the "overhead" part is actually the additional data that you need in order to send your payload. In the cases I talked about above it's the header part, because you need to add it to every payload that you want to send over the internet. TCP has bigger overhead than UDP because it needs to add more data to your payload, but you are guaranteed that your data will be received in it's destination, in the order you sent it and not corrupted. UDP does not have this features so it can't guarantee that.
Sometimes you will read/hear discussions on what protocol to use according to the data you want to send. For example, let's say you have a game, and you want to update the player's position everytime he moves, the payload it self will contain this:
int playerID;
float posX;
float posY;
The payload's size is 12 byte, and let's say we send it using TCP, now the whole packet will look like this:
-------------
TCP_HEADER
-------------
int playedID;
float posX;
float posY;
Now the whole packet's size is payload + TCP_HEADER which is 12 bytes + (32 bytes to 72 bytes), you now have 32 to 72 bytes overhead for your data. You can read about TCP's header here. Notice that the overhead is even bigger than the data itself!
Now you need to decide if it is the protocol you want to use for your game, if you don't need the features TCP offers you better of using UDP because it have smaller overhead and therefore less data to be sent.
You are correct that a packet generally consists of a header and then the payload. The overhead of a packet type is the amount of wasted bandwidth that is required to transmit the payload. The packet header is extra information put on top of the payload of the packet to ensure it gets to its destination.
The overhead is variable because you can choose a different type of packet (Or packet protocol) to transmit the data. Different packet protocols give you different features. The two key type of packet protocols that exist today are TCP and UDP.
One can say UDP has a lower overhead than TCP because its packets have a smaller header and therefore take less bandwidth to send the payload (The data).
The reasons for this are a deep subject but suffice to say that TCP provides many very useful features that UDP does not, such as ensured delivery of the packets and corruption detection. Both are very useful protocols and are chosen based on what features an application needs (Speed or reliability).

Dissector for TCP Option

I am new to writing dissectors in Lua and I had two quick questions. I have a packet which has the TCP Options as MSS, TCP SACK, TimeStamps, NOP, Window Scale, Unknown. I am basically trying to dissect the unknown section in the TCP Options field. I am aware that I will have to use the chained dissector.
The first question is while using the chained dissector to parse the TCP Options, do I have to parse all the Options from the beginning. For Example will I need to parse MSS, TCP SACK, .... and then finally parse Unknown section or is there any direct way for me to jump to the Unknown section.
The second question I have is I have seen the code for many custom protocol dissectors and if I need to dissect a protocol which follows (for example)TCP, then I will have to include the following:
-- load the tcp.port table
tcp_table = DissectorTable.get("tcp.port")
-- register our protocol to handle tcp port
tcp_table:add(port,myproto_tcp_proto)
My question is, is there anyway for me to jump to the middle of the protocol. For example in my case I want to parse TCP Options. Can I directly call tcp.options and the parser will start dissecting from where the options will start?
The TCP option is "uint8_t type; uint8_t len; uint8_t* data" structure.
I usually give common used ones a name. For example getSack(), getMss().
For others, keep them in an array(maximum size like 20).
For your second question, you mean you don't care about TCP header, right? If so, just move your pointer 20 bytes further to get access the TCP options.

Split CRLF between TCP payloads

I'm currently writing a low-level HTTP parser and have run into the following issue:
I am receiving HTTP data on a packet-by-packet basis, i.e. TCP payloads one at a time. When parsing this data, I am using the HTTP protocol standards of searching for CRLF to delineate header lines, chunk data (in the case of chunked-encoding), and the dual CRLF to delineate header from body.
My question is: do I need to worry about the possibility of CRLF being split between two TCP packet payloads? For example, the HTTP header will finish with CRLFCRLF. Is it possible that two subsequent TCP packets will have CR, and then LFCRLF?
I am assuming that yes; this is a case to worry about, since the application (HTTP) and TCP layers are rather independent of each other.
Any insight into this would be highly appreciated, thank you!
Yes, it is possible that the CRLF gets split into different TCP packets. Just think about the possibility that a single HTTP header is exactly one byte longer than the TCP MTU. In that case, there is only room for the CR, but not for the NL.
So no matter how tricky your code will get, it must be able to handle this case of splitting.
What language are you working in? Does it not have some form of buffered read functionality for the socket, so you don't have this issue?
The short answer to your question is yes, theoretically you do have to worry about it, because it is possible the packets would arrive like that. It is very unlikely, because most HTTP endpoints will tend to send the header in one packet and the body in subsequent packets. This is less by convention and more by the nature of the way most socket-based programs/languages work.
One thing to bear in mind is that while the protocol standards are quite clear about the CRLF separation, many people who implement HTTP (clients in particular, but to some degree servers as well) don't know/care what they are doing and will not obey the rules. They will tend to separate lines with LF only - particularly the blank line between the head and the body, the number of code segments I have seen with this problem I could not count up to quickly. While this is technically a protocol violation, most servers/clients will accept this behaviour and work around it, so you will need to as well.
If you can't do some kind of buffered read functionality, there is some good news. All you need to do is read a packet at a time into memory and tag the data on to the previous packet(s). Every time you have read a packet, scan your data for a double CRLF sequence, if you don't find it, read the next packet, and so on until you find the end of the head. This will be relatively small memory usage, because the head of any request shouldn't ever be more than 5-6KB, which given an ethernet MTU of (averaging around) 1450 bytes means you shouldn't ever need to load more than 4 or 5 packets into memory to cope with it.

Resources