How does wireshark interpret physical packets?
As far as I know, all packets look to be the same, so how does it decode them to pass to next higher protocol?
When it's used to capture live traffic it knows the type of the interface and therefore the L2 encapsulation of packets, and when it reads a pcap file, the file has a field in the header indicating network type.
There are probably a number of different mechanisms. You can download the dissectors and study the source to find out the various methods.
I wrote a dissector for a network sniffer and ported it to Ethereal and then Wireshark (or maybe someone else ported it; I don't remember). But the basic logic is that the dissector gets added to the list of possible dissectors. Wireshark calls a dissector and it decodes the packet if it can. If not, it calls the next one in the chain.
In the code I wrote, I simply analyzed the packet (UDP in my situation) to determine if it fit the profile of the desired packet using checksums and known data in the packet. If it decided it was the packet I was interested in I just extracted the various pieces of interesting data from the packet. The function tvb_get_ptr returns a pointer to the start of the data.
Related
When I use TCP I need destination port (to be able to "talk" to other process on the other host) and source port (because TCP is connection oriented so I'll send data back to source like ack, seq and more).
On the other side, UDP which is connectionless needs also source port.
Why is it? (I don't need to send back data)
Probably, two reasons.
First, receivers often need to reply and it is useful to provision a standard tool for that.
Secondly, you may have multiple interfaces (network cards) and using source address, you decide which of them must be used to emit the packet.
You don't need to but there's still the possibility to send a response back (that is very useful actually) however as stated in the RCF 768
Source Port is an optional field, when meaningful, it indicates the port
of the sending process, and may be assumed to be the port to which a
reply should be addressed in the absence of any other information. If
not used, a value of zero is inserted.
https://www.rfc-editor.org/rfc/rfc768
I would like to add to the answers here. Apart from simply knowing what to reply to, the source port can belong to the list of well-known port numbers. These ports specify what kind of data is encapsulated in the UDP (or TCP!) packet.
For example, the source port 530 indicates that the packet contains a Remote Procedure Call, and 520 indicates a Routing Information Protocol packet.
Recently I dug a bit deeper into the matter of network protocols and the OSI model, when noticed, that incoming TCP datagrams (correct me if this is the wrong term) are splitted into several parts, when they exceed a certain size - in this case it's propably my router's MTU. I captured those datagrams using SharpPcap in order to extract some information i am looking for, if you are wondering where I got this information from.
Anyway I was wondering if the reassembly of fragmented packets shouldn't be task of the IP layer, since it definitely provides information to accomplish this (id, fragmentation flags, fragment offset). Furthermore I read, that the TCP layer is to be interpreted as a stream-based protocol. But this actually implies, that it's up to the TCP layer to care about filling the application's buffer the right way, so that the initial piece of information is reconstructed and may be flushed "up" all further layers.
Before I made this observation I actually thought, that the TCP layer should care about reassembling those datagrams, but none of the mentioned layers does...
This leads to the following question(s):
Why are the TCP datagrams I receive not reassembled and what layer SHOULD actually take care about this?
The ip layer handles fragmentation and reassembly, http://en.wikipedia.org/wiki/IP_fragmentation.
When you use a tool like SharpPcap that uses winpcap/airpcap/libpcap, you are receiving the raw datagrams from the device you are capturing on. For many adapters this is an ethernet datagram that then contains an ip frame etc.
This is in contrast to data received after processing by the networking stack, where the reassembly is performed.
So, its expected that you won't get reassembled datagrams from SharpPcap (or many other capture libraries) because the data is being captured at the adapter level, not inside of our as an output of the networking stack that is performing reassembly.
You can perform the reassembly after capture either yourself or using a library that provides this functionality. You could also add such a component to Packet.Net (the packet processing library that SharpPcap is using) to provide this reassembly.
I know the protocol doesn't support this but is it common for clients that require some level of reliability to build into their application a method for requesting retransmission of a packet if found to be corrupt?
It is common for clients to implement reliability on top of UDP if they need reliability (or sometimes just some reliability) but not any of the other things that TCP offers, for example strict in-order delivery, and if they want, at the same time, low latency (or multicast, where it works).
In general, you will only want to use reliable UDP if there are urgent reasons (very low latency and high speed needed, e.g. for a twitch game). In every "normal" case, simply using TCP will serve you equally well or better.
Also note that it is easy to implement your own stack on top of UDP that performs worse than TCP.
See enet for an example of a library that implements reliability (and some other features) on top of UDP (Raknet or HawkNL would be other examples).
You might want to look at the answers to this question: What do you use when you need reliable UDP?
Of course. You can build a reliable protocol (like TCP) on top of UDP.
Example
Imagine you are building a fileserver:
* read the file using blocks of 1024 bytes
* construct an UDP packet with payload: 4 bytes for the "position" in the file, 4 bytes for the "length" of the contents of the packet.
The receiver now receives UDP packets. If he gets following packets:
* 0-1024: DATA
* 1024-2048: DATA
* 3072-4096: DATA
it realises a packet got missing, and asks the sending application to resend the part between 2048 and 3072.
This is a very basic example to explain your application code needs to deal with the packet construction and payload interpretation. Don't use it, it does not take edge cases (last packet, checksums for corrupted packets, ...) into account.
Short answer: No.
Long answer: UDP doesn't care for packet loss. If an UDP packet arrives and has a bad checksum, it is simply dropped. Neither the sender of the packet is informed about this, nor the recipient is informed. It is only dropped, that's all that happens. Also UDP packets are not numbered, so if you send four packets and only three arrive at the recipient, the recipient cannot know that there used to be four and one is missing. Last but not least, packets are not acknowledged, so the sender will never know if a packet it was sending out ever made it to the recipient or not.
Contrary to this, TCP breaks data into segments, each segment is numbered, so the recipient can know if a segment is missing. Also all segments must be acknowledged to the sender. If the sender receives no acknowledgment for a segment after a certain period of time, it will assume the segment was lost and send it again.
Of course, you can add an own frame header on top of every data fragment you sent over UDP, that way your application code can number the sent fragments and you can implement an acknowledgement-resent strategy in your code but the question is: Will this really be better than what TCP is already offering for free? Even if it would be equally good, save yourself the time and just use TCP. A lot of people already thought they can do better than TCP and usually they realize in the end, actually they cannot. TCP has its weaknesses but in general it is pretty good at what it does.
im now developing a project using winpcap..as i have known packets being sniffed are usually fragmented packets.
how to reassemble this TCP segements?..any ideas, suggestion or tutorials available?..
this i assume to be the only way i can view the HTTP header...
thanks!..
tcp is a byte stream protocol.
the sequence of bytes sent by your http application is encapsulated in tcp data segments and the byte stream is recreated before the data is delivered to the application on the other side.
since you are accessing the tcp datasegments using winpcap, you need to go to the data portion of the segment. the header of tcp has a fixed length of 20 bytes + an optional part which you need to determine using the winpcap api.
the length of data part in the tcp segment is determined by subtracting the tcp header length (obtained from a field in the tcp segment) and the ip header length (from a field in the ip datagram that encapsulates the tcp segment) from the total length (obtained from another field in the ip datagram).
so now you have the total segment length and the length of the data part within the segment. so you know offset where the http request data starts.
the offset is
total length-length of data part
or
length of ip-header + length of tcp header
i have not used winpcap. so you will have to find out how to get these fields using the api.
also ip datagrams may be further fragmented but i am expecting that you are provided only reassembled datagrams using this api. you are good to go!
There is no such thing as a TCP fragment. The IP protocol has fragments. TCP is a stream protocol. You can assemble the stream to its intended order by following the sequence numbers of both sides. Every TCP Packet goes to the IP level and can be fragmented there. You can assemble each packet by collecting all of the fragments and following the fragment offset from the header.
All of the information you need is in the headers. The wikipedia articles are quite useful in explaining what each field is
http://en.wikipedia.org/wiki/TCP_header#Packet_structure
http://en.wikipedia.org/wiki/IPv4#Header
PcapPlusPlus offers this capability out-of-the-box for all major OS's (including Windows). Please check out the TcpReassembly example to see a working code and the API documentation to understand how to use the TCP reassembly feature
Depending on the whose traffic you're attempting to passively reassemble, you may run into some TCP obfuscation techniques designed to confuse people trying to do exactly what you're trying to do. Check out this paper on different operating system reassembly behaviors.
libtins provides classes to perform TCP stream reassembly in a very high level way, so you don't have to worry about TCP internals to do so.
In practice, what is the most appropriate term for the communications transmitted over a network in higher level protocols (those above TCP/IP, for example)? Specifically, I am referring to small, binary units of data.
I have seen both "message" and "packet" referred to in various client/server libraries, but I was interested in the community's consensus.
These are definitely messages. A "packet" is a layer-3 (in ISO terminology) protocol unit, such as an IP packet; and a "datagram" is a layer-1 or layer-2 unit, such as the several Ethernet datagrams that might make up the fragments of an IP packet.
So a message might be split across several packets, particularly if you're using a streaming protocol such as TCP, and a packet might be split across several datagrams.
Just my take. It probably depends on what level you are working at. When I think of the entire transmission (all headers, data, etc) I would call that a Message. A packet, especially in TCP/IP, is just a part of a message. Multiple packets are pushed across the network comprising an entire message.
I think packet refers to the chunks of data transferred on a lower layer like Ethernet and message is used for higher level information exchange.
imo they basically mean the same...
edit:
There's also another terminology called frame, which is defined in RFC 1122 as "the unit of transmission in a link layer protocol, and consists of a link-layer header followed by a packet." [wikipedia]
msgs is packet in Network Layer
it is segement in TCP protocol(Transmission Layer)
it is msgs in HTTP or FTP(Application Layer)