how to reassemble tcp segment?

how to reassemble tcp segment? - http

im now developing a project using winpcap..as i have known packets being sniffed are usually fragmented packets.
how to reassemble this TCP segements?..any ideas, suggestion or tutorials available?..
this i assume to be the only way i can view the HTTP header...
thanks!..

tcp is a byte stream protocol.
the sequence of bytes sent by your http application is encapsulated in tcp data segments and the byte stream is recreated before the data is delivered to the application on the other side.
since you are accessing the tcp datasegments using winpcap, you need to go to the data portion of the segment. the header of tcp has a fixed length of 20 bytes + an optional part which you need to determine using the winpcap api.
the length of data part in the tcp segment is determined by subtracting the tcp header length (obtained from a field in the tcp segment) and the ip header length (from a field in the ip datagram that encapsulates the tcp segment) from the total length (obtained from another field in the ip datagram).
so now you have the total segment length and the length of the data part within the segment. so you know offset where the http request data starts.
the offset is
total length-length of data part
or
length of ip-header + length of tcp header
i have not used winpcap. so you will have to find out how to get these fields using the api.
also ip datagrams may be further fragmented but i am expecting that you are provided only reassembled datagrams using this api. you are good to go!

There is no such thing as a TCP fragment. The IP protocol has fragments. TCP is a stream protocol. You can assemble the stream to its intended order by following the sequence numbers of both sides. Every TCP Packet goes to the IP level and can be fragmented there. You can assemble each packet by collecting all of the fragments and following the fragment offset from the header.
All of the information you need is in the headers. The wikipedia articles are quite useful in explaining what each field is
http://en.wikipedia.org/wiki/TCP_header#Packet_structure
http://en.wikipedia.org/wiki/IPv4#Header

PcapPlusPlus offers this capability out-of-the-box for all major OS's (including Windows). Please check out the TcpReassembly example to see a working code and the API documentation to understand how to use the TCP reassembly feature

Depending on the whose traffic you're attempting to passively reassemble, you may run into some TCP obfuscation techniques designed to confuse people trying to do exactly what you're trying to do. Check out this paper on different operating system reassembly behaviors.

libtins provides classes to perform TCP stream reassembly in a very high level way, so you don't have to worry about TCP internals to do so.

Related

UDP numbered segments?

My firewall textbook says: "UDP breaks a message into numbered segments so that it can be transmitted."
My understanding was UDP had no sequence or other numbering scheme? That data was broken into packets and sent out with no ordered reconstruction on the other end, at least on this level. Am I missing something?

The book is just wrong here. The relevant section says:
User Datagram Protocol (UDP)—This protocol is similar to TCP in that it handles the addressing of a message. UDP breaks a message into numbered segments so that it can be transmitted. It then reassembles the message when it reaches the destination computer.
UDP does not include any mechanism to segment or reassemble messages; each message is sent as a single UDP datagram. If you look at the UDP "packet" (technically datagram) structure on page 108, there's no segment number or anything like that.
Mind you, segmentation can happen at other layers, either above or below UDP:
IP packets can be fragmented if they're too big for a network link's MTU (maximum transfer unit). This can happen to IP packets that contain UDP, TCP, or whatever. This is actually relevant for firewalls because creative fragmentation can sometimes be used to bypass packet filtering rules.
Some protocols that run on top of UDP also use something like numbered segments. For example, TFTP (trivial file transfer protocol) breaks files into "blocks", and transmits a block number in the header for each block. (And the receiver responds acknowledging the block number it's received -- it's like a drastically simplified version of TCP.) But this is part of the TFTP protocol, not part of UDP.
QUIC is another example of a protocol that runs over UDP and supports segmentation (and multiple connections, and...), and each packet contains a packet number. But again it's part of the QUIC protocol, not UDP.

How can I tell if the current TCP segment is part of a large PDU

I'm writing a small application using libpcap, where I parse/analyze a TCP based application. I faced a situation, where application attempts to send a really large amount of data, say 64K, and TCP layer cuts it into a number of smaller segments.
Now, my question is -- how do I really tell that a TCP payload of the packet, read from pcap, is actually a chunk of a larger payload. So, in order to access original large payload, number of segments will need to be re-assembled.
TCP header has sequence field, but I don't fully understand how it can answer my question.
Also, IP header has total_length field, but it has nothing to do with TCP segmentation, it indicates IP payload size of the current packet.
I'd appreciate to get some hints. Thanks.

TCP can't help you here because it neither knows nor cares what PDUs are. You need to implement whatever protocol defines what a "large PDU" is. For example, if this is HTTP over TCP, implementing the HTTP protocol will tell you if the segment is part of a large PDU.
Because my question was - how do I tell that I have a small segment that has to be reassembled in a large packet.
That's what a message protocol is for. If, for example, the message protocol says that a PDU is a "series of characters not containing a newline character terminated by a newline character", then if you don't have a newline character, you know it's part of a larger PDU.
The concept of PDUs applies to message protocols, so if you're talking about PDUs, you must have a message protocol. The message protocol will tell you when you have an entire PDU. That's its purpose.

how to know whether the TCP ends or not?

TCP is stream to communicate and it has varying length. So in the application, how I can know whether the TCP ends or not?
In the transfom layer, The TCP packet header doesn't have a length field and its length is varying, how can the TCP layer know where is the end.

You design a protocol that runs on top of TCP that includes a length field (or a message terminator). You can look at existing protocols layered on top of TCP (such as DNS, HTTP, IRC, SMTP, SMB, and so on) to see how they do this.
To avoid pain, it helps to have a thorough understanding of several different protocols layered on top of TCP before attempting to design your own. There's a lot of subtle details you can easily get wrong.

If you look at this post, it gives a good answer.
Depending on what you are communicating with, or how you are communicating there will need to be some sort of character sequence that you look for to know that an individual message or transmission is done if you plan to leave the socket open.

n the transfom layer
I assume you mean 'transport layer'?
The TCP packet header doesn't have a length field and its length is varying
The IP header has a length field. Another one in the TCP header would be redundant.
how can the TCP layer know where is the end.
From the IP header length word, less the IP and TCP header sizes.

When a party of a TCP connection receives a FIN or RST signal it knows that the other side has stopped sending. At an API level you can call shutdown to give that signal. The other side will then get a zero length read and knows that nothing more will be coming.

tcp: recomposing data at the end

How do TCP knows which is the last packet of a large file (that was segmented by tcp) in the scenario that the connection is kept-established. (like ftp or sending mp3 on yahoo messenger)
I mean how does it know which packet carries data of one.mp3 and which packet carries data of another.mp3 ??
Anyone ?
Thank you

There are at least 2 possible approaches.
Declare upfront how much data you're going to send. Something like a packet that declares Sending a message that's 4008 bytes long
The second approach is to use a terminating sequence (nastier to process)
So the receiver:
Tries to read the declared amount or
Scans for the terminating sequence

TCP is a stream protocol and fragmentation should be transparent to a TCP application. It operates on streams of data, never packets. A stream is assembled to its intended order using the sequence numbers. The sequence of bytes send by application is encapsulated in tcp segments. The stream is recreated on the receiver side before data is delivered to the application.
The IP protocol can do fragmentation.
Each TCP segment goes to the IP layer and may be fragmented there. Segment is reassembled by collecting all of the packets and offset field from the header is used to put it in the right place.

sending multiple tcp packets in an ip packet

is it possible to send multiple tcp or udp packets on a single ip packet? are there any specifications in the protocol that do not allow this.
if it is allowed by the protocol but is generally not done by tcp/udp implementations could you point me to the relevant portion in the linux source code that proves this.
are there any implementations of tcp/udp on some os that do send multiple packets on a single ip packet. (if it is allowed).

It is not possible.
The TCP seqment header does not describe its length. The length of the TCP payload is derived from the length of the IP packet(s) minus the length of the IP and TCP headers. So only one TCP segment per IP packet.
Conversely, however, a single TCP segment can be fragmented over several IP packets by IP fragmentation.

Tcp doesn't send packets: it is a continuous stream. You send messages.
Udp, being packet based, will only send one packet at a time.
The protocol itself does not allow it. It won't break, it just won't happen.
The suggestion to use tunneling is valid, but so is the warning.

You might want to try tunneling tcp over tcp, although it's generally considered a bad idea. Depending on your needs, your mileage may vary.

You may want to take a look at the Stream Control Transmission Protocol which allows multiple data streams across a single TCP connection.
EDIT - I wasn't aware that TCP doesn't have it's own header field so there would be no way of doing this without writing a custom TCP equivalent that contains this info. SCTP may still be of use though so I'll leave that link.

TCP is a public specification, why not just read it?
RFC4164 is the roadmap document, RFC793 is TCP itself, and RFC1122 contains some errata and shows how it fits together with the rest of the (IPv4) universe.
But in short, because the TCP header (RFC793 section 3.1) does not have a length field, TCP data extends from the end of the header padding to the end of the IP packet. There is nowhere to put another data segment in the packet.

You cannot pack several TCP packets into one IP packet - that is a restriction of specification as mentioned above. TCP is the closest API which is application-oriented. Or you want to program sending of raw IP messages? Just tell us, what problem do you want to solve. Think about how you organize the delivery of the messages from one application to another, or mention that you want to hook into TCP/IP stack. What I can suggest you:
Consider packing whatever you like into UDP packet. I am not sure, how easy is to initiate routing of "unpacked" TCP packages on remote side.
Consider using PPTP or similar tunnelling protocol.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex