Differentiating http and http2 packets - http

I'm working with packets one by one and need to be able to edit both http and http2 contents.
The question is: is there a way to distinguish the two on a single packet basis?
Edit: For some additional info, the point is to read and edit large pcap files, so i'm trying to work with as little memory as possible.

On a per-packet basis, no. A single TCP packet could represent any arbitrary part of the stream. You need to capture (at least) the first part of the stream to work out whether it's HTTP or HTTP/2 (or anything else).

You can use Chrome DevTool > Network > Protocol to see the protocol used in the file transference.

Related

Figuring out what kind of payload is carried by a packet

I'm working with Scapy to parse a set of .pcap files. I would like to understand what kind of payload those packets are carrying. If I have for example a pcap file with a lot of UDP packets which payloads has the same starting bytes I don't know what kind of encoding was used, and the first values keep repeating in other packets. Is there any program or python library that could allow me to figure out or try to guess what kind of encoding was used (if for example is an RTP payload or MPEG one and so on)?
UPDATE
I was able to use nDPI on those pcap files and it gave me satisfying results for all the flows except for a set of them that it was not able to recognize. I'm going to share with you the first part of the hex representation of the data:
f1d00404d1002d7c484830320000020080073804610d00007b09040000000000010f000000000000000000000000000000000000000000000000000121e002a22e537fcccb815afafce2361b
The first part f1d004 does not change between previous and successive packets. I have already tried to decode them with different protocols using wireshark's feature "Decode as". I have tried with RTP,RTCP,RTSP,JSON,MPEG. If can be useful, this is the capture related to a camera, that's why I tried the previous protocols.

How to make netcat send HTTP headers over multiple TCP segments?

I am trying to simulate "HTTP headers spanning multiple TCP segments" as mentioned here - http://wiki.wireshark.org/HTTP_Preferences.
How can this be done using netcat? Are there any examples you might be able to point me to to get me started?
Netcat isn't really the right tool for this job, but an easy way to make the headers span segments is just to make them long enough. Eventually, they won't fit in a single segment.
The packet size may be 1500 octets (normal Ethernet) or more than 9000 octets (Ethernet with jumbo frames). You'll want some actual network, packet processing with localhost is often optimized.
(For the proper tools for this, you probably want to ask on Severfault or Security.SE, as they're normally used for firewall testing)

TCP/IP programming, data in more than one packet

I am writing an application in C, using libpcap. My program listens for new packets and parses them
according to a grammar. The payload actually is XML.
Sometimes one packet is not enough for an XML file, so the XML buffer is splitted into separate packets.
I want to add code logic in order to handle these cases. However I don't know in advance that a packet does not contain the whole data. How do I know that a packet has more data that will be send next? How to i recognize that a new packet contains the rest of the data?
Do I have to use the TH_FIN flag? Could you please explain it to me?
There's nothing in TCP that defines packets, that's up to the higher layers to define if they need to - TCP is just a stream.
If this is raw XML over a TCP stream, you actually need to parse the xml - you'll know when you have a whole xml document when you've received the end of the document element.
If it's XML packaged over HTTP , you might be able to parse out the Content-Length: header which should contain the length of the body.
Note, reassembling a TCP stream from captured packets is a very hard problem, there's a lot of corner cases, e.g. you'd need to handle retransmission , out of sequence tcp segments and many more. http://libnids.sourceforge.net/ might help you.
As Anon say use a higher level stream library.
But even then you need to know the chunk side before starting to handle it, as you will read from the stream in block's of n bytes.
Thus you want to first send in binary the number of bytes to be sent, then send x bytes, and repeat, thus when you are receiving the chucks via select/read to know went you have all of chunk one to pass to the processor.
If you're using TCP, use a TCP library that gives you the data as a stream instead of trying to handle the packets yourself.
Stream is good. Another option is to store the incoming data in a buffer (eg char*) and search for application messaging framing characters or in the case of Xml, the root end tag. Once you've found a complete xml message at the front of the buffer, pull it out and process.
The XMPP instant messaging protocol, used by Jabber, has means to move XML chunks over a TCP stream. I don't know how exactly it is done myself, but RFC 3290 is the protocol definition. You should be able to work it out from that.

Working with persistent HTTP connections

We are trying to implement a proxy proof of concept but have encountered an interesting question: Since a single HTTP connection can, and indeed should, make multiple requests, and the HTTP transactions are sent via multiple packets due to TCP's magic, is it possible for a HTTP request to begin in the middle of a packet?
Bear in mind that this is not a theoretical question regarding possible optimization of the browser, but whether it actually happens in real life. It would be even better if someone could point me to a written reference on whether or not this is possible and if so how often it can occur.
Clarification update: We know that if we work in the HTTP layer alone we would not need to bother with this question, however we're trying to figure out if some advanced technique could be applied by working on the TCP layer first.
Assuming that you are talking about IP packets: Yes, it is possible that HTTP request starts middle of IP packet.
When you are using persistent HTTP connections, that is, using same TCP connection for several HTTP requests, it is fully possible that request boundary is middle of IP packet.
Also there is a TCP protocol between IP and HTTP. TCP contains also some headers so a IP packet may start with some TCP headers and rest of the packet consists of HTTP request.
HTTP request may also consist of several IP packets (in case of file uploads, transmission errors and following retransmissions etc).
However, I wonder why you are interested in packets if you are working at HTTP level. TCP should hide the IP packet details.
First of all, TCP is a stream based protocol and has no concept of packets. HTTP itself might have some kind of message or record delimiter, but TCP doesn't.
This page might be helpful: Structure of HTTP Transactions
From your question it sounds like you think that each read from a TCP socket is a "packet" of data. In reality, each read simply reads as many bytes as are in the buffer up to the maximum that you requested, without any concept of records or packets.
So for instance, lets say you read 2048 bytes from the socket, you could have the tail end of one transaction, followed by the beginning of a second response half way through the data you read, and only get the remainder of your second response on your next read from the socket.
If you're here in Jerusalem or near by maybe I could help you out.
Unless you are implementing your own TCP stack, you should not need to worry about the packets, but rather about the API that the TCP provides, in case of POSIX interfaces it would be the recv() or read(). So I treat the question then as "Can more than one HTTP requests come into a single read(), and can the HTTP request be split between multiple read() requests?" -- The answer to both would be "yes, it is possible".
An example of where this can happen is HTTP pipelining. This not frequent in real life (ironically, at least some of the browsers disable it by default because of "buggy proxies" :-) - but when it happens, can be a bit of a problem for the users to diagnose - especially if they have no access to the proxy.
One very notable place where it does happen by default apt-get in Debian-derived linux systems. Just install a Debian or Ubuntu server and try to use it through your proxy. You can do that by editing the /etc/apt/apt.conf.d/proxy file and placing the following there:
Acquire::http::Proxy "http://your.proxy.address:8080";
Depends of which abstraction layer of a packet you are talking about: there are many layers underneath HTTP.
HTTP --> TCP (byte stream) --> IP (packet) --> (possibly something else) Ethernet (frame) --> (possibly) some other transport
If you are talking about the IP layer, then yes the HTTP layer would start later on... Note that TCP presents a "byte stream interface" to its Client layer hence, no concept of packet here.
I think I understand where you are trying to go with this question.
If you don't use persistent HTTP connections, the HTTP GET request header is always the very first thing which is sent over the TCP connection, so we can be sure that the start of the HTTP GET request header does "not start in the middle of some TCP packet". But keep in mind that there may be one or more TCP packets without any user data, e.g. only a SYN, which may preceed the TCP packet with the start of the HTTP GET request header. And also keep in mind that the HTTP GET request header may not be contained in a single TCP packet.
If you do use persistent HTTP connections, the start of the HTTP GET request header for request number N+1 can start in the middle of a TCP packet, namely after the end of HTTP GET request body of request number N.
If you are asking these questions you are possibly "doing it wrong". As several other responders have already pointed out, in the vast majority of cases you should probably just be a TCP client and deal with a TCP stream of data and let the TCP code worry about the TCP packets. (Unless, of course, you are working on some special hardware which is looking at individual IP packets as they fly by and try to do some processing at the HTTP layer.)

How to analyse a HTTP dump?

I have a file that apparently contains some sort of dump of a keep-alive HTTP conversation, i.e. multiple GET requests and responses including headers, containing an HTML page and some images. However, there is some binary junk in between - maybe it's a dump on the TCP or even IP level (I'm not sure how to determine what it is).
Basically, I need to extract the files that were transferred. Are there any free tools I could use for this?
Use Wireshark.
Look into the file format for its dumps and convert your dump to it. Its very simple. Its called the pcap file format. Then you can open it in Wireshark no problem and it should be able to recognize the contents. Wireshark supports many dozens if not many hundred communication formats at various OSI layers (including TCP/IP/HTTP) and is great for this kind of debugging.
Wireshark will analyze on the packet level. If you want to analyze on the protocol level, I recommend Fiddler: http://www.fiddlertool.com/fiddler/
It will show you the headers sent, the responses, and will decrypt HTTPS sessions as well. And a ton more.
The Net tab in the Firebug plugin for Firefox might be of use.

Resources