Wireshark Traffic Analysis for File type - networking

So I am wondering how can we utilize Wireshark to see if a users has downloaded a txt file over the internet.
I tried this while running wireshark:
https://code.google.com/p/androidnetworktester/downloads/detail?name=1mb.txt
I followed the HTTP stream, and can see the URL and a bunch, but in the PCAP packet body, I can't find the 1mb.txt file anywhere. Just curious, if we are doing forensics works, how can we prove the person really downloaded this using this wireshark information? Is it because it's using SSL that all the text in the PCAP is scattered with random code?
Thanks a bunch

if we are doing forensics works, how can we prove the person really downloaded this using this wireshark information
You can't really prove it from the packet capture unless you are able to decode the content. In most cases this is not possible, but if you have access to the private key of the site (you usually don't because it is private) and if RSA key exchange was used then you can decode the traffic after capture.
What you can get from the packet capture is the target host of the request, but not the exact URL or even the content. But if the length of the packet capture matches about the length of the content (there is some overhead in transport) and if you know that this is the only file at the server of this size than you might have at least an indicator that the user might have downloaded this file. But is probably not enough as a real prove.
For more prove you might then have a look at this history of the browser.

Related

Figuring out what kind of payload is carried by a packet

I'm working with Scapy to parse a set of .pcap files. I would like to understand what kind of payload those packets are carrying. If I have for example a pcap file with a lot of UDP packets which payloads has the same starting bytes I don't know what kind of encoding was used, and the first values keep repeating in other packets. Is there any program or python library that could allow me to figure out or try to guess what kind of encoding was used (if for example is an RTP payload or MPEG one and so on)?
UPDATE
I was able to use nDPI on those pcap files and it gave me satisfying results for all the flows except for a set of them that it was not able to recognize. I'm going to share with you the first part of the hex representation of the data:
f1d00404d1002d7c484830320000020080073804610d00007b09040000000000010f000000000000000000000000000000000000000000000000000121e002a22e537fcccb815afafce2361b
The first part f1d004 does not change between previous and successive packets. I have already tried to decode them with different protocols using wireshark's feature "Decode as". I have tried with RTP,RTCP,RTSP,JSON,MPEG. If can be useful, this is the capture related to a camera, that's why I tried the previous protocols.

Need to create a package with a specific number in either the protocol header or payload

Unfortunately I'm not too familiar with Wireshark and in our recent homework we are supposed to create a pcap file which includes a specific number. In order to create that pcap file we are supposed to use the search function of Wireshark to find by string in packet bytes and export the result with the specified number in either the protocol header or the payload. How am I supposed to go about this?
Well, this was way easier than I thought. All I needed to do was to create a connection to a FTP server, listen to that connection in Wireshark and then transfer a textfile with the number in it/named after it, in plain FTP in ASCII mode.

What does HTTP download exactly mean?

I often hear people say download with HTTP. What does it really mean technically?
HTTP stands for Hyper Text Transfer Protocol. So to understand it literally, it is meant for text transferring. And I used some sniffer tool to monitor the wire traffic. What get transferred are all ASCII characters. So I guess we have to convert whatever we want to download into characters before transferring it via HTTP. Using HTTP URL encoding? or some binary-to-text encoding schema such as base64? But that requires some decoding on the client side.
I always think it is TCP that can transfer whatever data, so I am guessing HTTP download is a mis-used word. It arise because we view a web page via HTTP and find some downloadable link on that page, and then we click it to download. In fact, browser open a TCP connection to download it. Nothing about HTTP.
Anyone could shed some light?
The complete answer to What does HTTP download exactly mean? is in its RCF 2616 specification, that you can read here: https://www.rfc-editor.org/rfc/rfc2616
Of course that's a long (but very detailed) document.
I won't replicate or summarize its content here.
In the body of your question you are more specific:
So to understand it literally, it is meant for text transferring.
I think the word "TEXT" it misleading you.
And
have to convert whatever we want to download into characters before transferring it via HTTP
is false. You don't necessarily have to.
A file, for example a JPEG image, may be sent over the wire without any kind of encoding. See for example this: When a web server returns a JPEG image (mime type image/jpeg), how is that encoded?
Note that optionally a compression or encoding may be applied (the most common case is GZIP for textual content like html, text, scripts...) but that depends on how the client and the server agree on how the data have to be transferred. That "agreement" is made with the "Accept-Encoding" and "Content-Encoding" directives in respectively the request's and the resonse's headers.
I understand the name is misleading you, but if you read Hyper Text Transfer Protocol as a Transfer Protocol with Hypertext capabilities, then it changes a bit.
When HTTP was developed there were already lots of protocols (for example, the IP protocol, which is how data are widely transmitted between servers on the internet) but there were not protocols that allowed for easy navigation between documents.
HTTP is a protocol that allows for transferring of information AND for hyper text (i.e. links) embedded within text documents. These links don't necessarily have to point to other text documents, so you can basically transmit any information using HTTP (the sender and the receiver agree on the type of document being sent using something called the mime type).
So the name still makes sense, even if you can send things other than text files.
HTTP stands for Hyper Text Transfer Protocol. So to understand it literally, it is meant for text transferring.
Yes, text transferring. Not necessarily plain text, but all text. It doesn't mean that your text has to be readable by a person, just the computer.
And I used some sniffer tool to monitor the wire traffic. What get transferred are all ASCII characters.
Your sniffer tool knows that you're a person, so it won't just present you with 0s and 1s. It converts whatever it gets to ASCII characters to make it readable to you. Alle communication over the wire is binary. The ASCII representation is just there for your sake.
So I guess we have to convert whatever we want to download into characters before transferring it via HTTP
No, not at all. Again, it's text – not necessarily plain text.
I always think it is TCP that can transfer whatever data, [...]
Here you're right. TCP does transfer all data, but in a completely different layer. To understand this, let's look at the OSI model:
When you send anything over the network, your data goes through all the different layers. First, the application layer. Here we have HTTP and several others. Everything you send over HTTP goes through the layers, down through presentation and all the way to the physical layer.
So when you say that TCP transfers the data, then you're right (HTTP could work over other transport protocols such as UDP, but that is rarely seen), but TCP transfers all your data whether you download a file from a webserver, copy a shared folder on your local network between computers or send an email.
HTTP can transfer "binary" data just fine. There is no need to convert anything.
HTTP is the protocol used to transfer your data. In your case any file you are downloading.
You can either do that(opening another type of connection) or you can send your data as raw text. What you'll send is just what you would see when opening the file in a text editor. Your browser just decides to save the file in your Downloads folder(or whereever you want it) because it sees the file type is not supportet(.rar, .zip).
If you look at OSI model, HTTP is a protocol that lives in the application layer. So when you hear that someone uses "HTTP to transfer data" they are referring to application layer protocol. An alternative would be FTP or NFS, for example.
Browser indeed opens TCP connection, when HTTP is used. TCP lives in the transport layer and provides reliable connection on top of IP.
HTTP protocol provides different verbs that can be used to retrieve and send data, GET and POST are the most common ones. Look-up REST.

How would I trace a packet I sent out? (Get IP of wherever it is going)

I was wondering, is it possible to trace a packet that you send out of your own computer? The idea here would be to build something to protect your data. The packet sent out containing your password and other vital information is open to rerouting by a hacker. I want to know if it is possible (and if so, how I might go about approaching this) to trace the intermediate and/or final destinations of a certain packet, and then have them sent back to my computer for verification.
I would appreciate any help you guys could give on this matter.
There is no mechanism to do what you want. The packet itself might reach its "destination" just fine, only to be further re-directed elsewhere. Consider it like mailing an envelope -- whoever receives it is free to photocopy the contents a thousand times and send the copies to newspapers, tabloids, and telephone poles all over the world.
Same story with data -- once it leaves your computer, you have to trust the remote endpoint to not do anything harmful with it.
TLS and openPGP can go a long way to preventing third-parties from reading or modifying your data while it is in transit, but they cannot make sure the remote peer only handles your data with care.
No, not with ISO OSI, and there are other approaches to protect youre data

How to analyse a HTTP dump?

I have a file that apparently contains some sort of dump of a keep-alive HTTP conversation, i.e. multiple GET requests and responses including headers, containing an HTML page and some images. However, there is some binary junk in between - maybe it's a dump on the TCP or even IP level (I'm not sure how to determine what it is).
Basically, I need to extract the files that were transferred. Are there any free tools I could use for this?
Use Wireshark.
Look into the file format for its dumps and convert your dump to it. Its very simple. Its called the pcap file format. Then you can open it in Wireshark no problem and it should be able to recognize the contents. Wireshark supports many dozens if not many hundred communication formats at various OSI layers (including TCP/IP/HTTP) and is great for this kind of debugging.
Wireshark will analyze on the packet level. If you want to analyze on the protocol level, I recommend Fiddler: http://www.fiddlertool.com/fiddler/
It will show you the headers sent, the responses, and will decrypt HTTPS sessions as well. And a ton more.
The Net tab in the Firebug plugin for Firefox might be of use.

Resources