Using tcpdump to watch which websites are accessed on my network - networking

I've just got my hands on a Raspberry Pi and I've set it up to act as the DNS and DHCP server on my home network. This means that all network requests go through it before they are released into the wild... Which offers me a great opportunity to use tcpdump and see what is happening on my network!
I am playing around with the tcpdump arguments to create the perfect network spy. The idea is to capture HTTP GET requests.
This is what I have so far and it's pretty good:
tcpdump -i eth0 'tcp[((tcp[12:1] & 0xf0)>> 2):4] = 0x47455420' -A
The -i eth0 tells it which interface to listen to
The bit in quotes is a nifty bit of hex matching to detect a GET request
The -A means "print the ASCII contents of this packet"
This fires every time anything on my network sends a GET request, which is great. My question, finally, is how can I filter out boring requests like images, JavaScript, favicons etc?
Is this even possible with tcpdump or do I need to move onto something more comprehensive like tshark?
Thanks for any help!
DISCLAIMER: Currently the only person on my network is me... This is not malicious, it's a technical challenge!

Grep is your friend :-) tcpdump ... | grep -vE "^GET +(/.*\.js)|(/favicon.ico)|(.*\.png)|(.*\.jpg)|(.*\.gif)|... +HTTP will hide things like GET /blah/blah/blah.js HTTP 1/.0, GET /favicon.ico HTTP 1/.0, GET /blah/blah/blah.png HTTP 1/.0, etc.

Related

How to write http layer sniffer

I want to write an application layer sniffer (SMTP/ftp/http).
Based on my searchs, first (and perhaps hardest!) step is to reassemble the tcp stream of the sniffed connections.
Indeed, what I need is something like the "follow TCP stream" option of wireshark, but I need a tool which do it on live interface and automatically. As I know, Tshark can extract TCP streams data from the saved pcap files automatically (link) but not from live interfaces. Can Tshark do it on live interfaces???
As I know, TCPflow can do exactly what I want, however, it can not handle IP defragmentation and SSL connections (I want to analyse the SSL content in the case I have the server private key).
Finally, I also try bro network monitor. Although it provides the list of TCP connections (conn.log), I was not able to get TCP connections contents.
Any suggestion about mentioned tools or any other useful tool is welcome.
Thanks in advance, Dan.
perl Net::Inspect library might help you. It also comes with a tcpudpflow which can write tcp and udp flows into separate files, similar to tcpflow. It works on pcap files or can do live captures. The library handles IP fragmenting. It also comes with a httpflow tool to extract HTTP requests and responses (including decompression, chunked encoding..). It does not currently handle SSL.
As the author of this library I don't think that extracting TCP flows is the hardest part, the HTTP parser (exluding decompression, including chunked mode) is nearly twice as big than IP and TCP combined.
This example works for reassembling application data of a single protocol:
tshark -Y "tcp.dstport == 80" -T fields -d tcp.port==80,echo -e echo.data
It captures live http data, reassembles it, and outputs it as raw hex.
I can add a small script to parse the hex into ascii if you like.
I want to analyse the SSL content in the case I have the server private key
TL;DR: This can't be done with a capturing tool alone.
Why not: Because each SSL session generates a new secret conversation key, and you can't decrypt the session without this key. Having the private server key is not enough. The idea behind this is that if someone captures your SSL traffic, saves it, and then a year later he "finds" the private server key, then he still won't be able to decrypt your traffic.

Method to track lost packets source in FreeBSD

I have FreeBSD host (some sort of HTTP Proxy) with spikes of retransmitted packets number. Is there any way to track were host loosing them (per incoming connection).
I usually capture a bunch of them with tcpdump or similar; and then post process them elsewhere. In your case that should not be hard - as you just need the header.
Something like tcpdump (without; or a < 200 byte -s fly) would do on the target machine.
Compress/move this file then off to a desktop machine to work on it. I'd start with something like wireshark (simply use the filters).
Beyond that - simple grep-ing/wc-counting or a small perl script may be called for. To save you re-inventing histograms; consider http://snippets.aktagon.com/snippets/62-How-to-generate-a-histogram-with-Perl or do a quick google.

Simple Http alternative for Wireshark

In web development, I usually use Firebug. But now I have to use Wireshark to monitor Http requests sent by an Android simulator. Wireshark is a fantastic tool, however it is too fat for what I'm doing, and quite painful to copy/paste the request.
So I'm looking for a simpler alternative on Linux Ubuntu.
Wireshark is mostly bloated due to the GUI front-end; however it has a text-version called tshark that uses substantially less memory... the syntax is very similar to tcpdump...
To capture packets sent to and from a webserver on 192.168.12.14, use this...
tshark -n -i eth0 tcp and host 192.168.12.14 and port 80
You may also consider using ngrep http://ngrep.sourceforge.net/usage.html#http

How to debug Websockets?

I want to monitor the websocket traffic (like to see what version of the protocol the client/server is using) for debugging purposes. How would I go about doing this? Wireshark seems too low level for such a task. Suggestions?
Wireshark sounds like what you want actually. There is very little framing or structure to WebSockets after the handshake (so you want low-level) and even if there was, wireshark would soon (or already) have the ability to parse it and show you the structure.
Personally, I often capture with tcpdump and then parse the data later using wireshark. This is especially nice when you may not be able wireshark on the device where you want to capture the data (i.e. a headless server). For example:
sudo tcpdump -w /tmp/capture_data -s 8192 port 8000
Alternately, if you have control over the WebSockets server (or proxy) you could always print out the send and receive data. Note that since websocket frames start with '\x00' will want to avoid printing that since in many languages '\x00' means the end of the string.
If you're looking for the actual data sent and received, the recent Chrome Canary and Chromium have now WebSocket message frame inspection feature.
You find details in this thread.
I think you should use Wireshark
Steps
Open wireshark
Go to capture and follow bellow path: capture > interfaces > start capture in your appropriate device.
Write rules in filter tcp.dstport == your_websoket_port
Hit apply
For simple thing, wireshark is too complex, i wanted to check only if the connection can be establish or not. Following Chrome plugin "Simple Web-socket (link : https://chrome.google.com/webstore/detail/simple-websocket-client/pfdhoblngboilpfeibdedpjgfnlcodoo?hl=en)" work like charm. See image.
https://lh3.googleusercontent.com/bEHoKg3ijfjaE8-RWTONDBZolc3tP2mLbyWanolCfLmpTHUyYPMSD5I4hKBfi81D2hVpVH_BfQ=w640-h400-e365

A Question regarding wget

when I type wget http://yahoo.com:80 on unix shell. Can some one explain me what exactly happens from entering the command to reaching the yahoo server. Thank you very much in advance.
RFC provide you with all the details you need and are not tied to a tool or OS.
Wget uses in your case HTTP, which bases on TCP, which in turn uses IP, then it depends on what you use, most of the time you will encounter Ethernet frames.
In order to understand what happens, I urge you to install Wireshark and have a look at the dissected frames, you will get an overview of what data belongs to which network layer. That is the most easy way to visualize and learn what happens. Beside this if you really like (irony) funny documents (/irony) have a look at the corresponding RFCs HTTP: 2616 for example, for the others have a look at the external links at the bottom of the wikipedia articles.
The program uses DNS to resolve the host name to an IP. The classic API call is gethostbyname although newer programs should use getaddrinfo to be IPv6 compatible.
Since you specify the port, the program can skip looking up the default port for http. But if you hadn't, it would try a getservbyname to look up the default port (then again, wget may just embed port 80).
The program uses the network API to connect to the remote host. This is done with socket and connect
The program writes an http request to the connection with a call to write
The program reads the http response with one or more calls to read.

Resources