There are a specific set of process happens between a user hits www.google.com and see the page in the browser. Can anybody tell me what all things that happen during a similar process. Also how mobile browser is different from web browser.
This really depends on what browsers you're comparing. For example, Safari Mobile and Safari for Mac are quite similar to one another, so much so that you often see the same page on both. However IE for Pocket PCs is much different than IE8 and pages would render somewhat differently in those two.
Usually, site operators check the UserAgent string that all browsers have, to see which browser it is. Then, it's up to the site operator to show a mobile site or a regular site based on whether they want to or not.
PPK has a great list of all browser quirks and features, at quirksmode.org. It's a must-read for mobile development.
Name resolution. www.google.com gets resolved to an IP address through domain name
HTTP Request. The browser sends a GET request to server.
HTTP Response. The server sends back an HTTP response.
Parse. The client parses the resulting document and resolves referenced assets (css, images, etc)
HTTP Requests. For each referenced asset, the browser sends another request to the server.
HTTP Response. For each referenced asset, the server responds.
In this respect, how http is requested, mobile is not different than desktop.
Aame stuff happens, mobile browser(s) renders html documents like your pc browser(s).
Of course they might have have less memory, different rendering engine(s), run on a very small screen etc etc etc. But, at the end it is just another http request to google.com.
Depending on network or connection type to the net there might be another difference. Operator gateway/proxy. Some operators filter/proxy all communication to the net.
Also (usually) internet traffic from operator's customers to the net routed through couple of public IPs
Related
It may seem to be a trivial question but still.. I have a confusion over it.
Almost at every site I have read that HTTP persistent or keep-alive connections are better than the non-persistent one.
Ques: So, why do non-persistent even exists?
Some says that persistent has disadvantage if server is serving many clients as users are deprived of connection.
Ques: All the popular websites server millions of clients, does that mean they don't use persistent mode?
As per my understanding I can think search engines may not be using persistent connections.
Can someone please enlighten me on this topic.
Another doubt I have is regarding the HTTP requests. I have read that if a page contains link to several objects then web browser makes that many request to fetch all those (this is why persistent connections are used). My doubt is why all the objects are not embedded in the page and sent as one object? If argument is that it makes page heavy and not bandwidth friendly then anyways the browser open parallel connections to fetch multiple objects which again putting the same load on the network.
OK, I understand that this cannot be done for like image search but if a page contains few objects then can we embed them into the page and send.
These may seem foolish questions but I can't help. I have a doubt and I need to clear that and you can help.
Thanks
The original HTTP specification always uses non-persistent connections; HTTP/1.1 added persistence because it is more efficient for web pages that embed a lot of external objects (which were rare when HTTP/1.0 was written.)
However, even though HTTP/1.1 allows persistent connections there are implementations that don't support them, or which still only support HTTP/1.0. For this reason, HTTP/1.1 requires that the Connection: keep-alive header be sent in order to enable this feature, and Connection: close be sent to disable it.
It is possible to include media directly in the HTML by base64 encoding the data and including it in a data: URL. This is not usually done because it slows down your web browser. With a standard HTML page, the browser can start rendering the structure of the page without waiting for the (rather large) inline data: links to download.
As you say most of the webpages hosted over the internet will not only handle fewer data, and nobody can estimate that. The HTTP server should be generic and it should have a mechanism to avoid multiple requests in the name of dependencies. You say that the non-persistent method avoids the blocking of ports by a single client for a long time where as the server may have to serve more clients and it would give a lot of stress, that is not true. Persistent connections actually reduce the load for a server by limiting the number of queries it has to serve.
Hope this HTTP Persistent connection will help you understand.
I've been trying to look for an explanation online but I can't seem to find one.
If you go to a site like youtube.com on Chrome and hover over the blue bar corresponding to the file name "http://www.youtube.com/", you'll see four different things:
-Blocking
-Sending
-Waiting
-Receiving
While viewing a different site's page in the network tab, I see
-DNS Lookup
-Connecting
-Sending
-Waiting
-Receiving
It takes a long time to do all these things, even though the page is so simple. What makes my server display different statistical keys for a page load, and what can I do to optimize? In general, where can I find more comprehensive info on network tool?
DNS lookup usually happens when you connect to the site the first time and your browser doesn't have its ip address. In this case you can see a small tooltip at the down left corner of the page with text "Resolve www.blablabla.com...."
It could be pretty long if the DNS server is slow.
Connecting is the time when the browser has sent a packet for establishing the connection and is waiting for an answer.
It can be long if the web server is slow.
Blocking is the time when the browser has to request a resource but 20 other resources have been requested from the same server. In this case the browser will put these request into a queue. It can happens if the server is slow.
The question is pretty straightforward. I want to know if there are ways of discovering the HTTP requests my browser sends while I navigate. For instance, what happens when I click on a certain link which sends a PUT method? I mean, I wish I could determine the exact HTTP request that my browser sends to that website. Further, I want to, later, reproduce that request on Curl. Basically, I want to inspect requests my browser sends so I can automate that task later through the Curl command (command, not library).
Thanks in advance!
Fernando.
Fiddler does exactly what you want. It sets up a proxy that can monitor http communication from your browser.
http://www.fiddler2.com/fiddler2/
You would want the Firebug extesion for Firefox. It can show a lot of what is happening, and you can add more options by installing more extensions.
On the other hand, you can use wireshark to capture the traffic to and from your computer.
Then you can use filters to save the relevant packets (pcap is often the format for storing the packets).
Later, you can replay the packets using tools like tcpreplay.
You could try it out with backtrack linux (live cd/usb).
And nowadays there should be some new tools for windows also. :)
EO2 and JohnnyC are correct. Fiddler, WireShark, FireBug (FireFox addon), etc. are what you are going to look for. You can use them free of charge.
WireShark will capture all incoming and outgoing traffic on your box. You can listen on any port, filter data etc.
FireBug will capture outgoing and incoming data streams, the raw data (XML, JSON, images etc.) for each request.
Fiddler is great for tracking web data in a seperate application if you do not use FireFox.
I want to know that when browser sends a request do the server sends back the contents explicitly? And how would i confirm it?
There are several toolbars in Firefox that show exactly what are coming and going when making an HTTP request.
For firefox i use the following plugins:
Firebug
Web Developer
You could also install a utility called WireShark. It will "sniff" all the network traffic on your computer and show you at a packet level how it all works.
Browser plugins such as firebug (for firefox) let you see exactly what the server is returning; that's quite instructive and recommended! You'll see a bunch of headers followed by the response body in any of several formats (could be chunked, etc, etc).
In a Windows environment you can use Fiddler.
Fiddler includes a fair amount of documentation and is easy to use.
I have a firewall implementation and I want to log all the websites visited on the machine. So when the user enters an address in the browser(any browser) or clicks a link to be able to log the visited address.
The problem is that I want to log only the visited address and NOT the other resources requested by the page (ads, iframes, Google stats and so on). Is there a method to do this by looking at the HTTP or TCP headers? Or any other method.
Thank you.
A possible method would be to use "transparent proxying" : Have the firewall automatically transfer all out-bound HTTP connections to a proxy. You'll find the desired information in the proxy's log.
A somehow easier method I found was to use Microsoft® Active Accessibility® and to read the URL from the browser's address bar. But this is tricky in other ways: you have to take into consideration multiple browsers UI layout(at least the most popular ones) and also the differences between versions of the same browser. Some browsers or browser versions have limited support for MSAA and don't expose all the controls (e.g Opera 10.50-10.51, altghough this was fixed in 10.52).