how does HTTP "session" reconstruction work? - http

I have found this tool online: http://www.unleashnetworks.com/products/unsniff.html
How does this work? Are they assuming that all HTTP traffic for a session occurs in the same TCP session, and then just clumping all that data together? Is that a safe assumption?
I was under the impression that when I load a page, multiple TCP sessions could be running for that single page load (images, videos, flash, whatever).
This seems to get complicated when I think about having two browser tabs open that are loading pages at the same time..how could I differentiate one http "session" from another? Especially true if they are hitting the same page, right?
For that matter, how does the browser know which data incoming belongs to which tab? Does it keep track of TCP sessions belonging to an individual tab?
Edit:
When HTTP session is mentioned above, I am referring to all of the related HTTP transactions that it takes to, say, load a page.
By TCP session, I am literally referring to the handshake's SYN -> FIN packet lifetime.

Although it might not be visible, the HTTP Session tracker is being passed to the server from the client as a parameter or as e cookie (header)
You might need to read about HTTP session token
A session token is a unique identifier that is generated and sent from a server to a client to identify the current interaction session. The client usually stores and sends the token as an HTTP cookie and/or sends it as a parameter in GET or POST queries. The reason to use session tokens is that the client only has to handle the identifier—all session data is stored on the server (usually in a database, to which the client does not have direct access) linked to that identifier. Examples of the names that some programming languages use when naming their HTTP cookie include JSESSIONID (JSP), PHPSESSID (PHP), and ASPSESSIONID (ASP).

I am not familiar with the "Unsniff" app you link to, but I have used a few packet sniffers before (my favorite is Wireshark). Usually you can differentiate sessions based on what host they are connected to. So, for instance, if you have 2 tabs open and one is opened to www.google.com and the other is www.facebook.com, the packet sniffer should be able to tell you which session is pointed at which host (or at least give you an IP address, which you can then use to find the host. see: reverse lookup).
Most times, multiple HTTP sessions will be open to one host. This is the case when you're loading a site's various resources (CSS files, images, javascript, etc.). Each of these resources will show up as a separate HTTP session (unless, of course, the connection is persistent... but your sniffer should be able to separate them anyway). In this case, you (or the sniffer) will need to determine what was downloaded by looking at the actual data within the HTTP packet.

Related

Distinguish between users in HTTP

I have a question about distinguishing user in HTTP.
I opened wireshark and web browser, started wandering in web pages on some known website. and noticed that my PC opened several TCP connections and on each connection there are several HTTP request\response.
My main goal is to identify user that wandering in my website (for example)
at first, I thought of finding matching response for each request but it is not trivial as I have done alot of reading about it. specially if there are several TCP connections for one user.
I also thought of identify user by its source port\ip- but there are several TCP connection for one user and there are more than one user (some of them might be beyond NAT).
So my question is how can I identify\isolate all the http requests\responses from one user given:
there are more that one user
some of them might connected to NAT
each user open several TCP connections for the "main" HTTP request
cookies might change during session
Is there some sniffer or library that already has this ability of distinguish between users (this http request\response belongs to that user)?
This process suppose to run on the fly.
Thanks alot.

ASP MVC: Can I drop a client connection programatically?

I have an ASP.NET Web API application running behind a load balancer. Some clients keep an HTTP busy connection alive for too much time, creating unnecessary affinity and causing high load on some server instances. In order to fix that, I wish to gracefully close a connection that is doing too much requests in a short period of time (thus forcing the client to reconnect and pick a different server instance) while at same time keeping low traffic connections alive indefinitely. Hence I cannot use a static configuration.
Is there some API that I can call to flag a request to "answer this then close the connection" ? Or can I simply add the Connection: close HTTP header that ASP.NET will see and close the connection for me?
It looks like the good solution for your situation will be the built-in IIS functionality called Dynamic IP restriction. "To provide this protection, the module temporarily blocks IP addresses of HTTP clients that make an unusually high number of concurrent requests or that make a large number of requests over small period of time."
It is supported by Azure Web Apps:
https://azure.microsoft.com/en-us/blog/confirming-dynamic-ip-address-restrictions-in-windows-azure-web-sites/
If that is the helpful answer, please mark it as a helpful or mark it as the answer. Thanks!
I am not 100% sure this would work in your situation, but in the past I have had to block people coming from specific IP addresses geographically and people coming from common proxies. I created an Authorized Attribute class following:
http://www.asp.net/web-api/overview/security/authentication-filters
In would dump the person out based on their IP address by returning a HttpStatusCode.BadRequest. On every request you would have to check a list of bad ips in the database and go from there. Maybe you can handle the rest client side, because they are going to get a ton of errors.
Write an action filter that returns a 302 Found response for the 'blocked' IP address. I would hope, the client would close the current connection and try again on the new location (which could just be the same URL as the original request).

How does a http client associate an http response with a request (with Netty) or in general?

Is a http end point suppose to respond to requests from a particular client in order that they are received?
What about if it doesn't make sense to in the case of requests handled by cluster behind a proxy or in requests handled with NIO where one request is finished faster than the other?
Is there a standard way of associating a unique id with each http request to associate with the response? How is this handled in clients like http componenets httpclient or curl?
The question comes down to the following case:
Suppose, I am downloading a file from a server and the request is not finished. Is a client capable of completing other requests on the same keep-alive connection?
Whenever a TCP connection is opened, the connection is recognized by the source and destination ports and IP addresses. So if I connect to www.google.com on destination port 80 (default for HTTP), I need a free source port which the OS will generate.
The reply of the web server is then sent to the source port (and IP). This is also how NAT works, remembering which source port belongs to which internal IP address (and vice versa for incoming connections).
As for your edit: no, a single http connection can execute one command (GET/POST/etc) at the same time. If you send another command while you are retreiving data from a previously issued command, the results may vary per client and server implementation. I guess that Apache, for example, will transmit the result of the second request after the data of the first request is sent.
I won't re-write CodeCaster's answer because it is very well worded.
In response to your edit - no. It is not. A single persistent HTTP connection can only be used for one request at once, or it would get very confusing. Because HTTP does not define any form of request/response tracking mechanism, it simply would not be possible.
It should be noted that there are other protocols which use a similar message format (conforming to RFC822), which do allow for this (using mechanisms such as SIP's cSeq header), and it would be possible to implement this in a custom HTTP app, but HTTP does not define any standard mechanism for doing this, and therefore nothing can be done that could be assumed to work everywhere. It would also present a problem with the response for the second message - do you wait for the first response to finish before sending the second response, or try and pause the first response while you send the second response? How will you communicate this in a way that guarantees messages won't become corrupted?
Note also that SIP (usually) operates over UDP, which does not guarantee packet ordering, making the cSeq system more of a necessity.
If you want to send a request to a server while another transaction is still in progress, you will need to create a new connection to the server, and hence a new TCP stream.
Facebook did some research into this while they were building their CDN, and they concluded that you can efficiently have 2 or 3 open HTTP streams at any one time, but any more than that reduces overall transfer time because of the extra packet overhead cost. I would link to the blog entry if I could find the link...

Effect of TCP RST on page loading time for Javascript script src tag

I want to deprecate (turn off/not send HTTP responses) for some old HTML & JS code that my clients have installed on their pages. Not all clients can update all of their webpages prior to when we deprecate, but I have the OK to deprecate.
Simple example of what the code can look like:
Customer domain, customer.com, has HTML & JS on their pages:
<script src="http://mycompany.com/?customer=customer.com&..."></script>
We are considering configuring our switches to send a TCP RST response on incoming deprecated requests to http://mycompany.com/..., so my question is, are there any side-effects (stall page loading, for example) with the approach of configuring our switches to respond with a TCP RST on the incoming TCP connection? Obviously, I want the least (ie no) impact on a customer's site.
I have to think that RST is a fairly harsh mechanism to not reply to a single request. This request might be one of a hundred resources required to render one of your client pages, and if you tear down the connection, that connection cannot be re-used to request further resources. (See 19.7.1 in the HTTP1.1 RFC: "Persistent connections are the default for
HTTP/1.1 messages; we introduce a new keyword (Connection: close) for
declaring non-persistence.")
Each new connection will require a new three-way handshake to set up, which might add half a second per failed request to one of the two connections the client is using to retrieve resources from your servers. What is the average latency between your servers and your customers? Multiply that by three to get the time for a new three-way handshake.
If you fail the requests at the HTTP protocol level instead (301? 302? 404? 410?) you can return a failure in the existing HTTP connection and save three-round-trips to generate a new connection (which might also be for a resource that you're no longer interested in serving).
Plus, 410 ought to indicate that the browser shouldn't bother requesting the resource again (but I have no idea which browsers will follow this advice.) An RST-ed resource will probably be re-tried every single time it is requested.

Is data sent via HTTP POST when the Server does not exist?

I work for a large-ish advertising company. We've created a very lightweight clone of the PayPal IPN so we can offer CC Processing services for our top advertisers.
Like the PP IPN, it's a simple RESTful interface.
I deliberately instructed our admin guys to configure the vhost for this web app to only respond to requests on port 443.
This particular question is beyond my HTTP Protocol knowledge:
This may vary from browser to browser, but when a user submits a form, and the ACTION for that form is, say http://www.somesite.com, if the browser cannot resolve that site, does the post payload ever get sent over the wire?
I know this is a bit esoteric and it's more of an implementation question than something that exists in the HTTP RFC (as far as I could tell). Any takers?
Before sending any data the browser needs to open a TCP connection to the target site. Since this connection to the target site cannot be opened in the first place, no data can be sent.
Update (Thanks for the hint in the comments):
Use HTTP-Requests like POST to avoid sending data over the wire which could be intercepted by proxies before the existence of the target could be checked. With proxies the TCP-connection is always established successfully and the HTTP-request-header is sent to it. The POST-request contains the additional data in his request-body which should be sent only if the request header returns no error. Nevertheless, the implementation of proxies differ and I cannot guarantee that there is no proxy which returns an error if the target-site is non-existing. But in such a case I don't know any way where you could avoid sending the complete data over the wire...

Resources