Say my client is posting a http request, of which the body is 1 TB.
The first time when my server's http handler get invoked, how much of the request boday has been in my server's tcp socket receive-q?
Edited:
Actaully what I want to know is related to the details of http.
Say the size of the whole http request is very large (big size of headers, and big size of body):
Basically, I can use one TCP request to send the whole http request(the size of which is very big). I can also split the big http request into several smaller slices and send them with several TCP requests. Which way does http protocol implement?
When can we think that the http server can start to work from the point of view of http protocol?
For example:
All the path in the URL (the part after the host:port in the URL) has been available in the server's memory? Or,
All the http headers have been available in the server's memory? Or,
All the request body has been available in the server's memory?
Related
Just a quick question, and probably a stupid one.
But usually when a client connects to an http server, the server sends them the header and the html, correct?
I'm packet sniffing a realtime-chat, and attempting to reverse engineer a plain text protocol, and it's connected to a http server. This is why I ask, for verification.
Basically, this is correct. Anyways, you have to differentiate between for example GET and POST Requests.
While POST Requests normally have a "real" body with information that they are delivering to the Server, the body of GET Requests is empty for most of the time.
For the responses, your Claim is correct. The Header is sent to tell how big the response is, which MIME Type is used, etc.
I'm currently trying to automatically reconstruct an HTTP browsing only with a pcap ( basically it means matching an HTTP reply to the next HTTP requests). Most of the times, it works fine but sometimes a certain url, u, is present in the data of multiple HTTP replies.
For example, if u1 and u2 contains u in their reply data and if the request to u happens after the request to u2, how can I decide if the request to u was caused by u1 or by u2 ? Note that no request to u was made between u1 and u2.
Are there some fields in any network layer that I can use to make this match ?
Thanks!
HTTP runs on top of TCP, which is connection-oriented. You have access to the IP header of the connection used for the HTTP request (client IP/port -> server IP/Port).
HTTP is a command/response protocol, there is 1 response for each request.
So, simply look for an HTTP response immediately following the HTTP request on the same TCP connection (server IP/Port -> client IP/Port).
HTTP is state-less, the connection may be closed between requests without affecting the overall browsing model (closing connections is the required behavior in HTTP 0.9, is the default behavior in HTTP 1.0, and is not the default behavior in HTTP 1.1+), so it is possible for an HTTP response to trigger subsequent requests on new connections, so you need to be ready to handle that. The Connection header in the HTTP request will tell you whether the client is asking for the connection to remain open or not. The Connection header in the HTTP response will tell you whether the server is actually closing the connection or not after sending the response. But even if the server leaves the connection open, that is no guarantee that the client will actually reuse the same connection for later requests to the same server (though it likely will, unless a timeout elapses between requests).
Can someone explain the difference between the HTTP request and it handling and socket requests on 80 port. As I understood, HTTP server listen the 80 port and when someone sends an HTTP request on this port - server handle it. So when we place socket listener on port 80, and then write HTML formatted message to it - does it means that we send usual HTTP request? But as fiddler said - it false. What's the difference on a packet level? Or another lower than presentation-level between HTTP request and HTTP-formed writing to socket? Thanks.
First of all, port 80 is the default port for HTTP, it is not required. You can have HTTP servers listening on other ports as well.
Regarding the difference between "regular" HTTP requests and the ones you make yourself over a socket - there is no difference. The "regular" HTTP requests you are referring to (made by a web browser for example) are also implemented over sockets, just like you would do it manually yourself. And the same goes for the server. The implementation of the HTTP server listens for incoming socket connections and parses the data that passes there just like you would.
As long as you send in your socket valid HTTP protocol (according to the RFC), there should be no difference in the packet level (if the lower network stack is identical).
Keep in mind that the socket layer is just the layer the HTTP data always passes over. It doesn't really matter who put the data there, it just comes out from the other side the same way it was put in.
Please note that you have some degree of freedom when implementing an HTTP yourself. There are many optional fields and the order of the headers doesn't matter. So it is possible that two different HTTP implementations will be different in the packet level, but will behave basically the same.
The best way to actually see what's going on in the packet level, is by using a network sniffer - like wireshark or packetyzer. A sniffer actually records the packets of the network and shows you their content. So if you record several HTTP implementations (from various browsers) and your own socket implementation, you can make the required changes to make them identical in the packet level.
My Question is related to the HTTP Streaming Method for realizing HTTP Server Push:
The "HTTP streaming" mechanism keeps a request open indefinitely. It
never terminates the request or closes the connection, even after the
server pushes data to the client. This mechanism significantly
reduces the network latency because the client and the server do not
need to open and close the connection.
The HTTP streaming mechanism is based on the capability of the server
to send several pieces of information on the same response, without
terminating the request or the connection. This result can be
achieved by both HTTP/1.1 and HTTP/1.0 servers.
The HTTP protocol allows for intermediaries
(proxies, transparent proxies, gateways, etc.) to be involved in
the transmission of a response from server to the client. There
is no requirement for an intermediary to immediately forward a
partial response and it is legal for it to buffer the entire
response before sending any data to the client (e.g., caching
transparent proxies). HTTP streaming will not work with such
intermediaries.
Do I avoid the descibed problems whith proxy servers if i use HTTPS?
HTTPS doesn't use HTTP proxies - this would make security void. HTTPS connection can be routed via some HTTP proxy or just HTTP redirector by using HTTP CONNECT command, which establishes transparent tunnel to the destination host. This tunnel is completely opaque to the proxy, and proxy can't get to know, what is transferred (it can attempt to modify the dataflow, but SSL layer will detect modification and send an alert and/or close connection), i.e. what has been encrypted by SSL.
Update: for your task you can try to use one of NULL cipher suites (if the server allows) to reduce the number of operations, such as perform no encryption, anonymous key exchange etc. (this will not affect proxy's impossibility to alter your data).
How HTTP request works:(if i have mistake, please write)
user type in browser http://www.website.com
the server send him html page with links to images+css+js files
browser read html and where included images/css/js file send http request to get the file
where browser send request to get file, does he waiting for response or he send request for next file?
Thanks
Most browsers will have an internal queue of requests which are handled as follows:
Request the first item. If a fresh copy is in the cache, this will mean a request to the cache. If a stale copy with validation information (last-mod and/or e-tag) this will be a conditional request (the server or proxy may return a 304 indicating the stale copy is actually still fresh). Otherwise an unconditional request.
As rendering of the entity returned requires other entities, these will be put into a queue of needed requests.
Requests in the queue that have already been in that same queue (e.g. if a page uses the same image more than once) will have the same entity immediately used (hence if a URI returns a random image, but you use it more than once in the same page, you will get the same image used).
Requests will be processed immediately, so in the case of a webserver, images, css, etc. will begin downloading before the HTML has finished rendering or indeed, finished downloading.
Requests to the same domain with the same protocol (HTTP or HTTPS) will be pipelined, using a connection that has already been used, rather than opening a new one.
Requests are throttled in two ways: A maximum number of simultaneous requests to the same domain, and a total maximum number of simultaneous requests.
The browser usually initiate more than one socket to the target server, and thus getting content on more than one socket at the same time. This can be combined with HTTP Pipelining (what you are asking about), where the browser sends multiple requests on the same socket without waiting for each of their responses.
From Wikipedia page:
HTTP pipelining is a technique in
which multiple HTTP requests are
written out to a single socket without
waiting for the corresponding
responses. Pipelining is only
supported in HTTP/1.1, not in 1.0.