When does the browser execute the CONNECT method? - http

In http we have a method called CONNECT. I am trying to understand when it is used by the browser.
Is it used prior to every request or prior to every https request only?
Further, as we know - the http protocol makes persistent connection with the server. This connection gets closed after a certain period of inactivity or timeout. So does this concept influence when the CONNECT method is used by the browser? That is - whether CONNECT is used prior to each persistent connection or prior to each request within the persistent connection?

Related

HTTP server connection management

I've been playing around with an http server I'm creating using boost::asio (hence the c++ tag).
With HTTP/1.1 the default is to keep the clients connection open for more than 1 request/response.
My question is:
How long should I keep a client connection open? Should I use a deadline_timer which closes the connection after some arbitrary amount of time?
Or, should I just wait for the underlying socket receive timeout to expire? At which point my receive handler will be invoked with an EOF error prompting me to remove the client connection from my list of connections.
Furthermore, if this is specified in an RFC document, which one?

Websocket - Should client send ping frames?

As the title suggests, should Ping Frames only be sent from a server or it is better to have both endpoints send them? As mentioned in the Websocket RFC:
NOTE: A Ping frame may serve either as a keepalive...
So by by having one endpoint sending a ping request it should keep the connection open, right?
The second part of above line is this:
or as a means to verify that the remote endpoint is still responsive.
I'm new to the concept of websockets but if the connection closes from the server won't the client be notified?
Consider the case where the server just goes away, maybe it crashes. Who or what will notify the client of this? Or say a network link close to the server is down for so long that by the time it comes back up, the server has totally forgotten about this client. Who or what would tell the client?
There are three possibilities:
The client does not need to detect loss of the connection. In this case, there's nothing special you need to do.
The client has some way to detect loss of the connection already. For example, if the connection is idle for some period of time, the client could send an application-level query and timeout if it gets no response or if the query fails.
The client needs to detect loss of the connection but has no existing way to do this. In this case, using pings makes sense.
In a typical query/response protocol, the client usually doesn't need to ping the server because there's some query it can send that has the same effect. Unless the protocol layered above websocket supports some way for the server to query the client, the server often has only two choices: use pings to detect lost connections or disconnect idle clients.
Both variants has ways of implementation. For example in case if server sends ping to client, then client can get information that server disconnected by have a loop with deadline timer which is reset every time when ping is received. If the timer reaches dealine, then it's mean that server disconnected.

What is the relationship between HTTP connection and a request?

While I am configuring my nginx, I found two modules: ngx_http_limit_conn_module and ngx_http_limit_req_module
one is for limiting connection per defined key, and one for limiting request.
My question is what is the relationship (and difference) between
a HTTP connection and a request.
It seems that multiple HTTP requests can use one common HTTP connection, what is the principle under this?
Basically connections are established to make requests using it. So for instance endpoint for given key may accept 5 connections per hour from given IP address. But it doesn't mean only 5 requests can be made but much more - if the connection is not closed after a request (from HTTP 1.1 it's by default kept alive).
E.g. an endpoint accepts 5 connections and 10 requests from given IP address. If connection is established for every request only 5 requests overall can be made. If connection is kept alive single client may make all the requests. If there are 5 clients, every establishes a connection and keeps it alive there are 2 request approx. that can be made by each client - however one can make all the request if it's fast enough.
HTTP connections - client and server introduce themselves.
HTTP requests - client ask something from server.
Making a connection with server involves TCP handshaking and it is basically creating a socket connection with the server. To make a HTTP request you should be already established a connection with the server. If you established connection with a server you can make multiple request using the same connection(HTTP/1.0 by default one request per connection, HTTP/1.1 by default it is keep alive). As most of the web pages need multiple resources from the server(ex: 100 photos to load in the screen). It is a low burden to the server if we keep the connection and request those 100 images using the same connection(No need to go through the connection establishment process 100 times). That is why HTTP/1.0 came up with keep alive as default.
A request is a functional execution: "Do something for me, and return the result back to me" - which is made by the client over a channel that the server is listening on, the "connection". Think of it as making a phone call to a restaurant. When the restaurant picks up the phone, you have an established "connection" - and now can place multiple requests over the same connection. The restaurant can handle multiple, simultaneous customer calls, if it has multiple phone lines open to receive the calls. This is your "connection pool" - at any point in time, you can only have as many simultaneous open connections (max) as the size of your connection pool. The number of requests however will vary. Some client may make 3 requests, and hang up, while other client may make 10 requests before hanging up.
The size of your connection pool determines concurrency - how many simultaneous clients can you talk to at any point in time? The length of those conversations will be use case specific.

Proxy's Response to Asynchronous Close Events

Let's say you have an HTTP/1.1 proxy sitting between a client and a server. If connections are persistent, there is the possibility that the server will close the connection, but the client will send a request before being notified of the closure. What is the proxy's correct response to this? Does it send an HTTP error to the client or does it try to reconnect to the server?
The proxy should mimic the behaviour of the server, and close the connection - irrespective of whether there is a request in flight.
Automatically reconnecting can create unwanted side effects. The client would assume that it still has the same persistent connection and can, for example, skip authentication headers, cookies etc.
The other alternative - returning a 5xx error would also be wrong, since the client can also make incorrect assumptions about server state.
Mimicking server's behaviour is the safest and consistent option.

HTTP and Sessions

I just went through the specification of http 1.1 at http://www.w3.org/Protocols/rfc2616/rfc2616.html and came across a section about connections http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8 that says
" A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. This signaling takes place using the Connection header field (section 14.10). Once a close has been signaled, the client MUST NOT send any more requests on that connection. "
Then I also went through a section on http state management at https://www.rfc-editor.org/rfc/rfc2965 that says in its section 2 that
"Currently, HTTP servers respond to each client request without relating that request to previous or subsequent requests;"
A section about the need to have persistent connections in the RFC 2616 also said that prior to persistent connections every time a client wished to fetch a url it had to establish a new TCP connection for each and every new request.
Now my question is, if we have persistent connections in http/1.1 then as mentioned above a client does not need to make a new connection for every new request. It can send multiple requests over the same connection. So if the server knows that every subsequent request is coming over the same connection, would it not be obvious that the request is from the same client? And hence would this just not suffice to maintain the state and would this just nit be enough for the server to understand that the request was from the same client ? In this case then why is a separate state management mechanism required at all ?
Basically, yes, it would make sense, but HTTP persistent connections are used to eliminate administrative TCP/IP overhead of connection handling (e.g. connect/disconnect/reconnect, etc.). It is not meant to say anything about the state of the data moving across the connection, which is what you're talking about.
No. For instance, there might an intermediate (such as a proxy or a reverse proxy) in the request path that aggregates requests from multiple TCP connections.
See http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-21.html#intermediaries.

Resources