As we all know that the HTTP is stateless protocol, means it disconnects the connection after the end of every transaction.
But that's not enough for me to understand it. What makes me confuse that, Is the TCP connection also ends?
As HTTP is a TCP network protocol, so it talks to other nodes through the TCP pipe. So Is the stateless means that the TCP connection also ends?
So, it will make another TCP connection by using another TCP 3 way-handshake?
What makes the protocol stateless is that the server is not required to track state over multiple requests, not that it cannot do so if it wants to. This simplifies the contract between client and server, and in many cases minimizes the amount of data that needs to be transferred.and the Internet is a stateless development environment" is often used. This simply means that the HTTP that is the backbone of the Web is unable to retain a memory of the identity of each client that connects to a Web site and therefore treats each request for a Web page as a unique and independent connection, with no relationship whatsoever to the connections that preceded it.
As we all know that the HTTP is stateless protocol, means it disconnects the connection after the end of every transaction.
No, we don't all know that it means that at all: no, it doesn't mean that at all; and no, it doesn't do that at all.
What makes me confuse that, Is the TCP connection also ends?
The TCP connection ends when the HTTP connection ends.
As HTTP is a TCP network protocol, so it talks to other nodes through the TCP pipe.
Correct.
So Is the stateless means that the TCP connection also ends?
No. The HTTP and TCP connections can be persistent, and they are by default from HTTP 1.1 (previously via the so-called KEEPALIVE feature). This has nothing to do with statelessness.
So, it will make another TCP connection by using another TCP 3 way-handshake?
Yes, whenever required, but that isn't as often as you seem to think. You are conflating two different aspects of HTTP.
Related
I know that a http request first makes a 3 way handshake to establish connection. Followed by the request and response.
If a handshake is required for future requests then it is called non persistent connection.
The server can choose to keep the connection alive so that a handshake is not required untill a timeout value (persistent). This is called persistent connection. It saves time required by not requiring the 3 way handshake for each request.
My colleague mentions that http supports both persistent and non persistent. My understanding is that - tcp makes the connection. So persistence is controlled by tcp layer. Am I right?
May be not right. HTTP is higher layer than TCP and HTTP 1.0 will send close() when they finish tranportint some data streams. But in HTTP 1.1, the controller will not send close(), instead, it'll send keepalive/hearbeat to the other side for live. It is controlled by the application layer, in other words, by the HTTP itself.
One way HTTP can support a persistent connection is called Server-Sent-Events.
An alternative to HTTP for persistant connections is WebSocket. WebSocket is a computer communications protocol, providing full-duplex communication channels over a single TCP connection. WebSocket enables streams of messages on top of TCP. WebSocket is distinct from HTTP.
I've read some articles comparing the differences between WebSocket and the other push methods like Long polling. All the conclusions tend to be WebSocket is better then HTTP with low latency in the server and client bidirectional communication process.
But if server push is not a must, for example, a client game program just make a few queries to the server for some information, does it still better to use WebSocket then HTTP? More specially, I have two doubts here:
1. In a single Request-Response procedure, which is more efficency ? (I establish a WebSocket connection each time querying in the above case.)
2. Will the server capacity (The total number of clients that the server can serve) be affected by the unnecessary long-lived connection if I keep an WebSocket connection during the life cycle of the client?
Added Question:
3. Suppose there is only one TCP connection between the server and the client, will the stability of the connection go down and down as time flows?
The basic thing behind both the WebSocket and HTTP is the socket. In HTTP, it opens a connection on request and closes on response. For WebSocket, concept is a 2 way communication (full duplex) rather than request-response cycle.
Answers to your question:
Either you can use HTTP server or can create request-response design
using WebSocket
That's obvious. Each connection is a socket object. Server capacity
will be affected if we are not managing connections.
In WebSocket, it's using ping-pong mechanism to make sure that the client or
the server is alive. For every ping requests from one end, other end is
subjected to reply a pong response. This mechanism helps to detect failures and hence to maintain stability.
I am trying to understand what are the HTTP pipelining and HTTP keep-alive connections, and trying to establish a connection between these two topics and Server Sent events technology.
As far as I understand,
HTTP keep-alive connection is the default in HTTP 1.1 way of using TCP when the established once TCP connection is used for sending several HTTP requests one by one.
HTTP pipelining is the capability of client to send requests to server while responses to previous requests were not yet received using the same TCP connection, generally not used as a default way in browsers.
My questions:
1) if it is possible to send several requests to server one after one using one TCP connection - how the client can distinguish between the responses? I guess client is using FIFO order of sending responses by server?
2) Why non-idempotent requests such as POST requests shouldn't be pipelined (according to wikipedia)?
3) What about the limitations of the web-server: is the number of possible open TCP connections limited? If yes, then if some number of clients hold keep-alive connections others cannot establish connections, and this can result in a problem, right?
4) Server Sent Events are using the keep-alive connection but, as far as I understand, SSE are not using pipelining. Instead they manage to process several responses to one request, or may be they just send another request when the next response with event arrived. Which guess is correct?
5) One TCP connection means one socket? One socket means one TCP connection? Closing/opening socket means closing/opening TCP connection?
Yes, FIFO. TCP/IP guarantees delivering data in-order, so responses can't arrive in a different order (if the server/proxy is buggy and sends responses in wrong order then you're totally screwed).
I don't recall any reason per HTTP spec. It may be just caution, because pipelining is poorly implemented in some proxies/servers.
HTTP spec suggests 2 connections per server, browsers have settled on 6-8 connections per server, but there is no fixed limit. Running out of connections is a real problem for Apache, and for high-load situations it's recommended to disable KeepAlive in Apache and use a proxy (e.g. HAProxy) that can cheaply provide Keep-Alive functionality to clients.
The benefit of a proxy is that one proxy can distribute connections to several servers (helps scaling), or can modify the traffic (e.g. gzip compress everything even if server-side-software didn't).
SSE doesn't rely on Keep-Alive. It's not using multiple responses. It's a single response that takes forever to "download", so pipelining or keep-alive are irrelevant for SSE. The TCP/IP connection cannot return any more responses while SSE response is being sent.
SSE will keep the server busy as long as the connection is open (so typicall all the time for every user). That's why it's best to use SSE with Node.js/Tornado that can handle hundreds of thousands connections rather than PHP/Apache that is designed for few connections at a time.
Sockets are programming interface for TCP/IP connections. Generally yes, one socket is one connection.
I just went through the specification of http 1.1 at http://www.w3.org/Protocols/rfc2616/rfc2616.html and came across a section about connections http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8 that says
" A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. This signaling takes place using the Connection header field (section 14.10). Once a close has been signaled, the client MUST NOT send any more requests on that connection. "
Then I also went through a section on http state management at https://www.rfc-editor.org/rfc/rfc2965 that says in its section 2 that
"Currently, HTTP servers respond to each client request without relating that request to previous or subsequent requests;"
A section about the need to have persistent connections in the RFC 2616 also said that prior to persistent connections every time a client wished to fetch a url it had to establish a new TCP connection for each and every new request.
Now my question is, if we have persistent connections in http/1.1 then as mentioned above a client does not need to make a new connection for every new request. It can send multiple requests over the same connection. So if the server knows that every subsequent request is coming over the same connection, would it not be obvious that the request is from the same client? And hence would this just not suffice to maintain the state and would this just nit be enough for the server to understand that the request was from the same client ? In this case then why is a separate state management mechanism required at all ?
Basically, yes, it would make sense, but HTTP persistent connections are used to eliminate administrative TCP/IP overhead of connection handling (e.g. connect/disconnect/reconnect, etc.). It is not meant to say anything about the state of the data moving across the connection, which is what you're talking about.
No. For instance, there might an intermediate (such as a proxy or a reverse proxy) in the request path that aggregates requests from multiple TCP connections.
See http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-21.html#intermediaries.
From this article on Wikipedia:
Keepalive messages were not officially
supported in HTTP 1.0. In HTTP 1.1 all
connections are considered persistent,
unless declared otherwise.
Does this mean that using this
mechanism I can actually simulate a
TCP socket connection?
Using this can I make a Server
"push" data to a client?
Are all HTTP connections, even the
one I am using to connect to Stack
Overflow "HTTP persistent"?
Does the COMET technology of
server push use this mechanism of
HTTP persistent connection to push
data to clients?
Does this mean that using this mechanism I can actually simulate a
TCP socket connection?
Not really, sockets have MANY more features and flexibility.
Using this can I make a Server "push" data to a client?
Not directly, it's still a request/response protocol; the persistent connection just means the client can use the same underlying socket to send multiple requests and receive the respective responses.
Are all HTTP connections, even the one I am using to connect to Stack
Overflow "HTTP persistent"?
Unless your browser (or a peculiar server) says otherwise, yes.
Does the COMET technology of server push use this mechanism of HTTP
persistent connection to push data to
clients?
Kinda (for streaming, at least), but with a lot of whipped cream on top. There are other Comet implementation approaches, such as hidden iframes and AJAX long polling, that may not require persistent connections (which give some firewalls &c the fits anyway;-).
Actually, the HTTP server can "push" data to a connected http client without the client requesting it. See "HTTP server push" at http://en.wikipedia.org/wiki/Push_technology. However it does seem to be commonly implemented.