stateless protocol and stateful protocol - http

How to understand stateless protocol and stateful protocol? HTTP is a stateless protocol and FTP is a stateful protocol. For the web applications requiring a lot of interactions, the underlying protocol should be stateful ones. Is my understanding right?

HTTP is a stateless protocol, in other word the server will forget everything related to client/browser state. Although web applications have made it virtually look like stateful.
A stateless protocol can be forced to behave as if it were stateful. This can be accomplished if the server sends the state to the client, and if the client to sends it back again to the server, every time.
There are three ways this may be accomplished in HTTP:
a) One is cookies, in which case the state is sent and returned in HTTP headers.
b) The second is URL extension, in which case the state is sent as part of the URL as request.
c) The third is "hidden form fields", in which the state is sent to the client as part of the response, and returned to the server as part of a form's hidden data
SCALABILITY AND HIGH AVAILABILITY
One of the major reasons why HTTP scales so well is its Statelessness. Stateless protocol eases the replication concerns, as the state itself doesn't need to be stored on the server.
Stateful protocols are logically heavy to implement in Internet reliably. Stateless servers are also easily scalable, while for stateful servers scalablity is problematic. Stateless request can be sent to any node, at any time, while with Stateful this is not a case.
HTTP as Stateless protocol increases availability for stateless web applications, which otherwise would be difficult or impossible to implement. If there is connection lost, there is no state that is lost, simple request resend will resolve the problem. Stateless requests are also cacheable.
see more here

Since you're asking about a Web application, the protocol will always be stateless -- the protocol for the Web is http (or https), and that's all she wrote.
I think what you're thinking of is providing a state mechanism in your Web application itself. The typical approach to this is that you create a unique identifier for the user's session in your Web application (a sessionID of one form or another is the common practice) which is handed back and forth between browser and server. That's typically done in a cookie, though it can be done, with a bit more hassle for you depending on your platform/framework, on the URL as well.
Your server-side code stores stateful information (again, typically called the user's session) however it wants to using the sessionID to look it up. The http traffic simply hands back the sessionID. As long as that identifier is there, each http transaction is completely independent of all others, hence the protocol traffic itself is stateless.

HTTP is a stateless protocol. All the web-based applications are also stateless.
When a request is send to the server, a connection is established between client and server. The server receives the request, processes the request and sends back the response and then, the connection will be closed.
If another request will be sent, after that, it will be treated as a new request and a new connection will be established.
In order to make HTTP stateful, we use session management techniques.
So that, it uses the data coming from previous request while processing present request i.e, it uses the same connection for a series of client server interactions.
The session management techniques are:
hidden form field
cookie
session
URL-rewriting

Anything that forgets whatever it did in past is stateless, such as http
Anything that can keep the history is statefull, such as database
Http is a stateless protocol, that's why it forgets the user information.
We make http as statefull protocol using jsonWebToken(JWT) i.e. on each request going to server, server will first verify the user using JWT.

Your question is spot on, and yes, it would be great if your web transactions with your bank were done over a stateful connection. Alas, HTTP is stateless due to a quirky bug in FTP and a 12 socket limit in the partial socket table in BSD of 1989. Marcus Ranum explained it all here.
So HTTP throws away the state it inherits from TCP and has to recreate state at the application layer in the form of cookies. Crappy internet security is the result.
The Seif project proposes to fix all that using "secure JSON over TCP". DNS and certificate authorities are not required. The protocol and seifnode.js are finished and on github with an MIT license.

HTTP doesn't 'inherit' from TCP, but rather uses it for a transport. HTTP uses TCP for a stateful connection, but then disconnects. Later it will connect again, if needed. So, while you browse through a web site you create many different connections. Each one of those connections is stateful, but the conversation as a whole is not, since you are dropping the connection with every conversation.
From this link

Basically yes, but you have no choice but use HTTP which is where websites are served in. So you have to deal with compromises to make HTTP stateful, aka session management. Possibilities are basically passing on a session id through each call in the URL so you know when you're talking to someone you've talked about before, or via cookies, which achieve the same goal without cluttering the url. However, most modern web development languages take care of that for you; if you google for the language of your choice + "session management" you should get some ideas of how it's done.

Related

Kubernetes sticky session / load balance by header value

I'm working on a project where I want to use Kubernetes and Docker. The microservice I'm about to implement must create a permanent HTTP/2 connection to another service (provided by others and I can't modify anything in that service) pro user / client in order to send asynchronous and cloud initiated messages to that user. Also, each following request from that client must use the same connection.
Obviously that is a challenge in terms of scalability, because every request from a client must be routed to the same instance of my microservice, which created the permanent connection to the other service. What makes things worse is the fact that my clients can change the IPs and that they can't use cookies. But what they can do is to send a custom header value which identifies them.
I thought about HAProxy and nginx, but can't find an option in either of them to load balance requests by a header value. Is there really no way to do that? How would you approach that issue? Any ideas?
Thanks!

Comparison between HTTP and RPC

RPC protocol uses TCP as an underlying protocol and HTTP again uses TCP as an underlying protocol. So why is HTTP is widely accepted?
Why does SOAP use HTTP as an underlying protocol - why not RPC?
Remote Procedure Calls (RPC) is not a protocol, it's a principle that is also used in SOAP.
SOAP is an application protocol that uses HTTP for transport (so it won't have to think about encoding, message boundaries and so on). One of the reasons to use SOAP over HTTP is that for HTTP you usually don't need firewall rules and that the HTTP infrastructure is mature and commonly rolled out.
RPC does not require HTTP. Basically, RPC describes any mechanism that is suitable to invoke some piece of code remotely. The transport mechanism used to perform the RPC could be SOAP over HTTP. It could also be a REST call returning some JSON data over HTTP.
SOAP can also be used via Mails, and AFAIK (not sure here) the BizTalk Server should support this scenario. But even something exotical like trying SOAP over Avian Carriers can also be considered an RPC, although the latency of the latter may not be sufficient for real-world applications.
Think of an RPC as sending somehow some kind of message to a destination, in order to initiate a specific action and (optionally) getting some information back after the action has been completed. What prticular technology you choose to transmit these messages does not really matter.

what is the benefit of using http hijacker

Go http pkg provide a Hijacker interface, can anyone tell when should I use it.
I check the comment, after a Hijack call lets the caller take over the connection, the HTTP server library will not do anything else with the connection.
I understand it as it's used to support both http request and common tcp interactive within one port. Is it right? Does it has any other benefits.
It means that you take over the control of TCP connection.
TCP is a generic transport protocol, whereas HTTP is an application protocol on top of TCP. The OSI seven layer model describes TCP as layer 4 and HTTP is layer 7.
If you need to implement a different application protocol, this is one use-case for hijacking.
Or if you need to do something specialised with HTTP, like preventing keep-alive connections, that is another use-case.
An example for an alternative web application protocol is Google's SPDY. It's also a good reason why you might hijack an existing HTTP connection, rather than create a TCP connection directly. For SPDY, a browser would first make an HTTP request that included 'accept' headers indicating that it is also able to understand SPDY. So now you could hijack the connection and implement SPDY instead of HTTP.

HTTP and Sessions

I just went through the specification of http 1.1 at http://www.w3.org/Protocols/rfc2616/rfc2616.html and came across a section about connections http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8 that says
" A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. This signaling takes place using the Connection header field (section 14.10). Once a close has been signaled, the client MUST NOT send any more requests on that connection. "
Then I also went through a section on http state management at https://www.rfc-editor.org/rfc/rfc2965 that says in its section 2 that
"Currently, HTTP servers respond to each client request without relating that request to previous or subsequent requests;"
A section about the need to have persistent connections in the RFC 2616 also said that prior to persistent connections every time a client wished to fetch a url it had to establish a new TCP connection for each and every new request.
Now my question is, if we have persistent connections in http/1.1 then as mentioned above a client does not need to make a new connection for every new request. It can send multiple requests over the same connection. So if the server knows that every subsequent request is coming over the same connection, would it not be obvious that the request is from the same client? And hence would this just not suffice to maintain the state and would this just nit be enough for the server to understand that the request was from the same client ? In this case then why is a separate state management mechanism required at all ?
Basically, yes, it would make sense, but HTTP persistent connections are used to eliminate administrative TCP/IP overhead of connection handling (e.g. connect/disconnect/reconnect, etc.). It is not meant to say anything about the state of the data moving across the connection, which is what you're talking about.
No. For instance, there might an intermediate (such as a proxy or a reverse proxy) in the request path that aggregates requests from multiple TCP connections.
See http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-21.html#intermediaries.

HTTP client acting as a pseudo-server

Let's say I am going to deploy a server application that's likely to be placed behind a NAT/firewall and I don't want to ask users to tweak their NAT port mapping. In other words, connections to the server are impossible, but my app is a server application by nature, i.e. it sends back objects per URI.
Now, I'm thinking about initiating connections from the server periodically to see what requests are there to be responded to. I'm going to use HTTP via port 80 as something that would likely be working through NAT/firewall from virtually anywhere.
The question is, are there any standard considerations and common practices of implementing a client that can act as a server at the application level, specifically using HTTP? Any special HTTP headers? Design patterns?
E.g. I am thinking about the following scheme:
The client (which is my logical server) sends a dummy HTTP request to the server
The server responds back with non-standard headers X-Request-URI:, X-Host:, X-If-Modified-Since: etc, in other words, request headers wrapped into X-xxx as they are not standard in this situation; also requests to keep the connection alive
The client responds with a POST request that sends the requested object; again, uses wrapped headers (e.g. X-Status:, etc)
Unless there is a more "standard" way of doing something like this, do you think my approach is plausible?
Edit: an interesting discussion took place on reddit here
I've done something similar. This is very common. Client initiate the connection to the Server and keep the connection ALIVE. If the session is shut-down, client would re-initiate. When the session is up, Server can push anything to the client since it's client initiated.

Resources