Is TCP protocol stateless? - http

HTTP,the protocol residing over TCP protocol is stateless and also the IP protocol is stateless
But how can we conclude that TCP is stateless or not?

You can't assume that any stacked protocol is stateful or stateless just looking at the other protocols on the stack. Stateful protocols can be built on top of stateless protocols and stateless protocols can be built on top of stateful protocols. One of the points of a layered network model is that the kind of relationship you're looking for (statefulness of any given protocol in function of the protocols it's used in conjunction with) does not exist.
The TCP protocol is a stateful protocol because of what it is, not because it is used over IP or because HTTP is built on top of it. TCP maintains state in the form of a window size (endpoints tell each other how much data they're ready to receive) and packet order (endpoints must confirm to each other when they receive a packet from the other). This state (how much bytes the other guy can receive, and whether or not he did receive the last packet) allows TCP to be reliable even over inherently non-reliable protocols. Therefore, TCP is a stateful protocol because it needs state to be useful.
I would also like to point out that while HTTP and HTTPS (which is just HTTP over SSL/TLS, really) are essentially stateless (each request is a valid standalone request per the protocol), applications built on top of HTTP and HTTPS aren't necessarily stateless. For instance, a website can require you to visit a login page before sending a message. Even though the request where the client sends a message is a valid standalone request, the application will not accept it unless the client authenticated herself before. This means that the application implements state over HTTP.
On a side note, the statefulness of HTTP can be somewhat confusing, as several applications (on a clearly different OSI layer) will leak their state to HTTP. For instance, if a user tries to view a blog post that doesn't exist, the blog application might send back a response with the 404 status code, even though the file handling the blog post search itself was found.

tl;dr TCP is stateful.
While Zneak points out that you can use any communication for stateful purposes, the ACTUAL question being asked is whether the protocol itself is stateful.
Wikipedia:
In computing, a stateless protocol is a communications protocol that
treats each request as an independent transaction that is unrelated to
any previous request so that the communication consists of independent
pairs of requests and responses. A stateless protocol does not require the server to retain
session information or status about each communications partner for
the duration of multiple requests. In contrast, a protocol which
requires keeping of the internal state on the server is known as a
stateful protocol.
TCP's "request" (unit of communication) is a TCP packet.
TCP a stateful protocol since parties must remember what state the other is in, and what bytes the other has. Hence the TCP state diagram.
In contrast, UDP is a stateless protocol. Neither endpoint retains any notion of state. (Though as always, the encapsulated information could be used for stateful purposes.)

Here is a nice explanation :
Consider the phone service to be TCP and consider your relationship with distant family members to be HTTP. You will contact them with the phone service. Each call to them would be a stateful TCP connection. However, you don't constantly stay on the phone with them, as you will disconnect and call them back again at a later time. You would certainly expect them to remember what you talked about on the last call. HTTP in itself does not do that, but it is rather a function of the web server that maintains the state of the overall converstation.

To properly answer the question, we need the concept of a stateless protocol used to manage external stateful resources. Section 2.4 of http://laurel.datsi.fi.upm.es/_media/docencia/asignaturas/ws-modelingresources.pdf is about a service that implements such a protocol:
A Service that acts upon stateful resources may be described
“stateless” if it delegates responsibility for the management of the
state to another component such as a database or file system. ... A
consequence of statelessness is that any dynamic state needed for a
given message-exchange execution must be:
provided explicitly within the request message, whether directly by-value or indirectly by-reference, and/or
maintained implicitly within other system components with which the Web service can interact.
So, the http protocol is stateless, if we consider that the files that are served, the database that is accessed, etc. are separated from the implementation of the protocol itself. A service (which implements a protocol) that is stateless in relation with both sides taken together might not appear stateless on each side, because the other side can carry a state.

Related

HTTP is stateless while TCP is stateful?

I was wondering how HTTP is stateless while its built over TCP which is stateful ?
I'm still begineer backend engineer and I dont have solid understanding of this topics.
I tried to search for explanations but I'm not sure if this question has been asked before.
There are transport layer (TCP) states and application layer (HTTP) states.
When talking about TCP being stateful one is talking about transport layer states. TCP is stateful because a transport layer state consisting of current sequence numbers etc is needed to provide the reliability guarantees of TCP, i.e. ordering of packets, removing of duplicates, acknowledgements and retransmission. Thus a state spanning over multiple "units" (packets) is needed.
In HTTP this unit is the HTTP message, i.e. the HTTP request from the client and the HTTP response from the server. When talking about HTTP being stateless it means that there is no state inside the HTTP protocol needed which spans multiple such messages: a response strictly follows a request and there is no state covering multiple requests or responses - all requests are independent from each other from the perspective of HTTP.
Within web applications itself though some state usually is needed, like for a user session. These states are implemented on top of HTTP, usually with cookies shared between the requests. These states are then independent from a specific HTTP request and also independent from the underlying TCP connection.

Understanding the application layer in the osi model

I'm aware there are a lot of protocols at the application layer,
The question is more about when it is ok to not follow any of them,
Lets say i have a client and a server and the client app should send some data to that server, for instance, some statistics about a person using the app,
Now, for a good programming practice, is it ok to just open a tcp socket and send the data as is without the overhead of following a protocol or am i breaking the osi model and should i follow one of the protocols at the application layer?
Am i reinventing the wheel here or is it a practical solution?
There's always an application layer protocol. If your concept is to transmit some statistics to a server ''on a certain TCP or UDP port as a plain, decimal number'' then that is your (implicit) application protocol. The protocol enables the server to receive data and assigns a meaning to the number.
The OSI model is a model, not a law. In your application layer protocol you can do whatever you want.
However, it may be useful to anticipate future expansion of the service, so that you could e.g. transmit value_a:data\0value_b:data in one stream/datagram without having to keep client and server versions in perfect sync (with the server not expecting all values and just ignoring unknown values). Of course, you can also use a different server port each time instead - your choice.

how does ASP.net "HttpResponse.IsClientConnected" work?

if HTTP is connection-less, how does ASP.net response property, HttpResponse.IsClientConnected detect client is connected or not?
HTTP is not "connection-less" - you still need a connection to receive data from the server; more correctly, HTTP is stateless. Applications running on-top of HTTP will most likely actually be stateful, but HTTP itself is not.
"Connectionless" can also refer to a system using UDP as the transport instead of TCP. HTTP primarily runs over TCP and pretty much every real webserver expects, and returns, TCP messages instead of UDP. You might see HTTP-like traffic in UDP-based protocols like UPnP, but because you want your webpage to be delivered reliably, TCP will always be used instead of UDP.
As for IsClientConnected, when you access that property it calls into the current HttpWorkerRequest which is an abstract class implemented by the current host environment.
IIS7+ implements it such that if it previously received a TCP disconnect message (that sets a field) the method would now return false.
The ISAPI implementation (IIS 6) instead calls into a function within IIS that informs the caller if the TCP client on the current request/response context is still connected, though presumably it works on the same basis: when the webserver receives a TCP timeout, disconnect or connection-reset message it sets a flag and lets execution continue instead of terminating the response-generator thread.
Here's the relevant source code:
HttpResponse.IsClientConnected: http://referencesource.microsoft.com/#System.Web/HttpResponse.cs,80335a4fb70ac25f
IIS7WorkerRequest.IsClientConnected: http://referencesource.microsoft.com/#System.Web/Hosting/IIS7WorkerRequest.cs,1aed87249b1e3ac9
ISAPIWorkerRequest.IsClientConnected: http://referencesource.microsoft.com/#System.Web/Hosting/ISAPIWorkerRequest.cs,f3e25666672e90e8
It all starts with an HTTP request. Inside it, you can, for example, spawn worker threads, that can outlive the request itself. Here is where IsClientConnected comes in handy, so that the worker thread knows that the client has already received the response and disconnected or not.

What's the behavioral difference between HTTP Keep-Alive and Websockets?

I've been working with websockets lately in detail. Created my own server and there's a public demo. I don't have such detailed experience or knowledge re: http. (Although since websocket requests are upgraded http requests, I have some.)
On my end, the server reports details of each hit. Among them are a bunch of http keep-alive requests. My server doesn't handle them because they're not websocket requests. But it got my curiosity up.
The whole big thing about websockets is that the connection stays alive. Then you can pass messages in both directions (simultaneously even). I've read that the Keep-Alive HTTP connection is a relatively new development (I don't know how many years in people time, just that it's only included in the latest standard - 1.1 - is that actually old now?)
I guess I can assume that there's a behavioral difference between the two or there would have been no reason for a websocket standard? What's the difference?
A Keep Alive HTTP header since HTTP 1.0, which is used to indicate a HTTP client would like to maintain a persistent connection with HTTP server. The main objects is to eliminate the needs for opening TCP connection for each HTTP request. However, while there is a persistent connection open, the protocol for communication between client and server is still following the basic HTTP request/response pattern. In other word, server side can't push data to client.
WebSocket is completely different mechanism, which is used to setup a persistent, full-duplex connection. With this full-duplex connection, server side can push data to client and client should be expected to process data from server side at any time.
Quoting corresponding entries on Wikipedia for reference:
1) http://en.wikipedia.org/wiki/HTTP_persistent_connection
2) http://en.wikipedia.org/wiki/WebSocket
You should read up on COMET, a design pattern which shows the limits of HTTP Keep-Alive. Keep-Alive is over 12 years old now, so it's not a new feature of HTTP. The problem is that it's not sufficient; the client and server cannot communicate in a truly asynchronous manner. The client must always use a "hanging" request in order to get a message back from the server; the server may not just send a message to the client at any time it wants.
HTTP vs Websockets
REST (HTTP)
Resources benefit from caching when the representation of a resource changes rarely or multiple clients are expected to retrieve the resource.
HTTP methods have well-known idempotency and safety properties. A request is “idempotent” if it can be issued multiple times without resulting in unique outcomes.
The HTTP design allows for responses to describe errors with the request, with the resource, or to provide nuanced status information to differentiate between success scenarios.
Have request and response functionality.
HTTP v1.1 may allow multiple requests to reuse a single connection, there will generally be small timeout periods intended to control resource consumption.
You might be using HTTP incorrectly if…
Your design relies on a client polling the service often, without the user taking action.
Your design requires frequent service calls to send small messages.
The client needs to quickly react to a change to a resource, and it cannot predict when the change will occur.
The resulting design is cost-prohibitive. Ask yourself: Is a WebSocket solution substantially less effort to design, implement, test, and operate?
WebSockets
WebSocket design does not allow explicit or transparent proxies to cache messages, which can degrade client performance.
WebSocket protocol offers support only for error scenarios affecting the establishment of the connection. Once the connection is established and messages are exchanged, any additional error scenarios must be addressed in the messaging layer design, but WebSockets allow for a higher amount of efficiency compared to REST because they do not require the HTTP request/response overhead for each message sent and received.
When a client needs to react quickly to a change (especially one it cannot predict), a WebSocket may be best.
This makes the protocol well suited to “fire and forget” messaging scenarios and poorly suited for transactional requirements.
WebSockets were designed specifically for long-lived connection scenarios, they avoid the overhead of establishing connections and sending HTTP request/response headers, resulting in a significant performance boost
You might be using WebSockets incorrectly if..
The connection is used only for a very small number of events, or a very small amount of time, and the client does not - need to quickly react to the events.
Your feature requires multiple WebSockets to be open to the same service at once.
Your feature opens a WebSocket, sends messages, then closes it—then repeats the process later.
You’re re-implementing a request/response pattern within the messaging layer.
The resulting design is cost-prohibitive. Ask yourself: Is a HTTP solution substantially less effort to design, implement, test, and operate?
Ref: https://blogs.windows.com/buildingapps/2016/03/14/when-to-use-a-http-call-instead-of-a-websocket-or-http-2-0/

Discussion: Chat server via node.js: HTTP or TCP?

I was considering doing a chat server using node.js/socket.io. Should I make it a tcp server or a http server? I'd imagine tcp server would be more efficient, but can you send other stuff to it like file attachments etc? If tcp is more efficient, how much more so? Also, just wondering how many concurrent connections can one node.js server handle? Is it more work to do TCP or HTTP?
You are talking about 2 totally different approaches here - TCP is a transport layer protocol and HTTP is an application layer protocol. HTTP (usually) operates over TCP, so whichever option you choose, it will still be operating over TCP.
The efficiency question is sort of a moot point, because you are talking about different OSI layers. If you went for raw TCP sockets, your solution would probably be more efficient - in bandwidth at least - since HTTP contains a whole bunch of extra data (the headers) that would likely be irrelevant to your purposes (depending on the scale of the chat program). What you are talking about developing there is your own application layer protocol.
You can send anything you like over TCP - after all HTTP can send attachments, and that operates over TCP. FTP also operates over TCP, and that is designed purely for transferring "attachments". In order to do this, you would need to write your protocol so that it was able to tell the remote party that the following data was a file, then send the file data, then tell the remote party that the transfer is complete. Implementations of this are many and varied (the HTTP approach is completely different from the FTP approach) and your options are pretty much infinite.
I don't know for sure about the node.js connection limit, but I can say with a fair amount of confidence that it is limited by the operating system. This might help you get to grips with the answer to that question.
It is debatable whether it is more work to do it with TCP or HTTP - it's a lot of work to do it in both. I would probably lean more toward the TCP option being your best bet. While TCP would require you to design a protocol rather than/as well as an application, HTTP is not particularly suited to live, 2-way applications like chat servers. There are many implementations of chat over HTTP that use AJAX, but I can tell you from painful experience that they are a complete pain in the rear-end.
I would say that you should only be looking at HTTP if you are intending the endpoint (i.e. the client) to be a browser. If you are going to write a desktop app for the endpoint, a direct TCP link would definitely be the way to go. The main reason for this is that HTTP works in a request-response manner, where the client sends a request to the server, and the server responds. Over TCP you can open a single TCP stream, that can be used for bi-directional communication. This means that the server can push an event to the client instantly, while over HTTP you have to wait for the client to send a request, so you can respond with an event. If you were intending to use a browser as the client, it will make the whole file transfer thing much more tricky (the sending at least).
There are ways to implement this over HTTP using long-polling and server push (read this) but it can be a real pain to implement.
If you are going to implement this on a LAN (or possibly even over the internet) it is worth considering UDP over TCP - in a chat application it is not usually absolutely mission critical that messages arrive in the right order, and even if it was, users would probably not be able to type faster than the variations in network latency (probably <100ms). Then for file transfers you could either negotiate a seperate TCP socket for the data exchange (like FTP), or implement some kind of UDP ACK system (like TFTP).
I feel there is a lot more to say on this subject but right now I can't put it into words - I may extend this answer at some point.
Chat servers are the Hello World program in node. Use http.
As far as the question of how many concurrent connections can it handle, that all depends on your system. Set up a simple chat server and then try benchmarking it.
Also, check out http://search.npmjs.org/ and search for chat for a few pointers.

Resources