Is an HTTP request 'atomic' - http

I understand an HTTP request will result in a response with a code and optional body.
If we call the originator of the request the 'client' and the recipient of the request the 'server'.
Then the sequence is
Client sends request
Server receives request
Server sends response
Client receive response
Is it possible for the Server to complete step 3 but step 4 does not happen (due to dropped connection, application error etc).
In other words: is it possible for the Server to 'believe' the client should have received the response, but the client for some reason has not?

Network is inherently unreliable. You can only know for sure a message arrived if the other party has acknowledged it, but you never know it did not.
Worse, with HTTP, the only acknowledge for the request is the answer and there is no acknowledge for the answer. That means:
The client knows the server has processed the request if it got the response. If it does not, it does not know whether the request was processed.
The server never knows whether the client got the answer.
The TCP stack does normally acknowledge the answer when closing the socket, but that information is not propagated to the application layer and it would not be useful there, because the stack can acknowledge receipt and then the application might not process the message anyway because it crashes (or power failed or something) and from perspective of the application it does not matter whether the reason was in the TCP stack or above it—either way the message was not processed.
The easiest way to handle this is to use idempotent operations. If the server gets the same request again, it has no side-effects and the response is the same. That way the client, if it times out waiting for the response, simply sends the request again and it will eventually (unless the connection was torn out never to be fixed again) get a response and the request will be completed.
If all else fails, you need to record the executed requests and eliminate the duplicates in the server. Because no network protocol can do that for you. It can eliminate many (as TCP does), but not all.

There is a specific section on that point on the HTTP RFC7230 6.6 Teardown (bold added):
(...)
If a server performs an immediate close of a TCP connection, there is
a significant risk that the client will not be able to read the last
HTTP response.
(...)
To avoid the TCP reset problem, servers typically close a connection
in stages. First, the server performs a half-close by closing only
the write side of the read/write connection. The server then
continues to read from the connection until it receives a
corresponding close by the client, or until the server is reasonably
certain that its own TCP stack has received the client's
acknowledgement of the packet(s) containing the server's last
response. Finally, the server fully closes the connection.
So yes, this response sent step is a quite complex stuff.
Check for example the Lingering close section on this Apache 2.4 document, or the complex FIN_WAIT/FIN_WAIT2 pages for Apache 2.0.
So, a good HTTP server should maintain the socket long enough to be reasonably certain that it's OK on the client side. But if you really need to acknowledge something in a web application, you should use a callback (image callback, ajax callback) asserting the response was fully loaded in the client browser (so another HTTP request). That means it's not atomic as you said, or at least not transactional like you could expect from a relational database. You need to add another request from the client, that maybe you'll never get (because the server had crash before receiving the acknowledgement), etc.

Related

Can HTTP request fail half way?

I am talking about only one case here.
client sent a request to server -> server received it and returned a response -> unfortunately the response dropped.
I have only one question about this.
Is this case even possible? If it's possible then what should the response code be, or will client simply see it as read timeout?
As I want to sync status between client/server and want 100% accuracy no matter how poor the network is, the answer to this question can greatly affect the client's 'retry on failure' strategy.
Any comment is appreciated.
Yes, the situation you have described is possible and occurs regularly. It is called "packet loss". Since the packet is lost, the response never reaches the client, so no response code could possibly be received. Web browsers will display this as "Error connecting to server" or similar.
HTTP requests and responses are generally carried inside TCP packets. If a TCP packet carrying the HTTP response does not arrive in the expected time window, the request is retransmitted. The request will only be retransmitted a certain number of times before a timeout error will occur and the connection is considered broken or dead. (The number of attempts before TCP timeout can be configured on both the client and server sides.)
Is this case even possible?
Yes. It's easy to see why if you picture a physical cable between the client and the server. If I send a request down the cable to the server, and then, before the server has a chance to respond, unplug the cable, the server will receive the request, but the client will never "hear" the response.
If it's possible then what should the response code be, or will client simply see it as read timeout?
It will be a timeout. If we go back to our physical cable example, the client is sitting waiting for a response that will never come. Hopefully, it will eventually give up.
It depends on exactly what tool or library you're using how this is wrapped up, however - it might give you a specific error code for "timeout" or "network error"; it might wrap it up as some internal 5xx status code; it might raise an exception inside your code; etc.

Will the server raise an exception if its HTTP response can't get to the client?

We have an application which creates an order and sends to the server via HTTP post.
Client sends the order as an HTTP request
Server processes it
Server sends the response
Server does some further operation on this order
The client receives the response and processes it.
I've been asked about what about in step3, the response won't get to the client and get lost on the way. Then the client will try to re-send the same order. And this will introduce a duplicate order problem. And how to tackle this.
I came up with the idea that the client generates a unique ID and send to the server so when the client sends it the 2nd time, the server could know that it's a duplicate order, and will only return the previous response.
But I soon remember that HTTP is built upon TCP which should have a three-way handshaking thing for the data connection. Which means:
From the client perspective, if the client doesn't receive any response from the server, the connection will be maintained until timeout, then an exception will be thrown to let the client know.
My questions are:
From the server perspective, after it sends the response, how could it determine the response has reached the client?
There should be a three-way handshaking connection termination at the transportation layer to ensure that the connection will only be closed after the client received the messages, right? So if the message gets lost on the way, the server should trigger an exception, am I right?
If this is the case, the problem could simply be solved by ensure the server only does step4 if there is no exception in step3? Any other solution for this problem if my whole above idea is wrong?
Thanks
The whole idea is wrong. You need to look up idempotence. Basically every transaction needs to be idempotent, which means that applying it twice or more has no more effect than applying it once. This is generally implemented via unique transaction sequence numbers which are recorded at the server when the transaction has been completed.

How does a http client associate an http response with a request (with Netty) or in general?

Is a http end point suppose to respond to requests from a particular client in order that they are received?
What about if it doesn't make sense to in the case of requests handled by cluster behind a proxy or in requests handled with NIO where one request is finished faster than the other?
Is there a standard way of associating a unique id with each http request to associate with the response? How is this handled in clients like http componenets httpclient or curl?
The question comes down to the following case:
Suppose, I am downloading a file from a server and the request is not finished. Is a client capable of completing other requests on the same keep-alive connection?
Whenever a TCP connection is opened, the connection is recognized by the source and destination ports and IP addresses. So if I connect to www.google.com on destination port 80 (default for HTTP), I need a free source port which the OS will generate.
The reply of the web server is then sent to the source port (and IP). This is also how NAT works, remembering which source port belongs to which internal IP address (and vice versa for incoming connections).
As for your edit: no, a single http connection can execute one command (GET/POST/etc) at the same time. If you send another command while you are retreiving data from a previously issued command, the results may vary per client and server implementation. I guess that Apache, for example, will transmit the result of the second request after the data of the first request is sent.
I won't re-write CodeCaster's answer because it is very well worded.
In response to your edit - no. It is not. A single persistent HTTP connection can only be used for one request at once, or it would get very confusing. Because HTTP does not define any form of request/response tracking mechanism, it simply would not be possible.
It should be noted that there are other protocols which use a similar message format (conforming to RFC822), which do allow for this (using mechanisms such as SIP's cSeq header), and it would be possible to implement this in a custom HTTP app, but HTTP does not define any standard mechanism for doing this, and therefore nothing can be done that could be assumed to work everywhere. It would also present a problem with the response for the second message - do you wait for the first response to finish before sending the second response, or try and pause the first response while you send the second response? How will you communicate this in a way that guarantees messages won't become corrupted?
Note also that SIP (usually) operates over UDP, which does not guarantee packet ordering, making the cSeq system more of a necessity.
If you want to send a request to a server while another transaction is still in progress, you will need to create a new connection to the server, and hence a new TCP stream.
Facebook did some research into this while they were building their CDN, and they concluded that you can efficiently have 2 or 3 open HTTP streams at any one time, but any more than that reduces overall transfer time because of the extra packet overhead cost. I would link to the blog entry if I could find the link...

HTTP server detecting a broken network connection from a HTTP client

I have an web application in which after making a HTTP request to the server, the client quits ( or network connection is broken) before the response was completely received by the client.
In this scenario the server side of the application needs to do some cleanup work. Is there a way built into HTTP protocol to detect this condition. How does the server know if the client is still waiting for the response or has quit?
Thanks
Vijay Kumar
No, there is nothing built in to the protocol to do this (after all, you can't tell whether the response has been received by the client itself yet, or just a downstream proxy).
Just have your client make a second request to acknowledge that it has received and stored the original response. If you don't see a timely acknowedgement, run the cleanup.
However, make sure that you understand the implications of the Two Generals' Problem.
You might have a network problem... usualy, when you send a HTTP request to the server, first you send headers and then the content of the POST (if it is a post method). Likewise, the server responds with the headers and document body. The first line in the header is the status. Usually, status 200 is the success status, if you get that, then there should be no problem getting the rest of the document. Check this for details on the HTTP response status headers http://www.w3.org/Protocols/rfc2616/rfc2616-sec6.html
LE:
Sorry, missread your question. Basically, you don't have a trigger for when the user disconnects. If you use OOP, you could use the destructor of a class to clean whatever it is you need to clean.

Detecting missing responses to long running HTTP (SOAP) requests

I need a way to detect a missing response to a long running HTTP POST request. This problem arises when the network infrastructure (firewalls, proxies, unplugged cables, etc.) drops the response packets. The server may detect this failure, but the client cannot send additional bytes after the POST to probe the state of the TCP connection. The failure may be limited to a single TCP connection. For example I may be able to subsequently open a new TCP connection to the server.
I'm looking for a solution that still uses HTTP POST and does not change the duration of the server side processing.
Some solutions that I can think of are:
Provide a side channel interface to retrieve request & response history. If the history lists the response as having been send (presumably resulting in a TCP error) but I have not yet received it within a reasonable time I can generate a local error.
Use an X header to request that the server deliver "spurious" 100 Continue provisional responses on a regular interval. If I fail to see an expected 100 Continue or a non-provisional response I can generate a local error.
Is there a state of the art solution for this problem?
It sounds to me like you are using Soap for something that would be much better done using a stateful connection, or a server side push technology.

Resources