Should POST data have a re-try on timeout condition or not? - http

When POSTing data - either using AJAX or from a mobile device or what have you - there is often a "retry" condition, so that should something like a timeout occue, the data is POSTed again.
Is this actually a good idea?
POST data is meant to be idempotent, so if you
make a POST to the server,
the server receives the request,
takes time to execute and
then sends the data back
if the timeout is hit sometime after 3. then the next retry will send data that was meant to be idempotent.
The question then is that should a retry (when calling from the client side) be set for POST data, or should the server be designed to always handle POST data appropriately (with tokens and so on), or am i missing something?
update as per the questions - this is for a mobile app. As it happens, during testing it was noticed that with too short a timeout, the app would retry. Meanwhile, the back-end server had in fact accepted and processed the initial request, and got v. upset when the new (otherwise identical) re-request came in.

nonce's are a (partial) solution to this. The server generates a nonce and gives it to the client. The client sends the POST including the nonce, the server checks if the nonce is valid and unused and if so, acts on the POST and invalidates the nonce, if not, it reports back that the nonce is used and discards the data. Also very usefull to avoid the 'double post' problem by users clicking a submit button twice.
However, it is moving the problem from the client to a different one on the server. If you invalidate the nonce before the action, the action might still fail / hang, if you invalidate it after, the nonce is still valid for requests during the processing. So, a possible scenario on the server becomes on receiving.
Lock nonce
Do action
On any processing error preventing action completion, rollback, release lock on nonce.
On no errors, invalidate / remove nonce.
Semaphores on the server side are most helpfull with this, most backend languages have libraries for these.
So, implementing all these:
It is safe to retry, if the action is already performed it won't be done again.
A reply that the nonce has already been used can be understood as a a confirmation that the original POST has been acted upon.
If you need the result of an action where the second requests shows that the first came through, a short-lived cache would be needed server-sided.
Up to you to set a sane limit on subsequent tries (what if the 2nd fails? or the 3rd?).

The issue with automatic retries is the server needs to know if the prior POST was successfully processed to avoid unintended consequences. For example, if each POST inserts a DB record, but the last POST timed out after the insert, the automatic re-POST will cause a duplicate DB record, which is probably not what you want.
In a robust setup you may be able to catch that there was a timeout and rollback any POSTed updates. That would safely allow an automatic re-POST. But many (most?) web systems aren't this elaborate, so you'll typically require human intervention to determine if the POST should happen again.

If you can't avoid the long wait until the server responds, a better practice would be to return an immediate response (200OK with the message "processing") and have the client, later on, send a new request that checks if the action was performed.
AJAX was not designed to be used in such a way ("heavy" actions).
By the way, the default HTTP timeout is 7200 secs so I don't think you'll reach it easily - but regardless, you should avoid having the user wait for long periods of time.
Providing more information about the process (like what exactly you're trying to do) would help in suggesting ways to avoid such obstacles.

If your requests are failing enough to instigate something like this, you have major problems that have nothing to do with implementing a retry condition. I have never seen a Web app that needed this type of functionality. Programming your app to automatically beat an overloaded sever with the F5 hammer is not the solution to anything.
If this ajax request is triggered from a button click, disable that button until it returns, successful or not. If it failed, let the user click it again.

Related

Database rollback on API response failure

A customer of ours is very persistent that they expect this "from any API" (meaning they don't want to pay for the changes). I seem to have trouble finding clear information on this though.
Say we have an API that creates an appointment for a calendar. Server-side everything was successful, data is committed to the database. API tries to send the HTTP 201 (Created) response, but something goes wrong there. Client ignores the response, or connection dropped, ...
They want our API to undo the database changes in that particular situation.
The question is not how to do this, but rather if this is something most APIs do? Is this standard behavior? Or something similar like refusing duplicate create requests?
The difficult part of course is to actually know if an API has failed to send the response, and as far as I am concerned with respect to the crux of the question, it is not a usual behavior implemented. If the user willingly inputs the data, you can go ahead and store it. If the response doesn't return properly due to timeouts (you are not responsible for user "ignoring" the response), then the client side code can refresh on failure and load fresh data. And the user can delete inputted data themselves(given you provide an endpoint for that)
Depending on the database, it is possible to make all database changes of an API reversible. For example, with SQL, you use [SQL transactions][1] using commit, rollback and savepoints. There is most likely a similar mechanism available for noSQL.

HTTP Response sent before async call returns

I am yet to understand the behavior of web server thread, if I make an async call to say, a database, and immediately return response ( say OK ) to the client without even waiting for the async call to return back. First of all, is it a good approach ? What will happen to the thread which made the async call and if it is used again to serve another request and then the previous async call returns to this particular thread. Or does web server holds this thread waiting till the async call which it made, returns. Then the issue would be many hanging threads would be open as and web server would be available to take more requests. I am looking for an answer.
It depends on the way your HTTP servers works. But you should be very cautious.
Let's say you have a main event loop taking care of incoming HTTP connections, and workers threads which manage the HTTP communications.
A worker thread should be considered ready to accept a new HTTP request management only when it is effectively completly ready for that.
In terms of pure HTTP the more important thing is to avoid sending a response before having received the whole query. It seems simple, and it's usually the case. But if the query as a body, which may be a chunked body, it could take time to receive the whole message.
You should never send a response before, unless it's something like a 400 bad request response, followed by a real tcp/ip connection closing. If you fail to do so, and you have a message length parsing issue, the fact that you sent a response before the end of the query may lead to security problems. It could be used to exploit differences in the parsing of messages between your server and any other HTTP agent in front of your server (ssl terminator, reverse proxy, etc), in some sort of http smuggling issue. For this agent, if you made a response, it means you had the whole message, and it can send the next message, where you will in fact think this is just another part of the body.
Now if you have the whole message, you can decide to send an early response and detach an asynchronous task to really perform some sort of stuff. but this means:
you have to assume that no more output should be generated, you will not try to send any output to the request issuer, you should consider that the communication is now closed
the worker thread should not receive new requests to manage, and this is the hard part. If this thread is marked as available for a new request, it may also be killed by the thread manager (you have in Nginx or Apache request counters associated with workers, and they are killed after reaching a limit, to create fresh ones). it may also receive a gracefull reload command (usually it's a kill), etc.
So you start to enter a zone where you should know the internals of the HTTP server, which is maybe managed by you, or not, and where changes may appear sooner or later. And you start to make very strange things, which leads usually to strange issues, hard to reproduce.
Uausally the best way to handle asynchronous tasks, while still being able to understand what happen, is to use a messaging system. Put a list of tasks in queue, and get a parallel asynchronous worker process which does things with theses tasks. track status of theses tasks if you need it.
Same things may apply with the client, after receiving a very fast HTTP answer, it may need to perform some ajax status polling for the task status. And you will maybe only have to check the status of the task in the queue to send a response.
You will get more control on the whole thing.
For me I really dislike having detached threads, coming from strange code, performing heavy tasks without any way of outputing a status or reporting errors, and maybe preventing the nice application stop calls (still waiting for strange threads to join) which does not imply a killall.
It depends whether this asynchronous operation performs something which the client should be notified about.
If you return 200 OK (i.e. successfully completed) and later the asynchronous operation fails then the client will not know about the error.
You of course have some options like sending some kind of push notification over websocket or sending another request which would return the actual result and things like that. So basically depends on your needs...

Long HTTP POST Response Time

I have a web app that has some fairly hefty data processing on the backend. A current example workflow is:
User POSTS a form
Server receives form, starts processing
2-4 minutes pass
The server responds
The reason i'm asking this is that initially a web proxy on the user side was killing idle POSTs after 2 minutes. The more I think about it the more this seemed like a reasonable default.
This leaves the question, should I increase the timeout and not fix the problem? Or is this bad practice? It is currently at 2-4 minutes but could easily get longer. Should the application be responding with something rather than just leaving the connection open? If so other than completely redesigning the UI to be asynchronous submit/check back later what options are there?
Generally, if I were to submit a form and it would take that long, I would think something would have gone wrong and attempt to submit again. I think that you should collect the data and give the user some sort of success message. Then create another page that allows them to check on the status of the processing (if the user needs to get the results from that processing).

Notifying the user after a long Ajax task when they might be on a different page

I have an Ajax request to a web service that typically takes 30-60 seconds to complete. In some cases it could take as long as a few minutes. During this time the user can continue working on other tasks, which means they will probably be on a different page when the task finishes.
Is there a way to tell that the original request has been completed? The only thing that comes to mind is to:
wrap the web service with a web service of my own
use my web service to set a flag somewhere
check for that flag in subsequent page requests
Any better ways to do it? I am using jQuery and ASP.Net, if it matters.
You could add another method to your web service that allows you to check the status of a previous request. Then you can use ajax to poll the web service every 30 seconds or so. You can store the request id or whatever in Session so your ajax call knows what request ID to poll no matter what page you're on.
I would say you'd have to poll once in a while to see if request has ended and show some notifications, like this site does with badges for example.
At first make your request return immediately with something like "Started processing...". Then use a different request to poll for the result. It is not good neither for the server nor the client's browser to have long open HTTP sessions. Moreover the user should be informed and educated that he is starting a request that could take some time to complete.
To display the result you could have a"notification area" in all of your web pages. Alternatively you could have a dedicated page for this and instruct the user to navigate there. As others have suggested you could use polling to get the result.
You could use frames on your site, and perform all your long AJAX requests in an invisible frame. Frames add a certain level of pain to development, but might be the answer to your problems.
The only other way I could think of doing it is to actually load the other pages via an AJAX request, such that there are no real page reloads - this would mean that the AJAX requests aren't interrupted, but may cause issues with breaking browser functionality (back/forward, bookmarking, etc).
Since web development is stateless (you can't set a trigger/event on a server to update the client), the viable strategy is to setup up a status function that you can intermittently call using a javascript timer to check whether your code has finished executing. When it finishes, you can update your view.

Changing credentials on client-side for Basic Authentication on Flex

I want to let the user automatically re-login in my Flex app, which uses Basic Authentication
By the way, I have noted this StackOverflow question, which is relevant, but does not address the question of logging out client-side.
For example, after user A logs in, user B comes to the browser, goes to the login screen (perhaps in a new tab) and logs in.
This should mean that I send user B's credentials in the HTTP headers, and that since these are different from user A's, the server notes the fact and creates a new and separate session.
However, Flex's HTTP proxy catches the header and actually ignores these new credentials.
Flex does offer a way to tell the server to logout, and the Flex login code could invoke this every time before sending credentials, but that seems like an ugly workaround. I want to be able to do this client-side. I could also use a non-standard header for Basic Authentication (since I control the server-side Authentication as well), but that also seems like an ugly workaround.
Is there some way to simply end the session on client-side from Flex code? This is possible from JavaScript, for example.
And is there a way to directly work with cookies at client-side, as I can in JavaScript?
I understand that some of the limitations may be caused by security considerations, but all my communication is to the "home" server, so it should be possible to avoid the restrictions.
You're sort of asking a couple of different questions here.
You can't actually end a basic-auth "session" manually per se (at least not to the best of my knowledge); at best, you can authenticate against a kind of variable basic-auth realm, which may or may not work for you, but otherwise, you're sort of stuck with the first-authenticated session for the duration of the browser instance. Generally not the best way to go, unless you're pretty sure the user owns the machine, or can be depended on to close the browser after each session.
That leaves at least two other options, then. The first is to send in your credentials with an URLRequest object (the post you cited, which I wrote, shows how to do that), and to have your HTTP response hand back something indicating the credentials were accepted -- e.g., a GUID, maybe, generated and stored in some session table (in the database sense) on the server, perhaps. Then on successive HTTP requests, you might send along that GUID in an HTTP header, or as a value in each GET or POST request (similarly to the way Facebook handles their API clients, for instance), check the timeliness of that value on the server, and if all's well, carry on. To "log out," then, you'd simply send in a request to invalidate that GUID, perform the necessary cleanup on the server and inside your Flex app, and all should be fine: the next user can sit down, log in, authenticate, and the process continues.
Another way would be to work with cookies directly. The cookie mechanisms are actually handled mostly for you in Flex, though, since everything gets passed back and forth by the browser on your behalf. For example, if you send in a URLRequest with a username and password, and the server responds with a cookie of any kind, each request you make thereafter will package and send the same cookie, so in most cases, all you need to do is parse the initial response from the server (to set the state of your Flex app), assume the continued presence of the cookie, and when it's time to log out, send a URLRequest to log out, kill the cookie on the server, on status=200 do your Flex-app cleanup, and so on. Accessing the cookie values directly isn't the easiest thing in the world, though; you can use ExternalInterface as a proxy to JavaScript (examples of this online and here on SO, I'm sure), and get at them that way, but there's a good chance you don't even have to do that.
Hopefully that helps. Good luck!
Note also this post, which details some of the incredible distortion that Flex adds to HTTP Requests.

Resources