Why do we need a queue when using webhooks?

Why do we need a queue when using webhooks? - http

Can anyone clarify what is the purpose of using queue ?
What i understand is that a webhook is just a URL , you do a POST request to that URL and then do some stuff based on the body/data of the request. So why i need to queue the data and store it in a database then loop through the database again and perform the stuff.

The short answer is, you don't have to use a queue. A webhook is just an HTTP request (typically POST) notifying your application of some type of event. The reason you might want to consider a queue is because of typical issues you could run into.
One of these is because of response time back to the webhook requester (source). Many sources want a response (HTTP status 200) as quickly as possible so they can dequeue the request from their webhook system. If processing the webhook takes some time, a source will typically advise you to use a queue to defer the lengthier process asynchronous to the 200 response to the webhook.
Another possible reason could be for removing duplicate requests. There is no guarantee with webhooks that you will only receive a single request per event. A queue can be used to de-dupe these requests.
I would recommend you stick with a simple request handler if possible, then evolve a more sophisticated handler if you run into issues. Consider queues as a potential design approach if you run into issues like those above.

You need some way to prevent a conflict if the webhook is invoked multiple times very close together.
It doesn't necessarily have to be a queue, though. If the webhook performs database queries and updates, you can use a transaction to ensure that this is atomic for each invocation.
In this respect, it's little different from any other web utility. You should do something similar in scripts that process web forms.

Related

Handling queries alongside subscription queries using Axon Server

I'm currently using Axon Framework alongside Axon Server for my event-based microservices application. Recently I've encountered the need to wait for saga to fully complete and return its execution result (success or error) to the client, and I've solved it by creating a subscription query before dispatching a command that triggers the saga that locks then waits for updates that are being dispatched from saga and returns the result to client.
Now, that worked a treat in reporting on saga completion status to client, but now I've stumbled upon another seemingly connected problem. Every time a client queries our system's API, we perform an existence check of the client's account - and we do that by dispatching the corresponding query before we perform any business logic. After I've introduced the subscription query, when the client receives the response about saga completion status they immediately send a query to us for an updated list of certain entities, but the query that checks for account's existence fails with org.axonframework.queryhandling.NoHandlerForQueryException: No handler for query: ... which is returned by Axon Server upon sending despite the fact that there definitely is a handler registered for it and it's just handled exactly the same command during the previous request by client. This started to happen after I've added the inner subscription query mechanism to the equation.
This error disappears if we repeat the exact same query a bit later or put a delay of a couple hundred milliseconds between the calls, but that's certainly not a solution - what if our clients start to send loads of requests simultaneously, what will happen to account checking query? Are we unable to process some type of query when the subscription is not closed? I close the subscription on doFinally of Mono returned from SubscriptionQueryResult, but is there a chance that it doesn't actually get closed in Axon Server when the next query arrives? Or, which I think is closer to the truth, I need to somehow tune the query handling capacity of Axon Server? The documentation is rather concise on this topic, IMHO, especially concerning queries, not events/commands.

From the sound of it, you've hit a bug, #Ivan. I assume you are on Axon Server SE? It wouldn't hurt to construct an issue there describing the problem at hand.
As far as I know, the Subscription Query does not impact the capability to send other queries. So, it is not like a Query Handler suddenly unregisters once a subscription query is active. Or, it definitely shouldn't.
By the way, would you be able to share the versions of Axon Framework and Axon Server you are using? That helps with deducing the issue but also helps others that might hit the same problem.

what is the best practice for handling asynchronous api call that take time

So suppose I have an API to create a cloud instance asynchronously. So after I made an API call it will just return the success response, but the cloud instance will not been initialized yet. It will take 1-2 minutes to create cloud instance and after that it will save the cloud instance information (ex. ip, hostname, os) to db which mean I have to wait 1-2 minutes so I can fetch the data again to show cloud information. At first I try making a loading component, but the problem is that I don't know when the cloud instance is initialized (each instance has different time duration for creating). I'm considering using websocket or using cron or should I redesign my API? Has anyone design asynchronous system before how do you handle such a case.

If the API that you call gives you no information on when it's done with its asynchronous processing, it seems to me that you'll have to check at intervals until you find that the resource is ready; i.e. to poll it.
This seems to me to roughly fit the description and intent of the Polling Consumer pattern. In general, for asynchronous systems design, I can't recommend Enterprise Integration Patterns enough.

As other noted you can either have a notification channel using WebSockets or poll the backend. Personally I'd probably go with the latter for this case and would actually create several APIs, one for initiating the work and get back a URL with "job id" in it where the status of the job can be polled.
RESTfully that would look something like POST /instances to initiate a job GET /instances see all the instances that are running/created/stopped and GET /instances/<id> to see the status of a current instance (initiating , failed , running or whatever)

WebSockets would work, but might be an overkill for this use case. I would probably display a status of 'creating' or something similar after receiving the success response from the API call, and then start polling the API to see if the creation process has finished.

HTTP Response sent before async call returns

I am yet to understand the behavior of web server thread, if I make an async call to say, a database, and immediately return response ( say OK ) to the client without even waiting for the async call to return back. First of all, is it a good approach ? What will happen to the thread which made the async call and if it is used again to serve another request and then the previous async call returns to this particular thread. Or does web server holds this thread waiting till the async call which it made, returns. Then the issue would be many hanging threads would be open as and web server would be available to take more requests. I am looking for an answer.

It depends on the way your HTTP servers works. But you should be very cautious.
Let's say you have a main event loop taking care of incoming HTTP connections, and workers threads which manage the HTTP communications.
A worker thread should be considered ready to accept a new HTTP request management only when it is effectively completly ready for that.
In terms of pure HTTP the more important thing is to avoid sending a response before having received the whole query. It seems simple, and it's usually the case. But if the query as a body, which may be a chunked body, it could take time to receive the whole message.
You should never send a response before, unless it's something like a 400 bad request response, followed by a real tcp/ip connection closing. If you fail to do so, and you have a message length parsing issue, the fact that you sent a response before the end of the query may lead to security problems. It could be used to exploit differences in the parsing of messages between your server and any other HTTP agent in front of your server (ssl terminator, reverse proxy, etc), in some sort of http smuggling issue. For this agent, if you made a response, it means you had the whole message, and it can send the next message, where you will in fact think this is just another part of the body.
Now if you have the whole message, you can decide to send an early response and detach an asynchronous task to really perform some sort of stuff. but this means:
you have to assume that no more output should be generated, you will not try to send any output to the request issuer, you should consider that the communication is now closed
the worker thread should not receive new requests to manage, and this is the hard part. If this thread is marked as available for a new request, it may also be killed by the thread manager (you have in Nginx or Apache request counters associated with workers, and they are killed after reaching a limit, to create fresh ones). it may also receive a gracefull reload command (usually it's a kill), etc.
So you start to enter a zone where you should know the internals of the HTTP server, which is maybe managed by you, or not, and where changes may appear sooner or later. And you start to make very strange things, which leads usually to strange issues, hard to reproduce.
Uausally the best way to handle asynchronous tasks, while still being able to understand what happen, is to use a messaging system. Put a list of tasks in queue, and get a parallel asynchronous worker process which does things with theses tasks. track status of theses tasks if you need it.
Same things may apply with the client, after receiving a very fast HTTP answer, it may need to perform some ajax status polling for the task status. And you will maybe only have to check the status of the task in the queue to send a response.
You will get more control on the whole thing.
For me I really dislike having detached threads, coming from strange code, performing heavy tasks without any way of outputing a status or reporting errors, and maybe preventing the nice application stop calls (still waiting for strange threads to join) which does not imply a killall.

It depends whether this asynchronous operation performs something which the client should be notified about.
If you return 200 OK (i.e. successfully completed) and later the asynchronous operation fails then the client will not know about the error.
You of course have some options like sending some kind of push notification over websocket or sending another request which would return the actual result and things like that. So basically depends on your needs...

Continue execution asynchronously after return statement

Is there any way to achieve this?
My actual requirement is, I want to return success from my rest service as soon I get the data and perform a basic action on it and then I want to continue some more operations on the data.
I thought of two approaches
Threading - Currently I don't know how I will make it through threading.
Bulk update - I will schedule a task that will do all this processing after may be an hour or so.
But I am not very sure how should I start implementing this.
Any help?

In the context of HTTP we have to remember that once we finish the response, the conversation is over. There is really no way to keep the request alive after you have given the client a response. You would need to implement some worker process that runs outside of HTTP that processes things "queued up" or "triggered" by an HTTP request.

In Mate, Sending two or more requests to the server simultaneously?

I'm using Mate's RemoteObjectInvoker to call methods in my FluorineFX based API. However, all requests seem to be sent to the server sequentiality. That is, if I dispatch a group of messages at the same time, the 2nd one isn't sent until the first returns. Is there anyway to change this behavior? I don't want my app to be unresponsive while a long request is processing.

This thread will help you to understand what happens (it talks about blazeds/livecylce but I assume that Fluorine is using the same approach). In a few words what happens is:
a)Flash player is grouping all your calls in one HTTP post.
b)The server(BlazeDs,Fluorine etc) receives the request and starts to execute the methods serially, one after another.
Solutions
a)Have one HTTP post per method, instead of one HTTP post containing all the AMF messages. For that you can use HTTPChannel instead of AMFChannels (internally it is using flash.net.URLLoader instead of flash.net.NetConnection). You will be limited to the maximum number of parallel connection defined by your browser.
b)Have only one HTTP post but implement a clever solution on the server (it will cost you a lot of development time). Basically you can write your own parallel processor and use message consumers/publishers in order to send the result of your methods to the client.
c)There is a workaround similar to a) on https://bugs.adobe.com/jira/browse/BLZ-184 - create your remoteobject by hand and append a random id at the end of the endpoint.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex