I am working on prototyping a new web service for my company and we are considering Apache Camel as our integration framework. Here is a quick run-down of the high-level architecture:
-IBM Websphere MQ as the queuing solution
1) we receive http request
2) asynchronously persist this request
3a) do some processing on the request
3b) send to another tier for further processing
4) asynchronously update the request record in DB
5) respond to caller
What I want to do is:
When a http request comes in, put it on a queue to be processed and wait n seconds. If the web handler doesn't get a response in n seconds, reply to the caller with a custom message
Once the request is on the processing queue, a camel route is listening to this queue to process. When it pulls a message from queue, put a copy of the request on a different queue to be persisted asynchronously. Do some processing on the request. Then send it to another queue to be further processed and wait for a response. Then put it back on the persist queue to be asynchronously updated.
Then respond to web listener. Then web listener responds to web caller.
I am reading everything I can about Apache Camel and there is a lot of information about there. I might be on a little bit of information overload, and any help on the following concerns would be greatly appreciated:
1)
If the web listeners use an InOut exchange (with the first processing tier) without a replyTo queue defined, it will create a temporary queue for the response. What happens if this request times out? I understand I can set a requestTimeout on the exchange and, if it times out, catch that exception and set a custom message. But, will that temporary queue be killed? Or will they build up over time as requests time out?
2)
When it comes to scaling the processing tiers (adding more instances of those same routes on different machines), is it customary that if the instance that picks up the response (using a fixed reply to queue) is different than the instance that picked up the request, all the information about the original request is inside the message, so there is no need to share data across instances (unless of course there is data that is shared, like aggregrates and such)?
Any other tips and tricks when building a system like this would be very helpful.
Thanks!
I would say this solution is too complicated and there are too many areas which are hard both in terms of maintenance and also complexity. There is too much many steps mixing async and sync communication.
Why not simply the solution to the following steps:
Synchronously http request
Put message on MQ with reply to header
Message is picked up and sent to backend
If reply is not received within a given time transaction is terminated.
The reply to queue is removed
Requestor is notified.
Related
I am stuck in a problem while using Kafka in a microservice architecture . I am not able to understand how a microservice handling HTTP requests will be able to send a response to the user. I want to take data from HTTP and then publish it to topic named A then another validator service will validate it and publish it on another topic named B. I want to send processed data to HTTP response from subscribed data from topic B.
In my experience, one option is to respond immediately with 201 accepted, or embed a blocking validator library into your API, and properly return a 400 Bad Request.
Then future GET calls are required to read eventually consistent data that might come back from any consumer. E.g. the consumer process is Kafka Connector that writes to some database, or is a Kafka Streams/KSQL table, future API queries will return data from that. Your original client may need to make periodic HTTP calls until that data is available.
For the last week I've been researching a lot on the microservice architecture pattern and its requirements and constraints.
The majority of ressources suggest to use event buses/message brokers (asynchronous communication) to communicate between microservices rather than using REST API endpoints.
Synchronous calling would result in a higher response time and may cause cascading failure in case of a particular microservice failing in the chain.
Question:
Let's say the user requests a particular functionality or page on a website/mobile app which then needs to fetch data from multiple microservices and use theire respective functionalities to provide the desired outcome. But to achieve the desired outcome (response to client) ALL the services need to do their work before the backend sends the response back to the client (website/mobile app).
But if we use asynchronous service requests - which means the calling service doesnt wait for a response and would send its own response back to the client without getting the data from the asynchronously called service - the outcome might not be complete if an asychronously called service doesnt respond in time (service is unavailable or network issues). This would mean that the backend will send an incomplete response back to the client which is not acceptable.
How can I deal with this issue or did I get the concept wrong?
I'm thankful for every answer
If it's absolutely essential that a request gets a full response (i.e. that the request is synchronous), that's a strong argument in favor of the service stitching together synchronous requests and responses (and potentially needing to handle rollback in cases of partial success etc.).
Many requests don't fall into that pattern, though. For instance, a response might well be interpretable as "we've received your request and the operation will be performed. You can track the progress of your operation by using this request ID"; such an approach fits well with asynchronous messaging.
Let's imagine I have a REST API with an endpoint /api/status. When this endpoint is accessed, the API sends a message to a message queue requesting the status of some other service.
Then in reply, the service sends a message with its status to a queue on which the REST API listens. So it's single message to request the status and single reply message.
My question is: Is there a design pattern for converting the asynchronous nature of this approach to a synchronous one in the API? In other words: Is there a pattern that the GetStatus(...) method in the pseudo code below can implement to synchronize the getting of the status with communication over multiple message queues or even pub/sub systems.
var statusRequestMsg = "get_status";
var statusResponseMsg = GetStatus(statusRequestMsg);
I know how to solve this in code but I was curious if there is a design patter that introduces a common approach.
I googled a lot in search for that but the only think that I found was a very technical explanation of an approach to do that in this article:
A Communication Model to Integrate the Request-Response and the Publish-Subscribe Paradigms into Ubiquitous Systems
Please note that I understand that this is not the perfect API design and that there are better ways to implement the example. I've created the above example to help me illustrate my question. Also I understand that some AMQP impl. (like RabbitMQ) provide a way to synchronize MQ communication to request/response style.
Thanks in advance.
Microsoft calls it Async Request Reply pattern and uses a solution that polls over HTTP:
https://learn.microsoft.com/en-us/azure/architecture/patterns/async-request-reply
I imagine it should be possible to avoid polling by subscribing to updates for a key. For example, it's possible to subscribe to updates to a single key in Redis with keyspace notifications (The page mentions two caveats: that "all the events delivered during the time the client [is] disconnected are lost" and "events' notifications are not broadcasted to all nodes".)
Have you considered something like this:
Request comes in
Create a correlation id
Send correlation id to other service as part of message sent via queue
Begin polling for that id in some data store (say Redis)
Time elapses...
Send correlation id back to originating service along with result of request in a message sent via queue
Worker reading queue sets value of correlation id in data store to result of asynchronous request
Polling discovers result and returns in as response to request
Would that work?
Do I need to do anything to make all requests asynchronous or are they automatically handled that way?
I ran some tests and it appears that each request comes in on its own thread, but I figure better to ask as I might have tested wrong.
Update: (I have a bad habit of not explaining fully - sorry) Here's my concern. A client browser makes a REST request to my server of http://data.domain/com/employee_database/?query=state:Colorado. That comes in to the appropriate method in the controller. That method queries the database and returns an object which is then turned into a JSON structure and returned to the calling app.
Now let's say 10,000 clients all make a similar query to the same server. So I have 10,000 requests coming in at once. Will my controller method be called simultaneously in 10,000 distinct threads? Or must the first request return before the second request is called?
I'm not asking about the code in my handler method having asynchronous components. For my case the request becomes a single SQL query so the code has nothing that can be handled asynchronously. And until I get the requested data, I can't return from the method.
No REST is not async by default. the request are handled synchronously. However, your web server (IIS) has a number of max threads setting which can work at the same time, and it maintains a queue of the request received. So, the request goes in the queue and if a thread is available it gets executed else, the request waits in the IIS queue till a thread is available
I think you should be using async IO/operations such as database calls in your case. Yes in Web Api, every request has its own thread, but threads can run out if there are many consecutive requests. Also threads use memory so if your api gets hit by too many request it may put pressure on your system.
The benefit of using async over sync is that you use your system resources wisely. Instead of blocking the thread while it is waiting for the database call to complete in sync implementation, the async will free the thread to handle more requests or assign it what ever process needs a thread. Once IO (database) call completes, another thread will take it from there and continue with the implementation. Async will also make your api run faster if your IO operations take longer to complete.
To be honest, your question is not very clear. If you are making an HTTP GET using HttpClient, say the GetAsync method, request is fired and you can do whatever you want in your thread until the time you get the response back. So, this request is asynchronous. If you are asking about the server side, which handles this request (assuming it is ASP.NET Web API), then asynchronous or not is up to how you implemented your web API. If your action method, does three things, say 1, 2, and 3 one after the other synchronously in blocking mode, the same thread is going to the service the request. On the other hand, say #2 above is a call to a web service and it is an HTTP call. Now, if you use HttpClient and you make an asynchronous call, you can get into a situation where one request is serviced by more than one thread. For that to happen, you should have made the HTTP call from your action method asynchronously and used async keyword. In that case, when you call await inside the action method, your action method execution returns and the thread servicing your request is free to service some other request and ultimately when the response is available, the same or some other thread will continue from where it was left off previously. Long boring answer, perhaps but difficult to explain just through words by typing, I guess. Hope you get some clarity.
UPDATE:
Your action method will execute in parallel in 10,000 threads (ideally). Why I'm saying ideally is because a CLR thread pool having 10,000 threads is not typical and probably impractical as well. There are physical limits as well as limits imposed by the framework as well but I guess the answer to your question is that the requests will be serviced in parallel. The correct term here will be 'parallel' but not 'async'.
Whether it is sync or async is your choice. You choose by the way to write your action. If you return a Task, and also use async IO under the hood, it is async. In other cases it is synchronous.
Don't feel tempted to slap async on your action and use Task.Run. That is async-over-sync (a known anti-pattern). It must be truly async all the way down to the OS kernel.
No framework can make sync IO automatically async, so it cannot happen under the hood. Async IO is callback-based which is a severe change in programming model.
This does not answer what you should do of course. That would be a new question.
I'm new to web development so I'm not sure what's the best option for the problem that I'm having.
Basically I have a web application that calls a web service for processing some data.
This process may take a long time (hours) and I would to know if there is an easy way to send some status information to the client from time to time.
Right now, the client makes the request from the browser and it just waits there until it finishes.
How can I send some information from the web service? I would like to send a percentage and some additional text specifying what is being done.
Thanks
WCF services can be marked as [OneWay] so that they don't return a response.
or, you could have the service kick off the process in an async manner and then just return to the client that the process has/or hasn't kicked off.
Then, the client can poll another method as the other user has suggested.
If you process takes hours you definitely can't use a sync service because you'll hit your execution timeout or rather the connection timeout for the client.
Maybe you can poll another method for status?
If I were you, I would make the original request asynchronous, as in instead of waiting for the response, it just "starts" the task and returns immediately. Then I would have a separate method on your web service that the app can poll periodically to get the status of the job. once it completes, it can display the data like the original request was doing.
if you want to do it synchronously, you can turn off Response.Buffer and write directly to the response.