Which is better in this case - sync or async web service? - asynchronous

I'm setting up a web service in Axis2 whose job it will be to take a bunch of XML and put it on to a queue to be processed later. I understand its possible to set up a client to invoke a synchronous web service asynchronously by creating a using an "invokeNonBlocking" operation on the "Call" instance. (ref http://onjava.com/pub/a/onjava/2005/07/27/axis2.html?page=4)
So, my question is, is there any advantage to using an asynchronous web service in this case? It seems redundant because 1) the client isn't blocked and 2) the service has to accept and write the xml to queue regardless if it's synchronous or asynchronous

In my opinion, asynchronous is the appropriate way to go. A couple of things to consider:
Do you have multiple clients accessing this service at any given moment?
How often is this process occurring?
It does take a little more effort to implement the async methods. But I guarantee, in the end you will be much happier with the result. For one, you don't have to manage threading. Your primary concern might just be the volatility of the data in the que (i.e. race/deadlock conditions).

A "sync call" seems appropriate, I agree.
If the request from the client isn't time consuming, then I don't see the advantage either in making the call asynchronous. From what I understand of the situation in question here, the web-service will perform its "processing" against the request some time in the future.
If, on the contrary, the request had required a time consuming process, then an async call would haven been appropriate.

After ruminating some more about it, I'm thinking that the service should be asynchronous. The reason is that it would put the task of writing the data to the queue into a separate thread, thus lessening the chances of a timeout. It makes the process more complicated, but if I can avoid a timeout, then it's got to be done.

Related

How bad is it to run an entire HTTP action method in separate thread using Task::Run()?

I'm writing web services in C++/CLI (not my choice) using Microsoft's Web API. A lot of functions in Web API are async, but because I'm using C++/CLI, I don't get the async/await support of C# or VB. So the fallback position is to use ContinueWith() to schedule a continuation delegate for reading the async task's result safely.
However, because C++/CLI also doesn't support inline anonymous delegates or managed lambdas, every delegate continuation must be written as a separate function somewhere. That quickly turns into spaghetti with the number of async functions in Web API.
So, to avoid the deadlock issues of Task<T>::Result, I've been trying this:
[HttpGet, Route( "get/some/dto" )]
Task< SomeDTO ^ > ^ MyActionMethod()
{
return Task::Run( gcnew Func< SomeDTO ^ >( this, &MyController::MyActionMethod2 ) );
}
SomeDTO ^ MyActionMethod2()
{
// execute code and use any task->Result calls I need without deadlocking
}
Okay, so I know this isn't great, but how bad is it? I don't yet understand enough of the guts of Web API or ASP.NET to comprehend the performance or scaling ramifications this will have.
Also, what other consequences may this have that aren't necessarily related to performance? For example, exceptions get wrapped in an extra AggregateException, which represents additional complexity and work for handling exceptions.
Your memory usage will increase with your application's parallelism. For every concurrent call to MyActionMethod you will need a separate thread with its own stack. That will cost you about 1 MB of RAM for each concurrent call. If MyActionMethod runs long enough so that 10000 instances run at once, you're looking at 10 GB of RAM. There is also CPU overhead in setting up each thread.
If concurrency is low, dropping async support won't be a problem. In that case, don't bother with Task::Run. Just change MyActionMethod to return SomeDTO^ (no Task wrapper).
Another potential concern is that lose easy use of cancellation tokens. However, for Web API it's usually fine to just let an exception propagate back to Web API, which ends up cancelling the synchronous call anyway.
Finally, if you were planning on performing any operation within your action method in parallel, you'll still need to use ContinueWith to accomplish that. Going non-async by default means you'll always perform one operation at a time. Fortunately, it's often just fine to do so.
Okay, so I know this isn't great, but how bad is it?
It's difficult to answer this without load-testing your specific scenario. But you can walk through the known semantics (taken largely from my blog).
First, when a request comes in, ASP.NET executes your handler on a thread pool thread within that request context. Your request handler calls Task.Run, which takes another thread from the thread pool and executes the actual request logic on it. The handler then returns the task returned from Task.Run; this releases the original request thread back to the thread pool.
Then, the Task.Run delegate will block on any asynchronous parts. So, this pattern has the scaling disadvantages of a regular synchronous handler, plus an extra thread context switch. Also, it uses a thread from the ASP.NET thread pool, which is not necessarily a bad thing, but in some scenarios it may throw off the ASP.NET thread pool heuristics.
Also, what other consequences may this have that aren't necessarily related to performance? For example, exceptions get wrapped in an extra AggregateException, which represents additional complexity and work for handling exceptions.
Yes, the exceptions from any .Result or Wait() calls will be wrapped in AggregateException. You may be able to avoid this by calling .GetAwaiter().GetResult() instead.
Another important consideration is that the code executing within the Task.Run is executing without a request context. So, ambient data like HttpContext.Current, current culture, thread principal, etc. are not going to be set correctly. You'll have to capture any important data before calling Task.Run and pass it down manually.

Web API 2 - are all REST requests asynchronous?

Do I need to do anything to make all requests asynchronous or are they automatically handled that way?
I ran some tests and it appears that each request comes in on its own thread, but I figure better to ask as I might have tested wrong.
Update: (I have a bad habit of not explaining fully - sorry) Here's my concern. A client browser makes a REST request to my server of http://data.domain/com/employee_database/?query=state:Colorado. That comes in to the appropriate method in the controller. That method queries the database and returns an object which is then turned into a JSON structure and returned to the calling app.
Now let's say 10,000 clients all make a similar query to the same server. So I have 10,000 requests coming in at once. Will my controller method be called simultaneously in 10,000 distinct threads? Or must the first request return before the second request is called?
I'm not asking about the code in my handler method having asynchronous components. For my case the request becomes a single SQL query so the code has nothing that can be handled asynchronously. And until I get the requested data, I can't return from the method.
No REST is not async by default. the request are handled synchronously. However, your web server (IIS) has a number of max threads setting which can work at the same time, and it maintains a queue of the request received. So, the request goes in the queue and if a thread is available it gets executed else, the request waits in the IIS queue till a thread is available
I think you should be using async IO/operations such as database calls in your case. Yes in Web Api, every request has its own thread, but threads can run out if there are many consecutive requests. Also threads use memory so if your api gets hit by too many request it may put pressure on your system.
The benefit of using async over sync is that you use your system resources wisely. Instead of blocking the thread while it is waiting for the database call to complete in sync implementation, the async will free the thread to handle more requests or assign it what ever process needs a thread. Once IO (database) call completes, another thread will take it from there and continue with the implementation. Async will also make your api run faster if your IO operations take longer to complete.
To be honest, your question is not very clear. If you are making an HTTP GET using HttpClient, say the GetAsync method, request is fired and you can do whatever you want in your thread until the time you get the response back. So, this request is asynchronous. If you are asking about the server side, which handles this request (assuming it is ASP.NET Web API), then asynchronous or not is up to how you implemented your web API. If your action method, does three things, say 1, 2, and 3 one after the other synchronously in blocking mode, the same thread is going to the service the request. On the other hand, say #2 above is a call to a web service and it is an HTTP call. Now, if you use HttpClient and you make an asynchronous call, you can get into a situation where one request is serviced by more than one thread. For that to happen, you should have made the HTTP call from your action method asynchronously and used async keyword. In that case, when you call await inside the action method, your action method execution returns and the thread servicing your request is free to service some other request and ultimately when the response is available, the same or some other thread will continue from where it was left off previously. Long boring answer, perhaps but difficult to explain just through words by typing, I guess. Hope you get some clarity.
UPDATE:
Your action method will execute in parallel in 10,000 threads (ideally). Why I'm saying ideally is because a CLR thread pool having 10,000 threads is not typical and probably impractical as well. There are physical limits as well as limits imposed by the framework as well but I guess the answer to your question is that the requests will be serviced in parallel. The correct term here will be 'parallel' but not 'async'.
Whether it is sync or async is your choice. You choose by the way to write your action. If you return a Task, and also use async IO under the hood, it is async. In other cases it is synchronous.
Don't feel tempted to slap async on your action and use Task.Run. That is async-over-sync (a known anti-pattern). It must be truly async all the way down to the OS kernel.
No framework can make sync IO automatically async, so it cannot happen under the hood. Async IO is callback-based which is a severe change in programming model.
This does not answer what you should do of course. That would be a new question.

Invoking timed tasks in asynchronous Jax-RS requests

I've joined a project that uses Jax-RS (and originally there was quite a bit of Spring-based Controller code in there too, but all URL handlers use Jax-RS now). Now we want to be able to fill in a queue of tasks that should be run with a small delay between each of them. The delay can be specified in ms. I've avoided Thread.sleep, as I've heard you should not manage threads manually in Java EE. Before I came in there was already a busy wait loop implemented.
I would like to switch this to an asynchronous background task. I could of course let the client poll the server with the given delay, and just have an AsyncResponse that can be resumed. But can the same AsyncResponse be resumed/suspended multiple times? The resource does have state, so it would be possible to drop the asynchrony completely and just do client polling to handle all of it.
A lot of example code for showing off asynchronous tasks use Thread.sleep. How bad is it to do this in a background task on an ExecutorService or something similar?
The point of the delay is to simulate human interaction, and post a long list of JMS messages to a queue but ensure that two listeners don't pick up and handle messages that depend on one another.
Is it easier/better to handle this on the client side rather than the server side? Writing some JavaScript that handles all the polling would be quite simple, so if this seems like a bad idea for handling on the server side, it's not that big a deal.
The tool is only going to be used by a single user, as it's a developer testing tool. Therefore we went for solving this on the client side, pushing the messages onto the queue through AJAX calls. This works fine for our purposes, but if anyone has a solution that might help someone else. Feel free to drop a new answer.

Can two parallel WCF requests get handled by the same thread when ConcurrencyMode = Multiple

I have a WCF service with ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, ConcurrencyMode = ConcurrencyMode.Multiple). I want to use ThreadStatic variable to srore data.
I start worrying about is it possible two parallel requests for the same or different operationContracts get handled by the same thread serverside, because if this happens my ThreadStatic variable will get overriden.(I.e. something like the thread changing between HttpHandlers and HttpModules in ASP.NET)
I made a spike service with the same ServiceBehaviour and maxConcurrentCalls="2". After that a wcf client called the service with 50 parallel requests and my worry did not occur. However this is not a 100% proof.
Thank in advance!
Irrespective of the ConcurrencyMode, a ThreadStatic value will persist when your request terminates and the thread is returned to the thread pool. The same thread can be reused for a subsequent request, which will therefore be able to see your ThreadStatic value.
Obviously this won't be true for two concurrent requests, because by definition they will be executed on different threads.
From comments:
Also by definition MSDN says: 'The service instance is multi-threaded. No synchronization guarantees are made. Because other threads can change your service object at any time, you must handle synchronization and state consistency at all times.' So it is not so obvious:)
This means that a single instance of your service class can be accessed concurrently by multiple requests. So you would need to handle synchronization for any accesses to instance members of the service class.
However ThreadStatic members are by definition only used by one thread (and hence one request) at a time, so don't need synchronization.
The direct answer to your question is Joe's answer.
However you mention in the comments you are using an ambient design pattern. That pattern is already implemented in WCF as the OperationContext and is specifically designed to be extensible. I highly recommend using OperationContext over any custom thread storage.
See Where to store data for current WCF call? Is ThreadStatic safe?
I wanted to add to Joe's answer here because I would recommend that you use some sort of correlation for your requests if you're needing to store state. The threading model will become very convoluted and unreliable in production.
Further, now imagine you have two IIS servers hosting this service and a hardware or software load balancer forward facing so that you can consume it. To ensure that the correct state is gathered you'll need correlation because you never know which server the service will be started on. In the post below I mocked up a simplified version of how that might work. One thing to keep in mind is that the SessionState would need to be kept in a shared location to all instances of the service, an AppFabric Cache server for example.
Global Variable between two WCF Methods

asynchronous calls in asp.net

in this sample, two threads created; a worker thread created by BeginInvoke and an I/O completion thread created by SendAsync method.
but another author in his UnsafeQueueNativeOverlapped example, don't recommend this.
i want to use SendAsync or ...Async in an asp.net page and i want to use PageAsyncTask.
however, its BeginEventHandler requires AsyncResult to be returned which SendAsync does not return.
afaik, event based async pattern is the most recommended way so how could we call SendAsync or any ...Async methods without creating two threads and hurting the performance?
Actually if you used the beginIvoke and endInvoke for delegates or ThreadPool.WorkerItem it will not make any difference in your application because they are using the same thread that asp.net uses throw the iis
so now you have only 2 solution to make async calls first u will write your own threading classes (but be careful )
second use the PageAsyncTasks(recommended) this one much more safe and it's designed to work perfectly with asp.net
it's not about hurting the performance as much as it's about how and when to use asnyc tasks
if your process really take much time until it finish (because IIS will wait until all processes finish even the asnyc ones then start rendering) then you have to go to async solution instead it will make a draw back in performance
Note:
there is a difference between AddOnPreRenderCompleteAsync and RegisterAsyncTask
in there implementation they look the same but in the second one
you have access to the current http
context ,impersonation, culture and
profile data etc
you can run many tasks in parallel
you have timeout event and you can
determine timeout in the page
attribute
you can call RegisterAsyncTask
several times in one request to
register several async operations

Resources