How bad is it to run an entire HTTP action method in separate thread using Task::Run()? - asp.net

I'm writing web services in C++/CLI (not my choice) using Microsoft's Web API. A lot of functions in Web API are async, but because I'm using C++/CLI, I don't get the async/await support of C# or VB. So the fallback position is to use ContinueWith() to schedule a continuation delegate for reading the async task's result safely.
However, because C++/CLI also doesn't support inline anonymous delegates or managed lambdas, every delegate continuation must be written as a separate function somewhere. That quickly turns into spaghetti with the number of async functions in Web API.
So, to avoid the deadlock issues of Task<T>::Result, I've been trying this:
[HttpGet, Route( "get/some/dto" )]
Task< SomeDTO ^ > ^ MyActionMethod()
{
return Task::Run( gcnew Func< SomeDTO ^ >( this, &MyController::MyActionMethod2 ) );
}
SomeDTO ^ MyActionMethod2()
{
// execute code and use any task->Result calls I need without deadlocking
}
Okay, so I know this isn't great, but how bad is it? I don't yet understand enough of the guts of Web API or ASP.NET to comprehend the performance or scaling ramifications this will have.
Also, what other consequences may this have that aren't necessarily related to performance? For example, exceptions get wrapped in an extra AggregateException, which represents additional complexity and work for handling exceptions.

Your memory usage will increase with your application's parallelism. For every concurrent call to MyActionMethod you will need a separate thread with its own stack. That will cost you about 1 MB of RAM for each concurrent call. If MyActionMethod runs long enough so that 10000 instances run at once, you're looking at 10 GB of RAM. There is also CPU overhead in setting up each thread.
If concurrency is low, dropping async support won't be a problem. In that case, don't bother with Task::Run. Just change MyActionMethod to return SomeDTO^ (no Task wrapper).
Another potential concern is that lose easy use of cancellation tokens. However, for Web API it's usually fine to just let an exception propagate back to Web API, which ends up cancelling the synchronous call anyway.
Finally, if you were planning on performing any operation within your action method in parallel, you'll still need to use ContinueWith to accomplish that. Going non-async by default means you'll always perform one operation at a time. Fortunately, it's often just fine to do so.

Okay, so I know this isn't great, but how bad is it?
It's difficult to answer this without load-testing your specific scenario. But you can walk through the known semantics (taken largely from my blog).
First, when a request comes in, ASP.NET executes your handler on a thread pool thread within that request context. Your request handler calls Task.Run, which takes another thread from the thread pool and executes the actual request logic on it. The handler then returns the task returned from Task.Run; this releases the original request thread back to the thread pool.
Then, the Task.Run delegate will block on any asynchronous parts. So, this pattern has the scaling disadvantages of a regular synchronous handler, plus an extra thread context switch. Also, it uses a thread from the ASP.NET thread pool, which is not necessarily a bad thing, but in some scenarios it may throw off the ASP.NET thread pool heuristics.
Also, what other consequences may this have that aren't necessarily related to performance? For example, exceptions get wrapped in an extra AggregateException, which represents additional complexity and work for handling exceptions.
Yes, the exceptions from any .Result or Wait() calls will be wrapped in AggregateException. You may be able to avoid this by calling .GetAwaiter().GetResult() instead.
Another important consideration is that the code executing within the Task.Run is executing without a request context. So, ambient data like HttpContext.Current, current culture, thread principal, etc. are not going to be set correctly. You'll have to capture any important data before calling Task.Run and pass it down manually.

Related

Web API 2 - are all REST requests asynchronous?

Do I need to do anything to make all requests asynchronous or are they automatically handled that way?
I ran some tests and it appears that each request comes in on its own thread, but I figure better to ask as I might have tested wrong.
Update: (I have a bad habit of not explaining fully - sorry) Here's my concern. A client browser makes a REST request to my server of http://data.domain/com/employee_database/?query=state:Colorado. That comes in to the appropriate method in the controller. That method queries the database and returns an object which is then turned into a JSON structure and returned to the calling app.
Now let's say 10,000 clients all make a similar query to the same server. So I have 10,000 requests coming in at once. Will my controller method be called simultaneously in 10,000 distinct threads? Or must the first request return before the second request is called?
I'm not asking about the code in my handler method having asynchronous components. For my case the request becomes a single SQL query so the code has nothing that can be handled asynchronously. And until I get the requested data, I can't return from the method.
No REST is not async by default. the request are handled synchronously. However, your web server (IIS) has a number of max threads setting which can work at the same time, and it maintains a queue of the request received. So, the request goes in the queue and if a thread is available it gets executed else, the request waits in the IIS queue till a thread is available
I think you should be using async IO/operations such as database calls in your case. Yes in Web Api, every request has its own thread, but threads can run out if there are many consecutive requests. Also threads use memory so if your api gets hit by too many request it may put pressure on your system.
The benefit of using async over sync is that you use your system resources wisely. Instead of blocking the thread while it is waiting for the database call to complete in sync implementation, the async will free the thread to handle more requests or assign it what ever process needs a thread. Once IO (database) call completes, another thread will take it from there and continue with the implementation. Async will also make your api run faster if your IO operations take longer to complete.
To be honest, your question is not very clear. If you are making an HTTP GET using HttpClient, say the GetAsync method, request is fired and you can do whatever you want in your thread until the time you get the response back. So, this request is asynchronous. If you are asking about the server side, which handles this request (assuming it is ASP.NET Web API), then asynchronous or not is up to how you implemented your web API. If your action method, does three things, say 1, 2, and 3 one after the other synchronously in blocking mode, the same thread is going to the service the request. On the other hand, say #2 above is a call to a web service and it is an HTTP call. Now, if you use HttpClient and you make an asynchronous call, you can get into a situation where one request is serviced by more than one thread. For that to happen, you should have made the HTTP call from your action method asynchronously and used async keyword. In that case, when you call await inside the action method, your action method execution returns and the thread servicing your request is free to service some other request and ultimately when the response is available, the same or some other thread will continue from where it was left off previously. Long boring answer, perhaps but difficult to explain just through words by typing, I guess. Hope you get some clarity.
UPDATE:
Your action method will execute in parallel in 10,000 threads (ideally). Why I'm saying ideally is because a CLR thread pool having 10,000 threads is not typical and probably impractical as well. There are physical limits as well as limits imposed by the framework as well but I guess the answer to your question is that the requests will be serviced in parallel. The correct term here will be 'parallel' but not 'async'.
Whether it is sync or async is your choice. You choose by the way to write your action. If you return a Task, and also use async IO under the hood, it is async. In other cases it is synchronous.
Don't feel tempted to slap async on your action and use Task.Run. That is async-over-sync (a known anti-pattern). It must be truly async all the way down to the OS kernel.
No framework can make sync IO automatically async, so it cannot happen under the hood. Async IO is callback-based which is a severe change in programming model.
This does not answer what you should do of course. That would be a new question.

Starting and Forgetting an Async task in MVC Action

I have a standard, non-async action like:
[HttpPost]
public JsonResult StartGeneratePdf(int id)
{
PdfGenerator.Current.GenerateAsync(id);
return Json(null);
}
The idea being that I know this PDF generation could take a long time, so I just start the task and return, not caring about the result of the async operation.
In a default ASP.Net MVC 4 app this gives me this nice exception:
System.InvalidOperationException: An asynchronous operation cannot be started at this time. Asynchronous operations may only be started within an asynchronous handler or module or during certain events in the Page lifecycle. If this exception occurred while executing a Page, ensure that the Page is marked <%# Page Async="true" %>.
Which is all kinds of irrelevant to my scenario. Looking into it I can set a flag to false to prevent this Exception:
<appSettings>
<!-- Allows throwaway async operations from MVC Controller actions -->
<add key="aspnet:AllowAsyncDuringSyncStages" value="true" />
</appSettings>
https://stackoverflow.com/a/15230973/176877
http://msdn.microsoft.com/en-us/library/hh975440.aspx
But the question is, is there any harm by kicking off this Async operation and forgetting about it from a synchronous MVC Controller Action? Everything I can find recommends making the Controller Async, but that isn't what I'm looking for - there would be no point since it should always return immediately.
Relax, as Microsoft itself says (http://msdn.microsoft.com/en-us/library/system.web.httpcontext.allowasyncduringsyncstages.aspx):
This behavior is meant as a safety net to let you know early on if
you're writing async code that doesn't fit expected patterns and might
have negative side effects.
Just remember a few simple rules:
Never await inside (async or not) void events (as they return immediately). Some WebForms Page events support simple awaits inside them - but RegisterAsyncTask is still the highly preferred approach.
Don't await on async void methods (as they return immediately).
Don't wait synchronously in the GUI or Request thread (.Wait(), .Result(), .WaitAll(), WaitAny()) on async methods that don't have .ConfigureAwait(false) on root await inside them, or their root Task is not started with .Run(), or don't have the TaskScheduler.Default explicitly specified (as the GUI or Request will thus deadlock).
Use .ConfigureAwait(false) or Task.Run or explicitly specify TaskScheduler.Default for every background process, and in every library method, that does not need to continue on the synchronization context - think of it as the "calling thread", but know that it is not one (and not always on the same one), and may not even exist anymore (if the Request already ended). This alone avoids most common async/await errors, and also increases performance as well.
Microsoft just assumed you forgot to wait on your task...
UPDATE: As Stephen clearly (pun not intended) stated in his answer, there is an inherit but hidden danger with all forms of fire-and-forget when working with application pools, not solely specific to just async/await, but Tasks, ThreadPool, and all other such methods as well - they are not guaranteed to finish once the request ends (app pool may recycle at any time for a number of reasons).
You may care about that or not (if it's not business-critical as in the OP's particular case), but you should always be aware of it.
The InvalidOperationException is not a warning. AllowAsyncDuringSyncStages is a dangerous setting and one that I would personally never use.
The correct solution is to store the request to a persistent queue (e.g., an Azure queue) and have a separate application (e.g., an Azure worker role) processing that queue. This is much more work, but it is the correct way to do it. I mean "correct" in the sense that IIS/ASP.NET recycling your application won't mess up your processing.
If you absolutely want to keep your processing in-memory (and, as a corollary, you're OK with occasionally "losing" reqeusts), then at least register the work with ASP.NET. I have source code on my blog that you can drop in your solution to do this. But please don't just grab the code; please read the entire post so it's clear why this is still not the best solution. :)
The answer turns out to be a bit more complicated:
If what you're doing, as in my example, is just setting up a long-running async task and returning, you don't need to do more than what I stated in my question.
But, there is a risk: If someone expanded this Action later where it made sense for the Action to be async, then the fire and forget async method inside it is going to randomly succeed or fail. It goes like this:
The fire and forget method finishes.
Because it was fired from inside an async Task, it will attempt to rejoin that Task's context ("marshal") as it returns.
If the async Controller Action has completed and the Controller instance has since been garbage collected, that Task context will now be null.
Whether it is in fact null will vary, because of the above timings - sometimes it is, sometimes it isn't. That means a developer can test and find everything working correctly, push to Production, and it explodes. Worse, the error this causes is:
A NullReferenceException - very vague.
Thrown inside .Net Framework code you can't even step into inside of Visual Studio - usually System.Web.dll.
Not captured by any try/catch because the part of the Task Parallel Library that lets you marshal back into existing try/catch contexts is the part that's failing.
So, you'll get a mystery error where things just don't occur - Exceptions are being thrown but you're likely not privy to them. Not good.
The clean way to prevent this is:
[HttpPost]
public JsonResult StartGeneratePdf(int id)
{
#pragma warning disable 4014 // Fire and forget.
Task.Run(async () =>
{
await PdfGenerator.Current.GenerateAsync(id);
}).ConfigureAwait(false);
return Json(null);
}
So, here we have a synchronous Controller with no issues - but to ensure it still won't even if we change it to async later, we explicitly start a new Task via Run, which by default puts the Task on the main ThreadPool. If we awaited it, it would attempt to tie it back to this context, which we don't want - so we don't await it, and that gets us a nuisance warning. We disable the warning with the pragma warning disable.

Asp.net SynchronizationContext locks HttpApplication for async continuations?

This comment by Stephen Cleary says this:
AspNetSynchronizationContext is the strangest implementation. It treats Post as synchronous rather than asynchronous and uses a lock to execute its delegates one at a time.
Similarly, the article that he wrote on synchronization contexts and linked to in that comment suggests:
Conceptually, the context of AspNetSynchronizationContext is complex. During the lifetime of an asynchronous page, the context starts with just one thread from the ASP.NET thread pool. After the asynchronous requests have started, the context doesn’t include any threads. As the asynchronous requests complete, the thread pool threads executing their completion routines enter the context. These may be the same threads that initiated the requests but more likely would be whatever threads happen to be free at the time the operations complete.
If multiple operations complete at once for the same application, AspNetSynchronizationContext will ensure that they execute one at a time. They may execute on any thread, but that thread will have the identity and culture of the original page.
Digging in reflector seems to validate this as it takes a lock on the HttpApplication while invoking any callback.
Locking the app object seems like scary stuff. So my first question: Does that mean that today, all asynchronous completions for the entire app execute one at a time, even ones that originated from separate requests on separate threads with separate HttpContexts? Wouldn't this be a huge bottleneck for any apps that make 100% use of async pages (or async controllers in MVC)? If not, why not? What am I missing?
Also, in .NET 4.5, it looks like there's a new AspNetSynchronizationContext, and the old one is renamed LegacyAspNetSynchronizationContext and only used if the new app setting UseTaskFriendlySynchronizationContext is not set. So question #2: Does the new implementation change this behavior? Otherwise, I imagine with the new async/await support marshaling completions through the synchronization context, this kind of bottleneck would be noticed much more frequently going forward.
The answer to this forum post (linked from SO answer here) suggests that something fundamentally changed here, but I want to be clear on what that is and what behaviors have improved, since we have a .NET 4 MVC 3 app which is pretty much 100% async action methods making web service calls.
Let me answer your first question. In your assumption you didn't consider the fact that separate ASP.NET requests are processed by different HttpApplication objects. HttpApplication objects are stored in pool. Once you request a page, an application object is retrieved from pool and belongs to the request till its completion. So, my answer to your question:
all asynchronous completions for the entire app execute one at a time, even ones that originated from separate requests on separate threads with separate HttpContexts
is: No, they don't
Separate requests are processed by separate HttpApplication objects, locked HttpApplication will affect only single request.
Synchronization context is a powerful thing that helps developers to synchronize access to shared (in scope of request) resources. That is why all callbacks are executed under lock. Synchronization context is a heart of event-based synchronization pattern.

asynchronous calls in asp.net

in this sample, two threads created; a worker thread created by BeginInvoke and an I/O completion thread created by SendAsync method.
but another author in his UnsafeQueueNativeOverlapped example, don't recommend this.
i want to use SendAsync or ...Async in an asp.net page and i want to use PageAsyncTask.
however, its BeginEventHandler requires AsyncResult to be returned which SendAsync does not return.
afaik, event based async pattern is the most recommended way so how could we call SendAsync or any ...Async methods without creating two threads and hurting the performance?
Actually if you used the beginIvoke and endInvoke for delegates or ThreadPool.WorkerItem it will not make any difference in your application because they are using the same thread that asp.net uses throw the iis
so now you have only 2 solution to make async calls first u will write your own threading classes (but be careful )
second use the PageAsyncTasks(recommended) this one much more safe and it's designed to work perfectly with asp.net
it's not about hurting the performance as much as it's about how and when to use asnyc tasks
if your process really take much time until it finish (because IIS will wait until all processes finish even the asnyc ones then start rendering) then you have to go to async solution instead it will make a draw back in performance
Note:
there is a difference between AddOnPreRenderCompleteAsync and RegisterAsyncTask
in there implementation they look the same but in the second one
you have access to the current http
context ,impersonation, culture and
profile data etc
you can run many tasks in parallel
you have timeout event and you can
determine timeout in the page
attribute
you can call RegisterAsyncTask
several times in one request to
register several async operations

Which is better in this case - sync or async web service?

I'm setting up a web service in Axis2 whose job it will be to take a bunch of XML and put it on to a queue to be processed later. I understand its possible to set up a client to invoke a synchronous web service asynchronously by creating a using an "invokeNonBlocking" operation on the "Call" instance. (ref http://onjava.com/pub/a/onjava/2005/07/27/axis2.html?page=4)
So, my question is, is there any advantage to using an asynchronous web service in this case? It seems redundant because 1) the client isn't blocked and 2) the service has to accept and write the xml to queue regardless if it's synchronous or asynchronous
In my opinion, asynchronous is the appropriate way to go. A couple of things to consider:
Do you have multiple clients accessing this service at any given moment?
How often is this process occurring?
It does take a little more effort to implement the async methods. But I guarantee, in the end you will be much happier with the result. For one, you don't have to manage threading. Your primary concern might just be the volatility of the data in the que (i.e. race/deadlock conditions).
A "sync call" seems appropriate, I agree.
If the request from the client isn't time consuming, then I don't see the advantage either in making the call asynchronous. From what I understand of the situation in question here, the web-service will perform its "processing" against the request some time in the future.
If, on the contrary, the request had required a time consuming process, then an async call would haven been appropriate.
After ruminating some more about it, I'm thinking that the service should be asynchronous. The reason is that it would put the task of writing the data to the queue into a separate thread, thus lessening the chances of a timeout. It makes the process more complicated, but if I can avoid a timeout, then it's got to be done.

Resources