I have an application that is working well in production, but I wonder if I could have implemented the concurrency better....
ASP.NET .NET 4, C#
Basically, it generates n number of sql statements on the fly (approx 50 at the moment) and then runs them concurrently and writes the data to .csv files.
EDIT: First I create a thread to do all the work on so the page request can return. Then on that thread...
For each of the SQL statements I create a new Task using the TPL and execute it using a datareader and write the data to disk. When the last file is created I write some summary data to a summary file and zip it all up and give it to the user.
Should I have used Threads or Asynchronous Delegates instead?
I haven't posted code as I am really just wondering if my overall approach (i.e. TPL) is the best option in this situation.
Please don't lecture me about creating dynamic sql, it is totally necessary due to the technicalities of the database I am reading from and not relevant to the question. (Its the back end of a proprietary system. Got 7 thousand+ tables).
Should I have used Threads or Asynchronous Delegates instead?
Apparently, your background thread operation spans across the boundaries of a single HTTP request. In this case, it doesn't really matter what API you use to run such operation: Task.Run, Delegate.BeginInvoke, ThreadPool.QueueUserWorkItem, new Thread or anything else.
You shouldn't be running a lengthy background thread operation, which lifetime spans multiple HTTP requests, inside ASP.NET address space. While it's relatively easy to implement, this approach may have issues with IIS maintainability, scalability and security. Create a WCF service for that and call it from your ASP.NET page:
How to: Host a WCF Service in a Managed Windows Service.
If we start a new thread in ASP.Net from the thread which is serving the http request, and new thread has an unhandled exception, the worker process will crash immediately. Even if we use WCF service and call that from ASP.Net the ASP.Net thread is going to wait for the result. So better use any queuing mechanism so that the requests is in queue and queue can process in a different time based on the processing capacity. Of course when we say queuing we need to think about queue failure, requeue etc...But its worth if the application is big and needs to scale.
I've joined a project that uses Jax-RS (and originally there was quite a bit of Spring-based Controller code in there too, but all URL handlers use Jax-RS now). Now we want to be able to fill in a queue of tasks that should be run with a small delay between each of them. The delay can be specified in ms. I've avoided Thread.sleep, as I've heard you should not manage threads manually in Java EE. Before I came in there was already a busy wait loop implemented.
I would like to switch this to an asynchronous background task. I could of course let the client poll the server with the given delay, and just have an AsyncResponse that can be resumed. But can the same AsyncResponse be resumed/suspended multiple times? The resource does have state, so it would be possible to drop the asynchrony completely and just do client polling to handle all of it.
A lot of example code for showing off asynchronous tasks use Thread.sleep. How bad is it to do this in a background task on an ExecutorService or something similar?
The point of the delay is to simulate human interaction, and post a long list of JMS messages to a queue but ensure that two listeners don't pick up and handle messages that depend on one another.
Is it easier/better to handle this on the client side rather than the server side? Writing some JavaScript that handles all the polling would be quite simple, so if this seems like a bad idea for handling on the server side, it's not that big a deal.
The tool is only going to be used by a single user, as it's a developer testing tool. Therefore we went for solving this on the client side, pushing the messages onto the queue through AJAX calls. This works fine for our purposes, but if anyone has a solution that might help someone else. Feel free to drop a new answer.
Speaking of server resources (in general) and background processes. Would it be better to use a separate executable and a windows scheduled task or use the timer class and make use of the same resources as you application.
There are a few pros and cons to both methods, but what I'm wondering is this: Would making use of shared resources (thread pools and the like) be better than separate resources? Sure the process would be taking resources from the app, but isn't it technically already doing that either way?
you have given too little context to really understand the whole. how does the timer trigger the activity at certain time if the application is closed or there is nobody connected (logged on)? This kind of stays the same for both ASP.NET and Windows client because IIS takes the application down when nobody is connected for a while.
in my opinion a Windows' scheduled task is way better because you decouple from IIS application pool / application lifecycle and you also separate better and are sure that at that time the call will be executed and the activity started.
Duplicate
This is a close duplicate of Dealing with a longer running process in WCF. Please considering posting your answer to that one instead of this.
Original Question
I'm implementing the business layer of an application that must run some background processes at scheduled times. The business layer is made up of several WCF services all running under the same web application.
The idea is defining a set of 'tasks' that must be run at different times (eg. every 5 minutes, everyday at 23:00, etc). That wouldn't be hard to implement as a windows service, but the problem is, the tasks need access to data caches that are living in the services, so this 'scheduler' must run under the IIS context in order to access that data.
What I'm doing currently is using a custom ServiceHostFactory in one of the WCF services which spawns a child thread and returns. The child thread sleeps and wakes up every X minutes to see if there are scheduled tasks and executes them.
But I'm worried about IIS randomly killing my thread when it recycles the application pool or after some inactive time (eg. no activity on any of the WCF services, which listen for requests from the presentation layer). The thread must run uninterrupted regardless of activity on the services. Is this really possible?
I have found an article by someone doing the same thing, but his solution seems to be pinging the server from the child thread itself regularly. Hopefully there is a better solution.
I have at some point implemented a Windows Service that would load a web page on a regular basis. The purpose of that was was that the site was hosting a Workflow Foundation runtime, and we wanted to ensure that the web application was brought back up after IIS recycling the application pool. Perhaps the same approach can be used in this case; have a service (or Scheduled Task in Windows; even simpler) run every x minutes and load a page that will check for tasks.
Is it a possibility to run either a Windows Service or place applications in the Windows Scheduler to execute methods in the WCF at certain times? Maybe use a BackgroundWorker inside the WCF. Another option would be for WCF to spawn other applications to do the business logic, passing the appropriate data, or pointers to the data in memory(unsafe).
Recently, the book on threading for Winforms application (Concurrent programming on Windows by Joe Duffy) was released. This book, focused on winforms, is 1000 pages.
What gotchas are there in ASP.NET threading? I'm sure there are plenty of gotchas to be aware of when implementing threading in ASP.NET. What should I be aware of?
Thanks
Since each http request received by IIS is processed separately, on it's own thread anyway, the only issues you should have is if you kick off some long running process from within the scope of a single http request. In that case, I would put such code into a separate referenced dependant assembly, coded like a middle-tier component, with no dependance or coupling to the ASP.Net model at all, and handle whatever concurrency issues arose within that assembly separately, without worrying about the ASP.Net model at all...
Jeff Richter over at Wintellect has a library called PowerThreading. It is very useful if you are developing applications on .NET. => Power Threading Library
Check for his presentations online at various events.
Usually you are encouraged to use the thread pool in .Net because it of the many benefits of having things managed on your behalf.....but NOT in ASP.net.
Since ASP.net is already multi-threaded, it uses the thread pool to serve requests that are mapped to the ASP.net ISAPI filter, and since the thread pool is fixed in size, by using it you are basically taking threads away that are set aside to do the job of handling request.
In small, low-traffic websites, this is not an issue, but in larger, high-traffic websites you end up competing for and consuming threads that the ASP.net process relies on.
If you want to use threading, it is fine to do something like....
Thread thread = new Thread(threadStarter);
thread.IsBackground = true;
thread.Start();
but with a warning: be sure that the IsBackground is set to true because if it isn't the thread exists in the foreground and will likely prevent the IIS worker process from recycling or restarting.
First, are you talking about asynchronous ASP.NET? Or using the ThreadPool/spinning up your own threads?
If you aren't talking about asynchronous ASP.NET, the main question to answer is: what work would you be doing in the other threads and would the work be specific to a request/response cycle, or is it more about processing global tasks in the background?
EDIT
If you need to handle concurrent operations (a better term than multi-threaded IMO) for a given request/response cycle, then use the asynchronous features of ASP.NET. These provide an abstraction over IIS's support for concurrency, allowing the server to process other requests while the current request is waiting for work to complete.
For background processing of global tasks, I would not use ASP.NET at all. You should assume that IIS will recycle your AppPool at a random point in time. You also should not assume that IIS will run your AppPool on any sort of schedule. Any important background processing should be done outside of IIS, either as a scheduled task or a Windows Service. The approach I usually take is to have a Windows Service and a shared work-queue where the web-site can post work items. The queue can be a database table, a reliable message-based queue (MSMQ, etc), files on the file system, etc.
The immediate thing that comes to mind is, why would you "implement threading" in ASP.NET.
You do need to be conscious all the time that ASP.NET is multi-threaded since many requests can be processed simulatenously each in its own thread. So for example use of static fields needs to take threading into account.
However its rare that you would want to spin up a new thread in code yourself.
As far as the usual winforms issues with threading in the UI is concerned these issues are not present in ASP.NET. There is no window based message pump to worry about.
It is possible to create asynchronous pages in ASP.NET. These will perform all steps up to a certain point. These steps will include asynchronously fetching data, for instance. When all the asynchronous tasks have completed, the remainder of the page lifecycle will execute. In the meantime, a worker thread was not tied up waiting for database I/O to complete.
In this model, all extra threads are executing while the request, and the page instance, and all the controls, still exist. You have to be careful when starting your own threads, that, by the time the thread executes, it's possible that the request, page instance, and controls will have been Disposed.
Also, as usual, be certain that multiple threads will actually improve performance. Often, additional threads will make things worse.
The gotchas are pretty much the same as in any multithreaded application.
The classes involved in processing a request (Page, Controls, HttpContext.Current, ...) are specific to that request so don't need any special handling.
Similarly for any classes you instantiate as local variables or fields within these classes, and for access to Session.
But, as usual, you need to synchronize access to shared resources such as:
Static (C#) / Shared(VB.NET) references.
Singletons
External resources such as the file system
... etc...
I've seen threading bugs too often in ASP.NET apps, e.g. a singleton being used by multiple concurrent requests without synchronization, resulting in user A seeing user B's data.