.Net MVC Threading Long running process - asp.net

I have a long running process in my MVC application(C#).
Its building data for many reports.
Some of the clients could take several minutes or longer to calculate. Is running the process in a separate thread the best option?
Is there another way to allow the process to run, while allowing the user to still use the rest of the site?
If threading is the best solution, any good sites or stackoverflow threads to look at on how to do this?

When I've had cases like those, I usually would build a service to asynchronously process requests, and return a handle that I could use to check on its status in a database. IMHO, splitting it off as a thread in the web application seems like you'd be trying to shove a square peg into a round hole.

I've used two methods to solve this. If the work is guaranteed to not be TOO long running, I've kicked off a thread to do the work and return immediately to the user. When we couldn't make that guarantee, we used a queue (we happened to use MSMQ) for executing long running tasks. This processing was done on a different server apart from IIS. A benefit of this is that we built in a wait and retry on failure mechanism. So besides it handling long running tasks, we also used it for anything that might fail in a way that was inconvenient to handle in our MVC app. The main example of this is sending an email. Rather than do that in the MVC app we would just toss an email task on the queue. We used the Command Pattern for the task objects placed on the queue. Once we had that mechanism in place, we stopped using the technique of spawning a thread from our MVC code.

Related

Common Ways of handling long running process in ASP.NET

We have a long running data transfer process that is just an asp.net page that is called and run. It can take up to a couple hours to complete. It seems to work all right but I was just wondering what are some of the more popular ways to handle a long process like this. Do you create an application and run it through windows scheduler, or a web service or custom handler?
In a project for long running tasks in web-application, i made a windows service.
whenever the user has to do the time-consuming task, the IIS would give the task to the service which will return a token(a temporary name for the task) and in the background the service would do the task. At anytime, the user would see the status of his/her task which would be either pending in queue, processing, or completed. The service would do a fixed number of jobs in parallel, and would keep a queue for the next-incoming tasks.
A windows service is the typical solution. You do not want to use a web service or a custom handler as both of those will lie prey to the app pool recycling, which will kill your process.
Windows Workflow Foundation
What I find the most appealing about WF, is that workflows can be designed without much complexity to be persisted in SQL Server, so that if the server reboots in the middle of a process, the workflow can resume.
I use two types of processes depending on the needs of my BAs. For transfer processes that are run on demand and can be scheduled regularly, I typically write a WinForms (this is a personal preference) application that accepts command line parameters so I can schedule the job with params or run it on demand through an interactive window. I've written enough of them over the last few years that I have my own basic generic shell that I use to create new applications of this nature. For processes that must detect events (files appearing in folders, receiving CyberMation calls, or detecting SNMP traps), I prefer to use Windows Services so that they are always available. It's a little trickier simply because you have to be much more cautious of memory usage, leaks, recycling, security, etc. For me, the windows application tends to run faster on long duration jobs than they do when through an IIS process. I don't know if this is because it's attached to an IIS thread or if its memory/security is more limited. I've never investigated it.
I do know that .Net applications provide a lot of flexibility and management over resources, and with some standards and practice, they can be banged out fairly quickly and produce very positive results.

ASP.NET Multithreadthing and timing out long-running operations

I have a web service which has to make a call to a database operation if the operation takes more than 10 seconds then I need to exit from the operation and rollback. My understanding is that this can only be done by spawning a thread to carry out the lengthy process and then terminate the thread if it takes longer than certain amount of time. Can you suggest how this can be done or point me to examples where a similar situation is being handled. Is threading within asp.net and desktop application be considered the same, can I use threading concepts of a desktop application and apply it to web service or is there any difference to consider.
Thanks
I can tell you right now this could be a serious headache. Terminating threads without their cooperation can create all sorts of undefined and unexpected behavior.
But if you really want to do this, starting a background thread in ASP.NET is just like a .NET desktop application. What John means about threading not being the same as a desktop application only applies to how the ASP.NET engine handles requests internally. You don't really need to worry about that aspect (unless you are blocking on every request).

Long-running ASP.NET tasks

I know there's a bunch of APIs out there that do this, but I also know that the hosting environment (being ASP.NET) puts restrictions on what you can reliably do in a separate thread.
I could be completely wrong, so please correct me if I am, this is however what I think I know.
A request typically timeouts after 120 seconds (this is configurable) but eventually the ASP.NET runtime will kill a request that's taking too long to complete.
The hosting environment, typically IIS, employs process recycling and can at any point decide to recycle your app. When this happens all threads are aborted and the app restarts. I'm however not sure how aggressive it is, it would be kind of stupid to assume that it would abort a normal ongoing HTTP request but I would expect it to abort a thread because it doesn't know anything about the unit of work of a thread.
If you had to create a programming model that easily and reliably and theoretically put a long running task, that would have to run for days, how would you accomplish this from within an ASP.NET application?
The following are my thoughts on the issue:
I've been thinking a long the line of hosting a WCF service in a win32 service. And talk to the service through WCF. This is however not very practical, because the only reason I would choose to do so, is to send tasks (units of work) from several different web apps. I'd then eventually ask the service for status updates and act accordingly. My biggest concern with this is that it would NOT be a particular great experience if I had to deploy every task to the service for it to be able to execute some instructions. There's also this issue of input, how would I feed this service with data if I had a large data set and needed to chew through it?
What I typically do right now is this
SELECT TOP 10 *
FROM WorkItem WITH (ROWLOCK, UPDLOCK, READPAST)
WHERE WorkCompleted IS NULL
It allows me to use a SQL Server database as a work queue and periodically poll the database with this query for work. If the work item completed with success, I mark it as done and proceed until there's nothing more to do. What I don't like is that I could theoretically be interrupted at any point and if I'm in-between success and marking it as done, I could end up processing the same work item twice. I might be a bit paranoid and this might be all fine but as I understand it there's no guarantee that that won't happen...
I know there's been similar questions on SO before but non really answers with a definitive answer. This is a really common thing, yet the ASP.NET hosting environment is ill equipped to handle long-running work.
Please share your thoughts.
Have a look at NServiceBus
NServiceBus is an open source
communications framework for .NET with
build in support for publish/subscribe
and long-running processes.
It is a technology build upon MSMQ, which means that your messages don't get lost since they are persisted to disk. Nevertheless the Framework has an impressive performance and an intuitive API.
John,
I agree that ASP.NET is not suitable for Async tasks as you have described them, nor should it be. It is designed as a web hosting platform, not a back of house processor.
We have had similar situations in the past and we have used a solution similar to what you have described. In summary, keep your WCF service under ASP.NET, use a "Queue" table with a Windows Service as the "QueueProcessor". The client should poll to see if work is done (or use messaging to notify the client).
We used a table that contained the process and it's information (eg InvoicingRun). On that table was a status (Pending, Running, Completed, Failed). The client would submit a new InvoicingRun with a status of Pending. A Windows service (the processor) would poll the database to get any runs that in the pending stage (you could also use SQL Notification so you don't need to poll. If a pending run was found, it would move it to running, do the processing and then move it to completed/failed.
In the case where the process failed fatally (eg DB down, process killed), the run would be left in a running state, and human intervention was required. If the process failed in an non-fatal state (exception, error), the process would be moved to failed, and you can choose to retry or have human intervantion.
If there were multiple processors, the first one to move it to a running state got that job. You can use this method to prevent the job being run twice. Alternate is to do the select then update to running under a transaction. Make sure either of these outside a transaction larger transaction. Sample (rough) SQL:
UPDATE InvoicingRun
SET Status = 2 -- Running
WHERE ID = 1
AND Status = 1 -- Pending
IF ##RowCount = 0
SELECT Cast(0 as bit)
ELSE
SELECT Cast(1 as bit)
Rob
Use a simple background tasks / jobs framework like Hangfire and apply these best practice principals to the design of the rest of your solution:
Keep all actions as small as possible; to achieve this, you should-
Divide long running jobs into batches and queue them (in a Hangfire queue or on a bus of another sort)
Make sure your small jobs (batched parts of long jobs) are idempotent (have all the context they need to run in any order). This way you don't have to use a quete which maintains a sequence; because then you can
Parallelise the execution of jobs in your queue depending on how many nodes you have in your web server farm. You can even control how much load this subjects your farm to (as a trade off to servicing web requests). This ensures that you complete the whole job (all batches) as fast and as efficiently as possible, while not compromising your cluster from servicing web clients.
Have thought about the use the Workflow Foundation instead of your custom implementation? It also allows you to persist states. Tasks could be defined as workflows in this case.
Just some thoughts...
Michael

BackgroundWorker From ASP.Net Application

We have an ASP.Net application that provides administrators to work with and perform operations on large sets of records. For example, we have a "Polish Data" task that an administrator can perform to clean up data for a record (e.g. reformat phone numbers, social security numbers, etc.) When performed on a small number of records, the task completes relatively quickly. However, when a user performs the task on a larger set of records, the task may take several minutes or longer to complete. So, we want to implement these kinds of tasks using some kind of asynchronous pattern. For example, we want to be able to launch the task, and then use AJAX polling to provide a progress bar and status information.
I have been looking into using the BackgroundWorker class, but I have read some things online that make me pause. I would love to get some additional advice on this.
For example, I understand that the BackgroundWorker will actually use the thread pool from the current application. In my case, the application is an ASP.Net web site. I have read that this can be a problem because when the application recycles, the background workers will be terminated. Some of the jobs I mentioned above may take 3 minutes, but others may take a few hours.
Also, we may have several hundred administrators all performing similar operations during the day. Will the ASP.Net application thread pool be able to handle all of these background jobs efficiently while still performing it's normal request processing?
So, I am trying to determine if using the BackgroundWorker class and approach is right for our needs. Should I be looking at an alternative approach?
Thanks and sorry for such a long post!
Kevin
In your case it actually sounds like the solution you will be looking for is multifaceted (and not a simple in and done project).
Since you said that some processes can last for hours that is absolutely not something for ASP.NET to own. This should be ran inside a windows service and managed with native windows threading.
You will need to implement some type of work queue in your service and a way to communicate with the queue. One way is to expose a WCF service for all actions your service will govern. Another would be to have service poll a database table and pick up work from the table.
To be able express the status of the process you will want the ASP.NET application to be able to have some reference to the processID for example the WCF service returns a guid identifier. Then you have a method that when you give it the processID it will return the status of the process. You can then implement the polling of that service call using AJAX and display any type of modal you wish.
Another thing to remember is that you need to design your processes to have knowledge of where it is and where it will be when it is finished so it can track the state it's in. For example, BatchJobA is run and will have 1000 records to process. The service needs to know what record it's on or what the current % of competition is for it to be able to return information to the UI. For sql queries that take a very long time to execute this can be very problematic to accurately gauge where it is unless you do alot of pre and post processing of temp tables that you can in the middle of it read the status of the temp tables to understand where it is.
Based on what you are saying I think that BackgroundWorker is not a good choice.
Furthermore keeping this functionality as a part of your main app can be problematic, specifically because you do not want the submitted processing to be interrupted if the main app recycles. You can play with asynch processing but it still will be a part of the main app AppDomain - all of it will die if the app recycles.
I would suggest buidling a separate app implementing this functionality. In a similar situation I separated background processing to a Windows service and hosted a web service in it as a means of communication
You might consider a slightly different approach.
For example, have a command and control table in which you send commands like "REFORMAT PHONE NUMBERS" or whatever.
Then have a windows service monitoring that table. Whenever a record shows up, run the command.
This eliminates any sort of worry about a background thread. Further you have a bit more flexibility with regards to what's in the queue, order of operations including priority, etc. Finally, you would have a definitive list of what is running or needs to run.
As an option, instead of a windows service you might just use a SQL job to execute every so often to watch your control table and perform the requested action.

Multithreading in asp.net

What kind of multi-threading issues do you have to be careful for in asp.net?
It's risky to spawn threads from the code-behind of an ASP.NET page, because the worker process will get recycled occasionally and your thread will die.
If you need to kick off long-running processes as a result of user actions on web pages, your best bet is to drop a message off in MSMQ and have a separate background service monitoring the queue. The service could take as long as it wants to accomplish the task, and the web page would be finished with its work almost immediately. You could accomplish the same thing with an asynch call to a web method, but don't rely on getting the response when the web method is finished working. From code-behind, it needs to be a quick fire-and-forget.
One thing to watch out for at things that expire (I think httpContext does), if you are using it for operations that are "fire and forget" remember that all of a sudden if the asp.net cleanup code runs before your operation is done, you won't be able to access certain information.
If this is for a web service, you should definitely consider thread pooling. Too many threads will bring your application to a grinding halt because they will eventually start competing for CPU time.
Is this for file or network IO? If so, you should also consider using asynchronous IO. It can be a bit more of a pain to program, but you don't have to worry about spawning off too many threads at once.
Programmatic Caching is one area which immediately comes to my mind. It is a great feature which needs to be used carefully. Since it is shared across requests, you have to put locks around it before updating it.
Another place I would check is any code accessing filesystem like writing to log files. If one request has a read-write lock on a file, other concurrent requests will error out if not handled properly.
Isn't there a Limit of 25 Total Threads in the IIS Configuration? At least in IIS 6 i believe. If you exceed that limit, interesting things (read: loooooooong response times) may happen.
Depending on what you need, as far as multi threading is concerned, have you thought of spawning requests from the client. It's safe to spawn requests using AJAX, and then act on the results in a callback. Or use a service as a backgrounding mechanism, which runs every X minutes and processes in the background that way.

Resources