Can I use threads to carry out long-running jobs on IIS? - asp.net

In an ASP.Net application, the user clicks a button on the webpage and this then instantiates an object on the server through the event handler and calls a method on the object.
The method goes off to an external system to do stuff and this could take a while. So, what I would like to do is run that method call in another thread so I can return control to the user with "Your request has been submitted".
I am reasonably happy to do this as fire-and-forget, though it would be even nicer if the user could keep polling the object for status.
What I don't know is if IIS allows my thread to keep running, even if the user session expires.
Imagine, the user fires the event and we instantiate the object on the server and fire the method in a new thread. The user is happy with the "Your request has been submitted" message and closes his browser. Eventually, this users session will time out on IIS, but the thread may still be running, doing work. Will IIS allow the thread to keep running or will it kill it and dispose of the object once the user session expires?
EDIT: From the answers and comments, I understand that the best way to do this is to move the long-running processing outside of IIS. Apart from everything else, this deals with the appdomain recycling problem. In practice, I need to get version 1 off the ground in limited time and has to work inside an existing framework, so would like to avoid the service layer, hence the desire to just fire off the thread inside IIS. In practice, "long running" here will only be a few minutes and the concurrency on the website will be low so it should be okay. But, next version definitely will need splitting into a separate service layer.

You can accomplish what you want, but it is typically a bad idea. Several ASP.NET blog and CMS engines take this approach, because they want to be installable on a shared hosting system and not take a dependency on a windows service that needs to be installed. Typically they kick off a long running thread in Global.asax when the app starts, and have that thread process queued up tasks.
In addition to reducing resources available to IIS/ASP.NET to process requests, you also have issues with the thread being killed when the AppDomain is recycled, and then you have to deal with persistence of the task while it is in-flight, as well as starting the work back up when the AppDomain comes back up.
Keep in mind that in many cases the AppDomain is recycled automatically at a default interval, as well as if you update the web.config, etc.
If you can handle the persistence and transactional aspects of your thread being killed at any time, then you can get around the AppDomain recycling by having some external process that makes a request on your site at some interval - so that if the site is recycled you are guaranteed to have it start back up again automatically within X minutes.
Again, this is typically a bad idea.
EDIT: Here are some examples of this technique in action:
Community Server: Using Windows Services vs. Background Thread to Run Code at Scheduled Intervals
Creating a Background Thread When Website First Starts
EDIT (from the far distant future) - These days I would use Hangfire.

I disagree with the accepted answer.
Using a background thread (or a task, started with Task.Factory.StartNew) is fine in ASP.NET. As with all hosting environments, you may want to understand and cooperate with the facilities governing shutdown.
In ASP.NET, you can register work needing to stop gracefully on shutdown using the HostingEnvironment.RegisterObject method. See this article and the comments for a discussion.
(As Gerard points out in his comment, there's now also HostingEnvironment.QueueBackgroundWorkItem that calls down to RegisterObject to register a scheduler for the background item to work on. Overall the new method is nicer since it's task-based.)
As for the general theme that you often hear of it being a bad idea, consider the alternative of deploying a windows service (or another kind of extra-process application):
No more trivial deployment with web deploy
Not deployable purely on Azure Websites
Depending on the nature of the background task, the processes will likely have to communicate. That means either some form of IPC or the service will have to access a common database.
Note also that some advanced scenarios might even need the background thread to be running in the same address space as the requests. I see the fact that ASP.NET can do this as a great advantage that has become possible through .NET.

You wouldn't want to use a thread from the IIS thread pool for this task because it would leave that thread unable to process future requests. You could look into Asynchronous Pages in ASP.NET 2.0, but that really wouldn't be the right answer, either. Instead, what it sounds like you would benefit from is looking into Microsoft Message Queuing. Essentially, you would add the task details to the queue and another background process (possibly a Windows Service) would be in charge of carrying out that task. But the bottom line is that the background process is completely isolated from IIS.

I would suggest to use HangFire for such requirements. Its a nice fire and forget engine runs in background, supports different architecture, reliable because it is backed by persistence storage.

There is a good thread and sample code here: http://forums.asp.net/t/1534903.aspx?PageIndex=2
I've even toyed with the idea of calling a keep alive page on my website from the thread to help keep the app pool alive. Keep in mind if you are using this method that you need really good recovery handling, because the application could recycle at any time. As many have mentioned this is not the right approach if you have access to other service options, but for shared hosting this may be one of your only options.
To help keep the app pool alive, you could make a request to your own site while the thread is processing. This may help keep the app pool alive if your process runs a long time.
string tempStr = GetUrlPageSource("http://www.mysite.com/keepalive.aspx");
public static string GetUrlPageSource(string url)
{
string returnString = "";
try
{
Uri uri = new Uri(url);
if (uri.Scheme == Uri.UriSchemeHttp)
{
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(uri);
CookieContainer cookieJar = new CookieContainer();
req.CookieContainer = cookieJar;
//set the request timeout to 60 seconds
req.Timeout = 60000;
req.UserAgent = "MyAgent";
//we do not want to request a persistent connection
req.KeepAlive = false;
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
Stream stream = resp.GetResponseStream();
StreamReader sr = new StreamReader(stream);
returnString = sr.ReadToEnd();
sr.Close();
stream.Close();
resp.Close();
}
}
catch
{
returnString = "";
}
return returnString;
}

We started down this path, and it actually worked ok when our app was on one server. When we wanted to scale out to multiple machines (or use multiple w3wp in a web garen) we had to re-evaluate and look at how to manage a work queue, error handling, retries and the tricky problem of correctly locking to ensure only one server picks up the next item.
... we realized we are not in the business of writing background processing engines so we looked for existing solutions and we landed up using the awesome OSS project hangfire
Sergey Odinokov has created a real gem which is really easy to get started with, and allows you to swap out the backend of how work is persisted and queued. Hangfire uses background threads, but persists the jobs, handles retries and gives you visibility into the work queue. So hangfire jobs are robust and survive all the vagaries of appdomains being recycled etc.
Its basic setup uses sql server as the storage but you can swap out for Redis or MSMQ when its time to scale up. It also has an excellent UI for visualizing all the jobs and their status plus allowing you to re-queue jobs.
My point is that while its entirely possible to do what you want in an background thread, there is a lot of work to make it scalable and robust. Its fine for simple workloads but when things get more complex I much prefer to use a purpose built library rather than go through this effort.
For some more perspective on options available check out Scott Hanselman's blog which covers off a few options for handling background jobs in asp.net. (He gave hangfire a glowing review)
Also as referenced by John its worth reading up Phil Haack's blog on why the approach is problematic, and how to gracefully stop work on the thread when appdomain is unloaded.

Can you create a windows service to do that task? Then use .NET remoting from the Web Server to call the Windows Service to do the action? If that is the case that is what I would do.
This would eliminate the need to relay on IIS, and limit some of its processing power.
If not then I would force the user to sit there while the process is done. That way you ensure it is completed and not killed by IIS.

There does seem to be one supported way of hosting long-running work in IIS. Workflow Services seem designed for this, especially in conjunction with Windows Server AppFabric. The design allows for application pool recycling by supporting automatic persistence and resumption of the long-running work.

You may run tasks in the background and they will complete even after the request ends. Don't let an uncaught exception be thrown. Normally you want to always throw your exceptions. If an exception is thrown in a new thread then it will crash the IIS worker process - w3wp.exe, because you are no longer in the request's context. That's also then going to kill any other background tasks you have running in addition to in-process memory backed sessions if you are using them. This would be hard to diagnose, which is why the practice is discouraged.

Just create a surrogate process to run the async tasks; it doesn't have to be a windows service (although that is the more optimal approach in most cases. MSMQ is way over kill.

Related

ASP.net running Thread class Foreground-Threads

I do have a quick question regarding extra foreground threads that are running in an ASP.net Web Application.
I let some things run in an extra Thread class thread like Cleanup-Actions after a certain event, but the user is not interested in those operations, therefore the threads - these operations should just run ( rather in the background ) and have nothing to do with serving requests.
What happens when the application crashes, because of an uncaught exception - will the extra thread run to completion or does IIS cancel it immediately ?
What does happen when the application pool recycles ?
I know that it would be a better approach by using separate programms for jobs like this.. Windows Service for example, but I "inherited" that code by former employees..
Any work being done or queued inside ASP.NET can be lost at any time. It is not necessary to answer your questions in detail because they are moot. There are many reasons the whole process can just go away.
Windows Services also can go away at any time (crash, blue screen, ...).
If you need reliable execution you need to persist the work. Some kind of message queue (could be a database table) is appropriate.

Using 'Lock' in web applications

A few months ago I was interviewing for a job inside the company I am currently in, I dont have a strong web development background, but one of the questions he posed to me was how could you improve this block of code.
I dont remember the code block perfectly but to sum it up it was a web hit counter, and he used lock on the hitcounter.
lock(HitCounter)
{
// Bla...
}
However after some discussion he said, lock is good but never use it in web applications!
What is the basis behind his statement? Why shouldnt I use lock in web applications?
There is no special reason why locks should not be used in web applications. However, they should be used carefully as they are a mechanism to serialize multi-threaded access which can cause blocking if lock blocks are contended. This is not just a concern for web applications though.
What is always worth remembering is that on modern hardware an uncontended lock takes 20 nanoseconds to flip. With this in mind, the usual practice of trying to make code inside of lock blocks as minimal as possible should be followed. If you have minimal code within a block, the overhead is quite small and potential for contention low.
To say that locks should never be used is a bit of a blanket statement really. It really depends on what your requirements are e.g. a thread-safe in-memory cache to be shared between requests will potentially result in less request blocking than on-demand fetching from a database.
Finally, BCL and ASP.Net Framework types certainly use locks internally, so you're indirectly using them anyway.
The application domain might be recycled.
This might result in the old appdomain still finishing serving some requests and the new appdomain also serving new requests.
Static variables are not shared between them, so locking on a static global would not grant exclusivity in this case.
First of all, you never want to lock an object that you actually use in any application. You want to create a lock object and lock that:
private readonly object _hitCounterLock = new object();
lock(_hitCounterLock)
{
//blah
}
As for the web portion of the question, when you lock you block every thread that attempts to access the object (which for the web could be hundreds or thousands of users). They will all be waiting until each thread ahead of them unlocks.
Late :), but for future readers of this, an additional point:
If the application is run on a web farm, the ASP's running on multiple machines will not share the lock object
So this can only work if
1. No web farm has to be supported AND 2. ASP is configured (non-default) NOT to use parallel instances during recycle until old requests are served (as mentioned by Andras above)
This code will create a bottleneck for your application since all incoming request will have to wait at this point before the previous went out of the lock.
lock is only intended to be used for multithreaded applications where multiple threads require access to the same shared variable, thus a lock is exclusively acquired by the requesting thread and all pending threads will block and wait until the lock is released.
in web applications, user requests are isolated so there is no need for locking by default
Couple reasons...
If you're trying to lock a database read/write operation, there's a really high risk of a race condition happening anyway because the database isn't owned by the process doing the lock, so it could be read from/written to by another process -- perhaps even a hypothetical future version of IIS that runs multiple processes per application.
Locks are typically used in client applications for non-UI threads, i.e. background/worker threads. Web applications don't have as much of a use for multithreaded processing unless you're trying to take advantage of multiple cores (in which case locks on request-associated objects would be acceptable), because each request can be assumed to run on its own thread, and the server can't respond until it's processed the entire output (or at least a sequential chunk) anyway.

Considerations for ASP.NET application with long running synchronous requests

Under windows server 2008 64bit, IIS 7.0 and .NET 4.0 if an ASP.NET application (using ASP.NET thread pool, synchronous request processing) is long running (> 30 minutes). Web application has no page and main purpose is reading huge files ( > 1 GB) in chunks (~5 MB) and transfer them to the clients. Code:
while (reading)
{
Response.OutputStream.Write(buffer, 0, buffer.Length);
Response.Flush();
}
Single producer - single consumer pattern implemented so for each request there are two threads. I don't use task library here but please let me know if it has advantage over traditional thread creation in this scenario. HTTP Handler (.ashx) is used instead of a (.aspx) page. Under stress test CPU utilization is not a problem but with a single worker process, after 210 concurrent clients, new connections encounter time-out. This is solved by web gardening since I don't use session state. I'm not sure if there's any big issue I've missed but please let me know what other considerations should be taken in your opinion ?
for example maybe IIS closes long running TCP connections due to a "connection timeout" since normal ASP.NET pages are processed in less than 5 minutes, so I should increase the value.
I appreciate your Ideas.
Personally, I would be looking at a different mechanism for this type of processing. HTTP Requests/Web Applications are NOT designed for this type of thing, and stability is going to be VERY hard, you have a number of risks that could cause you major issues as you are working with this type of model.
I would move that processing off to a backend process, so that you are OUTSIDE of the asp.net runtime, that way you have more control over start/shutdown, etc.
First, Never. NEVER. NEVER! do any processing that takes more than a few seconds in a thread pool thread. There are a limited number of them, and they're used by the system for many things. This is asking for trouble.
Second, while the handler is a good idea, you're a little vague on what you mean by "generate on the fly" Do you mean you are encrypting a file on the fly and this encryption can take 30 minutes? Or do you mean you're pulling data from a database and assembling a file? Or that the download takes 30 minutes to download?
Edit:
As I said, don't use a thread pool for anything long running. Create your own thread, or if you're using .NET 4 use a Task and specify it as long running.
Long running processes should not be implemented this way. Pass this off to a service that you set up.
IF you do want to have a page hang for a client, consider interfacing from AJAX to something that does not block on IO threads - like node.js.
Push notifications to many clients is not something ASP.NET can handle due to thread usage, hence my node.js. If your load is low, you have other options.
Use Web-Gardening for more stability of your application.
Turn-off caching since you don't have aspx pages
It's hard to advise more without performance analysis. You the VS built-in and find the bottlenecks.
The Web 1.0 way of dealing with long running processes is to spawn them off on the server and return immediately. Have the spawned off service update a database with progress and pages on the site can query for progress.
The most common usage of this technique is getting a package delivery. You can't hold the HTTP connection open until my package shows up, so it just gives you a way to query for progress. The background process deals with orchestrating all of the steps it takes for getting the item, wrapping it up, getting it onto a UPS truck, etc. All along the way, each step is recorded in the database. Conceptually, it's the same.
Edit based on Question Edit: Just return a result page immediately, and generate the binary on the server in a spawned thread or process. Use Ajax to check to see if the file is ready and when it is, provide a link to it.

What are some best practices for managing background threads in IIS?

I have written an HttpModule that spawns a background thread. I'm using the thread like a Scheduled Task that runs in-process, which is really handy.
What are best practices for keeping track of this thread? I've never done this before, and I'm a little confused about some aspects of it:
How do I know if the thread is still running? I see it do its job, but is there another way to know if it's still alive? I downloaded ProcMon, but w3wp.exe spawns a boatload of threads, so I had no idea which was my thread. I named it, but that didn't help.
How do I "catch" the thread if it dies? Is there some kind of Dispose method where I can have it write to the EventLog or something if it fails? A "dying declaration" or something?
How do I actively stop the thread? If I want it to stop running this background process, how do I kill it without having to bounce IIS?
Is there anyway to start it again, independently of the HttpModule? (I'm guessing the answer to this is no...)
Edit: Just to clarify, the intention is that my thread never goes away. It runs a function, then goes to sleep for a couple minutes, then wakes up and runs the function again. It's not like it's doing one task then ending.
When Jeff made Stackoverflow, he had a similiar issue.
His solution was to use the cache expiration. You'd put something in the cache and then when it expires an event is fired in a non-user facing thread. In the event handler for the expiration, you stick some code in to re-add the item to the cache and do whatever housekeeping work needs to be done for your application
Using this technique, your subquestions are easily answered:
You check the item is still in
cache.
If the item is not in
cache, re-add it.
Remove the
cache item from the cache.
Add
the item back to the cache.
You could make a small management page to configure these options.
This gives you a nice way to roughly time housekeeping processes in your web-application. It doesn't require a seperate Windows Service, which is a big win.
From my experience, you can get this to work "good enough", but not perfect. I would recommend to implement your repeating tasks in a windows service. Depending on what the tasks do, the Windows Service maybe wouldn't even have to talk to the web application or vice versa, e. g. if both work with the same database. Otherwise you could still use e. g. WCF for communication.
The big advantage is: The windows service will start with the OS, you can easily configure, start and stop it using the control panel, you have built-in monitoring via Windows Event Log, you can update the background service and the web application independently etc.
If this is not an option, e. g. because you are in a shared hosting environment, I would recommend the following:
Start your background thread in Application_start (Global.asax), and store the thread reference in a static variable.
Wrap every method called on your background thread with try/catch, because since .NET 2.0, every unhandled exception on a background thread will shut down the application. (It will be restarted on the next request, but it slows down the next request, kills all current sessions and caches, and of course no timer will be active until the next request.)
On every request (implemented has a HttpModule or in Global.asax again), check the Thread instance in the global variable (is it still != null, is the thread active and running etc.). If not, call the restart code. Use locking in the restart part to make sure that the thread will not be created twice at same time.
Even then you can't be sure that your background thread is always running, if you don't have regular traffic around the clock. Also keep in mind that in a shared hosting environment it is very common to shut down application pools if there is no activity for a few hours. You could try to improve this by setting up a scheduled task on a client machine on your own network doing a light-weight HTTP request on your application every few minutes, just to ensure that your application is always running.

Multithreading in asp.net

What kind of multi-threading issues do you have to be careful for in asp.net?
It's risky to spawn threads from the code-behind of an ASP.NET page, because the worker process will get recycled occasionally and your thread will die.
If you need to kick off long-running processes as a result of user actions on web pages, your best bet is to drop a message off in MSMQ and have a separate background service monitoring the queue. The service could take as long as it wants to accomplish the task, and the web page would be finished with its work almost immediately. You could accomplish the same thing with an asynch call to a web method, but don't rely on getting the response when the web method is finished working. From code-behind, it needs to be a quick fire-and-forget.
One thing to watch out for at things that expire (I think httpContext does), if you are using it for operations that are "fire and forget" remember that all of a sudden if the asp.net cleanup code runs before your operation is done, you won't be able to access certain information.
If this is for a web service, you should definitely consider thread pooling. Too many threads will bring your application to a grinding halt because they will eventually start competing for CPU time.
Is this for file or network IO? If so, you should also consider using asynchronous IO. It can be a bit more of a pain to program, but you don't have to worry about spawning off too many threads at once.
Programmatic Caching is one area which immediately comes to my mind. It is a great feature which needs to be used carefully. Since it is shared across requests, you have to put locks around it before updating it.
Another place I would check is any code accessing filesystem like writing to log files. If one request has a read-write lock on a file, other concurrent requests will error out if not handled properly.
Isn't there a Limit of 25 Total Threads in the IIS Configuration? At least in IIS 6 i believe. If you exceed that limit, interesting things (read: loooooooong response times) may happen.
Depending on what you need, as far as multi threading is concerned, have you thought of spawning requests from the client. It's safe to spawn requests using AJAX, and then act on the results in a callback. Or use a service as a backgrounding mechanism, which runs every X minutes and processes in the background that way.

Resources