What kind of multi-threading issues do you have to be careful for in asp.net?
It's risky to spawn threads from the code-behind of an ASP.NET page, because the worker process will get recycled occasionally and your thread will die.
If you need to kick off long-running processes as a result of user actions on web pages, your best bet is to drop a message off in MSMQ and have a separate background service monitoring the queue. The service could take as long as it wants to accomplish the task, and the web page would be finished with its work almost immediately. You could accomplish the same thing with an asynch call to a web method, but don't rely on getting the response when the web method is finished working. From code-behind, it needs to be a quick fire-and-forget.
One thing to watch out for at things that expire (I think httpContext does), if you are using it for operations that are "fire and forget" remember that all of a sudden if the asp.net cleanup code runs before your operation is done, you won't be able to access certain information.
If this is for a web service, you should definitely consider thread pooling. Too many threads will bring your application to a grinding halt because they will eventually start competing for CPU time.
Is this for file or network IO? If so, you should also consider using asynchronous IO. It can be a bit more of a pain to program, but you don't have to worry about spawning off too many threads at once.
Programmatic Caching is one area which immediately comes to my mind. It is a great feature which needs to be used carefully. Since it is shared across requests, you have to put locks around it before updating it.
Another place I would check is any code accessing filesystem like writing to log files. If one request has a read-write lock on a file, other concurrent requests will error out if not handled properly.
Isn't there a Limit of 25 Total Threads in the IIS Configuration? At least in IIS 6 i believe. If you exceed that limit, interesting things (read: loooooooong response times) may happen.
Depending on what you need, as far as multi threading is concerned, have you thought of spawning requests from the client. It's safe to spawn requests using AJAX, and then act on the results in a callback. Or use a service as a backgrounding mechanism, which runs every X minutes and processes in the background that way.
Related
With regards to asynchronous ASP.NET web pages article on MSDN.
The advantages are obvious with long-running pages or high server load. So, given projects where you think demand may be high somewhere down the track, when usage grows, is there any reason NOT to implement async ASP.NET in every web application as a standard? Are there any disadvantages to the approach?
Secondary question: are there any real-world studies/examples of where the advantages start to appear, in different web app situations? Or is it just a matter of suck it and see?
From your own link:
Only I/O-bound operations are good candidates for becoming async action methods on an asynchronous controller class. An I/O-bound operation is an operation that doesn’t depend on the local CPU for completion. When an I/O-bound operation is active, the CPU just waits for data to be processed (that is, downloaded) from external storage (a database or a remote service). I/O-bound operations are in contrast to CPU-bound operations, where the completion of a task depends on the activity of the CPU.
Async pages are not free, they do come at a price. They are generally good when your page is making an external call to a service or performing some long-running, non-CPU bound, operation. Otherwise, you are likely to thrash the CPU, leaving you with a worse situation that you had before going async.
The idea is to use async when you would be eating up a thread from your application's thread pool doing non-CPU intensive work (waiting for a response from a long-running service). That way, your application can continue processing requests and doesn't start queuing new ones, slowly draining the responsiveness from your app.
Here is another link with information when/when not to use async pages.
Edit
As for what is considered "long running," you're faced with the crummy answer of "It depends." To figure this out, you would need to profile your application, see how many of your "long running" requests cause subsequent requests to be queued, instead of processed, by IIS. The decision comes down to being in a situation in which paying the costly toll of context switching is less than the return you're going to get for doing so. If your bottleneck is a certain page or service that causes incoming requests to be held off, it is probably a good idea to start thinking about async work. But, you might also be doing too much work in the request and it could be a "code smell" that you need to refactor your code.
In the end, It depends.
Here is an exerpt from MSDN.
In general, use asynchronous pipelines when the following conditions are true:
The operations are network-bound or I/O-bound instead of CPU-bound.
Testing shows that the blocking operations are a bottleneck in site performance and that IIS can service more requests by using asynchronous action methods for these blocking calls.
Parallelism is more important than simplicity of code.
You want to provide a mechanism that lets users cancel a long-running request.
While the link is about MVC, the idea holds true for other flavors of ASP.NET, too.
I am working on ASP.NET project and yesterday I saw a piece of code that uses System.Threading.Thread to offload some tasks to a new thread. The thread runs a few SQL statements and logs the result.
Isn't it better to use another approach? For example to have a Windows Service that performs the SQL batch. Then the web page will just enqueue the batch (via WCF).
In general, what are the best practices for multithreading in ASP.NET? Are there justified usages of threads/TPL tasks/etc. in a web page?
My thought when using multi-threading in ASP.NET:
ASP.NET recycles AppDomain for some reasons like you change web.config or in the period of time to avoid memory leak. The thing is you don't know which exact time of recycle. Long running thread is not suitable because when ASP.NET recycles it will take your thread down accordingly. The right approach of this case is long running task should be running on background process via Queue, like you mention.
For short running and fire and forget task, TPL or async/await are the most appropriate because it does not block thread in thread pool to utilize for HTTP requests.
In my opinion this should be solved by raising some kind of flag in the database and a Windows service that periodically checks the flag and starts the job. If the job is too frequent a dedicated queue solution should be used (MSMQ, RabbitMQ, etc.) to avoid overloading the database or the table growing too fast. I don't think communicating directly with the Windows service via WCF or anything else is a good idea because this may result in dropped messages.
That being said sometimes a project needs to run in a shared hosting and cannot setup a dedicated Windows service. In this case a thread is acceptable as a work around that should be removed as soon as the project grows enough to have its own server.
I believe all other threading in ASP.NET is a sign of a problem except for using Tasks to represent async operations or in the extremely rare case when you want to perform a computation in parallel in a web project but your project has very few concurrent users (less concurrent users than the number of cores)
Why Tasks are useful in ASP.NET?
First reason to use Tasks for async operations is that as of .NET 4.5 async APIs return Tasks :)
Async operations (not to be confused with parallel computations) may be web service calls, database calls, etc. They may be useful for two things:
Fire several of them at once and your job will take a time equal to the longest operation. If you fire them in sequential (non-async) fashion they will take time equal to the sum of the times of each operation which is obviously more.
They can improve scalability by releasing the thread executing the page - Node.js style. ASP.NET supports this since forever but in version 4.5 it is really easy to use. I'll go as far as claiming that it is easier than Node.js because of async/await. Releasing the thread is important because you may deplete your threads in the pool by having them wait. The result is that your website becomes slow when there are a certain number of users despite the fact that the CPU usage is like 30% simply because new requests are waiting in queue. If you increase the number of threads in the thread pool you pay the price of constant context switching than by the OS. At certain point you will get 100% CPU usage but 40% of it will be spent context switching. You will increase the throughput but with diminishing returns. A lot of threads also increase the memory footprint.
I have a long running process in my MVC application(C#).
Its building data for many reports.
Some of the clients could take several minutes or longer to calculate. Is running the process in a separate thread the best option?
Is there another way to allow the process to run, while allowing the user to still use the rest of the site?
If threading is the best solution, any good sites or stackoverflow threads to look at on how to do this?
When I've had cases like those, I usually would build a service to asynchronously process requests, and return a handle that I could use to check on its status in a database. IMHO, splitting it off as a thread in the web application seems like you'd be trying to shove a square peg into a round hole.
I've used two methods to solve this. If the work is guaranteed to not be TOO long running, I've kicked off a thread to do the work and return immediately to the user. When we couldn't make that guarantee, we used a queue (we happened to use MSMQ) for executing long running tasks. This processing was done on a different server apart from IIS. A benefit of this is that we built in a wait and retry on failure mechanism. So besides it handling long running tasks, we also used it for anything that might fail in a way that was inconvenient to handle in our MVC app. The main example of this is sending an email. Rather than do that in the MVC app we would just toss an email task on the queue. We used the Command Pattern for the task objects placed on the queue. Once we had that mechanism in place, we stopped using the technique of spawning a thread from our MVC code.
Under windows server 2008 64bit, IIS 7.0 and .NET 4.0 if an ASP.NET application (using ASP.NET thread pool, synchronous request processing) is long running (> 30 minutes). Web application has no page and main purpose is reading huge files ( > 1 GB) in chunks (~5 MB) and transfer them to the clients. Code:
while (reading)
{
Response.OutputStream.Write(buffer, 0, buffer.Length);
Response.Flush();
}
Single producer - single consumer pattern implemented so for each request there are two threads. I don't use task library here but please let me know if it has advantage over traditional thread creation in this scenario. HTTP Handler (.ashx) is used instead of a (.aspx) page. Under stress test CPU utilization is not a problem but with a single worker process, after 210 concurrent clients, new connections encounter time-out. This is solved by web gardening since I don't use session state. I'm not sure if there's any big issue I've missed but please let me know what other considerations should be taken in your opinion ?
for example maybe IIS closes long running TCP connections due to a "connection timeout" since normal ASP.NET pages are processed in less than 5 minutes, so I should increase the value.
I appreciate your Ideas.
Personally, I would be looking at a different mechanism for this type of processing. HTTP Requests/Web Applications are NOT designed for this type of thing, and stability is going to be VERY hard, you have a number of risks that could cause you major issues as you are working with this type of model.
I would move that processing off to a backend process, so that you are OUTSIDE of the asp.net runtime, that way you have more control over start/shutdown, etc.
First, Never. NEVER. NEVER! do any processing that takes more than a few seconds in a thread pool thread. There are a limited number of them, and they're used by the system for many things. This is asking for trouble.
Second, while the handler is a good idea, you're a little vague on what you mean by "generate on the fly" Do you mean you are encrypting a file on the fly and this encryption can take 30 minutes? Or do you mean you're pulling data from a database and assembling a file? Or that the download takes 30 minutes to download?
Edit:
As I said, don't use a thread pool for anything long running. Create your own thread, or if you're using .NET 4 use a Task and specify it as long running.
Long running processes should not be implemented this way. Pass this off to a service that you set up.
IF you do want to have a page hang for a client, consider interfacing from AJAX to something that does not block on IO threads - like node.js.
Push notifications to many clients is not something ASP.NET can handle due to thread usage, hence my node.js. If your load is low, you have other options.
Use Web-Gardening for more stability of your application.
Turn-off caching since you don't have aspx pages
It's hard to advise more without performance analysis. You the VS built-in and find the bottlenecks.
The Web 1.0 way of dealing with long running processes is to spawn them off on the server and return immediately. Have the spawned off service update a database with progress and pages on the site can query for progress.
The most common usage of this technique is getting a package delivery. You can't hold the HTTP connection open until my package shows up, so it just gives you a way to query for progress. The background process deals with orchestrating all of the steps it takes for getting the item, wrapping it up, getting it onto a UPS truck, etc. All along the way, each step is recorded in the database. Conceptually, it's the same.
Edit based on Question Edit: Just return a result page immediately, and generate the binary on the server in a spawned thread or process. Use Ajax to check to see if the file is ready and when it is, provide a link to it.
I have written an HttpModule that spawns a background thread. I'm using the thread like a Scheduled Task that runs in-process, which is really handy.
What are best practices for keeping track of this thread? I've never done this before, and I'm a little confused about some aspects of it:
How do I know if the thread is still running? I see it do its job, but is there another way to know if it's still alive? I downloaded ProcMon, but w3wp.exe spawns a boatload of threads, so I had no idea which was my thread. I named it, but that didn't help.
How do I "catch" the thread if it dies? Is there some kind of Dispose method where I can have it write to the EventLog or something if it fails? A "dying declaration" or something?
How do I actively stop the thread? If I want it to stop running this background process, how do I kill it without having to bounce IIS?
Is there anyway to start it again, independently of the HttpModule? (I'm guessing the answer to this is no...)
Edit: Just to clarify, the intention is that my thread never goes away. It runs a function, then goes to sleep for a couple minutes, then wakes up and runs the function again. It's not like it's doing one task then ending.
When Jeff made Stackoverflow, he had a similiar issue.
His solution was to use the cache expiration. You'd put something in the cache and then when it expires an event is fired in a non-user facing thread. In the event handler for the expiration, you stick some code in to re-add the item to the cache and do whatever housekeeping work needs to be done for your application
Using this technique, your subquestions are easily answered:
You check the item is still in
cache.
If the item is not in
cache, re-add it.
Remove the
cache item from the cache.
Add
the item back to the cache.
You could make a small management page to configure these options.
This gives you a nice way to roughly time housekeeping processes in your web-application. It doesn't require a seperate Windows Service, which is a big win.
From my experience, you can get this to work "good enough", but not perfect. I would recommend to implement your repeating tasks in a windows service. Depending on what the tasks do, the Windows Service maybe wouldn't even have to talk to the web application or vice versa, e. g. if both work with the same database. Otherwise you could still use e. g. WCF for communication.
The big advantage is: The windows service will start with the OS, you can easily configure, start and stop it using the control panel, you have built-in monitoring via Windows Event Log, you can update the background service and the web application independently etc.
If this is not an option, e. g. because you are in a shared hosting environment, I would recommend the following:
Start your background thread in Application_start (Global.asax), and store the thread reference in a static variable.
Wrap every method called on your background thread with try/catch, because since .NET 2.0, every unhandled exception on a background thread will shut down the application. (It will be restarted on the next request, but it slows down the next request, kills all current sessions and caches, and of course no timer will be active until the next request.)
On every request (implemented has a HttpModule or in Global.asax again), check the Thread instance in the global variable (is it still != null, is the thread active and running etc.). If not, call the restart code. Use locking in the restart part to make sure that the thread will not be created twice at same time.
Even then you can't be sure that your background thread is always running, if you don't have regular traffic around the clock. Also keep in mind that in a shared hosting environment it is very common to shut down application pools if there is no activity for a few hours. You could try to improve this by setting up a scheduled task on a client machine on your own network doing a light-weight HTTP request on your application every few minutes, just to ensure that your application is always running.