What is the correct use of the ASP.NET Thread Pool? - asp.net

My scenario is this, I have a file that slowly gets populated over the course of an hour or two (mp3, video, etc). As this file is populated many users are connected to the server to receive new data as it is added to the server.
At the moment each visitor connects to the server, and an IHttpAsyncHandler allocates a thread from the thread pool to handle the request. However using the default thread pool settings, this means that only 20 visitors can connect to a server (single processor) at a time.
Because most of the time these threads are simply waiting for new data, what would be the best way to release the thread to the pool, and have it re-activate when new data is available.
Many Thanks,
Ady

I would just use regular Threads for this purpose. The .NET ThreadPool is not really designed to support the releasing and re-activation of (long-running) threads depending on their internal state... at the very least, you would have to do some creative programming to achieve the desired behavior if you stick with a ThreadPool (i.e. break the logic into small asynchronous tasks that get executed by the ThradPool).
If you go with Thread, then you will have direct control to all the active threads so you can accept more visitors at the same time.

F# has a feature called Asynchronous Workflows that is ideally suited to this sort of thing. When your code is waiting on an external data source, the thread is returned to the thread pool for other uses. When new data arrives, the workflow gets a thread out of the pool and uses it to resume your code where you left off. In this way you never have to tie up a thread that's doing nothing but waiting on I/O.
It may be overkill to learn a new language just for this one use, but F# towers over every other CLR language when it comes to async I/O, and it's a really fun language, besides.

Related

How to best implement a blocking/waiting actor?

I'm fairly new to Akka and writing concurrent applications and I'm wondering what's a good way to implement an actor that would wait for a redis list and once an item becomes available it will process it, or send it to a different actor to process?
Would using the blocking function BRPOPLPUSH be better, or would a scheduler that will ask the actor to poll redis every second be a better way?
Also, on a normal system, how many of these actors can I spawn concurrently without consuming all the resource the system has to offer? How does one decide how many of each Actor type should an actor system be able to handle on the system its running on?
As a rule of thumb you should never block inside receive. Each actor should rely only on CPU and never wait, sleep or block on I/O. When these conditions are met you can create even millions of actors working concurrently. Each actor is suppose to have 600-650 bytes memory footprint (see: Concurrency, Scalability & Fault-tolerance 2.0 with Akka Actors & STM).
Back to your main question. Unfortunately there is no official Redis client "compatible" with Akka philosophy, that is, completely asynchronous. What you need is a client that instead of blocking will return you a Future object of some sort and allow you to register callback when results are available. There are such clients e.g. for Perl and node.js.
However I found fyrie-redis independent project which you might find useful. If you are bound to synchronous client, the best you can do is either:
poll Redis periodically without blocking and inform some actor by sending a message to with a Redis reply or
block inside an actor and understand the consequences
See also
Redis client library recommendations for use from Scala
BRPOPLPUSH with block for long time (up to the timeout you specify), so I would favour a Scheduler instead which still blocks, but for a shorter amount of time every second or so.
Whichever way you go, because you are blocking, you should read this section of the Akka docs which describes methods for working with blocking libraries.
Do you you have control over the code that is inserting the item into redis? If so you could get that code to send your akka code a message (maybe over ActiveMQ using the akka camel support) to notify it when the item has been inserted into redis. This will be a more event driven way of working and prevent you from having to poll, or block for super long periods of time.

How does a (full featured) long polling server work abstractly

Since you're using an event loop as opposed to threads, how does the actual server look?
I know it uses an event loop, but how do you separate out the requests? And how do you prevent your server from running extremely slowly (since it, I assume, can only push one thing at a time since it's threadless?)
Some sort of pseudo-code would be great.
Forgive my ignorance; of course, if there's somewhere that explains it in a non-basic "this is good enough until you get 1000 visitors way", I'd be glad to know of it.
The implementation details of a long poll server would vary so much from platform to platform that your assumptions might not be correct.
I implemented a COMET server for our website using .NET. I leveraged HttpListener to do all the boring http stuff and Microsoft CCR to deal with all the async IO. It uses a pool of threads to service requests as and when they come in. It's not a thread per client, but it's not single threaded either generally requiring a few tens of threads to stay fluid as user numbers rise. This approach means that we scale easily across multiple CPU cores. CCRs async enumerator pattern really helped keep the asynchronous logic nice and tidy, and I can read the code fairly easily a year later.
This approach has proved extremely scalable. I've tested up to 20000 clients, whereupon we became bound by network IO. It handles all our clients (who are "permanently" connected, reconnecting every 30s) ticking along at 1-2% server load. It's definitely worth reconsidering your assumption that you must either choose an event loop architecture as opposed to multiple threads. The middle ground works very nicely for me, and the .NET asynchronous programming model for dealing with IO bound tasks really takes you away from needing to micro-manage threads. Effectively, when there's IO data to process, a thread is borrowed from the pool to do that processing, and subsequently returned to the pool ready to service another request. All the complicated IOCP stuff is abstracted away.

Asp.net thread question

In my ASP.NET MVC application I have a number of threads that wait for a certain length of time and wake up to do some clean tasks over and over. I have not deployed this application to a production server yet but on my dev machine they seem to work as expected. For these threads to work the same on IIS7 do I need to look out for anything? Will IIS7 keep my threads alive indefinitely? are there implications to worry about?
Also I want to queue, lets say 50 objects that were created through various requests and process them all in one go. I'd like to maintain them inside a list and then process the list which means that the list object has to be kept alive indefinitely. I'd like to avoid serializing my objects into the DB in order to maintain this queue. What is the correct way of achieving this?
Will IIS7 keep my threads alive
indefinitely?
No, if the application pool recycles (if there's a long inactivity or some memory threshold is hit) those threads will be stopped as the application will be unloaded from memory. If those objects are so much precise I wouldn't recommend you keeping them in memory but rather serialize them to some persistent storage so that they could be processed later in case of failure.
The design you describe is fine when you don't mind losing cached commands in the queue. Otherwise it would be better to go with a different design. ASP.NET isn't suited for this type of processing, because IIS can recycle the process. When that happens you lose your in-memory queue. IIS could also decide to unload the AppDomain because no new requests are coming in. In that case your threads will also stop running which means that pending operations will still not been cached, even when you use a persisted queue.
You'd probably be better of with some sort of transactional queue, such as MSMQ or a custom table in the database (or look at the open source NServiceBus). Adding operations to the queue can be done by your web application and processing items can be done within a Windows service application that will not be recycled and can process the queue in a transactional way.
Since you're talking about multiple threads: when using a Windows service you can build it in such way that it can run multiple threads or make it single threaded and run several instances of the same thread. This is a very flexible design that I used successfully in the past to distribute CPU and disk intensive operations over multiple machines.

Asynchronous pages in the ASP.NET framework - where are the other threads and how is it reattached?

Sorry for this dumb question on Asynchronous operations. This is how I understand it.
IIS has a limited set of worker threads waiting for requests. If one request is a long running operation, it will block that thread. This leads to fewer threads to serve requests.
Way to fix this - use asynchronous pages. When a request comes in, the main worker thread is freed and this other thread is created in some other place. The main thread is thus able to serve other requests. When the request completes on this other thread, another thread is picked from the main thread pool and the response is sent back to the client.
1) Where are these other threads located? Is there another thread pool?
2) IF ASP.NET likes creating new threads in this other thread pool(?), why not increase the number of threads in the main worker pool - they are all running on the same machine anyway? I don't see the advantage of moving that request to this other thread pool. Memory/CPU should be the same right?
3) If the main thread hands off a request to this other thread, why does the request not get disconnected? It magically hands off the request to another worker thread somewhere else and when the long running process completes, it picks a thread from the main worker pool and sends response to the client. I am amazed...but how does that work?
You didn't say which version of IIS or ASP.NET you're using. A lot of folks talk about IIS and ASP.NET as if they are one and the same, but they really are two components working together. Note that IIS 6 and 7 listen to an I/O completion port where they pick up completions from HTTP.sys. The IIS thread pool is used for this, and it has a maximum thread count of 256. This thread pool is designed in such a way that it does not handle long running tasks well. The recommendation from the IIS team is to switch to another thread if you're going to do substantial work, such as done by the ASP.NET ISAPI and/or ASP.NET "integrated mode" handler on IIS 7. Otherwise you will tie up IIS threads and prevent IIS from picking up completions from HTTP.sys Chances are you don't care about any of this, because you're not writing native code, that is, you're not writing an ISAPI or native handler for the IIS 7 pipeline. You're probably just using ASP.NET, in which case you're more interested in its thread pool and how it works.
There is a blog post at http://blogs.msdn.com/tmarq/archive/2007/07/21/asp-net-thread-usage-on-iis-7-0-and-6-0.aspx that explains how ASP.NET uses threads. Note that for ASP.NET v2.0 and v3.5 on IIS 7 you should increase MaxConcurrentRequestsPerCPU to 5000--it is a bug that it was set to 12 by default on those platforms. The new default for MaxConcurrentRequestsPerCPU in ASP.NET v4.0 on IIS 7 is 5000.
To answer your three questions:
1) First, a little primer. Only 1 thread per CPU can execute at a time. When you have more than this, you pay a penalty--a context switch is necessary every time the CPU switches to another thread, and these are expensive. However, if a thread is blocked waiting on work...then it makes sense to switch to another thread, one that can execute now.
So if I have a thread that is doing a lot of computational work and using the CPU heavily, and this takes a long time, should I switch to another thread? No! The current thread is efficiently using the CPU, so switching will only incur the cost of a context switch.
So if I have a thread that makes an HTTP or SOAP request to another server and takes a long time, should I switch threads? Yes! You can perform the HTTP or SOAP request asynchronously, so that once the "send" has occurred, you can unwind the current thread and not use any threads until there is an I/O completion for the "receive". Between the "send" and the "receive", the remote server is busy, so locally you don't need to be blocking on a thread, but instead make use of the async APIs provided in .NET Framework so that you can unwind and be notified upon completion.
Ok, so you're #1 questions was "Where are these other threads located? Is there another thread pool?" This depends. Most code that runs in .NET Framework uses the CLR ThreadPool, which consists of two types of threads, worker threads and i/o completion threads. What about code that doesn't use CLR ThreadPool? Well, it can create its own threads, use its own thread pool, or whatever it wants because it has access to the Win32 APIs provided by the operating system. Based on what we discussed a bit ago, it really doesn't matter where the thread comes from, and a thread is a thread as far as the operating system and hardware is concerned.
2) In your second question, you state, "I don't see the advantage of moving that request to this other thread pool." You're correct in thinking that there is NO advantage to switching unless you're going to make up for that costly context switch you just performed in order to switch. That's why I gave an example of a slow HTTP or SOAP request to a remote server as an example of a good reason to switch. And by the way, ASP.NET does not create any threads. It uses the CLR ThreadPool, and the threads in that pool are entirely managed by the CLR. They do a pretty good job of determining when you need more threads. For example, that's why ASP.NET can easily scale from executing 1 request concurrently to executing 300 requests concurrently, without doing anything. The incoming requests are posted to the CLR ThreadPool via a call to QueueUserWorkItem, and the CLR decides when to call the WaitCallback (see MSDN).
3) The third question is, "If the main thread hands off a request to this other thread, why does the request not get disconnected?" Well, IIS picks up the I/O completion from HTTP.sys when the request initially arrives at the server. IIS then invokes ASP.NET's handler (or ISAPI). ASP.NET immediately queues the request to the CLR Threadpool, and returns a pending status to IIS. This pending status tells IIS that we're not done yet, but as soon as we are done we'll let you know. Now ASP.NET manages the life of that request. When a CLR ThreadPool thread invokes the ASP.NET WaitCallback (see MSDN), it can execute the entire request on that thread, which is the normal case. Or it can switch to one or more other threads if the request is what we call asynchronous--i.e. it has an asynchronous module or handler. Either way, there are well defined ways in which the request completes, and when it finally does, ASP.NET will tell IIS we're done, and IIS will send the final bytes to the client and close the connection if Keep-Alive is not being used.
Regards,
Thomas
Async pages in ASP.NET use asynchronous callbacks, and asynchronous callbacks use the Thread Pool, and it is the same thread pool used to serve ASP.NET requests.
However, it's not quite that simple. The .NET ThreadPool has two types of threads - worker threads and I/O threads. I/O threads use what's called an I/O Completion Port, which is (greatly oversimplifying here) a thread-free or thread-agnostic means of waiting for a read/write operation on a file handle to complete, subsequently running a callback method.
(Note that a file handle does not necessarily refer to a file on disk; as far as Windows is concerned, it could just as well be a socket, pipe, etc.)
A typical .NET web developer doesn't really need to know about any of this. Of course, if you were writing an actual web server, or any kind of network server, then you would definitely need to learn about these, because they are the only way to handle hundreds of incoming connections without actually spawning hundreds of threads to serve them. There's a Managed I/O Completion Port tutorial (CodeProject) if you're interested.
Anyway, getting back on topic; when you interact with the thread pool at a high level, i.e. by writing:
ThreadPool.QueueUserWorkItem(s => DoSomeWork(s));
This does not use an I/O completion port. Ever. It posts the work to one of the normal worker threads managed by thread pool. It's the same if you use async callbacks:
Func<int> asyncFunc;
IAsyncResult BeginOperation(object sender, EventArgs e, AsyncCallback cb,
object state)
{
asyncFunc = () => { Thread.Sleep(500); return 42; };
return asyncFunc.BeginInvoke(cb, state);
}
void EndOperation(IAsyncResult ar)
{
int result = asyncFunc.EndInvoke(ar);
Console.WriteLine(result);
}
Again - same deal. Inside the EndOperation you're running on a ThreadPool worker thread. You can verify this by inserting the following debugging code:
void EndSimpleWait(IAsyncResult ar)
{
int maxWorkers, maxIO, availableWorkers, availableIO;
ThreadPool.GetMaxThreads(out maxWorkers, out maxIO);
ThreadPool.GetAvailableThreads(out availableWorkers, out availableIO);
int result = asyncFunc.EndInvoke(ar);
}
Slap a breakpoint in there and you'll see that availableWorkers is one less than maxWorkers, while maxIO and availableIO are the same.
But some async operations are "special" in .NET. This actually has nothing to do with ASP.NET directly - they'll use I/O completion ports in a Winforms or WPF app too. Examples are:
System.Net.Sockets.Socket (BeginReceive) and a whole bunch of other BeginXYZ methods)
System.IO.FileStream (BeginRead and BeginWrite)
System.ServiceModel.ClientBase<T> (BeginInvoke)
System.Net.WebRequest (BeginGetResponse)
And so on, this is nowhere near a full list. Basically almost every class in the .NET Framework that exposes its own BeginXYZ and EndXYZ methods and could conceivably perform any I/O, probably uses I/O completion ports. That's to make it easier for you, the application developer, because I/O threads are kind of hard to implement yourself in .NET.
My guess is that the .NET Framework designers deliberately chose to make it difficult to post I/O operations (compared to worker threads, where you can just write ThreadPool.QueueUserWorkItem) because it's comparatively "dangerous" if you don't know how to use them properly; by contrast, it's actually pretty straightforward to spawn these in the Windows API.
As before, you can verify what's happening with some debugging code:
WebRequest request;
IAsyncResult BeginDownload(object sender, EventArgs e,
AsyncCallback cb, object state)
{
request = WebRequest.Create("http://www.example.com");
return request.BeginGetResponse(cb, state);
}
void EndDownload(IAsyncResult ar)
{
int maxWorkers, maxIO, availableWorkers, availableIO;
ThreadPool.GetMaxThreads(out maxWorkers, out maxIO);
ThreadPool.GetAvailableThreads(out availableWorkers, out availableIO);
string html;
using (WebResponse response = request.EndGetResponse(ar))
{
using (StreamReader reader = new
StreamReader(response.GetResponseStream()))
{
html = reader.ReadToEnd();
}
}
}
If you step through this one, you'll see that the thread stats are different. The availableWorkers will match maxWorkers, but availableIO is one less than maxIO. That's because you're running on an I/O thread. That's also why you're not supposed to do any expensive computations in async callbacks - posting CPU-intensive work on an I/O completion port is inefficient and, well, bad.
All of this explains why it's strongly recommended that you use Async pages in ASP.NET when you need to perform any I/O operations. The pattern is only useful for I/O operations; non-I/O async operations will end up being posted to worker threads in the ThreadPool and you'll still end up blocking subsequent ASP.NET requests. But you can spawn a virtually unlimited number of async I/O operations and not give it a second thought; these won't use any threads at all until the I/O is complete and the callback is ready to begin.
So, to summarize - there is only one ThreadPool, but there are different kinds of threads in it, and if you're performing slow I/O operations then it's much more efficient to use the I/O threads. It's got nothing to do with CPU or memory, it's all about I/O and file handles.
As for #3, it's not really a question of "why doesn't the request get disconnected", more like a question of "why would it?" A socket doesn't get closed simply because there's no thread currently sending to or receiving data from it, same way your front door doesn't automatically close if there's nobody there to greet guests. Client operations may time out if the server doesn't answer them, and may subsequently choose to disconnect from their end, but that's another issue altogether.
1) The threads are in w3svc or whatever process is running the ASP.NET engine in your particular version of IIS.
2) Not sure what you mean here. You actually have control over how many threads are in the worker thread pool. This article is pretty good: http://msdn.microsoft.com/en-us/library/ms998549.aspx
3) I think you are confusing Requests and connections... To be honest, I haven't a clue how the internals of IIS works, but generally in applications that handle multiple requests simultaneously there is ONE master listening thread that will then hand off the actual work to a child thread (and do nothing else). The original request is not "disconnected" because these things are happening at completely different levels of the network protocol stack. Windows Server has no problem accepting multiple connections on TCP port 80. Think about how TCP/IP works and the fact that it is sending multiple discrete packets of information. You are thinking of "connection" like a single hose going from spigot A to spigot B, but of course that's not how it really works. It is more akin to a bucket that is just collecting whatever gets spilled into it.
Hope this helps.
The answer also depends on which version of IIS you're talking about. In earlier versions, ASP.NET did not use "IIS threads". They were .NET ThreadPool threads. In IIS 7, the IIS and ASP.NET pipelines have been merged. I don't know which threads ASP.NET uses now.
The bottom line is, don't spawn your own threads.

Using ThreadPool.QueueUserWorkItem in ASP.NET in a high traffic scenario

I've always been under the impression that using the ThreadPool for (let's say non-critical) short-lived background tasks was considered best practice, even in ASP.NET, but then I came across this article that seems to suggest otherwise - the argument being that you should leave the ThreadPool to deal with ASP.NET related requests.
So here's how I've been doing small asynchronous tasks so far:
ThreadPool.QueueUserWorkItem(s => PostLog(logEvent))
And the article is suggesting instead to create a thread explicitly, similar to:
new Thread(() => PostLog(logEvent)){ IsBackground = true }.Start()
The first method has the advantage of being managed and bounded, but there's the potential (if the article is correct) that the background tasks are then vying for threads with ASP.NET request-handlers. The second method frees up the ThreadPool, but at the cost of being unbounded and thus potentially using up too many resources.
So my question is, is the advice in the article correct?
If your site was getting so much traffic that your ThreadPool was getting full, then is it better to go out-of-band, or would a full ThreadPool imply that you're getting to the limit of your resources anyway, in which case you shouldn't be trying to start your own threads?
Clarification: I'm just asking in the scope of small non-critical asynchronous tasks (eg, remote logging), not expensive work items that would require a separate process (in these cases I agree you'll need a more robust solution).
Other answers here seem to be leaving out the most important point:
Unless you are trying to parallelize a CPU-intensive operation in order to get it done faster on a low-load site, there is no point in using a worker thread at all.
That goes for both free threads, created by new Thread(...), and worker threads in the ThreadPool that respond to QueueUserWorkItem requests.
Yes, it's true, you can starve the ThreadPool in an ASP.NET process by queuing too many work items. It will prevent ASP.NET from processing further requests. The information in the article is accurate in that respect; the same thread pool used for QueueUserWorkItem is also used to serve requests.
But if you are actually queuing enough work items to cause this starvation, then you should be starving the thread pool! If you are running literally hundreds of CPU-intensive operations at the same time, what good would it do to have another worker thread to serve an ASP.NET request, when the machine is already overloaded? If you're running into this situation, you need to redesign completely!
Most of the time I see or hear about multi-threaded code being inappropriately used in ASP.NET, it's not for queuing CPU-intensive work. It's for queuing I/O-bound work. And if you want to do I/O work, then you should be using an I/O thread (I/O Completion Port).
Specifically, you should be using the async callbacks supported by whatever library class you're using. These methods are always very clearly labeled; they start with the words Begin and End. As in Stream.BeginRead, Socket.BeginConnect, WebRequest.BeginGetResponse, and so on.
These methods do use the ThreadPool, but they use IOCPs, which do not interfere with ASP.NET requests. They are a special kind of lightweight thread that can be "woken up" by an interrupt signal from the I/O system. And in an ASP.NET application, you normally have one I/O thread for each worker thread, so every single request can have one async operation queued up. That's literally hundreds of async operations without any significant performance degradation (assuming the I/O subsystem can keep up). It's way more than you'll ever need.
Just keep in mind that async delegates do not work this way - they'll end up using a worker thread, just like ThreadPool.QueueUserWorkItem. It's only the built-in async methods of the .NET Framework library classes that are capable of doing this. You can do it yourself, but it's complicated and a little bit dangerous and probably beyond the scope of this discussion.
The best answer to this question, in my opinion, is don't use the ThreadPool or a background Thread instance in ASP.NET. It's not at all like spinning up a thread in a Windows Forms application, where you do it to keep the UI responsive and don't care about how efficient it is. In ASP.NET, your concern is throughput, and all that context switching on all those worker threads is absolutely going to kill your throughput whether you use the ThreadPool or not.
Please, if you find yourself writing threading code in ASP.NET - consider whether or not it could be rewritten to use pre-existing asynchronous methods, and if it can't, then please consider whether or not you really, truly need the code to run in a background thread at all. In the majority of cases, you will probably be adding complexity for no net benefit.
Per Thomas Marquadt of the ASP.NET team at Microsoft, it is safe to use the ASP.NET ThreadPool (QueueUserWorkItem).
From the article:
Q) If my ASP.NET Application uses CLR ThreadPool threads, won’t I starve ASP.NET, which also uses the CLR ThreadPool to execute requests?
..
A) To summarize, don’t worry about
starving ASP.NET of threads, and if
you think there’s a problem here let
me know and we’ll take care of it.
Q) Should I create my own threads
(new Thread)? Won’t this be better
for ASP.NET, since it uses the CLR
ThreadPool.
A) Please don’t. Or to put it a
different way, no!!! If you’re really
smart—much smarter than me—then you
can create your own threads;
otherwise, don’t even think about it.
Here are some reasons why you should
not frequently create new threads:
It is very expensive, compared to
QueueUserWorkItem...By the way, if you can write a better ThreadPool than the CLR’s, I encourage you to apply for a job at Microsoft, because we’re definitely looking for people like you!.
Websites shouldn't go around spawning threads.
You typically move this functionality out into a Windows Service that you then communicate with (I use MSMQ to talk to them).
-- Edit
I described an implementation here: Queue-Based Background Processing in ASP.NET MVC Web Application
-- Edit
To expand why this is even better than just threads:
Using MSMQ, you can communicate to another server. You can write to a queue across machines, so if you determine, for some reason, that your background task is using up the resources of the main server too much, you can just shift it quite trivially.
It also allows you to batch-process whatever task you were trying to do (send emails/whatever).
I definitely think that general practice for quick, low-priority asynchronous work in ASP.NET would be to use the .NET thread pool, particularly for high-traffic scenarios as you want your resources to be bounded.
Also, the implementation of threading is hidden - if you start spawning your own threads, you have to manage them properly as well. Not saying you couldn't do it, but why reinvent that wheel?
If performance becomes an issue, and you can establish that the thread pool is the limiting factor (and not database connections, outgoing network connections, memory, page timeouts etc) then you tweak the thread pool configuration to allow more worker threads, higher queued requests, etc.
If you don't have a performance problem then choosing to spawn new threads to reduce contention with the ASP.NET request queue is classic premature optimization.
Ideally you wouldn't need to use a separate thread to do a logging operation though - just enable the original thread to complete the operation as quickly as possible, which is where MSMQ and a separate consumer thread / process come in to the picture. I agree that this is heavier and more work to implement, but you really need the durability here - the volatility of a shared, in-memory queue will quickly wear out its welcome.
You should use QueueUserWorkItem, and avoid creating new threads like you would avoid the plague. For a visual that explains why you won't starve ASP.NET, since it uses the same ThreadPool, imagine a very skilled juggler using two hands to keep a half dozen bowling pins, swords, or whatever in flight. For a visual of why creating your own threads is bad, imagine what happens in Seattle at rush hour when heavily used entrance ramps to the highway allow vehicles to enter traffic immediately instead of using a light and limiting the number of entrances to one every few seconds. Finally, for a detailed explanation, please see this link:
http://blogs.msdn.com/tmarq/archive/2010/04/14/performing-asynchronous-work-or-tasks-in-asp-net-applications.aspx
Thanks,
Thomas
That article is not correct. ASP.NET has it's own pool of threads, managed worker threads, for serving ASP.NET requests. This pool is usually a few hundred threads and is separate from the ThreadPool pool, which is some smaller multiple of processors.
Using ThreadPool in ASP.NET will not interfere with ASP.NET worker threads. Using ThreadPool is fine.
It would also be acceptable to setup a single thread which is just for logging messages and using producer/consumer pattern to pass logs messages to that thread. In that case, since the thread is long-lived, you should create a single new thread to run the logging.
Using a new thread for every message is definitely overkill.
Another alternative, if you're only talking about logging, is to use a library like log4net. It handles logging in a separate thread and takes care of all the context issues that could come up in that scenario.
I'd say the article is wrong. If you're running a large .NET shop you can safely use the pool across multiple apps and multiple websites (using seperate app pools), simply based on one statement in the ThreadPool documentation:
There is one thread pool per process.
The thread pool has a default size of
250 worker threads per available
processor, and 1000 I/O completion
threads. The number of threads in the
thread pool can be changed by using
the SetMaxThreads method. Each thread
uses the default stack size and runs
at the default priority.
I was asked a similar question at work last week and I'll give you the same answer. Why are you multi threading web applications per request? A web server is a fantastic system optimized heavily to provide many requests in a timely fashion (i.e. multi threading). Think of what happens when you request almost any page on the web.
A request is made for some page
Html is served back
The Html tells the client to make further requets (js, css, images, etc..)
Further information is served back
You give the example of remote logging, but that should be a concern of your logger. An asynchronous process should be in place to receive messages in a timely fashion. Sam even points out that your logger (log4net) should already support this.
Sam is also correct in that using the Thread Pool on the CLR will not cause issues with the thread pool in IIS. The thing to be concerned with here though, is that you are not spawning threads from a process, you are spawning new threads off of IIS threadpool threads. There is a difference and the distinction is important.
Threads vs Process
Both threads and processes are methods
of parallelizing an application.
However, processes are independent
execution units that contain their own
state information, use their own
address spaces, and only interact with
each other via interprocess
communication mechanisms (generally
managed by the operating system).
Applications are typically divided
into processes during the design
phase, and a master process explicitly
spawns sub-processes when it makes
sense to logically separate
significant application functionality.
Processes, in other words, are an
architectural construct.
By contrast, a thread is a coding
construct that doesn't affect the
architecture of an application. A
single process might contains multiple
threads; all threads within a process
share the same state and same memory
space, and can communicate with each
other directly, because they share the
same variables.
Source
You can use Parallel.For or Parallel.ForEach and define the limit of possible threads you want to allocate to run smoothly and prevent pool starvation.
However, being run in background you will need to use pure TPL style below in ASP.Net web application.
var ts = new CancellationTokenSource();
CancellationToken ct = ts.Token;
ParallelOptions po = new ParallelOptions();
po.CancellationToken = ts.Token;
po.MaxDegreeOfParallelism = 6; //limit here
Task.Factory.StartNew(()=>
{
Parallel.ForEach(collectionList, po, (collectionItem) =>
{
//Code Here PostLog(logEvent);
}
});
I do not agree with the referenced article(C#feeds.com). It is easy to create a new thread but dangerous. The optimal number of active threads to run on a single core is actually surprisingly low - less than 10. It is way too easy to cause the machine to waste time switching threads if threads are created for minor tasks. Threads are a resource that REQUIRE management. The WorkItem abstraction is there to handle this.
There is a trade off here between reducing the number of threads available for requests and creating too many threads to allow any of them to process efficiently. This is a very dynamic situation but I think one that should be actively managed (in this case by the thread pool) rather than leaving it to the processer to stay ahead of the creation of threads.
Finally the article makes some pretty sweeping statements about the dangers of using the ThreadPool but it really needs something concrete to back them up.
Whether or not IIS uses the same ThreadPool to handle incoming requests seems hard to get a definitive answer to, and also seems to have changed over versions. So it would seem like a good idea not to use ThreadPool threads excessively, so that IIS has a lot of them available. On the other hand, spawning your own thread for every little task seems like a bad idea. Presumably, you have some sort of locking in your logging, so only one thread could progress at a time, and the rest would just take turns getting scheduled and unscheduled (not to mention the overhead of spawning a new thread). Essentially, you run into the exact problems the ThreadPool was designed to avoid.
It seems that a reasonable compromise would be for your app to allocate a single logging thread that you could pass messages to. You would want to be careful that sending messages is as fast as possible so that you don't slow down your app.

Resources