I'm trying to recognize a run away threads in my own application and close them for good before they render machine inaccessible.
However, I can only get CPU time for the thread, that is limitation of API I'm using. Is there any way to evaluate CPU utilization from that data?
I was thinking about comparing it to real time and if it is close - than that thread is loading CPU too much. What do you think about that heuristic, will it work?
CPU time divided by real time will give you CPU utilization.
Related
Is it possible to limit CPU & Memory for the *nix Process?
The CPU limit may look like "use no more than 10% of one core".
The memory limit may look like "use no more than 100Mb", the OS may limit it or kill the process if it try to exceed the limit, both ways are fine.
Any *nix that could do that would be fine.
It seems it is possible to implement it with virtual machines, but it is not acceptable because the overhead is too huge.
If you happen to use Solaris, the ability to limit resource usage is a native feature.
Memory (RAM) usage can be capped per process using the rcap.max-rss setting while CPU usage can be limited per project using the project.cpu-caps.
Note that Solaris also allows OS level virtualization (a.k.a. zones) which have no significant overhead, if any, compared to a bare metal OS instance.
Resource capping is part of Solaris zones configuration.
Try CPULimit
cpulimit is a simple program which attempts to limit the cpu usage of a process (expressed in percentage, not in cpu time). This is useful to control batch jobs, when you don't want them to eat too much cpu. It does not act on the nice value or other scheduling priority stuff, but on the real cpu usage. Also, it is able to adapt itself to the overall system load, dynamically and quickly.
We have a process that is computationally intensive. When it runs it typically it uses 99% of the available CPU. It is configured to take advantage of all available processors so I believe this is OK. However, one of our customers is complaining because alarms go off on the server on which this process is running because of the high CPU utilization. I think that there is nothing wrong with high CPU utilization per se. The CPU drops back to normal when the process stops running and the process does run to completion (no infinite loops, etc.). I'm just wondering if I am on solid ground when I say that high CPU usage is not a problem per se.
Thank you,
Elliott
if I am on solid ground when I say that high CPU usage is not a problem per se
You are on solid ground.
We have a process that is computationally intensive
Then I'd expect high CPU usage.
The CPU drops back to normal when the process stops running
Sounds good so far.
Chances are that the systems you client are using are configured to notify when the CPU usage goes over a certain limit, as sometimes this is indicative of a problem (and sustained high usage can cause over heating and associated problems).
If this is expected behavior, your client needs to adjust their monitoring - but you need to ensure that the behavior is as expected on their systems and that it is not likely to cause problems (ensure that high CPU usage is not sustained).
Alarm is not a viable reason for poor design. The real reason may be that it chokes other tasks on the system. Modern OSes usually take care of this by lowering dynamic priority of the CPU hungry process in such a way that others that are less demanding of CPU time will get higher priority. You may tell the customer to "nice" the process to start with, since you probably don't care if it runs 10 mins of 12 mins. Just my 2 cents :)
we have requirement to handle 10000 concurrent user.
Let me explain the system. Machine has two processors. ProcessModel in machine.config is set as autoconfig = true. so that makes maxWorkerThreads = 20.
When I Load run my case with 30 users and watch CPU usage it is maximing to 100. and number of threads on w3wp.exe is more then 70. As my default is 20 * 2 (CPU's) = 40.
Once cpu touches 100% most of the transaction fail or talking maximum time to respond
Now questions
1. how do i get 30 more threads assigned to the same workerprocess?
2. How can reduce CPU usage here?
You have a bit of an issue here. Increasing the # of threads will further increase CPU usage. (Your 2 goals are incompatible.) Not only are you asking each CPU to do more, but you'll have additional overhead with context switching.
You could investigate using a non-blocking IO library, which would essentially mean only 1-2 threads per CPU. However, this could be a significant architecture change to your project (probably not feasible) - and what you might actually find is that most of the CPU was actually spent due to the work your code is performing, and not because of anything threading-related.
It sounds like you need to do some performance tuning and optimization of your application.
You should take a look at making async calls so that your threads are not remaining active while the caller is waiting for a response.
http://msdn.microsoft.com/en-us/magazine/cc163463.aspx
I have a multi-threaded web application with about 1000~2000 threads at production environment.
I expect CPU usage on w3wp.exe but System Idle Process eats CPU. Why?
The Idle process isn't actually a real process, it doesn't "eat" your CPU time. the %cpu you see next to it is actually unused %cpu (more or less).
The reason for the poor performance of your application is most likely due to your 2000 threads. Windows (or indeed any operating system) was never meant to run so many threads at a time. You're wasting most of the time just context switching between them, each getting a couple of milliseconds of processing time every ~30 seconds (15ms*2000=30sec!!!!).
Rethink your application.
the idle process is simply holding process time until a program needs it, its not actually eating any cycles at all. you can think the system idle time as 'available cpu'
System Idle Process is not a real process, it represents unused processor time.
This means that your app doesn't utilize the processor completely - it may be memory-bound or CPU-bound; possibly the threads are waiting for each other, or for external resources? Context switching overhead could also be a culprit - unless you have 2000 cores, the threads are not actually running all at the same time, but assigned time slices by the task scheduler, this also takes some time.
You have not provided a lot of details so I can only speculate at this point. I would say it is likely that most of those threads are doing nothing. The ones that are doing something are probably IO bound meaning that they are spending most of their waiting for the external resource to respond.
Now lets talk about the "1000~2000 threads". There are very few cases (maybe none) where having that many threads is a good idea. I think your current issue is a perfect example of why. Most of those threads are (apparently anyway) doing nothing but wasting resources. If you want to process multiple tasks in parallel, espcially if they are IO bound, then it is better to take advantage of pooled resources like the ThreadPool or by using the Task Parallel Library.
I've been thinking today about NodeJS and it attitude towards blocking, it got me thinking, if a block of code is purely non-blocking, say calculating some real long alogirthm and variables are all present in the stack etc.. should this push a single core non hyperthreaded to CPU as Windows Task Manager defines it to 100% as it aims to complete this task as quickly as possible? Say that this is generally calculation that can take minutes.
Yes, it should. The algorithm should run as fast as it can. It's the operating system's job to schedule time to other processes if necessary.
If your non-blocking computation intensive code doesn't use 100% of the CPU then you are wasting cycles in the idle task. It always irritates me to see the idle task using 99% of the CPU.
As long as the CPU is "given" to other processes when there are some that need it to do their calculations, I suppose it's OK : why not use the CPU if it's available and there is some work to do ?
As RAM can be paged out to disk, all applications are potentially blocking. This would happen if the algorithm uses more RAM than available on the system. As a result, it won't hit 100%.