why have user context and kernel context ... unix

why have user context and kernel context ... unix - unix

operating systems related question dunno if i can ask here
but thought i would get proper explanation in this forum
when a process be exec in user context... wont the higher priority precesses in kernel context blocking the process in user context all the time...
it is hazy for me... the concepts
......

There is two main kind of scheduler in operating system, preemptive schedulers and non-preemptive schedulers.
Non preemptive schedulers would behave like you think, a process with higher rights and higher priority will keep using the cpu until it finish OR until it block (on a mutex for example or with a call to yield which explicitly release the cpu in order to schedule another one.)
But non-preemptive schedulers are rare and linux scheduler isn't that kind. It uses time slices to let process work for a short period of time before de-scheduling it, it also include priority but keep scheduling processes with lower priority, you should take a look at this linux scheduler article.

This Stackoverflow posting has a discussion that includes a run-down of how kernel mode works with an explanation of some of the jargon. In particular look at the section titled 'A brief primer on kernel vs. user mode'. It might help to shed some light on your question.

http://en.wikipedia.org/wiki/Ring_0

Your process in kernel mode can as well be preempted, when it reaches the quantum.
Wikipedia: Preemption

Related

Scheduler without OS

Do you know if there is any open source task scheduler without the OS support?
Basically, we are looking for a lean scheduler that can schedule and preempt the tasks on our AM335X TI chipset based boards, which don't have any RTOS running on them.

There are no "portable" schedulers at that level, because the context switch functions are hardware-dependent. Therefore, you need a scheduler for your specific hardware (e.g., provided by TI in their development environments) or a very minimal RTOS.
There are RTOSs that are very minimal (a few KB of footprint) and can contain only the scheduler. Have a look for example at http://erika.tuxfamily.org. However, I'm afraid that your specific microcontroller is not supported.

Most simple RTOS kernels provide at least scheduling, synchronisation, and IPC. However since they are also provided as static libraries, what you don't use will not be included in your product. That said I find it difficult to think of a system where synchronisation, IPC and timer services would not be required or at least beneficial.
There are numerous such RTOS libraries including Segger embOS for example.

The only portable, hardware independent solution I know is Protothreads by Adam Dunkels. But it is not true multitasking anyway, just a nice syntax sugar on concurrent state machines. However it may help you in your task.

How to find i/o bottleneck within asp.net app

We got a high traffic website which generates a lot of I/O. Within 10 minutes it has been reading over 10 gb of data (w3wp in question seen in task manager). For memory and application hangs I have been using WinDbg with success. But I don't know how I can find the object(s) / method(s) within a process which are responsible for the highest I/O.
Is this even possible?
Edit
The question is: Is there a way to profile I/O operations in a .NET assembly, say: list of threads sorted by highest disk I/O (or something similar that would help me where to look)

ANTS Performance Profiler
I have used this tool to great success - dealing with finding the specific instructions which are causing ~512GB of memory on a high-volume web farm getting chewed up within 5-10 minutes. Sounds like a very similar situation as yours.
Now, to be realistic - it's not going to magically solve your problem. It still requires a lot of setup, thorough analysis and detective work. But this tool definitely took the problem from "practically unsolvable" to "solvable within days".
Update:
As I mentioned in the comments (and Ben Emmett echoed), we can use ANTS to monitor memory, file system handles - pretty much any resource consumption and drill down the call stack to see the effects of specific routines.

I came up with this tool AppDynamics Lite which displays your application calls costs and performance in a visual way. It might help you to find out which functions are making the most costy IO operations.
Quoting;
Understand the health of your CLR with key metrics like response time, throughput, exception rate, and garbage collection time as well as key system resource like CPU, memory and disk I/O.
Worth giving a shot as it is trial/free for 30 days. Hope it helps.
Ps: I'm not affiliated with AppDynamics in any way.

You can use the (free) Windows Performance Toolkit from Windows 8 which does run also on Windows Vista and later. There you can turn on system wide profiling to see what was going on in all processes at once. No instrumentation necessary. Only one reboot is required to set an arcane registry key which is done by WPRUI.exe automatically.
With XPerf you could enable IO Init stack walking so that a call stack is taken for every IO which is started. The only issue is that the stacks will be broken for 64 bit processes which means that you will see only the first method above the BCL methods of your code because there is a Windows 7 bug in the stackwalking capabilities of the OS.
A workaround is to Ngen your assemblies or move to Server 2012 or switch to x86 for profiling to see deeper call stacks.
You will see all file IO and CPU activity even without any call stacks and the file names along how long the hard disc was used. That should give you good information which part of your app is causing the disc IO. From the partial call stacks you should be able to pinpoint your issue even without full stacks.
The tool will give you much more insight than any commercially available profiler at the expense that you need to learn how to use it. Since the call stacks do not end at your code or in user mode but in the kernel you can also determine if e.g. the virus scanner is causing significant IO delays. But you need to know how your processor does work. This toolset was originally aimed at kernel devs which explains why you see so many useless columns.
In the picture below you see file IO and CPU consumption stacked. When you select your high IO file in the disc IO graph it will highlight in the CPU consumption all related call stacks which were taken at the same time while the IO was active. This way you can diretly navigate from the IO to your potentially blocked threads.

Cooperative Multitasking system

I'm trying to get around the concept of cooperative multitasking system and exactly how it works in a single threaded application.
My understanding is that this is a "form of multitasking in which multiple tasks execute by voluntarily ceding control to other tasks at programmer-defined points within each task."
So if you have a list of tasks and one task is executing, how do you determine to pass execution to another task? And when you give execution back to a previous task, how do resume from where you were previously?
I find this a bit confusing because I don't understand how this can be achieve without a multithreaded application.
Any advice would be very helpeful :)
Thanks

In your specific scenario where a single process (or thread of execution) uses cooperative multitasking, you can use something like Windows' fibers or POSIX setcontext family of functions. I will use the term fiber here.
Basically when one fiber is finished executing a chunk of work and wants to voluntarily allow other fibers to run (hence the "cooperative" term), it either manually switches to the other fiber's context or more typically it performs some kind of yield() or scheduler() call that jumps into the scheduler's context, then the scheduler finds a new fiber to run and switches to that fiber's context.
What do we mean by context here? Basically the stack and registers. There is nothing magic about the stack, it's just a block of memory the stack pointer happens to point to. There is also nothing magic about the program counter, it just points to the next instruction to execute. Switching contexts simply saves the current registers somewhere, changes the stack pointer to a different chunk of memory, updates the program counter to a different stream of instructions, copies that context's saved registers into the CPU, then does a jump. Bam, you're now executing different instructions with a different stack. Often the context switch code is written in assembly that is invoked in a way that doesn't modify the current stack or it backs out the changes, in either case it leaves no traces on the stack or in registers so when code resumes execution it has no idea anything happened. (Again, the theme: we assume that method calls fiddle with registers, push arguments to the stack, move the stack pointer, etc but that is just the C calling convention. Nothing requires you to maintain a stack at all or to have any particular method call leave any traces of itself on the stack).
Since each stack is separate, you don't have some continuous chain of seemingly random method calls eventually overflowing the stack (which might be the result if you naively tried to implement this scheme using standard C methods that continuously called each other). You could implement this manually with a state machine where each fiber kept a state machine of where it was in its work, periodically returning to the calling dispatcher's method, but why bother when actual fiber/co-routine support is widely available?
Also remember that cooperative multitasking is orthogonal to processes, protected memory, address spaces, etc. Witness Mac OS 9 or Windows 3.x. They supported the idea of separate processes. But when you yielded, the context was changed to the OS context, allowing the OS scheduler to run, which then potentially selected another process to switch to. In theory you could have a full protected virtual memory OS that still used cooperative multitasking. In those systems, if a errant process never yielded, the OS scheduler never ran, so all other processes in the system were frozen. **
The next natural question is what makes something pre-emptive... The answer is that the OS schedules an interrupt timer with the CPU to stop the currently executing task and switch back to the OS scheduler's context regardless of whether the current task cares to release the CPU or not, thus "pre-empting" it.
If the OS uses CPU privilege levels, the (kernel configured) timer is not cancelable by lower level (user mode) code, though in theory if the OS didn't use such protections an errant task could mask off or cancel the interrupt timer and hijack the CPU. There are some other scenarios like IO calls where the scheduler can be invoked outside the timer, and the scheduler may decide no other process has higher priority and return control to the same process without a switch... And in reality most OSes don't do a real context switch here because that's expensive, the scheduler code runs inside the context of whatever process was executing, so it has to be very careful not to step on the stack, to save register states, etc.
** You might ask why not just fire a timer if yield isn't called within a certain period of time. The answer lies in multi-threaded synchronization. In a cooperative system, you don't have to bother taking locks, worry about re-entrance, etc because you only yield when things are in a known good state. If this mythical timer fires, you have now potentially corrupted the state of the program that was interrupted. If programs have to be written to handle this, congrats... You now have a half-assed pre-emptive multitasking system. Might as well just do it right! And if you are changing things anyway, may as well add threads, protected memory, etc. That's pretty much the history of the major OSes right there.

The basic idea behind cooperative multitasking is trust - that each subtask will relinquish control, of its own accord, in a timely fashion, to avoid starving other tasks of processor time. This is why tasks in a cooperative multitasking system need to be tested extremely thoroughly, and in some cases certified for use.
I don't claim to be an expert, but I imagine cooperative tasks could be implemented as state machines, where passing control to the task would cause it to run for the absolute minimal amount of time it needs to make any kind of progress. For example, a file reader might read the next few bytes of a file, a parser might parse the next line of a document, or a sensor controller might take a single reading, before returning control back to a cooperative scheduler, which would check for task completion.
Each task would have to keep its internal state on the heap (at object level), rather than on the stack frame (at function level) like a conventional blocking function or thread.
And unlike conventional multitasking, which relies on a hardware timer to trigger a context switch, cooperative multitasking relies on the code to be written in such a way that each step of each long-running task is guaranteed to finish in an acceptably small amount of time.

The tasks will execute an explicit wait or pause or yield operation which makes the call to the dispatcher. There may be different operations for waiting on IO to complete or explicitly yielding in a heavy computation. In an application task's main loop, it could have a *wait_for_event* call instead of busy polling. This would suspend the task until it has input to process.
There may also be a time-out mechanism for catching runaway tasks, but it is not the primary means of switching (or else it wouldn't be cooperative).

One way to think of cooperative multitasking is to split a task into steps (or states). Each task keeps track of the next step it needs to execute. When it's the task's turn, it executes only that one step and returns. That way, in the main loop of your program you are simply calling each task in order, and because each task only takes up a small amount of time to complete a single step, we end up with a system which allows all of the tasks to share cpu time (ie. cooperate).

What happens inside machine when i send a signal

I mean, i have a request for an operating system, like kill SIG_NUMBER PID, what happens next. What are the actions taken by operating system and so on.
Thanks much

Depends on the OS of course - but generally assuming you have sufficient privileges to deliver that signal to the process concerned - then the OS will alter the process state for the proc. concerned within the kernel. That will generally result in some "life cycle" state change for the process - i.e. to be terminated, terminating, dead, to be suspended .. etc.
The actual call into the kernel (depending on OS) will be via a system call or maybe an 'ioctl' call via some appropriate device.
When it's the process's turn for some cpu time the proc scheduler will take process state into account to determine what to do next. Deliberately brief here as its quite involved.
I'd suggest looking at some sample source - look at a Linux distro or OpenSolaris maybe (although that's quite complicated).
Example here - warning this is very complicated.
OpenSolaris signal handling in the kernel

process scheduling question

For example, a process waiting for
disk I/O to complete will sleep on the
address of the buffer header
corresponding to the data being
transferred. When the interrupt
routine for the disk driver notes that
the transfer is complete, it calls
wakeup on the buffer header. The
interrupt uses the kernel stack for
whatever process happened to be
running at the time, and the wakeup is
done from that system process.
Can you please explain the last line in the paragraph which I have emphasised. It is about waking up the process which has been waiting for some event to occur and thus has slept. This para is from Galvin. By the way can you suggest some good book or link for studying unix operating systems?
Thanks.

There is some process running at the time the interrupt is received. The kernel doesn't change over to some other process context to handle it -- that would take time -- it just does what's necessary in the current context, and lets the scheduler know that the next time it schedules, the waiting process is ready to proceed.
There are a number of good internals books around. I'm fond of the various McKusick et al books, like The Design and Implementation of the FreeBSD Operating System.

Maurice Bach's Design of the Unix Operating System is the most well-known and comprehensive book on the subject.

The I/O completion interrupt will be executed as soon as the disk signals the end of the transfer. This is done regardless of what the kernel is currently doing. Interrupt handlers are usually very small and self-contained. Therefore it is faster to re-use the current runtime environment (stack, CPU state, etc) instead of doing a full context switch to a separate thread. On the down side this means that interrupt handlers are only allowed to do very limited things, like setting a flag somewhere else, or enqueing a work item. Also, they have to clean up very carefully after themselves, so that the running process is not disturbed.

Eric Raymond's 'The Art of Unix Programming' , should be read to understand the Unix philosophy and culture.To actually know and appreciate the reasons behind its design.