The idea behind Intels hyperthreading is (as far as I understand) that one core is used for two threads in a time-multiplexed manner.
The HW support this by having the state-related resources doubled and time-sharing other resources. If the running thread stalls (e.g. because it has to fetch new data from RAM), the other thread gets access to the shared resources. The result is a better utilization of the shared resources.
So if one thread isn't ready, the other thread is allowed to run. In other words - a thread switch can happen when the executing thread stalls.
I've tried to find out what will happen if both threads are ready for a long time but I haven't been able to find the information.
What happens if the running thread doesn't stall?
Will the running thread continue as long as it is ready?
Will the core switch to the other thread after some time? If so - what is the criteria for the switch? Is it controlled by HW or SW?
Hyperthreading is simultaneous multithreading (SMT). So it doesn't just switch back and forth on some relatively coarse-grain scale (like stalls), in the case of Sandy Bridge and newer, the fetcher and the decoder alternate between the threads. Execution units are shared competitively, so even if neither thread is stalling they can still together achieve a better utilization than if they ran alone (but that's not typical). So the problems you identified don't apply, because it doesn't work like that in the first place.
I can't seem to find this specific implementation detail, or even a pointer to where in an OS book to find this.
Basically, main thread calls an async task (to be run later) on itself. So... when does it run?
Does it wait for the run loop to finish? Or does it just randomly interrupt the run-loop in the middle of any function?
I understand the registers will be the same (unless separate thread), but not really the instruction pointer and what happens to the stack, if anything does happen.
Thank you
In C# the task is scheduled to be run on the current SynchronizationContext. The context basically has a queue of tasks which it schedules to run on the threads it is associated with, in a GUI app there is only one thread so the task is scheduled to run there.
The GUI thread is not interrupted but it executes the task when it finishes all other tasks preceding it in the queue.
The threads of a process all share the same address space, not the same CPU registers. The thread scheduling is done depends on the programming language and the O/S. Usually there are explicit scheduling points, such as returning from a system call, blocking awaiting I/O completion, or between p-code instructions for interpreted languages. Some O/S implemtations reschedule depending on how long a thread has run for time-based scheduling. Often languages include a function that explicitly offers the CPU to any other thread or process by transferring control to the process or thread scheduler component of the O/S.
The act of switching from one thread or process to another is known as a context switch and is carefully tuned code because this is often done thousands of times per second. This can make the code difficult to follow.
The best explanation of this I've ever seen is http://www.amazon.com/The-Design-UNIX-Operating-System/dp/0132017997 classic.
I read about Jesque at https://github.com/gresrun and I would like to understand how does it perform under huge payload. Is the only way of queuing a job to create an instance of Job class and then using a Thread to start off the worker or are there any other approaches? I am a little skeptical about using java.lang.Thread objects like it is done in the example on this link for batch jobs where data payload is huge.
Actually spawing threads without control is never a good idea.
I would suggest the approach to put your workers in a BlockingQueue and then spawn a very limited number of threads (as much as your CPUs, in order to reduce contention) to start off those workers. Once the work has finished, the thread pick up a new worker and start the process again. Once there are no worker in the queue, the threads just hangs on the queue, waiting for new workers.
You can have a look at the Thread Pool Pattern
I'm trying to get around the concept of cooperative multitasking system and exactly how it works in a single threaded application.
My understanding is that this is a "form of multitasking in which multiple tasks execute by voluntarily ceding control to other tasks at programmer-defined points within each task."
So if you have a list of tasks and one task is executing, how do you determine to pass execution to another task? And when you give execution back to a previous task, how do resume from where you were previously?
I find this a bit confusing because I don't understand how this can be achieve without a multithreaded application.
Any advice would be very helpeful :)
Thanks
In your specific scenario where a single process (or thread of execution) uses cooperative multitasking, you can use something like Windows' fibers or POSIX setcontext family of functions. I will use the term fiber here.
Basically when one fiber is finished executing a chunk of work and wants to voluntarily allow other fibers to run (hence the "cooperative" term), it either manually switches to the other fiber's context or more typically it performs some kind of yield() or scheduler() call that jumps into the scheduler's context, then the scheduler finds a new fiber to run and switches to that fiber's context.
What do we mean by context here? Basically the stack and registers. There is nothing magic about the stack, it's just a block of memory the stack pointer happens to point to. There is also nothing magic about the program counter, it just points to the next instruction to execute. Switching contexts simply saves the current registers somewhere, changes the stack pointer to a different chunk of memory, updates the program counter to a different stream of instructions, copies that context's saved registers into the CPU, then does a jump. Bam, you're now executing different instructions with a different stack. Often the context switch code is written in assembly that is invoked in a way that doesn't modify the current stack or it backs out the changes, in either case it leaves no traces on the stack or in registers so when code resumes execution it has no idea anything happened. (Again, the theme: we assume that method calls fiddle with registers, push arguments to the stack, move the stack pointer, etc but that is just the C calling convention. Nothing requires you to maintain a stack at all or to have any particular method call leave any traces of itself on the stack).
Since each stack is separate, you don't have some continuous chain of seemingly random method calls eventually overflowing the stack (which might be the result if you naively tried to implement this scheme using standard C methods that continuously called each other). You could implement this manually with a state machine where each fiber kept a state machine of where it was in its work, periodically returning to the calling dispatcher's method, but why bother when actual fiber/co-routine support is widely available?
Also remember that cooperative multitasking is orthogonal to processes, protected memory, address spaces, etc. Witness Mac OS 9 or Windows 3.x. They supported the idea of separate processes. But when you yielded, the context was changed to the OS context, allowing the OS scheduler to run, which then potentially selected another process to switch to. In theory you could have a full protected virtual memory OS that still used cooperative multitasking. In those systems, if a errant process never yielded, the OS scheduler never ran, so all other processes in the system were frozen. **
The next natural question is what makes something pre-emptive... The answer is that the OS schedules an interrupt timer with the CPU to stop the currently executing task and switch back to the OS scheduler's context regardless of whether the current task cares to release the CPU or not, thus "pre-empting" it.
If the OS uses CPU privilege levels, the (kernel configured) timer is not cancelable by lower level (user mode) code, though in theory if the OS didn't use such protections an errant task could mask off or cancel the interrupt timer and hijack the CPU. There are some other scenarios like IO calls where the scheduler can be invoked outside the timer, and the scheduler may decide no other process has higher priority and return control to the same process without a switch... And in reality most OSes don't do a real context switch here because that's expensive, the scheduler code runs inside the context of whatever process was executing, so it has to be very careful not to step on the stack, to save register states, etc.
** You might ask why not just fire a timer if yield isn't called within a certain period of time. The answer lies in multi-threaded synchronization. In a cooperative system, you don't have to bother taking locks, worry about re-entrance, etc because you only yield when things are in a known good state. If this mythical timer fires, you have now potentially corrupted the state of the program that was interrupted. If programs have to be written to handle this, congrats... You now have a half-assed pre-emptive multitasking system. Might as well just do it right! And if you are changing things anyway, may as well add threads, protected memory, etc. That's pretty much the history of the major OSes right there.
The basic idea behind cooperative multitasking is trust - that each subtask will relinquish control, of its own accord, in a timely fashion, to avoid starving other tasks of processor time. This is why tasks in a cooperative multitasking system need to be tested extremely thoroughly, and in some cases certified for use.
I don't claim to be an expert, but I imagine cooperative tasks could be implemented as state machines, where passing control to the task would cause it to run for the absolute minimal amount of time it needs to make any kind of progress. For example, a file reader might read the next few bytes of a file, a parser might parse the next line of a document, or a sensor controller might take a single reading, before returning control back to a cooperative scheduler, which would check for task completion.
Each task would have to keep its internal state on the heap (at object level), rather than on the stack frame (at function level) like a conventional blocking function or thread.
And unlike conventional multitasking, which relies on a hardware timer to trigger a context switch, cooperative multitasking relies on the code to be written in such a way that each step of each long-running task is guaranteed to finish in an acceptably small amount of time.
The tasks will execute an explicit wait or pause or yield operation which makes the call to the dispatcher. There may be different operations for waiting on IO to complete or explicitly yielding in a heavy computation. In an application task's main loop, it could have a *wait_for_event* call instead of busy polling. This would suspend the task until it has input to process.
There may also be a time-out mechanism for catching runaway tasks, but it is not the primary means of switching (or else it wouldn't be cooperative).
One way to think of cooperative multitasking is to split a task into steps (or states). Each task keeps track of the next step it needs to execute. When it's the task's turn, it executes only that one step and returns. That way, in the main loop of your program you are simply calling each task in order, and because each task only takes up a small amount of time to complete a single step, we end up with a system which allows all of the tasks to share cpu time (ie. cooperate).
The thread will be started on each Application_Start event.
It will be a monitoring thread which is supposed to run constantly.
So even if the app shuts down, once it is restarted the thread will start too ensuring it runs all the time.
However I need to be sure that this thread will not be stopped / shut down while the application is running.
So in a few words, does anybody know if asp.net could shut down such a thread without actualy stopping / recycling the application.
As a matter of design, you shouldn't depend on asp.net to run threads like this. Little things like app recycling can cause you a lot of trouble.
Instead, create a windows service to execute the thread. This way you don't have to worry about it.
Update
I just wanted to add a little more information.
IIS has the ability to execute your app across multiple threads and processes. A standard site installation usually only has a single process (aka: web garden) assigned which spins up around 20 threads to handle request processing.
However, any IIS administrator can easily add more processes to the mix. They usually do this when a site can hose a single process either because request processing takes too long, or the number of handler threads isn't enough, or as a temporary measure if the app has enough problems that a single thread will hose the entire process fairly often.
If you have a thread being spun on app start, then it will create one for each worker process the site has. This may be unexpected behavior to you or your successors.
Also, monitoring apps are almost always completely separate to the application they are monitoring. One of the primary reasons is that in the event the monitored process dies, hangs, or otherwise becomes unresponsive then the monitoring app itself still needs to carry on and log this information. Otherwise the monitored process could very well hose the monitoring app itself.
So, do yourself a favor and move this to its own process. The best way to do this on an IIS server is to create a windows service and give it the appropriate execution rights to do what you need.