ksoftirq is NOT using CPU and should be, why? - cpu-usage

Both Linux 2.6 and 3.8.
Linux setup as a router passing a 3 gig file
Doing a top, %SI is high at 30%, but ksoftirqd is doing 0% CPU. So the question is "What thread is handling the softirq???" I've read the code and it is suppose to be ksoftirqd, but it is idle.
[can't post image, not enough points]
Is this an accounting issue?
dreez

When the load can not be handled with %SI, it will be offloaded to softirqd. So you see %SI 30 and %softirqd 0.
From the man page.
ksoftirqd is a per-cpu kernel thread that runs when the machine is under heavy soft-interrupt load. Soft interrupts are normally serviced on return from a hard interrupt, but it's possible for soft interrupts to be triggered more quickly than they can be serviced. If a soft interrupt is triggered for a second time while soft interrupts are being handled, the ksoftirq daemon is triggered to handle the soft interrupts in process context. If ksoftirqd is taking more than a tiny percentage of CPU time, this indicates the machine is under heavy soft interrupt load.

Related

When a process makes a system call to transmit a TCP packet over the network, which of the following steps do NOT occur always?

I am teaching myself OS by going through the lecture notes of the course at IIT Bombay (https://www.cse.iitb.ac.in/~mythili/os/). One of the questions in the Process worksheet asks which of the following doesn't always happen in the situation described at the title. The answer is C.
A. The process moves to kernel mode.
B. The program counter of the CPU shifts to the kernel part of the address space.
C. The process is context-switched out and a separate kernel process starts execution.
D. The OS code that deals with handling TCP/IP packets is invoked
I'm a bit confused though. I thought when an interrupt routine occurs the process is context-switched out so other processes can run and the CPU is not idle during that time. The kernel, then, will take care of the packet sending. Why would C not be correct then?
You are right in saying that "when an interrupt routine occurs the process is context-switched out so other processes can run and the CPU is not idle during that time", but the words "generally or mostly" need to be added to it.
In most cases, there is another process waiting for CPU time and that can be scheduled. However it is not the case 100% of the time. The question is about the word "always" and while other options always occur in the given situation, option C is a choice that OS makes at run time. If OS determines that switching out this process can be sub optimal than performing the system call and resuming the same process, then it may not perform the context switching.
There is a cost associated with context switching and if other processes are also blocked on some I/O then it may be optimal for OS to NOT switch the context or there might be other reasons to not switch the context such as what if only 1 process is running, there is no other process to switch the context to!

How can a code be asyncronus on a single-core CPU which is synchronous?

In a uniprocessor (UP) system, there's only one CPU core, so only one thread of execution can be happening at once. This thread of execution is synchronous (it gets a list of instructions in a queue and run them one by one). When we write code, it compiles to set of CPU instructions.
How can we have asynchronous behavior in software on a UP machine? Isn't everything just run in some fixed order chosen by the OS?
Even an out-of-order execution CPU gives the illusion of running instructions in program order. (This is separate from memory reordering observed by other cores or devices in the system. In a UP system, runtime memory reordering is only relevant for device drivers.)
An interrupt handler is a piece of code that runs asynchronously to the rest of the code, and can happen in response to an interrupt from a device outside the CPU. In user-space, a signal handler has equivalent semantics.
(Or a hardware interrupt can cause a context switch to another software thread. This is asynchronous as far as the software thread is concerned.)
Events like interrupts from network packets arriving or disk I/O completing happen asynchronously with respect to whatever the CPU was doing before the interrupt.
Asynchronous doesn't mean simultaneous, just that it can run between any two machine instructions of the rest of the code. A signal handler in a user-space program can run between any two machine instructions, so the code in the main program must work in a way that doesn't break if this happens.
e.g. A program with a signal-handler can't make any assumptions about data on the stack below the current stack pointer (i.e. in the un-reserved part of the stack). The red-zone in the x86-64 SysV ABI is a modification to this rule for user-space only, since the kernel can respect it when transferring control to a signal handler. The kernel itself can't use a red-zone, because hardware interrupts write to the stack outside of software control, before running the interrupt handler.
In an OS where I/O completion can result in the delivery of a POSIX signal (i.e. with POSIX async I/O), the timing of a signal can easily be determined by the timing of a hardware interrupts, so user-space code runs asynchronously with timing determined by things external to the computer. It's not just an issue for the kernel.
On a multicore system, there are obviously far more ways for things to happen in different orders more of the time.
Many processors are capable of multithreading, and many operating systems can simulate multithreading on single-threaded processors by swapping tasks in and out of the processor.

Probe seems to consume the CPU

I've got an MPI program consisting of one master process that hands off commands to a bunch of slave processes. Upon receiving a command, a slave just calls system() to do it. While the slaves are waiting for a command, they are consuming 100% of their respective CPUs. It appears that Probe() is sitting in a tight loop, but that's only a guess. What do you think might be causing this, and what could I do to fix it?
Here's the code in the slave process that waits for a command. Watching the log and the top command at the same time suggests that when the slaves are consuming their CPUs, they are inside this function.
MpiMessage
Mpi::BlockingRecv() {
LOG(8, "BlockingRecv");
MpiMessage result;
MPI::Status status;
MPI::COMM_WORLD.Probe(MPI_ANY_SOURCE, MPI_ANY_TAG, status);
result.source = status.Get_source();
result.tag = status.Get_tag();
int num_elems = status.Get_count(MPI_CHAR);
char buf[num_elems + 1];
MPI::COMM_WORLD.Recv(
buf, num_elems, MPI_CHAR, result.source, result.tag
);
result.data = buf;
LOG(7, "BlockingRecv about to return (%d, %d)", result.source, result.tag);
return result;
}
Yes; most MPI implementations, for the sake of performance, busy-wait on blocking operations. The assumption is that the MPI job is the only thing going on that we care about on the processor, and if the task is blocked waiting for communications, the best thing to do is to continually poll for that communication to reduce latency; so that there's virtually no delay between when the message arrives and when it's handed off to the MPI task. This typically means that CPU is pegged at 100% even when nothing "real" is being done.
That's probably the best default behaviour for most MPI users, but it isn't always what you want. Typically MPI implementations allow turning this off; with OpenMPI, you can turn this behaviour off with an MCA parameter,
mpirun -np N --mca mpi_yield_when_idle 1 ./a.out
It sounds like there are three ways to wait for an MPI message:
Aggressive busy wait. This will get the message into your receiving code as fast as possible. Some processor is doing nothing but checking for the incoming message. If you put all of your processors in this state, the rest of your system is going to be very slow. MPI uses aggressive mode by default.
Degraded busy wait. This will yield to other processes while doing its busy wait. If the number of processes you ask for is more than the number of processors you have, MPI switches to degraded mode. You can also force aggressive or degraded mode with an MCA parameter.
Polling. Even the degraded busy wait is still a busy wait, and it will keep one processor pegged at 100% for each process that is waiting. If you have other tasks on your system that you don't want to compete with, you can call MPI_Iprobe() in a loop with a sleep call before calling a blocking receive. I find a 100ms sleep is responsive enough for my tasks, and still keeps the CPU usage minimal when a worker is idle.
I did some searching and found that a busy wait is what you want if you are not sharing your processors with other tasks.

Stopping runaway OpenCL kernel

I accidentally wrote a while loop that would never break in a kernel and I sent this to the GPU. After 30 seconds my screens started flickering, I realised what I have done and terminated the application by force. The problem is that I had to shut down the computer afterwards to make sure the kernels are gone. Therefore my questions are:
If I forcefully terminate the program (the program that's launching the kernels) without it freeing the GPU resources (freeing buffers, queues, kernels, CL.destroying) will the kernels still run?
If they are still running can I do anything to stop them? Say, like, release resources I don't have a handle to any more.
If you are using an NVIDIA card, then by terminating the application you will eventually free the resources on the card to allow it to run again. This is because NVIDIA has a watchdog monitor on the device (which you can turn off).
If you are using an AMD card, you are out of luck AFAIK and will have to restart the machine after every crash.

if interrupt happens how does unix kernel determine which process its for

Lets say Unix is executing process A and an interrupt at higher level occurs. Then OS get a interrupt number and from IVT it looks up the routine to call.
Now how does the OS know that this interrupt was for process A and not for process B. It might have been that process B might have issued a disk read and it came back while OS was executing process A.
Thanks
Start with this: http://en.wikipedia.org/wiki/MINIX
Go buy the book and read it; it will really help a lot.
Interrupts aren't "for" processes. They're for devices and handled by device drivers.
The device driver handles the interrupt and updates the state of the device.
If the device driver concludes that an I/O operation is complete, it can then update the its queue of I/O requests to determine which operation completed. The operation is removed from the queue of pending operations.
The process which is waiting for that operation is now ready-to-run and can resume execution.
You are talking about a hardware interrupt and these are not targeted at processes.
If a process A requests a file, the filesystem layer, which already resides in the kernel, will fetch the file from the block device. The block device itself is handled by a driver.
When the interrupt occurs, triggered by the block device, the OS has this interrupt associated with the driver. So the driver is told to handle the interrupt. It will then query which blocks were read and see for what it requested them.
After the filesystem is told that the requested data is ready, it may further process it. Then, the process leaves blocked state.
In the next round of the scheduler, the scheduler may select to wake up this process. It may also select to wake up another process first.
As you can see, the interrupt occurance is fully disconnected from the process operation.

Resources