How do I wait on three child processes? - unix

I'm trying to fork 3 different child processes from a parent (and running this on a UNIX box), and I want to have this requirement :
The parent must wait till all the 3 children processes have finished executing.
I'm using wait for the same .. Here's the code snippet :
#include <unistd.h>
#include <sys/signal.h>
#include <sys/types.h>
#include <sys/wait.h>
int main()
{
int stat;
/* ... */
At last, in the parent, I do this :
wait (&stat);
/* ... */
return 0;
}
Question :
Do I need to call wait thrice or does a single call suffice?
I need to know how this works..

You have to issue three waits. Each wait blocks until a child exits or doesn't block if a child has already exited. See wait.

You have to wait three times.

Side note: If you don't want to block waiting for each to terminate in turn, you can instead install a signal handler for SIGCHLD and then call wait() to collect the return code once you know it is ready.

Related

would change process priority frequently have side effect

I am doing embedded system programming.
our process is set as higher priority by default, however for some actions like invoking shell command, write file. I was thinking to lower its priority and then up it again. so it's kind of like a pair of function calling: "setdefaultpriority" and "improve priority".
And there are lots of shell command calling in our process. In one file, I may need to call tens of pair of "setdefault..." and "improve.."
My question, would so many priority operation in one process have any bad effect ?
setpriority in a non-root process can only go up (decrease priority), never down.
What you can do is decrease process priority in the child process, before it execs the shell command.
//errror checks ommited
#include <sys/resource.h>
#include <sys/time.h>
#include <stdio.h>
#include <unistd.h>
#include <assert.h>
#include <sys/wait.h>
int main()
{
pid_t pid;
pid=fork();
assert(pid>=0);
if (!pid){
execlp("nice", "nice", (char*)0);
_exit(1);
}
wait(0);
pid=fork();
if (!pid){
setpriority(PRIO_PROCESS, 0, 10);
execlp("nice", "nice", (char*)0);
_exit(1);
}
}
/* should print:
0
10
*/
The performance overhead of a system call as simple as setpriority should be negligible compared to the cost of fork and exec*.

How to start detached process and wait for the parent to terminate?

Using QProcess to implement an updater I start a detached process from my application and immediately exit. In the spawned updater process I will overwrite the files as needed and then start the main app again. The problem is that the update of the files can sometimes fail if the main app does not close "fast enough" to release all the loaded libraries before the updater starts overwriting them. One way would be to wait for arbitrary amount of time like 1 sec and then start updating but I would rather implement something that actually checks if the parent process is not running anymore. I can pass its ID when I spawn it but that does not really cut it because there seems to be no function such as bool QProcess::isRunning(qint64 pid). I also don't think it is possible to connect signals and slots cross applications... Any ideas?
You can use a QSystemSemaphore class in both applications to wait.
app:
int main()
{
QSystemSemaphore sem( "some_uuid", 1, QSystemSemaphore::Create );
sem.acquire();
// ...
app.exec();
// TODO: start updater
// sem.release(); // not sure, that it will not be done automatically
return 0;
}
updater:
int main()
{
QSystemSemaphore sem( "some_uuid", 1, QSystemSemaphore::Open );
sem.acquire(); // Will wait for application termination
// ...
app.exec();
return 0;
}
You should not forget about error handling. And you should try to open and close file "yourapp.exe" from "updater.exe" with full access to be sure, that application is closed.
For 100% result it is prefferable to use platform-specific API.

Controlled premature termination of mpi program running under slurm?

I am running a script that does multiple subsequent mpirun calls through slurms squeue command. Each call to mpirun will write its output to an own directory, but there is a dependency between them in the way that a given run will use data from the former runs output directory.
The mpi program internally performs some iterative optimization algorithm, which will terminate if some convergence criteria are met. Every once in a while it happens, that the algorithm reaches a state in which those criteria are not quite met yet, but by plotting the output (which is continuosly written to disk) one can quite easily tell that the important things have converged and that further iterations would not change the nature of the final result anymore.
What I am therefore looking for is a way to manually terminate the run in a controlled way and have the outer script proceed to the next mpirun call. What is the best way to achieve this? I do not have direct access to the node on which the calculation is actually performed, but I have of course access to all of slurms commands and the working directories of the individual runs. I have access to the mpi programs full source code.
One solution that would work is the following: If one manually wants to terminate a run, one places a file with a special name like killme in the working directory, which could easily be done with touch killme. The mpi program would regulary check for the existence of this file and terminate in a controlled manner if it exists. The outer script or slurm would not be involved at all here and the script would just continue with the next mpirun call. What do you think of this solution? Can you think of anything better?
Here is a short code snippet for getting SIGUSR1 as a signal.
More detailed explanation can be found here.
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <string.h>
#include <unistd.h>
void sighandler(int signum, siginfo_t *info, void *ptr) {
fprintf(stderr, "Received signal %d\n", signum);
fprintf(stderr, "Signal originates from process %lu\n",
(unsigned long) info->si_pid);
fprintf(stderr, "Shutting down properly.\n");
exit(0);
}
int main(int argc, char** argv) {
struct sigaction act;
printf("pid %lu\n", (unsigned long) getpid());
memset(&act, 0, sizeof(act));
act.sa_sigaction = sighandler;
act.sa_flags = SA_SIGINFO;
sigaction(SIGUSR1, &act, NULL);
while (1) {
};
return 0;
}

mpi multiple init finalize

Assuming I have good reason to do the following (I think I have), how to make it works?
#include "mpi.h"
int main( int argc, char *argv[] )
{
int myid, numprocs;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
// ...
MPI_Finalize();
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
// ...
MPI_Finalize();
return 0;
}
I got the error:
--------------------------------------------------------------------------
Calling any MPI-function after calling MPI_Finalize is erroneous.
The only exceptions are MPI_Initialized, MPI_Finalized and MPI_Get_version.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** after MPI was finalized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[ange:13049] Abort after MPI_FINALIZE completed successfully; not able to guarantee that all other processes were killed!
The reason to do that:
I've Python wrapping around C++ code. Some wrapped class have constructor that call MPI_Init, and destructor that call MPI_Finalize. I would like to be able in Python to freely create, delete re-create the Python object that wrap this C++ class. The ultimate goal is to create a webservice entirely in Python, that import the Python C++ exstension once, and execute some Python code given the user request.
EDIT: I think I'll refactor the C++ code to give possibility to not MPI_Init and MPI_Finalize in constructor and destructor, so it's possible to do it exactly one time in the Python script (using mpi4py).
You've basically got the right solution, so I'll just confirm. It is in fact erroneous to call MPI_Init and MPI_Finalize multiple times, and if you have an entity that calls these internally on creation/destruction, then you can only instantiate that entity once. If you want to create multiple instances, you'll need to change the entity to do one of the following:
Offer an option to not call Init and Finalize that the user can set externally
Use MPI_Initialized and MPI_Finalized to decide whether it needs to call either of the above

SIGCHLD handler reinstall

I see some example of SIGCHLD handler like:
void child()
{
wait(0);
signal(SIGCHLD, child);
}
void server_main()
{
...
signal(SIGCHLD, child);
...
for(;;;) {
...
switch(fork()) {
...
}
}
There two parts in the handler that confuse me:
1). SIGCHLD is caught when the child terminates or is stopped. Then why need to call wait inside the handler? The signal already arrives.
2). Why need to reinstall the SIGCHLD handler. Isn't the signal call will install the handler once and for all?
Thanks!
SIGCHLD will be triggered when the child process finished
execution. It will however still be in the process table (as a
so-called zombie process) in order to let the parent fetch the exit
value of the child. Calling wait() will clear the process table
from that child process.
If you only create n child processes then there's no reason for the signal handler still being in place when all n child processes died.
I suggest you take a look at sigaction instead, as the behaviour of signal varies between Unixes.
Isn't the signal call will install the handler once and for all?
You cannot rely on this behavior; perhaps the signal handler will be cleared, perhaps it will persist. This is part of the problem with historical signal handling. The signal(3) manpage on my system reports:
When a signal occurs, and func points to a function, it is
implementation-defined whether the equivalent of a:
signal(sig, SIG_DFL);
is executed or the implementation prevents some
implementation-defined set of signals (at least including
sig) from occurring until the current signal handling has
completed.
Unreliable signals have been nearly replaced by sigaction(2)-based signals introduced in SysVr4 and standardized in POSIX.1-2001:
struct sigaction {
void (*sa_handler)(int);
void (*sa_sigaction)(int, siginfo_t *, void *);
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
};
int sigaction(int signum, const struct sigaction *act,
struct sigaction *oldact);
These are sadly more complicated to write, but once you've written the code, you won't have to wonder if you need to re-install your handler -- and you won't have to worry that the signal will arrive a second time while handling the signal.

Resources