I want to run a particular MPI function under google benchmark. Something like:
#include <mpi.h>
#include <benchmark/benchmark.h>
template<class Real>
void MPIInitFinalize(benchmark::State& state)
{
auto mpi = []() {
MPI_Init(nullptr, nullptr);
foo();
MPI_Finalize();
};
for(auto _ : state) {
mpi();
}
}
BENCHMARK_TEMPLATE(MPIInitFinalize, double);
BENCHMARK_MAIN();
Of course, we know what will happen:
*** The MPI_Init() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
I understand that MPI isn't cool with what I want to do. But google benchmark is simply too useful to not at least try to find a hack to make this work.
Is there anything that can be done? Can I fork a process and pass the lambda to it? Is there a threading pattern that will work? Even expensive things will be helpful, as I can just subtract the cost of doing whatever hack works without a call too foo() from the one which call foo().
If you don't need to include MPI_Init and MPI_Finalize in your time (which you probably don't want anyways) you can take alook at this gist: https://gist.github.com/mdavezac/eb16de7e8fc08e522ff0d420516094f5
It countains an example on how to benchmark MPI enabled code with google benchmark. The basic idea is to call google benchmark from your own main method (using ::benchmark::Initialize(&argc, argv) and ::benchmark::RunSpecifiedBenchmarks()), synchronize using MPI_Barrier, time your code using std::chrono::high_resolution_clock and using MPI_Allreduce to find the slowest process. You can then publish that time using state.SetIterationTime (but only on the main process).
Related
Here is a simple piece of code where a division by zero occurs. I'm trying to catch it :
#include <iostream>
int main(int argc, char *argv[]) {
int Dividend = 10;
int Divisor = 0;
try {
std::cout << Dividend / Divisor;
} catch(...) {
std::cout << "Error.";
}
return 0;
}
But the application crashes anyway (even though I put the option -fexceptions of MinGW).
Is it possible to catch such an exception (which I understand is not a C++ exception, but a FPU exception) ?
I'm aware that I could check for the divisor before dividing, but I made the assumption that, because a division by zero is rare (at least in my app), it would be more efficient to try dividing (and catching the error if it occurs) than testing each time the divisor before dividing.
I'm doing these tests on a WindowsXP computer, but would like to make it cross platform.
It's not an exception. It's an error which is determined at hardware level and is returned back to the operating system, which then notifies your program in some OS-specific way about it (like, by killing the process).
I believe that in such case what happens is not an exception but a signal. If it's the case: The operating system interrupts your program's main control flow and calls a signal handler, which - in turn - terminates the operation of your program.
It's the same type of error which appears when you dereference a null pointer (then your program crashes by SIGSEGV signal, segmentation fault).
You could try to use the functions from <csignal> header to try to provide a custom handler for the SIGFPE signal (it's for floating point exceptions, but it might be the case that it's also raised for integer division by zero - I'm really unsure here). You should however note that the signal handling is OS-dependent and MinGW somehow "emulates" the POSIX signals under Windows environment.
Here's the test on MinGW 4.5, Windows 7:
#include <csignal>
#include <iostream>
using namespace std;
void handler(int a) {
cout << "Signal " << a << " here!" << endl;
}
int main() {
signal(SIGFPE, handler);
int a = 1/0;
}
Output:
Signal 8 here!
And right after executing the signal handler, the system kills the process and displays an error message.
Using this, you can close any resources or log an error after a division by zero or a null pointer dereference... but unlike exceptions that's NOT a way to control your program's flow even in exceptional cases. A valid program shouldn't do that. Catching those signals is only useful for debugging/diagnosing purposes.
(There are some useful signals which are very useful in general in low-level programming and don't cause your program to be killed right after the handler, but that's a deep topic).
Dividing by zero is a logical error, a bug by the programmer. You shouldn't try to cope with it, you should debug and eliminate it. In addition, catching exceptions is extremely expensive- way more than divisor checking will be.
You can use Structured Exception Handling to catch the divide by zero error. How that's achieved depends on your compiler. MSVC offers a function to catch Structured Exceptions as catch(...) and also provides a function to translate Structured Exceptions into regular exceptions, as well as offering __try/__except/__finally. However I'm not familiar enough with MinGW to tell you how to do it in that compiler.
There's isn't a
language-standard way of catching
the divide-by-zero from the CPU.
Don't prematurely "optimize" away a
branch. Is your application
really CPU-bound in this context? I doubt it, and it isn't really an
optimization if you break your code.
Otherwise, I could make your code
even faster:
int main(int argc, char *argv[]) { /* Fastest program ever! */ }
Divide by zero is not an exception in C++, see https://web.archive.org/web/20121227152410/http://www.jdl.co.uk/briefings/divByZeroInCpp.html
Somehow the real explanation is still missing.
Is it possible to catch such an exception (which I understand is not a C++ exception, but a FPU exception) ?
Yes, your catch block should work on some compilers. But the problem is that your exception is not an FPU exception. You are doing integer division. I don’t know whether that’s also a catchable error but it’s not an FPU exception, which uses a feature of the IEEE representation of floating point numbers.
On Windows (with Visual C++), try this:
BOOL SafeDiv(INT32 dividend, INT32 divisor, INT32 *pResult)
{
__try
{
*pResult = dividend / divisor;
}
__except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO ?
EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
{
return FALSE;
}
return TRUE;
}
MSDN: http://msdn.microsoft.com/en-us/library/ms681409(v=vs.85).aspx
Well, if there was an exception handling about this, some component actually needed to do the check. Therefore you don't lose anything if you check it yourself. And there's not much that's faster than a simple comparison statement (one single CPU instruction "jump if equal zero" or something like that, don't remember the name)
To avoid infinite "Signal 8 here!" messages, just add 'exit' to Kos nice code:
#include <csignal>
#include <iostream>
#include <cstdlib> // exit
using namespace std;
void handler(int a) {
cout << "Signal " << a << " here!" << endl;
exit(1);
}
int main() {
signal(SIGFPE, handler);
int a = 1/0;
}
As others said, it's not an exception, it just generates a NaN or Inf.
Zero-divide is only one way to do that. If you do much math, there are lots of ways, like
log(not_positive_number), exp(big_number), etc.
If you can check for valid arguments before doing the calculation, then do so, but sometimes that's hard to do, so you may need to generate and handle an exception.
In MSVC there is a header file #include <float.h> containing a function _finite(x) that tells if a number is finite.
I'm pretty sure MinGW has something similar.
You can test that after a calculation and throw/catch your own exception or whatever.
Is it possible to have a Segmentation Fault on if incorrectly set the value of a function pointer?
Or will the interpreter/compiler detect that beforehand?
The details depend on the language you're using, but in general it's not just possible but likely.
C provides no guarantees whatsoever. You can just say e.g.
#include <stddef.h>
typedef void (*foo)( void );
int main( void ) {
((foo)NULL)( );
return 0;
}
which takes NULL, casts it to a function and calls it (or at least attempts to, and crashes.) As of writing, both gcc -Wall and clang -Wall will neither detect nor warn for even this pathological case.
With other languages, there may be more safeguards in place. But generally, don't expect your program to survive a bad function pointer.
void (*ptr)() = (void (*) ())0x0;
ptr();
Nothing prevents you from compiling/executing this, but it will fail for sure.
The following example produces the segmentation fault you mention:
int main(int argc, char *argv[]) {
void (*fun_ptr)() = (void (*)()) 1;
(*fun_ptr)();
return 0;
}
None of cc, clang, splint issue a warning. C assumes that the programmer knows what he is doing.
UPDATE
The following reference illustrates why a C allows for absolute memory addressing to be accessed through pointers.
Koenig, Andrew R., C Traps an Pitfalls, Bell Telephone Laboratories, Murray Hill, New Jersey, Technical Memorandum, 2.1. Understanding Declarations:
I once talked to someone who was writing a C program that was going to
run stand-alone in a small microprocessor. When this machine was
switched on, the hardware would call the subroutine whose address was
stored in location 0.
In order to simulate turning power on, we had to devise a C statement
that would call this subroutine explicitly. After some thought, we
came up with the following:
(*(void(*)())0)();
I am running a script that does multiple subsequent mpirun calls through slurms squeue command. Each call to mpirun will write its output to an own directory, but there is a dependency between them in the way that a given run will use data from the former runs output directory.
The mpi program internally performs some iterative optimization algorithm, which will terminate if some convergence criteria are met. Every once in a while it happens, that the algorithm reaches a state in which those criteria are not quite met yet, but by plotting the output (which is continuosly written to disk) one can quite easily tell that the important things have converged and that further iterations would not change the nature of the final result anymore.
What I am therefore looking for is a way to manually terminate the run in a controlled way and have the outer script proceed to the next mpirun call. What is the best way to achieve this? I do not have direct access to the node on which the calculation is actually performed, but I have of course access to all of slurms commands and the working directories of the individual runs. I have access to the mpi programs full source code.
One solution that would work is the following: If one manually wants to terminate a run, one places a file with a special name like killme in the working directory, which could easily be done with touch killme. The mpi program would regulary check for the existence of this file and terminate in a controlled manner if it exists. The outer script or slurm would not be involved at all here and the script would just continue with the next mpirun call. What do you think of this solution? Can you think of anything better?
Here is a short code snippet for getting SIGUSR1 as a signal.
More detailed explanation can be found here.
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <string.h>
#include <unistd.h>
void sighandler(int signum, siginfo_t *info, void *ptr) {
fprintf(stderr, "Received signal %d\n", signum);
fprintf(stderr, "Signal originates from process %lu\n",
(unsigned long) info->si_pid);
fprintf(stderr, "Shutting down properly.\n");
exit(0);
}
int main(int argc, char** argv) {
struct sigaction act;
printf("pid %lu\n", (unsigned long) getpid());
memset(&act, 0, sizeof(act));
act.sa_sigaction = sighandler;
act.sa_flags = SA_SIGINFO;
sigaction(SIGUSR1, &act, NULL);
while (1) {
};
return 0;
}
Assuming I have good reason to do the following (I think I have), how to make it works?
#include "mpi.h"
int main( int argc, char *argv[] )
{
int myid, numprocs;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
// ...
MPI_Finalize();
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
// ...
MPI_Finalize();
return 0;
}
I got the error:
--------------------------------------------------------------------------
Calling any MPI-function after calling MPI_Finalize is erroneous.
The only exceptions are MPI_Initialized, MPI_Finalized and MPI_Get_version.
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** after MPI was finalized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[ange:13049] Abort after MPI_FINALIZE completed successfully; not able to guarantee that all other processes were killed!
The reason to do that:
I've Python wrapping around C++ code. Some wrapped class have constructor that call MPI_Init, and destructor that call MPI_Finalize. I would like to be able in Python to freely create, delete re-create the Python object that wrap this C++ class. The ultimate goal is to create a webservice entirely in Python, that import the Python C++ exstension once, and execute some Python code given the user request.
EDIT: I think I'll refactor the C++ code to give possibility to not MPI_Init and MPI_Finalize in constructor and destructor, so it's possible to do it exactly one time in the Python script (using mpi4py).
You've basically got the right solution, so I'll just confirm. It is in fact erroneous to call MPI_Init and MPI_Finalize multiple times, and if you have an entity that calls these internally on creation/destruction, then you can only instantiate that entity once. If you want to create multiple instances, you'll need to change the entity to do one of the following:
Offer an option to not call Init and Finalize that the user can set externally
Use MPI_Initialized and MPI_Finalized to decide whether it needs to call either of the above
Recently ,I come across this problem as I memtioned in this Title.
I have tried by using QThread::terminate(),but I just can NOT stop
the thread ,which is in a dead loop (let's say,while(1)).
thanks a lot.
Terminating the thread is the easy solution to stopping an async operation, but it is usually a bad idea: the thread could be doing a system call or could be in the middle of updating a data structure when it is terminated, which could leave the program or even the OS in an unstable state.
Try to transform your while(1) into while( isAlive() ) and make isAlive() return false when you want the thread to exit.
QThreads can deadlock if they finish "naturally" during termination.
For example in Unix, if the thread is waiting on a "read" call, the termination attempt (a Unix signal) will make the "read" call abort with an error code before the thread is destroyed.
That means that the thread can still reach it's natural exit point while being terminated. When it does so, a deadlock is reached since some internal mutex is already locked by the "terminate" call.
My workaround is to actually make sure that the thread never returns if it was terminated.
while( read(...) > 0 ) {
// Do stuff...
}
while( wasTerminated )
sleep(1);
return;
wasTerminated here is actually implemented a bit more complex, using atomic ints:
enum {
Running, Terminating, Quitting
};
QAtomicInt _state; // Initialized to Running
void myTerminate()
{
if( _state.testAndSetAquire(Running, Terminating) )
terminate();
}
void run()
{
[...]
while(read(...) > 0 ) {
[...]
}
if( !_state.testAndSetAquire(Running, Quitting) ) {
for(;;) sleep(1);
}
}
Have you tried exit or quit?
Did the thread call QThread::setTerminationEnabled(false)? That would cause thread termination to delay indefinitely.
EDIT: I don't know what platform you're on, but I checked the Windows implementation of QThread::terminate. Assuming the thread was actually running to begin with, and termination wasn't disabled via the above function, it's basically a wrapper around TerminateThread() in the Windows API. This function accepts disrespect from no thread, and tends to leave a mess behind with resource leaks and similar dangling state. If it's not killing the thread, you're either dealing with zombie kernel calls (most likely blocked I/O) or have even bigger problems somewhere.
To use unnamed pipes
int gPipeFdTest[2]; //create a global integer array
As an when where you intend to create pipes use
if( pipe(gPipeFdTest) < 0)
{
perror("Pipe failed");
exit(1);
}
The above code will create a pipe which has two ends gPipeFdTest[0] for reading and gPipeFdTest[1] for writing. What you can do is in your run function set up to read the pipe using select system call. And from where you want to come out of run, there set up to write using write system call. I have used select system call for monitoring the read end of the pipe as it suits my implmentation. Try to figure all this out in your case. If you need any more help, give me a buzz.
Edit:
My problem was just like yours. I had a while(1) loop and the other things I tried needed mutexes and other fancy multithreading mumbo jumbo, which added complexity and debugging was nightmare. Using pipes absolved me from those complexities besides simplified the code. I am not saying that it is the best option but in my case it turned out to be the best and cleanest alternative. I was bugged my hung application before this solution.