I need to use MPI spawn on a cluster. For that I understood I need to use MPI_Info_set to specify with nodes will run the spawned processes. I have tried MPI_Info_set(info, "add-host","node1,node2") but it does not work.
Below, I provide a small example of the spawning code:
MPI_Info info;
MPI_Info_create(&info);
MPI_Info_set(info,"add-host","node1,node2");
MPI_Comm_spawn("./mpiworker", MPI_ARGV_NULL,
dynamic_procs,
info, 0, MPI_COMM_WORLD,
&intercomm,
MPI_ERRCODES_IGNORE);
Is there anything else I can use?
Parameter add-host comes probably from OpenMPI (man page) and it is not supported in MPICH.
For MPICH try one of those:
host - works for me,
hosts - should work, however, it seems that it is broken in version I currently use and MPI spawns all processes at the first node passed as a parameter value; if it also happens in your case, I suggest manual assignment of hosts for each process using MPI_Comm_spawn_multiple.
Also, I have no idea how to find a list of all supported parameters - I think MPICH guys do not pay much attention to keep documentation complete.
This worked out for me, instead of just MPI_Comm_spawn. The following code will spawn 1 process per node. I could spawn more processes per node by extending the dimension of the inputs below.
MPI_Info info[2];
MPI_Info_create(&info[0]);
MPI_Info_set(info[0],"host","node1");
MPI_Info_create(&info[1]);
MPI_Info_set(info[1],"host","node2");
char *cmds[2] = { "./mpiworker", "./mpiworker" };
int np[2] = { 1, 1 };
int errcodes[2];
MPI_Comm_spawn_multiple( 2, cmds, MPI_ARGVS_NULL, np, info, 0, MPI_COMM_WORLD, &intercomm, errcodes );
//Below parallel code follows
...
The above was tested on Ubuntu-bionic with MPICH Version:3.3a2.
My example is based on the following page. If I find a more elegant way, I will repost.
Related
I wrote code in AX 2009 to poll a directory on a network drive, every 1 second, waiting for a response file from another system. I noticed that using a file explorer window, I could see the file appear, yet my code was not seeing and processing the file for several seconds - up to 9 seconds (and 9 polls) after the file appeared!
The AX code calls System.IO.Directory::GetFiles() using ClrInterop:
interopPerm = new InteropPermission(InteropKind::ClrInterop);
interopPerm.assert();
files = System.IO.Directory::GetFiles(#POLLDIR,'*.csv');
// etc...
CodeAccessPermission::revertAssert();
After much experimentation, it emerges that the first time in my program's lifetime, that I call ::GetFiles(), it starts a notional "ticking clock" with a period of 10 seconds. Only calls every 10 seconds find any new files that may have appeared, though they do still report files that were found on an earlier 10s "tick" since the first call to ::GetFiles().
If, when I start the program, the file is not there, then all the other calls to ::GetFiles(), 1 second after the first call, 2 seconds after, etc., up to 9 seconds after, simply do not see the file, even though it may have sitting there since 0.5s after the first call!
Then, reliably, and repeatably, the call 10s after the first call, will find the file. Then no calls from 11s to 19s will see any new file that might have appeared, yet the call 20s after the first call, will reliably see any new files. And so on, every 10 seconds.
Further investigation revealed that if the polled directory is on the AX AOS machine, this does not happen, and the file is found immediately, as one would expect, on the call after the file appears in the directory.
But this figure of 10s is reliable and repeatable, no matter what network drive I poll, no matter what server it's on.
Our network certainly doesn't have 10s of latency to see files; as I said, a file explorer window on the polled directory sees the file immediately.
What is going on?
Sounds like your issue is due to SMB caching - from this technet page:
Name, type, and ID
Directory Cache [DWORD] DirectoryCacheLifetime
Registry key the cache setting is controlled by
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters
This is a cache of recent directory enumerations performed by the
client. Subsequent enumeration requests made by client applications as
well as metadata queries for files in the directory can be satisfied
from the cache. The client also uses the directory cache to determine
the presence or absence of a file in the directory and uses that
information to prevent clients from repeatedly attempting to open
files which are known not to exist on the server. This cache is likely
to affect distributed applications running on multiple computers
accessing a set of files on a server – where the applications use an
out of band mechanism to signal each other about
modification/addition/deletion of files on the server.
In short try to set the registry key
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters\DirectoryCacheLifetime
to 0
Thanks to #Jan B. Kjeldsen , I have been able to solve my problem using FileSystemWatcher. Here is my implementation in X++ :
class SelTestThreadDirPolling
{
}
public server static Container SetStaticFileWatcher(str _dirPath,str _filenamePattern,int _timeoutMs)
{
InteropPermission interopPerm;
System.IO.FileSystemWatcher fw;
System.IO.WatcherChangeTypes watcherChangeType;
System.IO.WaitForChangedResult res;
Container cont;
str fileName;
str oldFileName;
str changeType;
;
interopPerm = new InteropPermission(InteropKind::ClrInterop);
interopPerm.assert();
fw = new System.IO.FileSystemWatcher();
fw.set_Path(_dirPath);
fw.set_IncludeSubdirectories(false);
fw.set_Filter(_filenamePattern);
watcherChangeType = ClrInterop::parseClrEnum('System.IO.WatcherChangeTypes', 'Created');
res = fw.WaitForChanged(watcherChangeType,_timeoutMs);
if (res.get_TimedOut()) return conNull();
fileName = res.get_Name();
//ChangeTypeName can be: Created, Deleted, Renamed and Changed
changeType = System.Enum::GetName(watcherChangeType.GetType(), res.get_ChangeType());
fw.Dispose();
CodeAccessPermission::revertAssert();
if (changeType == 'Renamed') oldFileName = res.get_OldName();
cont += fileName;
cont += changeType;
cont += oldFileName;
return cont;
}
void waitFileSystemWatcher(str _dirPath,str _filenamePattern,int _timeoutMs)
{
container cResult;
str filename,changeType,oldFilename;
;
cResult=SelTestThreadDirPolling::SetStaticFileWatcher(_dirPath,_filenamePattern,_timeoutMs);
if (cResult)
{
[filename,changeType,oldFilename]=cResult;
info(strfmt("filename=%1, changeType=%2, oldFilename=%3",filename,changeType,oldFilename));
}
else
{
info("TIMED OUT");
}
}
void run()
{;
this.waitFileSystemWatcher(#'\\myserver\mydir','filepattern*.csv',10000);
}
I should acknowledge the following for forming the basis of my X++ implementation:
https://blogs.msdn.microsoft.com/floditt/2008/09/01/how-to-implement-filesystemwatcher-with-x/
I would guess DAXaholic's answer is correct, but you could try other solutions like EnumerateFiles.
In your case I would rather wait for the files rather than poll for the files.
Using FileSystemWatcher there will be a minimal delay from file creation till your process wakes up. It is more tricky to use, but avoiding polling is a good thing. I have never used it over a network.
I tried to iterate over a basic blocks in a specific routine, but i found some problems:
VOID Routine(RTN rtn, VOID *v)
{
RTN_Open(rtn)
for (BBL bbl = RTN_BblHead(rtn); BBL_Valid(bbl); bbl = BBL_Next(bbl))
{ /* some code */ }
RTN_Close(rtn);
}
error: deprecated-declarations,
How can i fix that error, or do it by another way ?
You have a deprecated-declarations warning because RTN_BblHead is now deprecated. Use RTN_InsHead instead.
From include\pin\gen\image.ph:
/* DO NOT EDIT */
/* RTN_BblHead is now deprecated. See RTN_InsHead.
*/
extern PIN_DEPRECATED_API BBL RTN_BblHead(RTN x);
This is also mentioned in the documentation: RTN_BblHead
You can also pass -Wno-deprecated-declarations to GCC to suppress this warning.
Edit
Remember that PIN is above all a DBI (dynamic binary instrumentation) framework: it is extremely good when it comes to instrument the executed code flow, and less good when it needs to break down non executed code.
Routine instrumentation lets the Pintool inspect and instrument an entire routine when the image it is contained in is first loaded' but as the documentation points:
A Pintool can walk the instructions of a routine. There is not enough
information available to break the instructions into BBLs.
Pin find the instructions of a RTN through static discovery, so Pin cannot guarantee that it will find all the instructions in the RTN and this is even more difficult for BBLs. My guess is that they tried at some point (hence the availability of RTN_BblHead in the past) to provide static discovery of BBLs but the discovery rate was too low (or too error prone) to be deemed acceptable, so the function became deprecated.
In short, yes you need to discover a RTN instructions by instructions (knowing that pin might miss some instructions as this is done statically). You can only discover the BBLs of a routine if the routine is executed at some point.
I have a GSM modem and a PLC. The PLC sees a modem (I use a *.lib and functional block "openPort"), but I don't understand how send an "AT command" to the modem, for example, "ate0".
First, to increase your understanding of AT commands in general, read the V.250 specification. That will go a long way in making you an AT command expert.
Then for the actual implementation, I do not know Codesys, so the following is pseudo code of the structure you should have for handling AT commands:
the_modem = openPort();
...
// Start sending ATE0
writePort(the_modem, "ATE0\r");
do {
line = readLinePort(the_modem);
} while (! is_final_result_code(line))
// Sending of ATE0 command finished (successfully or not)
...
closePort(the_modem);
Whatever you do, never, never use delay, sleep or similar as a substitute for waiting for the final result code. You can look at the code for atinout for an example for the is_final_result_code function (you can also compare to isFinalResponseError and isFinalResponseSuccess in ST-Ericsson's U300 RIL, although note that CONNECT is not a final result code. It is an intermediate result code, so the name isFinalResponseSuccess is not 100% correct).
I have an FTDI USB serial device which I use via the termios serial API. I set up the port so that it will time-out on read() calls in half a second (by using the VTIME parameter), and this works on Linux as well as on FreeBSD. On OpenBSD 5.1, however, the read() call simply blocks forever when no data is available (see below.) I would expect read() to return 0 after 500ms.
Can anyone think of a reason that the termios API would behave differently under OpenBSD, at least with respect to the timeout feature?
EDIT: The no-timeout problem is caused by linking against pthread. Regardless of whether I'm actually using any pthreads, mutexes, etc., simply linking against that library causes read() to block forever instead of timing out based on the VTIME setting. Again, this problem only manifests on OpenBSD -- Linux and FreeBSD work as expected.
if ((sd = open(devPath, O_RDWR | O_NOCTTY)) >= 0)
{
struct termios newtio;
char input;
memset(&newtio, 0, sizeof(newtio));
// set options, including non-canonical mode
newtio.c_cflag = (CREAD | CS8 | CLOCAL);
newtio.c_lflag = 0;
// when waiting for responses, wait until we haven't received
// any characters for 0.5 seconds before timing out
newtio.c_cc[VTIME] = 5;
newtio.c_cc[VMIN] = 0;
// set the input and output baud rates to 7812
cfsetispeed(&newtio, 7812);
cfsetospeed(&newtio, 7812);
if ((tcflush(sd, TCIFLUSH) == 0) &&
(tcsetattr(sd, TCSANOW, &newtio) == 0))
{
read(sd, &input, 1); // even though VTIME is set on the device,
// this read() will block forever when no
// character is available in the Rx buffer
}
}
from the termios manpage:
Another dependency is whether the O_NONBLOCK flag is set by open() or
fcntl(). If the O_NONBLOCK flag is clear, then the read request is
blocked until data is available or a signal has been received. If the
O_NONBLOCK flag is set, then the read request is completed, without
blocking, in one of three ways:
1. If there is enough data available to satisfy the entire
request, and the read completes successfully the number of
bytes read is returned.
2. If there is not enough data available to satisfy the entire
request, and the read completes successfully, having read as
much data as possible, the number of bytes read is returned.
3. If there is no data available, the read returns -1, with errno
set to EAGAIN.
can you check if this is the case?
cheers.
Edit: OP traced back the problem to a linking with pthreads that caused the read function to block. By upgrading to OpenBSD >5.2 this issue was resolved by the change to the new rthreads implementation as the default threading library on openbsd. more info on guenther# EuroBSD2012 slides
Hey, I am making some stuff in Objective-C++... And I must say that I am a total newbie when it comes to the Objective-C part... I don't really want to learn it, I kinda just need it for accessing a few Mac APIs (ObjC is such a dumb language).
So - compiling with g++ -x objective-c++ - and I somehow keep getting this warning:
XXX may not respond to YYY
First it was with a NSScreen, now it is with a NSWindow:
NSWindow may not respond to +initWithContentRect:styleMask:backing:defer:
I saw somewhere that I should cast it to id, but didn't work, throwing absolutely cryptic errors...
So - WHAT does this warning actually mean and HOW am I supposed to make it stop?
EDIT: Okay, apparently I need to ALLOCATE an instance first, then I can call its init function... Anyways, now the GCC is reporting:
confused by earlier errors, bailing out
And NOTHING else. This is the ONLY error that it reports. I figured that there is some error in my code that doesn't get reported... So I will post the whole file where the problem is here:
ONXeWindow::ONXeWindow(int px, int py, int sw, int sh, bool resizable){
NSRect wr = NSMakeRect(px, py, sw, sh);
int wf = 1; // titled
wf += 2; // closable
wf += 4; // miniaturizable
wf += (resizable ? 8 : 0); // resizable
wf += (false ? 128 : 0); // metal bg
useWindow = [[NSWindow alloc] initWithContentRect:wr styleMask:wf backing:2 defer:YES];
}
Also, YES, framework AppKit was imported (in the header file) - I am not going to confuse you with my weird file scheme here.
The message isn't really cryptic, you just don't know the language (and don't care to, by your own admission).
Since Objective-C methods are dispatched dynamically at run-time, you can call a method that the compiler doesn't have any knowledge of, however, the compiler is warning you that you're doing so. The + in the beginning of the method name means that you're calling a class method (a - would indicate that you're calling a method on an instance). Since NSWindow has no class method named initWithContentRect:styleMask:backing:defer:, the compiler is giving you a warning, and in this case, it's a pretty good one.
You probably wrote something like:
NSWindow* myWindow = [NSWindow initWithContentRect:rect styleMask:0 backing:0 defer:NO];
when you meant to write something like:
NSWindow* myWindow = [[NSWindow alloc] initWithContentRect:rect styleMask:0 backing:0 defer:NO];
The first one sends the message directly to the class, but this is an instance method. You need to allocate an instance of NSWindow first, then send the init message. Also, clang tends to give much better warning and error messages than gcc. Clang 2.0 also handles C++ and ObjC++ pretty well, so it might be worth it to switch to clang.
Checkout this example, looks like you are not allocating your objects.