I have a a Perl program that uses threads to spawn multiple processes doing the same tasks. For example, the Perl program parses a set of files in a directory that are fixed length files and creates output in excel. The directory will have multiple such fixed length files. The Perl program calls 10 threads and parses the file in each of the 10 threads and creates the desired output.
My intention is to get away from the threads in Perl and was wondering if I could use the asynchronous nature of node.js programs to call - lets say 10 of these Perl programs on a set of 10 files and so on till all the files in the given directory are parsed.
Having researched node.js I understood that I could call external programs (the Perl program in this specific case) through node.js. However, I was not able to ascertain if these calls to external programs in node.js are asynchronous in nature. If they are, then I can potentially write a wrapper in node.js that calls the Perl program for oh - lets say in 10 iterations so that the 10 Perl programs run simultaneously (well, almost simultaneously I guess) and I can get rid of the threads in the Perl program.
Can anyone help please?
Related
Why can MPI(by which I will mean OpenMPI throughout this post) programs not be executed like any other, and instead have to be executed using mpirun?
In other words, why does MPI not simply provide headers/packages/... that you can import and then let you be master in your own house, by letting you use MPI when and where you want, in your sourcecode, and allowing you to compile your own parallel-processing-included-executables?
I'm really a novice, but for example, I feel like the -np argument passed to mpirun could easily be fixed in the sourcecode, or could be prompted for by the program itself, or could be read in from a configuration file, or could be simply configured to use all available cores, whose number will be determined by a surrounding scheduler script anyway, or ....
(Of course, you can argue that there is a certain convenience in having mpirun do this automatically in some sense, but that hardly justifies, in my view, taking away the coder's possibility to write his own executable.)
For example, I really have little experience, but in Python you can do multiprocessing by simply calling functions of the multiprocessing module and then running your script like any other. Of course, MPI provides more than Python's multiprocessing, but if MPI for example has to start a background service, then I still don't understand why it can't do so automatically upon calls of MPI functions in the source.
For another possibly stupid example, CUDA programs do not require cudarun. And for a good reason, since if they did, and if you used both CUDA and MPI in parts of your program, you would now have to execute cudarun mpirun ./foo (or possibly mpirun cudarun ./foo) and if every package worked like this, you would soon have to have a degree in computer science to simply execute a program.
All of this is maybe super important, as you can simply ship each of your MPI executables with a corresponding wrapper script, but this is kind of annoying and I would still be interested in why this design choice was made.
You can spin up processes however you like, you'll need to have some channel to send port information between processes, a command line arg works. I've had to spin up processes manually, but it's far easier and less painful to use a preconstructed communicator. If you have a good reason, you can do it though.
I have a question where I edited a minimal complete example into the question. The key calls are MPI_Open_port, MPI_Comm_accept, MPI_Comm_connect, and MPI_Intercomm_merge. You have to merge the connecting nodes one at a time. If you want to go after this, be sure you have a good idea about the difference between an inter and intracommunicator. Here's the example for you:
Trying to start another process and join it via MPI but getting access violation
I have some questions on performing File I/Os using MPI.
A set of files are distributed across different processes.
I want the processes to read the files in the other processes.
For example, in one-sided communication, each process sets a window visible to other processors. I need the exactly same functionality. (Create 'windows' for all files and share them so that any process can read any file from any offset)
Is it possible in MPI? I read lots of documentations about MPI, but couldn't find the exact one.
The simple answer is that you can't do that automatically with MPI.
You can convince yourself by seeing that MPI_File_open() is a collective call taking an intra-communicator as first argument and returning a file handler to the opened file as last argument. In this communicator, all processes open the file and therefore, all processes must see the file. So unless a process sees a file, it cannot get a MPI_file handler to access it.
Now, that doesn't mean there's no solution. A possibility could be to do by hand exactly what you described, namely:
Each MPI process opens individually the file they see and are responsible of; then
Each of theses processes reads this local file into a buffer;
Theses individual buffers are all exposed, using either a global MPI_Win memory windows, or several individual ones, ready for one-sided read accesses; and finally
All read accesses to any data that were previously stored in these individual local files, are now done through MPI_Get() calls using the memory window(s).
The true limitation of this approach is that it requires to fully read all of the individual files, therefore, you need to have sufficient memory per node for storing each of them. I'm well aware that this is a very very big caveat that could just make the solution completely impractical. However, if the memory is sufficient, this is an easy approach.
Another even simpler solution would be to store the files into a shared file system, or having them all copied on all local file systems. I imagine this isn't an option since the question wouldn't have been asked otherwise...
Finally, in last resort, a possibility I see would be to dedicate a MPI process (or an OpenMP thread of a MPI process) per node to serve each files. This process would just act as a "file server", answering "read" request coming from the other MPI processes, and serving them by reading the requested data from the file, and sending it back via MPI. It's a bit lengthy to write, but it should work.
I have a Go program which cannot be rewritten in Common Lisp for efficiency reasons. How can I run it via Common Lisp?
Options so far:
1. CFFI
Using the foreign function interface seems to me like the "correct" way to do this. However, the research I did lead directly to a dead end. If this is the winner, what resources are there to learn about how to interface with Go?
2. Sockets
Leaving the Go program running all the time while listening on a port would work. If this is the best way, I'll continue trying to make it work.
3. Execute System Command
This seems all kinds of wrong.
4. Unknown
Or is there an awesome way I haven't thought of yet?
It depends on what you want to do, but 1-3 are all viable options
1. CFFI
To get this to work you will need to use FFI on both the go and lisp side.
You will need to extern the appropriate function from Go as C functions, and then call them using cffi from lisp. See https://golang.org/cmd/cgo/#hdr-C_references_to_Go on how to extern function in go. In this case you would create a dynamically linkable library (dll or so file) rather than an executable file.
2. Sockets (IPC)
The second option is to run your go program as a daemon and use some form of IPC (such as sockets) to communicate between lisp and go. This works well if your program is long running, or if it makes sense to have a server and one or more clients (the server could just as easily be the lisp code as the go code). Sockets in particular are also more flexible, you could write components in other languages or change languges for one component without having to change the others as long as you maintain the same protocol. Also, you could potentially run the components on seperate hardware. However, using sockets may hurt performance.
There are other IPC methods available, such as FIFO files (named pipes), SHM, and message queues, but they are more system dependent than sockets.
3. System command (subprocess)
The third way is to start a sub-process. This is a viable option, but it has some caveats. First of all, the behavior of starting a sub process is dependent both on the lisp implementation and the operating system. UIOP smooths out a lot of the details for implementation differences, but some are too great to overcome. In particular, depending on the implementation you may or may not be able to run a subprocess in parallel. If not you will have to run a seperate command every time you want to communicate with go, which means waiting for the process to start up every time you need it. You also may, or may not be able to send input to the subprocess after starting it.
Another option is to run a command to start a go process, and then communicate with it using sockets or some other IPC, and then running a command to stop the process before closing the lisp program.
Personally, I think that using sockets is the most attractive option, but depending on your needs, on of the other options might be better suited.
CFFI is to use C from Common Lisp. It's an easy way to get new features without too much hassle as the libraries out there usually are written in C or have a C interface. If you can make a C library from your Go source then you can do this and use the foreign feature from CL.
Sockets (or other two way communication bus) are good if the Go program is a service that is supposed to provide something. Eg. an application server to serve http requests. Usually if you only need to use the go program once each run of the CL program this isn't the way to go.
Subprocess is best if you can run your application with arguments and get a result that is used in Common Lisp. It's not good if you are going to use the Go program many times as it will have overhead (in which the sockets thing would be best)
Awesome way to do this is to make the whole thing in Common Lisp. If you choose a implementation that has a good compiler and write it well you might get away with having the application as a CL image. If you need to speed up things you can focus on the slow parts and optimize them og you can use CFFI by writing the optimizations in C. There is even a Inline C for SBCL where you can just write C where you want to optimize in CL and you don't need to write the optimizations in their own file and compile and link separately.
MPI require I deploy mpi program to each machine. Currently, I put the mpi program in nfs, but this method has 2 issues, one is nfs has latency issue and the other is nfs not suitable for large cluster. I know that I could use some linux shell commands to sync up my program to each node, but looks like not so convenient. especially, when I change the program frequently. Is there any easy method to to that ?
There's nothing wrong with NFS or any other network filing system in large clusters. It just means your file server isn't sized for the job. If you replace NFS with anything like ssh, ftp, scripts, or whatever and change nothing else, I don't think that'll make any significant difference. Also, if the loading time is a significant and bothersome component of the overall runtime then why use MPI in the first place?
OK, enough of playing devils advocate. One thing you can do is to have nodes load your program onto other nodes in a binary tree type arrangement. You'll need a script that will copy the executable to two other nodes along with a copy of the script, start that script running asynchronously on those nodes and then runs the executable locally. The result would be a chain reaction of copying and running spreading across the network. The only difficult bit is in choosing which nodes to copy to so that each one is visited just once. It will be a lot faster.
Depending on the nature of the application and the nature of the NFS network, using a shared file system for both the MPI implementation and the application "should" be able to scale with reasonable performance, to a point. Keep in mind that there is some NFS caching at the node level, so multiple ranks on the same node will not each have to traverse the network to reach the files.
In general terms, I tend to advise that NFS be discontinued at about 128 nodes or 1024 ranks in favor of local installations. That advice changes if the NFS is delivered with 10GigE, IPoIB, or if a high performance file system like SFS or GPFS is used.
If you are committed to local installations, then tools like rsync, or scp are good candidates to distribute the bits. Script the final result. You can even do a tar to shared, and remote command (e.g. ssh, clush) un-tar to local disc. The "solution" only needs to be robust, not polished or elegant.
I'll also chime in to say the NFS should be just fine in this use-case, unless you have a cluster of over 100-200 nodes.
If you just want a lightweight tool for doing many-node parallel operations, I'd suggest pdsh. pdsh is a very common tool on HPC clusters. It includes a command called pdcp for doing parallel node copies, i.e.
pdcp -w node[00-99] myfile /path/to/destination/myfile
Where the nodenames are node00, node01, ... node99.
Similarly, you use the pdsh command to run a command in parallel across all the nodes. I.e.,
pdsh -w node[00-99] /path/to/my/executable
Alternatively, if you're looking for something a little less ad-hoc for doing these operations, I can recommend Ansible as an easy and lightweight configuration management and deployment tool. It's not as simple to get started as pdsh, but might be more manageable in the long run...
For example, a simple Ansible playbook to copy a tarball to all nodes, extract it, and then execute a binary might look like:
---
- hosts: computenodes
user: myname
vars:
num_procs: 32
tasks:
- name: copy and extract tarball to deployment location
action: unarchive src=myapp.tar.gz dest=/path/to/deploy/
- name: execute app
action: command mpirun -np {{num_procs}} /path/to/deploy/myapp.exe
I manage Unix systems where, sometimes, programs like CGI scripts run forever, sometimes eating a lot of CPU time and wasting resources.
I want a program (typically invoked from cron) which can kill these runaways, based on the following criteria (combined with AND and OR):
Name (given by a regexp)
CPU time used
elapsed time (for programs which are blocked on an I/O)
I do not really know what to type in a search engine for this sort of program. I certainly could write it myself in Python but I'm lazy and there is may be a good program already existing?
(I did not tag my question with a language name since a program in Perl or Ruby or whatever would work as well)
Try using system-level quota enforcement instead. Most systems will allow to set per-process CPU time limit for different users.
Examples:
Linux: /etc/security/limits.conf
FreeBSD: /etc/login.conf
CGI scripts can usually be run under their own user ID, for example using mod_suid for Apache.
This might be something more like what you were looking for:
http://devel.ringlet.net/sysutils/timelimit/
Most of the watchdig-like programs or libraries are just trying to see whether a given process is running, so I'd say you'd better off writing your own, using the existing libraries that give out process information.