Hydra parameter sweeps in parallel (e.g., Ray Tune) - fb-hydra

I've seen the recently added ray_launcher plugin and the Ax plugin.
I was wondering: is there a way to launch parameters' sweeps in parallel using the currently supported plugins? Should we wait for a "ray_tune" plugin to be able to do so?
Thanks a lot in advance!

You can mix Launchers and Sweepers and the jobs should be running in parallel if the underlying Launcher supports it (which is the case for the Ray Launcher).
Keep in mind that the Sweeper itself decides how many jobs to run in each iteration. different Sweepers employs different strategies here.
If you are convinced things are not running in parallel when you are using the Ax Sweeper with the Ray Launcher please open an issue in the Hydra repo.

Related

How to get and change the values of the projector lens system?

I am trying to write a Gatan DigitalMicrograph script to control the tilting of incident electron beam before and after a specimen. I think that the values of pre-specimen lens system can be got and changed by using commands such as EMGetBeamTilt, EMSetBeamTilt and EMChangeBeamTilt. However, I don't know how to get or control the status of the post-specimen lens system such as a projector lens. What command or code should be written in order to control the projector lens system?
It will be appreciated if you share some wisdom. Thank you very much in advance.
Unfortunately, only a limited number of microscope hardware components can be accessed by DM-script via a generalized interface. The generalized commands communicate to the microscope via a software interface which is implemented by the microscope vendor, so that the exact behaviour of each command (i.e. which lenses are driven when a value is changed) lies completely within the control of the microscope software and not DM. Commands to access specific lenses or microscope-specific controls are most often not available.
All available commands, while they can be found in earlier versions often as well, are officially supported and documented since GMS 2.3. You will find the complete list of commands in the F1 help-documentation (on online-systems):

Crash around pthreads while integrating SQLite into RTP application on VxWorks

I am trying to integrate SQLite library into RTP application on VxWorks. I built SQlite and link against it statically. I run simple test that works well on other systems. The test is realy primitive one: sqlite_open(), sqlite_exec(), sqlite_close(). Parameters are correct (works on other systems).
I experience SIGSEGV (signal code 11). I traced down to the point of crash with "printf()s" and discovered that it crashes after pthread_mutex_lock() call. What is interesting - it returns from the function call and then crashes. I checked the stack size (having a taskDelay() afore actual crash). Stack is big enough and far from its limit.
I try to build SQlite with SQLITE_HOMEGROWN_RECURSIVE_MUTEX and without. And I build all the time with SQLITE_THREADSAFE 1.
If someone has experienced something like that and managed to fix it - please let me know.
Here are few details, jut to outline them.
VxWorks wersion: 6.8
SQlite sources: 3.7.16.1
Development environment: Windriever
CPU Architecture: PowerPC
Thanks in advance
I have found it. I had no pthreads in my VxWorks OS. Now it works.
The strange thing is that there is no way to verify that while building an application against pthreads library.
There is no easy way to do that, but at least some kind of "stub" function, rhater than SIGSEGV. Or am I asking too much for that kind of money?

How to get all running processes in Qt

I have two questions:
Is there any API in Qt for getting all the processes that are running right now?
Given the name of a process, can I check if there is such a process currently running?
Process APIs are notoriously platform-dependent. Qt provides just the bare minimum for spawning new processes with QProcess. Interacting with any processes on the system (that you didn't start) is out of its depth.
It's also beyond the reach of things like Boost.Process. Well, at least for now. Note their comment:
Boost.Process' long-term goal is to provide a portable abstraction layer over the operating system that allows the programmer to manage any running process, not only those spawned by it. Due to the complexity in offering such an interface, the library currently focuses on child process management alone.
I'm unaware of any good C++ library for cross-platform arbitrary process listing and management. You kind of have to just pick the platforms you want to support and call their APIs. (Or call out to an external utility of some kind that will give you back the info you need.)

Interworking MPI between Windows and Linux

I have several Windows box and Linux box interconnected with Infiniband, and I need to run MPI jobs in both environment, does anyone know what's the best way to interwork them ?
Currently, I am considering using the beta release of windows binary of open MPI. Maybe I need to add additional things into my hpc sw stack ? Or should I just forget about MPI and directly code in a lower layer for the getting the windows part to work, since there are few jobs that needs windows anyway.
Any idea is appreciated. Many thanks!
So I dig through some of forums and find that
openMPI currently does not support interworking of task spawning between windows and linux systems, however MpiCh2 seems to be capable of interworking task spawning. For sending and receiving messages using MPI send, I will need to investigate more

distributed scheduling system for R scripts

I would like to schedule and distribute on several machines - Windows or Ubuntu - (one task is only on one machine) the execution of R scripts (using RServe for instance).
I don't want to reinvent the wheel and would like to use a system that already exists to distribute these tasks in an optimal manner and ideally have a GUI to control the proper execution of the scripts.
1/ Is there a R package or a library that can be used for that?
2/ One library that seems to be quite widely used is mapReduce with Apache Hadoop.
I have no experience with this framework. What installation/plugin/setup would you advise for my purpose?
Edit: Here are more details about my setup:
I have indeed an office full of machines (small servers or workstations) that are sometimes also used for other purpose. I want to use the computing power of all these machines and distribute my R scripts on them.
I also need a scheduler eg. a tool to schedule the scripts at a fix time or regularly.
I am using both Windows and Ubuntu but a good solution on one of the system would be sufficient for now.
Finally, I don't need the server to get back the result of scripts. Scripts do stuff like accessing a database, saving files, etc, but do not return anything. I just would like to get back the errors/warnings if there are some.
If what you are wanting to do is distribute jobs for parallel execution on machines you have physical access to, I HIGHLY recommend the doRedis backend for foreach. You can read the vignette PDF to get more details. The gist is as follows:
Why write a doRedis package? After all, the foreach package already
has available many parallel back end packages, including doMC, doSNOW
and doMPI. The doRedis package allows for dynamic pools of workers.
New workers may be added at any time, even in the middle of running
computations. This feature is relevant, for example, to modern cloud
computing environments. Users can make an economic decision to \turn
on" more computing resources at any time in order to accelerate
running computations. Similarly, modernThe doRedis Package cluster
resource allocation systems can dynamically schedule R workers as
cluster resources become available
Hadoop works best if the machines running Hadoop are dedicated to the cluster, and not borrowed. There's also considerable overhead to setting up Hadoop which can be worth the effort if you need the map/reduce algo and distributed storage provided by Hadoop.
So what, exactly is your configuration? Do you have an office full of machines you're wanting to distribute R jobs on? Do you have a dedicated cluster? Is this going to be EC2 or other "cloud" based?
The devil is in the details, so you can get better answers if the details are explicit.
If you want the workers to do jobs and have the results of the jobs reconfigured back in one master node, you'll be much better off using a dedicated R solution and not a system like TakTuk or dsh which are more general parallelization tools.
Look into TakTuk and dsh as starting points. You could perhaps roll your own mechanism with pssh or clusterssh, though these may be more effort.

Resources