Julia - is ulimit working with a Julia session? - julia

I search and it doesn't seem to have a Julia core way to limit RAM used so I search Linux instead.
According to this question, I can limit the RAM used by my command to 64GB with:
ulimit -v 64000000
I am wondering if I do:
$ulimit -v 64000000
$julia
$julia>
Am I doing everything alright, i.e., everything I do like launching a JuMP model within my Julia console will be limited to 64GB of RAM?

Since ulimit appears to set resource limits for your entire user account, I can find no reason why this should not also apply to Julia processes run under that user account.
The one exception that comes to mind might be if you are running Julia processes on multiple nodes of a linux cluster with, e.g. the Distributed.jl stdlib, but this is a rather niche case.

Related

Run hydra configured project with SLURM and Horovod

Right now, I am using Horovod to run distributed training of my pytorch models. I would like to start using hydra config for the --multirun feature and enqueue all jobs with SLURM. I know there is the Submitid plugin. But I am not sure, how would the whole pipeline work with Horovod. Right now, my command for training looks as follows:
CUDA_VISIBLE_DEVICES=2,3 horovodrun -np 2 python training_script.py \
--batch_size 30 \
...
Say I want to use hydra --multirun to run several multi-gpu experiments, I want to enqueue the runs with slurm since my resources are limited and would be run sequentially most of the time and I want to use Horovod to synchronize gradients of my networks. Would this setup run out of the box? Would I need to specify CUDA_VISIBLE_DEVICES if slurm took care of the resources? How would I need to adjust my run command or other settings to make this setup plausible? I am especially interested in how the multirun feature handles GPU resources. Any recommendations are welcome.
The Submitit plugin does support GPU allocation, but I am not familiar with Horovod and have no idea if this can work in conjunction with it.
One new feature of Hydra 1.0 is the ability to set or copy environment variables from the launching process.
This might come in handy in case Horovod is trying to set some environment variables. See the docs for info about it.

What is merit of terminal multiplexer compared to standard terminal app and job control?

I don't know what is a merit of a terminal multiplexer like screen or tmux compared to combination of a standard terminal application and a job control feature of a shell.
Typically good features of a terminal multiplexer are cited as follows:
persistance
multiple windows
session sharing
First two features are, however, achieved with a terminal application like iTerm2 and a job control feature of a shell like bash.
Session sharing is a novel feature but it seems to be required in a quite rare situation.
What is a merit of terminal multiplexer? Why do you use it?
I'm interested especially in its merit in daily task.
I can tell you from my perspective as a devops/developer.
Almost every day I have to deploy a bunch of apps (a particular version) on multiple servers. Handling that without something like Terminator or Tmux would be a real pain.
On a single window I can put something like 4 panes (four windows in one) and monitor stuff on 4 different servers...which by it self is a huge deal...without tabs or other terminal instances and what not....
On the first pane I can shutdown nginx, on the second server I can shut down all the processes with supervisord (process manager), and on the third pane I can do the deploy process...if I quickly need to jump to some other server I just use the fourth pane...
Some colleagues that only use a bunch of terminal instances can get really confused when they have to do a bunch of things quickly, constantly ssh-ing in and out ...and if they are not careful they can go to the wrong server because they switched to the wrong terminal instance and entered a command that wasn't meant for that particular server :)...
A terminal multiplexer like Tmux really does help me to be quick and accurate.
There is an package manager for Tmux, which lets you install plugins and really supercharge you terminal even more!
On a different note, a lot of people are using Tmux in combination with Vim...which lets you create some awesome things together...
All in all, those were my two cents on the benefit of using a terminal multiplexer...

Will a shell script written on ubuntu kernel always be able to run on RHEL also

I want to know that if a shell script is created and tested on ubuntu kernel, then will it always without fail also run on RHEL kernel provided the correct shell is invoked for running the script.
Ways in which the execution may differ when used on different distributions and different kernels:
Differences in the version and configuration of the Linux kernel - this may affect presence and format of the contents of files such as those in /proc and /sys, or the presence of particular device drivers.
Differences in the version of the shell used - /bin/sh may be Bash on one system and Dash on another, or Bash 3.x on one system and Bash 4.x on the other.
Differences in the installed programs your script invokes (and, if you got your package dependencies wrong, whether those programs are even present - what's "essential" on one distribution may be "optional" on another).
In short, different distributions have the same issues as different versions of one distribution, but more so.
It depends on what shell/interpreter it was written for and versions of the particular shell it was written for. For example, a bash script written using bash-4.4 may not work in bash-2.0 and so on. It's not quite related to to the distribution/kernel version you use but the shell you use.
So, without details, it's not possible to assert whether a script that works on Ubuntu will work on RHEL. If you use the same shell and same version
on both machines then yes, it's going to work as expected (barring some very odd cases).

Parallel R on a Windows cluster

I've got a Windows HPC Server running with some nodes in the backend. I would like to run Parallel R using multiple nodes from the backend. I think Parallel R might be using SNOW on Windows, but not too sure about it. My question is, do I need to install R also on the backend nodes?
Say I want to use two nodes, 32 cores per node:
cl <- makeCluster(c(rep("COMP01",32),rep("COMP02",32)),type="SOCK")
Right now, it just hangs.
What else do I need to do? Do the backend nodes need some kind of sshd running to be able to communicate each other?
Setting up snow on a Windows cluster is rather difficult. Each of the machines needs to have R and snow installed, but that's the easy part. To start a SOCK cluster, you would need an sshd daemon running on each of the worker machines, but you can still run into troubles, so I wouldn't recommend it unless you're good at debugging and Windows system administration.
I think your best option on a Windows cluster is to use MPI. I don't have any experience with MPI on Windows myself, but I've heard of people having success with the MPICH and DeinoMPI MPI distributions for Windows. Once MPI is installed on your cluster, you also need to install the Rmpi package from source on each of your worker machines. You would then create the cluster object using the makeMPIcluster function. It's a lot of work, but I think it's more likely to eventually work than trying to use a SOCK cluster due to the problems with ssh/sshd on Windows.
If you're desperate to run a parallel job once or twice on a Windows cluster, you could try using manual mode. It allows you to create a SOCK cluster without ssh:
workers <- c(rep("COMP01",32), rep("COMP02",32))
cl <- makeSOCKluster(workers, manual=TRUE)
The makeSOCKcluster function will prompt you to start each one of the workers, displaying the command to use for each. You have to manually open a command window on the specified machine and execute the specified command. It can be extremely tedious, particularly with many workers, but at least it's not complicated or tricky. It can also be very useful for debugging in combination with the outfile='' option.

Is it possible to pause an R script that is running?

I am running a some analysis in R which is going to take at least 24 hours to finish. Is it possible to pause the function midway, so that I can take my computer to work and back?
This is not possible AFAIK, but I believe you can just suspend your computer and the processes automatically will be paused.
If you are using Linux, you can also stop and continue a process manually using killall -STOP R and killall -CONT R commands. Take a look at this article and the comment section there, which contain useful information regarding this.
On Windows, you can maybe use the Task Manager or install special software that is capable of doing that. But I really do not know as I do not use Windows on a regular basis.
EDIT: even if you use kill or killall to pause the process, but shutdown the computer, you will lose the data.

Resources