Memory limit in jupyter notebook - jupyter-notebook

How do I set a maximum memory limit for a jupyter notebook process?
If I use too much RAM the computer gets blocked and I have to press the power button to restart the computer manually.
Is there a way of automatically killing a jupyter notebook process as soon as a user-set memory limit is surpassed or to throw a memory error? Thanks

Under Linux you can use "cgroups" to limit resources for any software running on your computer.
Install cgroup-tools with apt-get install cgroup-tools
Edit its configuration /etc/cgconfig.conf to make a profile for the particular type of work (e.g. numerical scientific computations):
group app/numwork {
memory {
memory.limit_in_bytes = 500000000;
}
}
Apply that configuration to the process names you care about by listing them in /etc/cgrules.conf (in my case it is all julia executables, which I run through jupyter, but you can use it for any other software too):
*:julia memory app/numwork/
Finally, parse the config and set it as the current active config with the following commands:
~# cgconfigparser -l /etc/cgconfig.conf
~# cgrulesengd
I use this to set limits on processes running on a server that is used by my whole class of students.
This page has some more details and other ways to use cgroups https://wiki.archlinux.org/index.php/cgroups

Related

select a specific GPU to use in jupyter notebook

I'm coding on a .ipynb file on a linux server.
The linux server I use has multiple GPUs on it, but I should only use idle GPU so as not to accidentally abort others' programme.
I've already known that for common .py file we can add some instructions at the command line to choose a common GPU(e.g. export CUDA_VISIBLE_DEVICES=#), but will it work for jupyter notebook? If not, how can I specify a GPU to work on.
You have to choose its name correctly. For instance there may be 3 GPU devices available namely "cuda:0","cuda:1","cuda:2". To choose the third one you need to run the following code:
if torch.cuda.is_available():
dev = "cuda:2"
else:
dev = "cpu"
device = torch.device(dev)
with tensorflow as below
gpus = tf.config.list_physical_devices('GPU')
if len(gpus) > 0:
tf.config.experimental.set_visible_devices(gpus[0], 'GPU')

Cell run to completion but kernel is still running

I have a jupyter notebook does some data extractions. After executed the cell (no * with the cell) and get the extraction results, the kernel is still showing running (and CPU shows ipykernel 100%)
What could cause this happen and how to find out what process causes 100% usage on the ipykernel while no cell is running?
These normally caused by some extensions. Try to check if you have any extension for variable monitor/watch, then disable it.
Alternatively, you can try jupyter lab which has optimized many ways jupyter book issues with extensions.

How to install xv6 on virtualbox or vmware?

I'm trying to run xv6 operating system on VirtualBox or VMWare in a Linux host. The official instructions said how to run the OS on qemu only. However, the official page (https://pdos.csail.mit.edu/6.828/2014/xv6.html) mentioned that xv6 can be booted directly on hardware also, but it's not clear how.
I want to boot xv6 on VirtualBox or VMware first. I extracted the following command from the Makefile, which runs xv6 from the command line after it's compiled using make command.
/usr/bin/qemu-system-i386 -serial mon:stdio -drive file=fs.img,index=1,media=disk,format=raw -drive file=xv6.img,index=0,media=disk,format=raw -smp 2 -m 512
Please help me how to proceed. If the procedure is already documented some reference will be helpful.
The instructions are here which is linked (via 6.828 tools page) from your link though they are a bit terse:
Using a Virtual Machine
Otherwise, the easiest way to get a compatible toolchain is to install
a modern Linux distribution on your computer. With platform
virtualization, Linux can cohabitate with your normal computing
environment. Installing a Linux virtual machine is a two step process.
First, you download the virtualization platform.
VirtualBox (free for Mac, Linux, Windows) — Download page
VMware Player (free for Linux and Windows, registration required)
VMware Fusion (Downloadable from IS&T for free).
VirtualBox is a little slower and less flexible, but free!
Once the virtualization platform is installed, download a boot disk
image for the Linux distribution of your choice.
Ubuntu Desktop is what we use.
This will download a file named something like
ubuntu-10.04.1-desktop-i386.iso. Start up your virtualization platform
and create a new (32-bit) virtual machine. Use the downloaded Ubuntu
image as a boot disk; the procedure differs among VMs but is pretty
simple. Type objdump -i, as above, to verify that your toolchain is
now set up. You will do your work inside the VM.
I can see how one could read that and not see the answer.
After the virtual machine is installed, download the Ubuntu Desktop .iso. Install that into the VM and fire it up. Presumably the Desktop will provide a clear mechanism for loading your OS. (Wait, I'm giving it a try. Will update with the result.)
Turns out that is simply a Ubuntu client desktop, and isn't anything special for running a sub-operating system.
Looking around some more, I found the commentary to be the best potential clue. It contains this (head scratcher) phrase:
To run xv6, install the QEMU PC simulators. To run in QEMU, run "make qemu".
If only it specified the context to get to that point! (Sorry I am not more help.)
I see that you want to boot it on VirtualBox or VMware, but another option would be to using docker to run xv6. A great guide for getting started with xv6 through docker is here.
The full guide is elaborate and can help you with getting started.
It is an alternative option, but one that can get you going fast hopefully.
It will only take 4 steps to get going with the xv6:
Step 1
Download and set up docker here
Step 2
- Run this command in PowerShell or bash to pull the ubuntu image with xv6 docker pull grantbot/xv6
Step 3
- To run the docker image and get going with xv6 run this command docker run -it grantbot/xv6
Step 4
- Now inside the shell in the ubuntu image run cd /home/a/xv6-public/ to enter the root folder of the xv6.
Done
- Now you can compile and run the xv6 with make qemu-nox
Step 1.Compile xv6
Download the code, unzip it and enter the directory, compile the operating system image and root file system, the command is as follows:
make xv6.img&&make fs.img
Step 2. Write image to disk
Create two disks in a existed vmware virtual machine(my vmware version is 15.2.2, linux version is Centos7.8), the operation steps are: virtual machine settings -> add -> disk -> SCSI -> create a new virtual disk -> size 0.005 (allocate immediately, single file) -> name the disk "os", which means this disk is the operating system.
Create another disk named "fs" in the same way to put the root file system.
At this time, there should be "sdb" and "sdc" in the /dev/ directory (sda is the current operating system itself). If you do not see the "sdb" and "sdc", restart the guest operating system.
Write the operating system and root file system to the disk with the following command:
dd if=./xv6.img of=/dev/sdb bs=4k count=1000
dd if=./fs.img of=/dev/sdc bs=4k count=1000
shutdown the current virtual machine to ensure that the file has sync to the disk. At this time, the two images have been written to the disk, vmware saves the disk as a file, the location is in the directory of the current virtual machine, named os.vmdk, fs.vmdk, the next step will load these two files into the new virtual machine.
Step 3. Create xv6 virtual machine
To create an empty virtual machine, the operation steps are: customize (advanced) -> next -> install the operating system later -> choose other operating system type (choose other versions) -> take the virtual machine name as xv6 (name depend on you) ) -> Then use the default configuration all the way to "Next" to completion.
Right-click the created virtual machine and delete the disk created by default. Add the disk file created in the previous step to the current virtual machine. The operation steps are: add -> "disk" -> ide (note that this is an IDE instead of a SCSI disk, because xv6 reads an IDE format disk) -> use an existing virtual disk -> select the os.vmdk generate in the step 2->complete
Add fs.vmdk in the same way. Note that you must add os.vmdk first. Because os.vmdk is the operating system, it needs to be the first hard disk.
Now, you create a virtual machine which has two disk. one is os disk, another is root file system disk, all is ready.
Start the virtual machine, and the xv6 will start successfully.

Kill Jupyter Notebooks from within the notebooks

I am looking for a way to automatically kill the notebooks kernel to free up the memory after finish running some cells.
Is there a magic command that I can use from within the notebook cells to do that?
Use exit(); the kernel will restart automatically but all the resources of the old kernel will be cleaned.

When using mpirun with R script, should I copy manually file/script on clusters?

I'm trying to understand how openmpi/mpirun handle script file associated with an external program, here a R process ( doMPI/Rmpi )
I can't imagine that I have to copy my script on each host before running something like :
mpirun --prefix /home/randy/openmpi -H clust1,clust2 -n 32 R --slave -f file.R
But, apparently it doesn't work until I copy the script 'file.R' on clusters, and then run mpirun. Then, when I do this, the results are written on cluster, but I expected that they would be returned to working directory of localhost.
Is there another way to send R job from localhost to multiple hosts, including the script to be evaluated ?
Thanks !
I don't think it's surprising that mpirun doesn't know details of how scripts are specified to commands such as "R", but the Open MPI version of mpirun does include the --preload-files option to help in such situations:
--preload-files <files>
Preload the comma separated list of files to the current working
directory of the remote machines where processes will be
launched prior to starting those processes.
Unfortunately, I couldn't get it to work, which may be because I misunderstood something, but I suspect it isn't well tested because very few use that option since it is quite painful to do parallel computing without a distributed file system.
If --preload-files doesn't work for you either, I suggest that you write a little script that calls scp repeatedly to copy the script to the cluster nodes. There are some utilities that do that, but none seem to be very common or popular, which I again think is because most people prefer to use a distributed file system. Another option is to setup an sshfs file system.

Resources