Running Multiple Jupyter Notebooks which use GPUs results in decreased performance - jupyter-notebook

I'm using jupyter notebooks to train neural networks with GPUs. Specifically I have 3 notebooks open and each is training a different neural network.
I'm experiencing a decrease in performance in the time to train each network.
When I run nvidia-smi I see the processes on each GPU and the associated GPU memory used. The GPU memory used is not being maxed out on any 1 of the GPUs.
Is running multiple jupyter notebooks causing a problem?
If jupyter notebook causing the problem, what else could it be and how might I check?

Related

foreach..dopar runs significantly slower on Docker vs local laptop

I am currently running into a weird issue that the foreach...dopar runs like 10x faster in my local laptop (Dell laptop, Windows 10 OS) with 15 cores than when the R code was put onto Docker container containing 8 cores. The code itself only sets the ncores parameter to be 3, so I am puzzled by why there is such a drastic difference in runtime like that. Did anyone here run into the similar issue with the doParallel package in Docker? If yes, how did you resolve it?

Kernel dying in Jupyter notebooks with metpy

running any metpy function is fine till plotting when the kernel repeatedly dies. I think this may be a hardware issue but curious for answers

Notebook instance running R with a GPU

I am new to cloud computing and GCP. I am trying to create a notebook instance running R with a GPU. I got a basic instance with 1 core and 0 GPUs to start and I was able to execute some code which was cool. When I try to create an instance with a GPU I keep getting all sorts of errors about something called live migration, or that there are no resources available, etc. Can someone tell me how to start an R notebook instance with a GPU? It can't be this difficult.
The CRAN (The Comprehensive R Archive Network) doesn't support GPU. However, you can follow this link might help you to install a Notebook instance running R with a GPU. You need a machine with Nvidia GPU drivers installed then install R and Jupyter Lab. After that compile those R packages which require it for use with GPU's.

Jupyter notebook on Ubuntu doesn't free memory

I have a piece of code (Python 3.7) in a Jupyter notebook cell that generates a pandas data frame (df) containing numpy arrays.
I am checking the memory consumption of the df just by looking at the system monitor app preinstalled in Ubuntu.
The problem is that if I run the cell a second time, the memory consumption double even if the df is assigned to the same variable.
If I run multiple times the same cell, the system goes out of memory, and the kernel will dye by itself.
Using del df or gc.collect() won't free the memory as well.
Restarting the notebook kernel is the only way to free the memory.
In practice, I would expect the memory to stay roughly the same because I am just reassigning a new df to the same variable over and over again.
Indeed, the memory accumulates only if I run the code on a linux machine and in the notebook. If I run the same code via terminal python script.py, or if I run the very same notebook on macOS, the memory pressure will not change, I can run the same cell multiple time and the occupied memory stays stable (as expected).
Can you help me pointing out where is the problem coming from and how to solve it?
P.S. Both Python and Jupiter are installed with Anaconda 2018.12 on Ubuntu 18.04.
I have asked the same question on the Ubuntu community since I am not sure this is strictly related to python itself, but I got no answers so far.

Parallel R on a Windows cluster

I've got a Windows HPC Server running with some nodes in the backend. I would like to run Parallel R using multiple nodes from the backend. I think Parallel R might be using SNOW on Windows, but not too sure about it. My question is, do I need to install R also on the backend nodes?
Say I want to use two nodes, 32 cores per node:
cl <- makeCluster(c(rep("COMP01",32),rep("COMP02",32)),type="SOCK")
Right now, it just hangs.
What else do I need to do? Do the backend nodes need some kind of sshd running to be able to communicate each other?
Setting up snow on a Windows cluster is rather difficult. Each of the machines needs to have R and snow installed, but that's the easy part. To start a SOCK cluster, you would need an sshd daemon running on each of the worker machines, but you can still run into troubles, so I wouldn't recommend it unless you're good at debugging and Windows system administration.
I think your best option on a Windows cluster is to use MPI. I don't have any experience with MPI on Windows myself, but I've heard of people having success with the MPICH and DeinoMPI MPI distributions for Windows. Once MPI is installed on your cluster, you also need to install the Rmpi package from source on each of your worker machines. You would then create the cluster object using the makeMPIcluster function. It's a lot of work, but I think it's more likely to eventually work than trying to use a SOCK cluster due to the problems with ssh/sshd on Windows.
If you're desperate to run a parallel job once or twice on a Windows cluster, you could try using manual mode. It allows you to create a SOCK cluster without ssh:
workers <- c(rep("COMP01",32), rep("COMP02",32))
cl <- makeSOCKluster(workers, manual=TRUE)
The makeSOCKcluster function will prompt you to start each one of the workers, displaying the command to use for each. You have to manually open a command window on the specified machine and execute the specified command. It can be extremely tedious, particularly with many workers, but at least it's not complicated or tricky. It can also be very useful for debugging in combination with the outfile='' option.

Resources