Running out of Memory when building Stanza Document - jupyter-notebook

System Specifications: - Device:- NVIDIA Jetson AGX Xavier [16 GB] - Jetpack 4.5.1
8 core CPU
RAM:- 32 GB
The Pipeline looks like
nlp = stanza.Pipeline('en', use_gpu=True, batch_size=100, tokenize_batch_size = 32, pos_batch_size = 32, , depparse_batch_size = 32)
doc = nlp(corpus)
I am trying to build a Stanza Document with processors:- tokenizer, pos, depparse, error, sentiment, ner;
While using a dataset of around 300MB of txt to build the Stanza Document i am running out of memory (RAM) and then the jupyter notebook stops and kernel dies, even with 100MB of data the kernel dies.
(I have tried using higher batch sizes and even lower as well but the problem persists)

Related

Can't get tensorflow-gpu to work in R due to CUDA issues

I'm trying to get started using Keras and I have one of the new fancy type of Nvidia GPUs but I can't seem to get it off the ground despite the fact that I'm using a fresh installation of Ubuntu (20.04).
On my first attempt, I noticed that Ubuntu detected my graphics card so I installed it by going into "Additional Drivers." I then installed Keras and Tensorflow using the following commands and yielded no errors.
install.packages("keras")
library(keras)
install_keras(tensorflow = "gpu")
However, when I try to actually set up a Keras model,
model <- keras_model_sequential() %>%
layer_dense(units = 16, activation = "relu", input_shape = c(10000)) %>%
layer_dense(units = 16, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")
I get this awful error message:
2021-01-14 09:04:53.188680: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-01-14 09:04:53.189214: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-01-14 09:04:53.224466: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-01-14 09:04:53.224843: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:09:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.785GHz coreCount: 68 deviceMemorySize: 9.78GiB deviceMemoryBandwidth: 707.88GiB/s
2021-01-14 09:04:53.224860: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-01-14 09:04:53.226413: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-01-14 09:04:53.226446: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-01-14 09:04:53.226935: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-01-14 09:04:53.227061: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-01-14 09:04:53.227139: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/arta/.local/share/r-miniconda/envs/r-reticulate/lib:/usr/lib/R/lib:/usr/local/cuda-11.2/lib64:::/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/lib/server:/usr/local/cuda-11.2/lib64
2021-01-14 09:04:53.227437: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-01-14 09:04:53.227513: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-01-14 09:04:53.227519: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2021-01-14 09:04:53.228275: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-01-14 09:04:53.228290: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-01-14 09:04:53.228293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]
As you might notice, this error message mentions cuda-11.2, however, I got an almost identical error message when I was using my system's default cuda-10.1, which I suppose came with the driver.
I did a number of things, including downloading and trying to install cuDNN direct from Nvidia's website using their documentation, and adding cuda to PATH and LD_LIBRARY_PATH, to no avail.
Finally, I removed my r-reticulate conda environment so that I can reinstall Tensorflow again from scratch but against cuda 11.2 instead of the default 10.1.
I followed the directions on this blog post but I instead substituted every instance of 10.1 with 11.2, and libcudnn.so.7 with libcudnn.so.8, since that's the newest version available and it's the one I downloaded to my system, which brings me to the above error message, which is almost the same as the one I got when I was using 10.1, which came default with my computer.
Also, I noticed something strange when I tried to use Tensorflow in R again. I installed it using install_keras(tensorflow = "gpu") with no discernible problems, but when I called the following command:
imdb <- dataset_imdb(num_words = 10000)
It started downloading and installing it for me once again, but it gave me this warning:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-gpu 2.2.0 requires tensorboard<2.3.0,>=2.2.0, but you have tensorboard 2.4.0 which is incompatible.
tensorflow-gpu 2.2.0 requires tensorflow-estimator<2.3.0,>=2.2.0, but you have tensorflow-estimator 2.4.0 which is incompatible.
What am I supposed to make of this? Why is it that it can use the right installation of CUDA:
2021-01-14 09:00:06.766462: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
But it can't use another file somewhere else?
2021-01-14 09:04:53.227139: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/arta/.local/share/r-miniconda/envs/r-reticulate/lib:/usr/lib/R/lib:/usr/local/cuda-11.2/lib64:::/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/lib/server:/usr/local/cuda-11.2/lib64
What do I do now? Why can't I get gpu acceleration to work? My plan is to follow the directions in that blog post and purge all Nvidia software from Ubuntu and try again using 10.1, since that seems to be the most stable version.
Thanks to #RobertCrovella, I uninstalled CUDA, cuDNN etc. because of the version mismatch and reinstalled CUDA version 11.0 with cuDNN 8.0.
> tensorflow::tf_gpu_configured()
...
tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/device:GPU:0 with 8779 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3080, pci bus id: 0000:09:00.0, compute capability: 8.6)
GPU device name: /device:GPU:0[1] TRUE
do I understand correctly that if I installed cuda 11.0 and cuDNN 8.0 to cuda 11.0 then all these errors disappear?
I have installed cuda 11.2 and found cuDNN 8 to cuda 11.1. I've installed them then with python3 (3.8 ubuntu 20.04.1 LTS default) pip3 and tensorflow etc. In python the rip seems to be working but in R it's broken.
I have created symbolic links to the existing versions and an R code get to the point where it should use the gpu but it is aborted with core dump.

Jupyter notebook crashes after rendering pybullet-gym environment

I am using pybullet-gym to evaluate my policy model and visualize its interactions; however, when it renders the environment using the following sample code (taken from its own repo) jupyter notebook crashes and restarts its kernel :
import gym
import pybulletgym
env = gym.make('HumanoidPyBulletEnv-v0')
env.render()
env.reset()
Here is the message in shell:
startThreads creating 1 threads.
starting thread 0
started thread 0
argc=2
argv[0] = --unused
argv[1] = --start_demo_name=Physics Server
ExampleBrowserThreadFunc started
X11 functions dynamically loaded using dlopen/dlsym OK!
X11 functions dynamically loaded using dlopen/dlsym OK!
Creating context
Created GL 3.3 context
Direct GLX rendering context obtained
Making context current
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=GeForce MX150/PCIe/SSE2
GL_VERSION=3.3.0 NVIDIA 450.102.04
GL_SHADING_LANGUAGE_VERSION=3.30 NVIDIA via Cg compiler
pthread_getconcurrency()=0
Version = 3.3.0 NVIDIA 450.102.04
Vendor = NVIDIA Corporation
Renderer = GeForce MX150/PCIe/SSE2
b3Printf: Selected demo: Physics Server
startThreads creating 1 threads.
starting thread 0
started thread 0
MotionThreadFunc thread started
ven = NVIDIA Corporation
ven = NVIDIA Corporation
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":1"
after 21540 requests (21540 known processed) with 0 events remaining.

how can I maximize the GPU usage of Tensorflow 2.0 from R (with Keras library)?

I use R with Keras and tensorflow 2.0 on the GPU.
After connecting a second monitor to my GPU, I receive this error during a deep learning script:
I concluded that the GPU is short of memory and a solution seems to be this code:
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.allow_growth = True # dynamically grow the memory used on the GPU
config.log_device_placement = True # to log device placement (on which device the operation ran)
# (nothing gets printed in Jupyter, only if you run it standalone)
sess = tf.Session(config=config)
set_session(sess) # set this TensorFlow session as the default session for Keras
According to this post:
https://github.com/tensorflow/tensorflow/issues/7072#issuecomment-422488354
Although this code is not accepted by R.
It says
unexpecterd token from Tensorflow.
Error in tf.ConfigProto() : could not find function "tf.ConfigProto"
It seems that tensorflow 2.0 does not accept this code if I understand correct from this post:
https://github.com/tensorflow/tensorflow/issues/33504
Does anyone know how I can maximize the GPU usage from my R script with Keras library and Tensorflow 2.0 ?
Thank you!
To enable GPU memory growth using keras or tensorflow in R, with tensorflow 2.0, you need to find the correct functions in the tf object.
First, find your GPU device:
library(tensorflow)
gpu <- tf$config$experimental$get_visible_devices('GPU')[[1]]
Then enable memory growth for that device:
tf$config$experimental$set_memory_growth(device = gpu, enable = TRUE)
You can find more relevant functions by typing tf$config$experimental$ and then using tab autocomplete in Rstudio.
Since these functions are labeled as experimental, they will likely change or move location in the future.

Intel MKL/Xeon Phi Offload Runtime Issue - Auto Offload not working

I have set up my Xeon phi 3120A in Windows 10 Pro, with MPSS 3.8.4 and Parallel XE 2017 (Initial Release). I have chosen this Parallel XE as this was the last supported XE for the x100 series. I have installed the MKL version that is packaged with the Parallel XE 2017 (Initial Release).
What have I done / setup:
After setting up MPSS 3.8.4, and following the steps such as flashing and pinging, I have checked that micctrl -s shows “mic0 ready” (with linux image containing the appropriate KNC name), miccheck produces all "passes" and micinfo gives me a reading for all the key stats that the co-processor is providing.
Hence to me it looks like the co-processor is certainly installed and being recognised by my computer. I can also see that mic0 is up and running in the micsmc gui.
I have then set up my environment variables to enable automatic offload, namely, MKL_MIC_ENABLE=1, OFFLOAD_DEVICES= 0, MKL_MIC_MAX_MEMORY= 2GB, MIC_ENV_PREFIX= MIC, MIC_OMP_NUM_THREADS= 228, MIC_KMP_AFFINITY= balanced.
The Problem - Auto Offload is not working
When I go to run some simple code in R-3.4.3 (copied below, designed specifically for automatic offload), it keeps running the code through my host computer rather than running anything through the Xeon phi.
To support this, I cannot see any activity on the Xeon Phis when I look at the micsmc gui.
Hence, auto offload is not working.
The R code:
require(Matrix)
sink("output.txt")
N <- 16000
cat("Initialization...\n")
a <- matrix(runif(N*N), ncol=N, nrow=N);
b <- matrix(runif(N*N), ncol=N, nrow=N);
cat("Matrix-matrix multiplication of size ", N, "x", N, ":\n")
for (i in 1:5) {
dt=system.time( c <- a %*% b )
gflops = 2*N*N*N*1e-9/dt[3]
cat("Trial: ", i, ", time: ", dt[3], " sec, performance: ", gflops, " GFLOP/s\n")
}
Other steps I have tried:
I then proceeded to set up the MKL_MIC_DISABLE_HOST_FALLBACK=1 environmental variable, and as expected, when I ran the above code, R terminated.
In Using Intel® MKL Automatic Offload on Intel Xeon Phi Coprocessors it says that if the HOST_FALLBACK flag is active and offload is attempted but fails (due to “offload runtime cannot find a coprocessor or cannot initialize it properly”), it will terminate the program – this is happening in that R is terminating completely. For completeness, this problem is happening on R-3.5.1, Microsoft R Open 3.5.0 and R-3.2.1 as well.
So my questions are:
What am I missing to make the R code run on the Xeon phi? Can you
please advise me on what I need to do to make this work?
(linked to
1) is there a way to check if the MKL offload runtime can see the
Xeon phi? Or that it is correctly set up, or what (if any) problem
that MKL is having initialising the Xeon phi?
Will sincerely appreciate if someone out there can help me – I believe that I am missing a fundamental/simple step, and have been tearing my hair out trying to make this work.
Many thanks in advance,
Rash

How to initialise JVM with larger maximum heap size using rJava

I am attempting to make use of the Stanford NLP tools (Java) in R, using the rJava package. When attempting to create a StanfordCoreNLP object I get the following error:
Error in .jnew("edu/stanford/nlp/pipeline/StanfordCoreNLP", props) : java.lang.OutOfMemoryError: Java heap space
To resolve this, I have attempted to initialise the JVM with a larger maximum heap size, using variations of the following code:
.jinit(parameters=c("-Xms1g","-Xmx4g"))
When the maximum heap is set to 1GB using -Xmx1g the JVM loads but I continue to get the OutOfMemoryError. When the maximum heap size is set to 2 or 3 GB (-Xmx2g or -Xmx3g), R will stop responding. When set to 4GB or higher -Xmx4g I will get the following message:
Error in .jinit(parameters = c("-Xms1g", "-Xmx4g"), force.init = TRUE) : Cannot create Java virtual machine (-6)
How do you successfully initialise the JVM using rJava to values larger than 1GB? I am using 32bit versions of Java (v8 u51) and R (v3.2.0)
I am using 32bit versions of Java (v8 u51) and R (v3.2.0)
That's your problem right there. Switch to 64bit versions.

Resources