Optimum memory usage in R - r

I have been using 64 bit Windows machine. The RStudio as well as R is also 64 bit. I am using these on an EC2 instance which is r5.4xlarge. It has 16 core and about 128G memory. If I run command memory.limit() I see 100GB. Since , in the .RProfile file I have set this memory limit. Still when I use the RScript I see only 10GB memory being in use in Task Manager.
How should I make sure R uses optimum memory so that the script is run much more faster? If I run same script on my local with 64 GB RAM the script finishes in 5 minutes with 100% CPU usage but on EC2 it finishes in 15 minutes with only 25% CPU usage. Please let me know if additional information is required.

I'm not sure that memory is the issue on this.
Since you note that the server only runs with 25% of CPU usage and 100% locally, it could that your code is parallelized locally and an not on the VM.
Another thing to look at is are you running Microsoft R Open locally? and not on the VM?
R Open uses the Intel MKL (Math Kernel library) by default. which is a much faster implementation of the BLAS libraries.
sessionInfo()
for the standard R library
other attached packages:
[1] Matrix_1.2-12
and for R Open (something like )
other attached packages:
[1] RevoUtilsMath_10.0.0

Related

Notebook instance running R with a GPU

I am new to cloud computing and GCP. I am trying to create a notebook instance running R with a GPU. I got a basic instance with 1 core and 0 GPUs to start and I was able to execute some code which was cool. When I try to create an instance with a GPU I keep getting all sorts of errors about something called live migration, or that there are no resources available, etc. Can someone tell me how to start an R notebook instance with a GPU? It can't be this difficult.
The CRAN (The Comprehensive R Archive Network) doesn't support GPU. However, you can follow this link might help you to install a Notebook instance running R with a GPU. You need a machine with Nvidia GPU drivers installed then install R and Jupyter Lab. After that compile those R packages which require it for use with GPU's.

Unusually long package installation time on RStudio Server Pro Standard on GCP?

If we install.packages("dplyr") on a GCP 'RStudio Server Pro Standard' VM, it takes around 3 minutes to install (on instance with 4 cores / 15 gb ram)
This seems unusual, as installation would typicaly take ~20 seconds on a laptop with equivalent specs.
Why so slow, and is there a quick and easy way of speeding this up?
Notes
I use the RStudio Server Pro Standard image from GCP marketplace to start the instance
Keen to know if there are any 'startup scripts' or similar I can set to run after the instance starts, e.g. to install a collection of commonly used packages
#user5783745 you can also adjust the Makevars to allow multithreaded compilation, which will help speed up compilations.
I followed this RStudio community post, and dropped MAKEFLAGS = -j4 into ~/.R/Makevars.
This basically halved the amount of time it took to install dplyr from scratch on the RStudio Server Pro Standard for GCP instance I spun up. (same as yours, 4 vCPU, 15GB ram)

How To Limit the Amount of RAM RServer/RSession Uses on Windows

I need to limit the amount of RAM RServer/RSession uses on our Windows 2012 server.
I am a system admin who has two users running R Studio on our Server. Each R Studio spawns one or more RSession.exe processes. Sometimes one of the RSession.exe processes takes up too much RAM. Yesterday, it grew to about 60GB, forcing almost everything else running to a halt. (We have 72 GB available, but 48 GB of that is allocated to SQL Server.) I need to make sure that using R doesn’t bring down my server or affect other applications running on it.
Info:
sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)
As suggested in some forums, I tried adding --max-mem-size=4000M to the shortcut they use to start R. This works with RGUI.exe, but doesn’t work with RStudio.exe. When I run memory.limit() from RGui, it return 4000, as expected. When I run memory.limit() from RStudio, it returns 71999, regardless of the start parameters. And, in any case, I’m not certain this is the right place to set limits. Would a command line switch like this for RStudio affect RSession processes it spawns?
I also tried running memory.limit(4000) from within RStudio, but got the following message:
“Warning message: In memory.limit(4000) : cannot decrease memory limit: ignored”
I’ve also read something about setting memory limits in user profiles, but all of the instructions for doing so are Unix oriented and refer to files and directories that don’t exist on our Windows server. (Sorry, I’m not an R person, just a poor sysadmin trying to keep his server alive…)
One post noted that “Just to let you know, support.rstudio.com on 2014/06/10: "We've got it on our list of things to investigate and hope to have a solution soon" (How to set memory limit in RStudio (desktop version)?). Is this a bug? If so, what work-arounds are there? If not, how can I limit RSession memory usage so it doesn’t bring down my server?
Thank You,
Robbie

Increasing memory size in R on Linux

I am using Linux through a Virtual Machine (I need to in order to run this R code which uses Linux specific commands). I am using R version 3.3.1 x86_64-pc-linux-gnu on my Virtual Machine. I want to allocate larger memory size for R, since my code fails to finish due to memory size issues. I know in R on Windows you can use memory.limit(size=specify_size) to increase the size of the memory allocated, how would I do so on Linux in a straight forward fashion.
Compiled from comments:
R will use everything it can in a Linux environment, which does not limit an application's memory allowance like Windows does. As code requires more memory than the VM has available, more memory should be allocated when establishing the virtual environment.
The problem was solved by increasing the Base Memory of the Virtual Machine as well as increasing the CPU from 1 to 4. The code is running well now.

how to override the 2GB memory limit when R starts

When R boots, the memory limit (as returned by memory.limit) is set to 2GB, regardless of the available memory on the computer. (I found that out recently). I imagine that at some point in the booting process, this limit is set to the actually available memory.
This can be seen by printing memory.limit() in the .Rprofile file which is sourced at startup. It prints "2047". On the other hand, when R has booted and I type memory.limit() in the console, I get "16289".
I use a custom .Rprofile file and I need to have access to more than 2GB during bootup.
How can override this limit?
My current workaround is to set the limit myself in the .Rprofile using memory.limit(size=16289) but then I will have to edit this every time I work on a computer with a different amount of RAM which happens fairly often.
Is there an option I can change, a .ini file I can edit, or anything I can do about it?
OS and R version:
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Edit: this is not a duplicate, at least not a duplicate of the proposed question. It is not about managing available memory! I have 16GB of memory and memory.limit() shows that my limit is indeed 16GB.
It all started when I got the warning that I had "reached 2GB memory allocation" (implicating that I had a 2GB memory limit). After investigation, it appears that indeed R limits the memory at 2GB during the startup process.
I want to load my data automatically when R starts, for this I have a small loading script in the .Rprofile. I load more than 2GB data hence I need to have access to my 16GB. My question is about achieving this. This has nothing at all in common with the proposed duplicate, except keywords...
I'm interpreting this as you wanting memory.limit(size=16289) in your .RProfile file, but you don't want to set the specific number every time you change computers with different memory. Why not just dynamically pull the memory you need? In windows:
TOT_MEM <- as.numeric(gsub("\r","",gsub("TotalVisibleMemorySize=","",system('wmic OS get TotalVisibleMemorySize /Value',intern=TRUE)[3])))/1024
memory.limit(size=TOT_MEM)
which would set the available memory to the total memory of the system, or
FREE_MEM <- as.numeric(gsub("\r","",gsub("FreePhysicalMemory=","",system('wmic OS get FreePhysicalMemory /Value',intern=TRUE)[3])))/1024
memory.limit(size=FREE_MEM)
which would set memory.limit to the total available memory on boot.
Place this in RProfile, above where you load your data.

Resources