The memory limit in R on Linux - r

The command "memory.limit()" is used to get memory in R on windows.
What is corresponding way to get memory in R on Linux

You can use ulimit -v to get memory limit for single process
Refer link

Related

R: Memory Allocation on Ubuntu for R vs Rscript

I have created a script, and when I run it line-by-line on Ubuntu, everything works from end-to-end, however, when I run it using Rscript it fails due to a memory allocation error.
I have tried to remove excess variables using rm followed by gc, and I have tried to run the code using R CMD BATCH, but nothing seems to work.
Does anyone know why memory allocation is different between running R and Rscript?
Sorry, I cannot really do a reproducible example for this problem due to the sensible data, but the packages I use are RJDBC with the setting options(java.parameters = "-Xmx8g") on a computer with 16 Gb of RAM, and I also use the parallel-package, which is where the error occurs.

Memory limit in jupyter notebook

How do I set a maximum memory limit for a jupyter notebook process?
If I use too much RAM the computer gets blocked and I have to press the power button to restart the computer manually.
Is there a way of automatically killing a jupyter notebook process as soon as a user-set memory limit is surpassed or to throw a memory error? Thanks
Under Linux you can use "cgroups" to limit resources for any software running on your computer.
Install cgroup-tools with apt-get install cgroup-tools
Edit its configuration /etc/cgconfig.conf to make a profile for the particular type of work (e.g. numerical scientific computations):
group app/numwork {
memory {
memory.limit_in_bytes = 500000000;
}
}
Apply that configuration to the process names you care about by listing them in /etc/cgrules.conf (in my case it is all julia executables, which I run through jupyter, but you can use it for any other software too):
*:julia memory app/numwork/
Finally, parse the config and set it as the current active config with the following commands:
~# cgconfigparser -l /etc/cgconfig.conf
~# cgrulesengd
I use this to set limits on processes running on a server that is used by my whole class of students.
This page has some more details and other ways to use cgroups https://wiki.archlinux.org/index.php/cgroups

Memory error when kniting in R

I try to knit an Rdata file in R studio and get an error that cannot allocate vector 1,8mb.
I understand that I have memory issue so I tried to use memory.limit() as suggested on another post but I got an error
memory.size() is Windows-specific.
I also made extra RAM using a usb and changing swappiness but nothing happened. I'm using Ubuntu 14.04 32bit, R Version 3.3.0 32bit and also I have Windows O/S installed.
The problem is that you're on a 32-bit installation of R and you've used up the (very small) amount of memory it can handle. If you just switch to 64-bit you won't get this error.
The error message is because you invoked a Windows-specific command on Ubuntu, but that's really not relevant because with 32-bit R there's a hard limit on memory which you've already hit.
I know this is confusing because it's complaining about a very small size vector (1.8 MB) but that just means that the remaining amount of memory 32-bit R can handle is less than that.
If you were on Windows you might need to set the memory limit in addition to using 64-bit R, but if you're using Ubuntu then just using 64-bit R should solve the problem.
RStudio Instructions
I'm basing this on my version of RStudio, yours might be slightly different but it should be very close.
Click Tools
Click Global Options...
With the "General" (default) screen selected click Change where it says "R version:"
Select Use your machine's default version of R64 (64-bit)
Click OK
Allow RStudio to restart

Memory limit in R Studio

I tried to decrease and give some limit to R studio but it didn't work.
I've read this article and this question.
My computer is running Windows 8 64 Bit with RAM 8 GB. I just want to give memory limit to R Studio only 4 GB.
The easiest would be to just use:
memory.limit()
[1] 8157
memory.limit(size=4000)
If you are running a server version of RStudio, it will be a bit different. You will have to change the file /etc/rstudio/rserver.conf and add rsession-memory-limit-mb=4000 to it.
If you do find that RStudio is resetting the memory limit with every new instance, you could try and add memory.limit(size=4000) to your .Rprofile file that sets your with every start

When using mpirun with R script, should I copy manually file/script on clusters?

I'm trying to understand how openmpi/mpirun handle script file associated with an external program, here a R process ( doMPI/Rmpi )
I can't imagine that I have to copy my script on each host before running something like :
mpirun --prefix /home/randy/openmpi -H clust1,clust2 -n 32 R --slave -f file.R
But, apparently it doesn't work until I copy the script 'file.R' on clusters, and then run mpirun. Then, when I do this, the results are written on cluster, but I expected that they would be returned to working directory of localhost.
Is there another way to send R job from localhost to multiple hosts, including the script to be evaluated ?
Thanks !
I don't think it's surprising that mpirun doesn't know details of how scripts are specified to commands such as "R", but the Open MPI version of mpirun does include the --preload-files option to help in such situations:
--preload-files <files>
Preload the comma separated list of files to the current working
directory of the remote machines where processes will be
launched prior to starting those processes.
Unfortunately, I couldn't get it to work, which may be because I misunderstood something, but I suspect it isn't well tested because very few use that option since it is quite painful to do parallel computing without a distributed file system.
If --preload-files doesn't work for you either, I suggest that you write a little script that calls scp repeatedly to copy the script to the cluster nodes. There are some utilities that do that, but none seem to be very common or popular, which I again think is because most people prefer to use a distributed file system. Another option is to setup an sshfs file system.

Resources