I'm writing an R package which depends upon many other packages. When I load too many packages into the session I frequently got this error:
Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared object '/Library/Frameworks/R.framework/Versions/3.2/Resources/library/proxy/libs/proxy.so':
`maximal number of DLLs reached...
This post Exceeded maximum number of DLLs in R pointed out that the issue is with the Rdynload.c of the base R code:
#define MAX_NUM_DLLS 100
Is there any way to bypass this issue except modifying and building from source?
As of R 3.4, you can set a different max number of DLLs using and environmental variable R_MAX_NUM_DLLS. From the release notes:
The maximum number of DLLs that can be loaded into R e.g. via
dyn.load() can now be increased by setting the environment
variable R_MAX_NUM_DLLS before starting R.
Increasing that number is of course "possible"... but it also costs a bit
(adding to the fixed memory footprint of R).
I did not set that limit, but I'm pretty sure it was also meant as reminder for the useR to "clean up" a bit in her / his R session, i.e., not load package namespaces unnecessarily. I cannot yet imagine that you need > 100 packages | namespaces loaded in your R session.
OTOH, some packages nowadays have a host of dependencies, so I agree that this at least may happen accidentally more frequently than in the past.
The real solution of course would be a code improvement that starts with a relatively small number of "DLLinfo" structures (say 32), and then allocates more batches (of size say 32) if needed.
Patches to the R sources (development trunk in subversion at https://svn.r-project.org/R/trunk/ ) are very welcome!
---- added Jan.26, 2017: In the mean time, we've had a public bug report about this, a proposed patch (which was not good enough: There is always an OS dependent limit on the number of open files), and today that bug report has been closed by R core member #TomasKalibera who implemented new code where the maximal number of loaded DLLs is set at
pmax(100, pmin(1000, 0.6* OS_dependent_getrlimit_or_equivalent()))
and so on Windows and Linux (and not yet tested, but "almost surely" macOS), the limit should be considerably higher than previously.
----- Update #2 (written Jan.5, 2018):
In Oct'17, the above change was made more automatic with the following commit to the sources (of the development version of R - only!)
r73545 | kalibera | 2017-10-12 14:41:20
Increase the number of DLLs that can be loaded by default. If needed,
increase the soft limit on open files.
and on the help page ?dyn.load (https://stat.ethz.ch/R-manual/R-devel/library/base/html/dynload.html) the ulimit -n <num_open_files> is now mentioned (section Note close to bottom).
So you might consider using R's development version till that becomes "main stream" in April.
Alternatively, you do (in a terminal / shell)
ulimit -n 2048
and then start R from that terminal. Tomas Kalibera mentioned this to work on macOS.
I had this issue with the simpleSingleCell library in bioconductor
On the macOS you can't exceed 256. So I set my .Renviron in my home dir
R_MAX_NUM_DLLS=150
It's easy
Go to the environment variable and edit
variable_name = R_MAX_NUM_DLL
value = 1000
Restart R
worked well for me
Related
I have been having this problem for more than a week now and I am running out of time and patience.This problem occurs when I run my script on a Mac and when I run it on a PC (no difference of results from more RAM, it just aborts faster). When I try to run this line of my dataset, the session aborts.
set.seed(119)
tax_PR2 <- assignTaxonomy(seqtab,
"~/Desktop/Documents/Bruts/aeDNA_data_shared/pr2_version_4.11.1_dada2.fasta",
multithread=TRUE)
Does anyone have any idea of what the problem is? I verified my dataset (seqtab is currently considered by R as a large matrix of 3930724 elements of 20.2Mb), I verified the space I have on my computer, I have all the needed packages to run this line of code and I tried different sources of genome database for PR2 (PR2 version 4.11.1 or 4.12.0 etc...) and it always has the same result.
If you have any ideas I would appreciate them. I hope the information I gave is sufficient.
Packages installed:
library(BiocManager)
library(Rcpp)
library(dada2)
library(ff)
library(ggplot)
library(gridExtra)
library(phyloseq)
library(vegan)
This is probably caused by a bug that was introduced in 1.14, see the Github issue here for more information: https://github.com/benjjneb/dada2/issues/916
We've just identified the cause, and a fix should be out soon. For immediate use, the workaround is to turn off multithreading, or to revert to the previous release 1.12.
I have a main function in R which calls other files to run my program. I call the main file through a bat file(.exe). When I run it line-by-line it runs without a memory error, but when I call the bat file to run it, it halts and gives me the following error:
Cannot allocate memory greater than 51 MB.
How can I avoid this?
Memory limitations in R such as this are a recurring nightmare for a lot of us.
Very often the problem is a limit imposed by your OS limits (which can usually be changed on a Bash or PowerShell command line), architecture (32 v. 64 bit), or the availability of contiguous free RAM, irregardless of overall available memory.
It's hard to say why something would not cause a memory issue when run line by line, but would hit the memory limit when run as a .bat.
What version of R are you running? Do you have both installed? Is 32-bit being called by Rscript when you run your .bat file whereas you run a 64-bit version line by line? You can check the version of R that's being run with R.Version().
You can test this by running the command memory.limit() in both your R IDE/terminal and in your .bat file (be sure to print or save the result as an object in your .bat file). You might also do well to try setting memory.limit() in your .bat file, as it may just have a smaller default, perhaps due to differences in your R Profile that's invoked in your IDE or terminal versus the .bat file.
If architecture isn't the cause of your memory error, then you have several more troubleshooting steps to try:
Check memory usage in both environments (in R directly and via your .bat process) using this:
sort( sapply(ls(),function(x){object.size(get(x))}))
Run the garbage collector explicitly in your scripts, that's the gc() command
Check all object sizes to make sure there are no unexpected results in your .bat process: sort( sapply(ls(),function(x){format(object.size(get(x)), units = "Mb")}))
Try memory profiling:
Rprof(tf <- "rprof.log", memory.profiling=TRUE)
Rprof(NULL)
summaryRprof(tf)
While this is a RAM issue, for good measure you might want to check that the compute power available is both sufficient and not varying between these two ways of running your code: parallel::detectCores()
Examine your performance with Prof. Hadley Wikham's lineprof tool (warning: requires devtools and doesn't work on lines of code which call the C programming language)
References While I'm pulling these snippets out of my own code, most of them originally came from other, related StackOverflow posts, such as:
Reaching memory allocation in R
R Memory Allocation "Error: cannot allocate vector of size 75.1 Mb"
R memory limit warning vs "unable to allocate..."
How to compute the size of the allocated memory for a general type
R : Any other solution to "cannot allocate vector size n mb" in R?
Yes you should be using 64bit R, if you can.
See this question, and this from the R docs.
Is there a simple way to trigger a crash in R? This is for testing purposes only, to see how a certain program that uses R in the background reacts to a crash and help determine if some rare problems are due to crashes or not.
The easiest way is to call C-code. C provides a standard function abort()[1] that does what you want. You need to call: .Call("abort").
As #Phillip pointed out you may need to load libc via:
on Linux, dyn.load("/lib/x86_64-linux-gnu/libc.so.6") before issuing .Call("abort"). The path may of course vary depending on your system.
on OS X, dyn.load("/usr/lib/libc.dylib")
on Windows (I just tested it on XP as I could not get hold of a newer version.) you will need to install Rtools[2]. After that you should load dyn.load("C:/.../Rtools/bin/cygwin1.dll").
There is an entire package on GitHub dedicated to this:
crash
R package that purposely crash an R session. WARNING: intended
for test.
How to install a package from github is covered in other questions.
I'm going to steal an idea from #Spacedman, but I'm giving him full conceptual credit by copying from his Twitter feed:
Segfault #rstats in one easy step:
options(device=function(){});plot(1)
reported Danger, will crash your R session.
— Barry Rowlingson (#geospacedman) July 16, 2014
As mentioned in a comment to your question, the minimal approach is a simple call to the system function abort(). One way to do this in one line is to
R> Rcpp::cppFunction('int crashMe(int ignored) { ::abort(); }');
R> crashMe(123)
Aborted (core dumped)
$
or you can use the inline package:
R> library(inline)
R> crashMe <- cfunction(body="::abort();")
R> crashMe()
Aborted (core dumped)
$
You can of course also do this outside of Rcpp or inline, but then you need to deal with the system-dependent ways of compiling, linking and loading.
I'll do this in plain C because my C++-foo isn't Dirkian:
Create a C file, segv.c:
#include <signal.h>
void crashme(){raise(SIGSEGV);}
Compile it at the command line (windows users will have to work this out for themselves):
R CMD SHLIB segv.c
In R, load and run:
dyn.load("segv.so") # or possibly .dll for Windows users
.C("crashme")
Producing a segfault:
> .C("crashme")
*** caught segfault ***
address 0x1d9e, cause 'unknown'
Traceback:
1: .C("crashme")
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 1
aborting ...
Segmentation fault
This is the same behaviour as the one Thomas references in the graphics system bug report which I have filed and might get fixed one day. However this two-liner will always raise a segfault...
Maybe Dirk can one-line-Rcpp-ise it?
If you want to crash your R, try this
lapply("", function(x) eval(sys.call(1)))
(Save everything before running because this immediately results in "R Session Aborted")
Edit: This works for me on Windows 10.
I'm fighting this problem second day straight with a completely sleepless night and I'm really starting to lose my patience and strength. It all started after I decided to provision another (paid) AWS EC2 instance in order to test my R code for dissertation data analysis. Previously I was using a single free tier t1.micro instance, which is painfully slow, especially when testing/running particular code. Time is much more valuable than reasonable number of cents per hour that Amazon is charging.
Therefore, I provisioned a m3.large instance, which I hope should have enough power to crunch my data comfortably fast. After EC2-specific setup, which included selecting Ubuntu 14.04 LTS as an operating system and some security setup, I installed R and RStudio Server per instructions via sudo apt-get install r-base r-base-dev as ubuntu user. I also created ruser as a special user for running R sessions. Basically, the same procedure as on the smaller instance.
Current situation is that any command that I issuing in R session command line result in messages like this: Error: could not find function "sessionInfo". The only function that works is q(). I suspect here a permissions problem, however, I'm not sure how to approach investigating permission-related problems in R environment. I'm also curious what could be the reasons for such situation, considering that I was following recommendations from R Project and RStudio sources.
I was able to pinpoint the place that I think caused all that horror - it was just a small configuration file "/etc/R/Rprofile.site", which I have previously updated with directives borrowed from R experts' posts here on StackOverflow. After removing questionable contents, I was able to run R commands successfully. Out of curiosity and for sharing this hard-earned knowledge, here's the removed contents:
local({
# add DISS_FLOSS_PKGS to the default packages, set a CRAN mirror
DISS_FLOSS_PKGS <- c("RCurl", "digest", "jsonlite",
"stringr", "XML", "plyr")
#old <- getOption("defaultPackages")
r <- getOption("repos")
r["CRAN"] <- "http://cran.us.r-project.org"
#options(defaultPackages = c(old, DISS_FLOSS_PKGS), repos = r)
options(defaultPackages = DISS_FLOSS_PKGS, repos = r)
#lapply(list(DISS_FLOSS_PKGS), function() library)
library(RCurl)
library(digest)
library(jsonlite)
library(stringr)
library(XML)
library(plyr)
})
Any comments on this will be appreciated!
I've followed the rtest.java example code from the rJava installation (/usr/lib/R/site-library/rJava/jri/examples/rtest.java on Debian and derivatives) for building data.frames from java arrays.
This works well for small data frames (~10000 rows), however when I try to do this in anger (i.e. > 1000000 rows) it causes the java runtime to segfault.
Oddly, I appear to be able to create the data.frame ok (making the usual rniPutXXXArray calls), however when I come to save the data.frame (using an eval, after assigning the data.frame to an R symbol) the issue occurs.
I can see some debug when I make calls to eval on the R engine, however when I go via the low level interface (rniXXX) I get no debug at all. Is there a way to switch more debug on than I already have?
For what it's worth, here's the top of the segv message. I can of course provide more detail on request.
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f1be6259ea5, pid=6898, tid=139758087001856
#
# JRE version: 7.0_03-b21
# Java VM: OpenJDK 64-Bit Server VM (22.0-b10 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea7 2.1.3
# Distribution: Debian GNU/Linux unstable (sid), package 7u3-2.1.3-1
# Problematic frame:
# C [libR.so+0x117ea5] SET_VECTOR_ELT+0x11f5
...
Please ask on stats-rosuda-devel including the actual code you're using. Note that with RNI calls you're responsible for protection of the objects - unfortunately the example code skips that aspect so what probably happens is that due to the size of your objects the garbage collection occurs before you are done with the construction so some of the objects get collected and thus are invalid and R crashes on you. If you want to be safe, protect the columns and then the generic vector you create out of it.
BTW: It is much safer to use the org.rosuda.REngine API instead of using RNI directly. It even provides REXP.createDataFrame() method that does all the work for you.