On certain machines loading packages on all cores eats up all available RAM resulting in an error 137 and my R session is killed. On my laptop (Mac) and a Linux computer it works fine. On the Linux computer that I want to run this on, a 32 core with 32 * 6GB RAM it does not. The sysadmin told me memory is limited on the compute nodes. However, as per my edit below, my memory requirements are not excessive by any stretch of the imagination.
How can I debug this and find out what is different? I am new to the parallel package.
Here is an example (it assumes the command install.packages(c(“tidyverse”,”OpenMx”)) has been run in R under version 4.0.3):
I also note that it seems to be only true for the OpenMx and the mixtools packages. I excluded mixtools from the MWE because OpenMx is enough to generate the problem. tidyverse alone works fine.
A workaround I tried was to not load packages on the cluster and just evaluate .libPaths("~/R/x86_64-pc-linux-gnu-library/4.0/") in the body of expr of clusterEvalQ and use the namespace commands like OpenMx::vec in my functions but that produced the same error. So I am stuck because on two out of three machines it worked fine, just not on the one I am supposed to use (a compute node).
.libPaths("~/R/x86_64-pc-linux-gnu-library/4.0/")
library(parallel)
num_cores <- detectCores()
cat("Number of cores found:")
print(num_cores)
working_mice <- makeCluster(num_cores)
clusterExport(working_mice, ls())
clusterEvalQ(working_mice, expr = {
library("OpenMx")
library("tidyverse")
})
It seems to consume all available RAM resulting in an error 137 by simply loading packages. That is a problem because I need the libraries loaded in each available core where their functions are performing tasks.
Subsequently, I am using DEoptim but loading packages was enough to generate the error.
Edit
I have profiled the code using profmem and found that the part in the example code asks for about 2MB of memory and the whole script I am trying to run 94.75MB. I then also checked using my OS (Catalina) and caught the following processes as seen on the screenshot.
None of these numbers strike me as excessive, especially not on a node that has ~6GB per CPU and 32 cores. Unless, I am missing something major here.
I want to start by saying I'm not sure what is causing this issue for you. The following example may help you debug how much memory is being used by each child processes.
Using mem_used from the pryr package will help you track how much RAM is used by an R session. The following shows the results of doing this on my local computer with 8 Cores and 16 GB of RAM.
library(parallel)
library(tidyr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
num_cores <- detectCores()
cat("Number of cores found:")
#> Number of cores found:
print(num_cores)
#> [1] 8
working_mice <- makeCluster(num_cores)
clusterExport(working_mice, ls())
clust_memory <- clusterEvalQ(working_mice, expr = {
start <- pryr::mem_used()
library("OpenMx")
mid <- pryr::mem_used()
library("tidyverse")
end <- pryr::mem_used()
data.frame(mem_state = factor(c("start","mid","end"), levels = c("start","mid","end")),
mem_used = c(start, mid, end), stringsAsFactors = F)
})
to_GB <- function(x) paste(x/1e9, "GB")
tibble(
clust_indx = seq_len(num_cores),
mem = clust_memory
) %>%
unnest(mem) %>%
ggplot(aes(mem_state, mem_used, group = clust_indx)) +
geom_line(position = 'stack') +
scale_y_continuous(labels = to_GB) #approximately
As you can see, each process uses about the same amount of RAM ~ 160MG on my machine. According to pryr::mem_used(), the amount of RAM used is always the same per core after each library step.
In whatever environment you are working in, I'd recommend you do this on just 10 workers and see if it is using a reasonable amount of memory.
I also confirmed with htop top that all the child processes are only using about 4.5 GB of Virtual memory and approximately a similar amount of RAM each.
The only thing I can think of that may be the issue is clusterExport(working_mice, ls()). This would only be an issue if you are not doing this in a fresh R session. For example, if you had 5 GB of data sitting in your global environment, each socket would be getting a copy.
Related
R's memory.size() is a Windows only. For other functions (such as windows()) the help page gives pointer to non-windows counterparts.
But for memory.size() I could find no such pointers.
So here is my question: is there a function to do the same as memory.size() but in linux?
I think that this should be handled by the operating system. There is no built-in limit that I know of; if necessary, R will use all the memory that it can get.
To obtain information on the total and/or on the available memory in linux, you can try
system('grep MemTotal /proc/meminfo')
or
system('free -m')
or
system('lshw -class memory')
The last command will complain that you should run this as super-user and it will give a warning that the output may not be accurate; but from my experience it will still provide a fairly useful output.
To obtain information on the memory usage of a running R script one could either monitor the currently used resources by starting top in a separate terminal, or use, e.g., the following system call from within the R script:
system(paste0("cat /proc/",Sys.getpid(),"/status | grep VmSize"))
Hope this helps.
Using pryr library:
library("pryr")
mem_used()
# 27.9 MB
x <- mem_used()
x
# 27.9 MB
class(x)
# [1] "bytes"
Result is the same as #RHertel's answer, with pryr we can assign the result into a variable.
system('grep MemTotal /proc/meminfo')
# MemTotal: 263844272 kB
To assign to a variable with system call, use intern = TRUE:
x <- system('grep MemTotal /proc/meminfo', intern = TRUE)
x
# [1] "MemTotal: 263844272 kB"
class(x)
# [1] "character"
Yes, memory.size() and memory.limit() is not working in linux/unix.
I can suggest unix package.
To increase the memory limit in linux:
install.packages("unix")
library(unix)
rlimit_as(1e12) #increases to ~12GB
You can also check the memory with this:
rlimit_all()
for detailed information:
https://rdrr.io/cran/unix/man/rlimit.html
also you can find further info here:
limiting memory usage in R under linux
I have a project that's downloading ~20 million PDFs multithreaded on an ec2. I'm most proficient in R and it's a one off so my initial assessment was that the time savings from bash scripting wouldn't be enough to justify the time spent on the learning curve. So I decided just to call curl from within an R script. The instance is a c4.8xlarge, rstudio server over ubuntu with 36 cores and 60 gigs of memory.
With any method I've tried it runs up to the max ram fairly quickly. It runs alright but I'm concerned swapping the memory is slowing it down. curl_download or curl_fetch_disk work much more quickly than the native download.file function (one pdf per every .05 seconds versus .2) but those both run up to max memory extremely quickly and then seem to populate the directory with empty files. With the native function I was dealing with the memory problem by suppressing output with copious usage of try() and invisible(). That doesn't seem to help at all with the curl package.
I have three related questions if anyone could help me with them.
(1) Is my understanding of how memory is utilized correct in that needlessly swapping memory would cause the script to slow down?
(2) curl_fetch_disk is supposed to be writing direct to disk, does anyone have any idea as to why it would be using so much memory?
(3) Is there any good way to do this in R or am I just better off learning some bash scripting?
Current method with curl_download
getfile_sweep.fun <- function(url
,filename){
invisible(
try(
curl_download(url
,destfile=filename
,quiet=T
)
)
)
}
Previous method with native download.file
getfile_sweep.fun <- function(url
,filename){
invisible(
try(
download.file(url
,destfile=filename
,quiet=T
,method="curl"
)
)
)
}
parLapply loop
len <- nrow(url_sweep.df)
gc.vec <- unlist(lapply(0:35, function(x) x + seq(
from=100,to=len,by=1000)))
gc.vec <- gc.vec[order(gc.vec)]
start.time <- Sys.time()
ptm <- proc.time()
cl <- makeCluster(detectCores()-1,type="FORK")
invisible(
parLapply(cl,1:len, function(x){
invisible(
try(
getfile_sweep.fun(
url = url_sweep.df[x,"url"]
,filename = url_sweep.df[x,"filename"]
)
)
)
if(x %in% gc.vec){
gc()
}
}
)
)
stopCluster(cl)
Sweep.time <- proc.time() - ptm
Sample of data -
Sample of url_sweep.df:
https://www.dropbox.com/s/anldby6tcxjwazc/url_sweep_sample.rds?dl=0
Sample of existing.filenames:
https://www.dropbox.com/s/0n0phz4h5925qk6/existing_filenames_sample.rds?dl=0
Notes:
1- I do not have such powerful system available to me, so I cannot reproduce every issue mentioned.
2- All the comments are being summarized here
3- It was stated that machine received an upgrade: EBS to provisioned SSD w/ 6000 IOPs/sec, however the issue persists
Possible issues:
A- if memory swap starts to happen then you are nor purely working with RAM anymore and I think R would have harder and harder time to find available continues memory spaces.
B- work load and the time it takes to finish the workload, compared to the number of cores
c- parallel set up, and fork cluster
Possible solutions and troubleshooting:
B- Limiting memory usage
C- Limiting number of cores
D- If the code runs fine on a smaller machine like personal desktop than issue is with how the parallel usage is setup, or something with fork cluster.
Things to still try:
A- In general running jobs in parallel incurs overhead, now more cores you have, you will see the effects more. when you pass a lot of jobs that take very very little time (think smaller than second) this will results in increase of overhead related to constantly pushing jobs. try to limit the core to 8 just like your desktop and try your code? does the code run fine? if yes than try to increase the workload as you increase the cores available to the program.
Start with lower end of spectrum of number of cores and amount of ram an increase them as you increase the workload and see where the fall happens.
B- I will post a summery about Parallelism in R, this might help you catch something that we have missed
What worked:
Limiting the number of cores has fixed the issue. As mentioned by OP, he has also made other changes to the code, however i do not have access to them.
You can use the async interface instead. Short example below:
cb_done <- function(resp) {
filename <- basename(urltools::path(resp$url))
writeBin(resp$content, filename)
}
pool <- curl::new_pool()
for (u in urls) curl::curl_fetch_multi(u, pool = pool, done = cb_done)
curl::multi_run(pool = pool)
It is possible to use the package R.cache with the package parallel.
I'm doing some computation that are very time comsuming and I would like to use a cache along with going parallel.
The the parrallel jobs are independent for each other. Yet I cannot load the R.cache package on the clusters.
library(parallel)
library(R.cache)
cl <- makeCluster(getOption("cl.cores", 2))
clusterExport(cl,varlist = ls())
clusterEvalQ(cl, library(R.cache))
## Error in checkForRemoteErrors(lapply(cl, recvResult)) :
## 2 nodes produced errors; first error: there is no package called ‘R.cache’
Nothing particular about R.cache here. You need to make sure R.cache is installed on each of the compute nodes.
And yes, R.cache works and is designed to work on concurrent systems. See also https://github.com/HenrikBengtsson/R.cache/issues/18
(I'm the author of R.cache).
I am running the script as follows:
library(ff)
library(ffbase)
setwd("D:/My_package/Personal/R/reading")
x<-cbind(rnorm(1:100000000),rnorm(1:100000000),1:100000000)
system.time(write.csv2(x,"test.csv",row.names=FALSE))
#make ffdf object with minimal RAM overheads
system.time(x <- read.csv2.ffdf(file="test.csv", header=TRUE, first.rows=1000, next.rows=10000,levels=NULL))
#make increase by 5 of the column#1 of ffdf object 'x' by the chunk approach
chunk_size<-100
m<-numeric(chunk_size)
#list of chunks
chunks <- chunk(x, length.out=chunk_size)
#FOR loop to increase column#1 by 5
system.time(
for(i in seq_along(chunks)){
x[chunks[[i]],][[1]]<-x[chunks[[i]],][[1]]+5
}
)
# output of x
print(x)
#clear RAM used
rm(list = ls(all = TRUE))
gc()
#another option to run garbage collector explicitly.
gc(reset=TRUE)
The issue is that I still some RAM unreleased but all objects and functions have been swept away from the current environment.
Moreover, the next run of the script will increase portion of RAM unreleased as if it is cumulative function (by Task manager in Win7 64bit).
However, if I make a non-ffdf object and sweep it away, the output of rm() and gc() will be Ok.
So my guess about RAM unreleased is connected with specifics of ffdf objects and ff package.
So the effective way to clear up RAM is to quit the current R-session and re-run it again. but it is not very convinient.
I have scanned a bunch of posts about memory cleaning up including this one:
Tricks to manage the available memory in an R session
But I have not found the clear explanations of such a situation and effective ways to overcome it (without resetting R-session).
I would be very grateful for your comments.
I have a 15.4GB R data.table object with 29 Million records and 135 variables. My system & R info are as follows:
Windows 7 x64 on a x86_64 machine with 16GB RAM."R version 3.1.1 (2014-07-10)" on "x86_64-w64-mingw32"
I get the following memory allocation error (see image)
I set my memory limits as follows:
#memory.limit(size=7000000)
#Change memory.limit to 40GB when using ff library
memory.limit(size=40000)
My questions are the following:
Should I change the memory limit to 7 TB
Break the file into chunks and do the process?
Any other suggestions?
Try to profile your code to identify which statements cause the "waste of RAM":
# install.packages("pryr")
library(pryr) # for memory debugging
memory.size(max = TRUE) # print max memory used so far (works only with MS Windows!)
mem_used()
gc(verbose=TRUE) # show internal memory stuff (see help for more)
# start profiling your code
Rprof( pfile <- "rprof.log", memory.profiling=TRUE) # uncomment to profile the memory consumption
# !!! Your code goes here
# Print memory statistics within your code whereever you think it is sensible
memory.size(max = TRUE)
mem_used()
gc(verbose=TRUE)
# stop profiling your code
Rprof(NULL)
summaryRprof(pfile,memory="both") # show the memory consumption profile
Then evaluate the memory consumption profile...
Since your code stops with an "out of memory" exception you should reduce the input data to an amount the makes your code workable and use this input for memory profiling...
You could try to use the ff package. It works well with on disk data.