Error in serialize(data, node$con) : error writing to connection - r

I'm currently trying to run some code that implements parallel processing, but I'm running into this error:
Error: cannot allocate vector of size 2.1 Gb
Execution halted
Error in serialize(data, node$con) : error writing to connection
Calls: %dopar% ... postNode -> sendData -> sendData.SOCKnode -> serialize
Execution halted
Warning message:
system call failed: Cannot allocate memory
Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode ->
unserialize Execution halted
I can't seem to figure out why there's a memory problem. If I take the code out of the foreach loop or change the foreach to a for loop, it works perfectly fine, so I don't think it has to do with the contents of the code itself, but rather something about the parallelization. Also, it seems to throw the error pretty soon after the code starts executing. Any ideas why this might be happening? Here's a look at my code:
list_storer <- list()
list_storer <- foreach(bt=2:bootreps, .combine=list, .multicombine=TRUE) %dopar% {
ur <- sample.int(nrow(dailydatyr),nrow(dailydatyr),replace=TRUE)
ddyr_boot <- dailydatyr[ur,]
weightvar <- ddyr_boot[,c('ymd1_IssueD','MatD_ymd2')]
weightvar <- abs(weightvar)
x <- DM[ur,]
y<-log(ddyr_boot$dirtyprice2/ddyr_boot$dirtyprice1)
weightings <- rep(1,nrow(ddyr_boot))
weightings <- weightings/(ddyr_boot$datenum2-ddyr_boot$datenum1)
treg <- repeatsales(y,x,maxdailyreturn,weightings,weightvar)
zbtcol <- 0
cnst <- NULL
if (is.null(dums) == FALSE){
zbtcol <- length(treg)-ncol(x)
cnst <- paste("tbs(",dums,")_",(middleyr),sep="")
if (is.null(interactVar) == FALSE){
ninteract <- (length(treg)-ncol(x)-length(dums))/length(dums)
interact <- unlist(lapply(cnst,function(xla) paste(xla,"*c",c(1:ninteract),sep="")))
cnst <- c(cnst,interact)}
}
}
tregtotal <- tregtotal + (is.na(treg)==FALSE)
treg[is.na(treg)==TRUE] <- 0
list_storer[[length(list_storer)+1]] <- treg
}
stopImplicitCluster(cl)

Parallelisation as done by foreach is a space vs. time trade-off. We get faster execution at the expense of higher memory usage. The reason for the higher memory usage is that several R process are started and each of them needs it’s own memory to hold the data necessary for the calculation. Currently foreach is using an implicit PSOCK cluster. One way to solve this is to make the cluster creation explicit using a lower number of processes. How low depends on the amount of memory you have and on the memory requirements of each job:
n <- parallel::detectCores()/2 # experiment!
cl <- parallel::makeCluster(n)
doParallel::registerDoParallel(cl)
<foreach>
parallel::stopCluster(cl)

Related

Error: no more error handlers available (recursive errors?); invoking 'abort' restart problem in package

I am trying to use a package in R that I found - Bayesian Macroeconomics in R (BMR) - and I stumbled in the following error after trying to estimate the MCMC for the model
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Error in tryCatch(evalq((function (parameters) : bad value
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Error in parent.frame() :
'rho' must be an environment not promise: detected in C-level eval
Error during wrapup: R_Reprotect: only 0 protected items, can't reprotect index 33
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
After this error the R aborts the current session and I have to start a new one. I tried to look into the packages code to see if it was something to do with it, but I am not good enough to find out. I searched for it on google and some people have solved this problem using gc(), but it didn't work for me. This issue is already reported in the Package github repository, but it didn't receive any answers. I am really trying to work with DSGE without using MATLAB/Octave and this package was looking pretty good for it.
The code I am trying to replicate is this one:
rm(list=ls())
library(BMR)
source("nkm_model.R")
#
data(BMRVARData)
dsgedata <- USMacroData[24:211,-c(1,3)]
dsgedata <- as.matrix(dsgedata)
for(i in 1:2){
dsgedata[,i] <- dsgedata[,i] - mean(dsgedata[,i])
}
#
obj <- new(dsgevar_gensys)
obj$set_model_fn(nkm_model_simple)
x <- c(1)
obj$eval_model(x)
#
lrem_obj = obj$lrem
lrem_obj$solve()
lrem_obj$shocks_cov <- matrix(c(1,0,0,0.125),2,2,byrow=TRUE)
sim_data <- lrem_obj$simulate(200,800)$sim_vals
sim_data <- cbind(sim_data[,3],sim_data[,5])
#
prior_pars <- cbind(c(1.0),
c(0.05))
prior_form <- c(1)
obj$set_prior(prior_form,prior_pars)
#
par_bounds <- cbind(c(-Inf),
c( Inf))
opt_bounds <- cbind(c(0.7),
c(3.0))
obj$set_bounds(opt_bounds[,1],opt_bounds[,2])
obj$opt_initial_lb <- opt_bounds[,1]
obj$opt_initial_ub <- opt_bounds[,2]
#
cons_term <- TRUE
p <- 1
lambda <- 1.0
obj$build(sim_data,cons_term,p,lambda)
mode_res <- obj$estim_mode(x,TRUE)
mode_check(obj,mode_res$mode_vals,25,1,"eta")
#
obj$mcmc_initial_lb <- opt_bounds[,1]
obj$mcmc_initial_ub <- opt_bounds[,2]
obj$estim_mcmc(x,50,100,100) # error here
var_names <- c("Output Gap","Output","Inflation","Natural Int","Nominal Int","Labour Supply",
"Technology","MonetaryPolicy")
plot(obj,par_names="eta",save=FALSE)
IRF(obj,20,var_names=colnames(dsgedata),save=FALSE)
forecast(obj,10,back_data=10)
states(obj)
The error occurs in line 78:
obj$estim_mcmc(x,50,100,100)

doSNOW and Foreach loop (R) on cluster?

I am using a cluster to run a foreach loop in parallel, using doSNOW. The loop works on my desktop, I receive this warning when running on the cluster
Execution halted
Error in unserialize(node$con) : error reading from connection
Calls: local ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
The loop is rather large, so I have just provided a very basic sample here (I do not believe the error is in the loop, as it works on the desktop).
library(sp)
library(raster)
library(fields)
library(tidyr)
library(dplyr)
library(sphereplot)
library(dismo)
library(doSNOW)
library(parallel)
cores <- (detectCores()/2)/2
print(cores)
cl <- makeCluster(cores, type = "SOCK", outfile = "")
registerDoSNOW(cl)
FossilClimCoV <- foreach(i = 0:5,.combine = "rbind",
.packages = paste(c("dplyr","dismo",
"sp","raster",
"fields",
"tidyr","sphereplot","doSNOW","parallel"
)))%dopar%{
print(i)
FossilTemp <- Fossils%>%dplyr::filter(Age == i)
if(nrow(FossilTemp)>0){
BULK removed for ease
return(FossilTemp1)
}
}
I'm not sure how to fix this error. I don't understand why it will not work on the cluster, but will on my desktop.
EDIT 1
I have now resolved this large error by changing from a doSNOW backend to doParallel.
library(doParallel)
registerDoParallel(cores=3)
*foreach loop*
However, I now have a new error:
Calls: %dopar% -> <Anonymous>
Execution halted
If I change the errorhandling to "remove" the foreach loop will always return an empty vector.

Parallel processing stopped working with error: object 'mcinteractive' not found

For a long time I've been successfully running a program which uses parallel processing. a couple of days ago to code stopped working with the error message:
"Error in get("mcinteractive", pkg) : object 'mcinteractive' not
found
traceback()
8: get("mcinteractive", pkg)
7: .customized_mcparallel({
result <- mclapply(X, function(...) {
res <- FUN(...)
writeBin(1L, progressFifo)
return(res)
}, ..., mc.cores = mc.cores, mc.preschedule = mc.preschedule,
mc.set.seed = mc.set.seed, mc.cleanup = mc.cleanup,
mc.allow.recursive = mc.allow.recursive)
if ("try-error" %in% sapply(result, class)) {
writeBin(-1L, progressFifo)
}
close(progressFifo)
result
})
6: pbmclapply(1:N, FUN = function(i) {
max_score = max(scores[i, ])
topLabels = names(scores[i, scores[i, ] >= max_score -
fine.tune.thres])
if (length(topLabels) == 0) {
return(names(which.max(scores[i, ])))
}
(I have more traceback if you are interested, but I think it mainly belongs to the "surrounding" code and is not so interesting for the error per se. Tell me if you need it and I'll make an edit!)
I do not know anything about parallel processing, and I haven't been able to understand the issue by digging into the code. From what I've understood, parallel::mcparallel is a function containing the argument mcinteractive for which you can choose TRUE or FALSE. Earlier I got the tip to decrease the number of cores used in the processing. Before I used 16 cores without any issues. After the error started occurring I tried to set the number of cores to both 8 and 1 with the same result. If it is some memory problem I guess I'm in the wrong forum, sorrysorrysorry!! But I only experience problems when using RStudio, which is why I'm writing here. The only other thing that I can think of, that might be related, is that my processing (through RStudio) sometimes gets stuck and the only thing I found is that the RAM memory is full and I have to restart the computer. Then the processing works as usual again. However, this does not help with the new error when using parallel computation.
Do anyone recognize this issue and have any lead to what could be the cause? Is it the code, teh package, studioR or my computer? Any checks I can run? :)
Edit:
Including a short version of the error while searching the code after changing pbmclapply to mclapply.
> packageVersion("parallel")
[1] ‘3.4.4’
> labels = parallel::pbmclapply(1:N, FUN = function(i) {
. . .
+ }, mc.cores = numCores)
Error: 'pbmclapply' is not an exported object from 'namespace:parallel'
> labels = pbmcapply::pbmclapply(1:N, FUN = function(i) {
. . .
+ }, mc.cores = numCores)
Error in get("mcinteractive", pkg) : object 'mcinteractive' not found
> labels = parallel::mclapply(1:N, FUN = function(i) {
. . .
+ }, mc.cores = numCores)
Warning message:
In parallel::mclapply(1:N, FUN = function(i) { :
all scheduled cores encountered errors in user code
#inside mclapply
> job.res <- lapply(seq_len(cores), inner.do)
Error in mcfork() : could not find function "mcfork"
#inside inner.do
> f <- parallel::mcfork()
Error: 'mcfork' is not an exported object from 'namespace:parallel'
Edit 2: came a bit further in my error searching.
I had to add a triple colon before a lot of functions for parallel, meaning that i'm attaching an internal function (?), which in turn should mean that paralell is no longer part of my search path(?)
parallel:::mcfork()
parallel:::mc.advance.stream()
parallel:::selectChildren()
parallel:::isChild()
#Had to change .check_ncores(cores) to
parallel::detectCores()
This problem occurs because pbmclapply was updated and now only works with R >3.5, updating R solved my problem.

using package snow's parRapply: argument missing error

I want to find documents whose similarity between other doucuments are larger than a given value(0.1) by cutting documents into blocks.
library(tm)
data("crude")
sample.dtm <- DocumentTermMatrix(
crude, control=list(
weighting=function(x) weightTfIdf(x, normalize=FALSE),
stopwords=TRUE
)
)
step = 5
n = nrow(sample.dtm)
block = n %/% step
start = (c(1:block)-1)*step+1
end = start+step-1
j = unlist(lapply(1:(block-1),function(x) rep(((x+1):block),times=1)))
i = unlist(lapply(1:block,function(x) rep(x,times=(block-x))))
ij <- cbind(i,j)
library(skmeans)
getdocs <- function(k){
ci <- c(start[k[[1]]]:end[k[[1]]])
cj <- c(start[k[[2]]]:end[k[[2]]])
combi <- sample.dtm[ci]
combj < -sample.dtm[cj]
rownames(combi)<-ci
rownames(combj)<-cj
comb<-c(combi,combj)
sim<-1-skmeans_xdist(comb)
cat("Block", k[[1]], "with Block", k[[2]], "\n")
flush.console()
tri.sim<-upper.tri(sim,diag=F)
results<-tri.sim & sim>0.1
docs<-apply(results,1,function(x) length(x[x==TRUE]))
docnames<-names(docs)[docs>0]
gc()
return (docnames)
}
It works well when using apply
system.time(rmdocs<-apply(ij,1,getdocs))
When using parRapply
library(snow)
library(skmeans)
cl<-makeCluster(2)
clusterExport(cl,list("getdocs","sample.dtm","start","end"))
system.time(rmdocs<-parRapply(cl,ij,getdocs))
Error:
Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: attempt to set 'rownames' on an object with no dimensions
Timing stopped at: 0.01 0 0.04
It seems that sample.dtm coundn't be used in parRapply. I'm confused. Can anyone help me? Thanks!
In addition to exporting objects, you need to load the necessary packages on the cluster workers. In your case, the result of not doing so is that there isn't a dimnames method defined for "DocumentTermMatrix" objects, causing rownames<- to fail.
You can load packages on the cluster workers with the clusterEvalQ function:
clusterEvalQ(cl, { library(tm); library(skmeans) })
After doing that, rownames(combi)<-ci will work correctly.
Also, if you want to see the output from cat, you should use the makeCluster outfile argument:
cl <- makeCluster(2, outfile='')

Getting a random internal selfref error in data.table for R

I love data.table, it's fast and intuitive, what could be better?
Alas, here's my problem: when referring to a data.table within a foreach() loop (using the doMC implementation) I will occasionally get the following error:
EXAMPLE IN APPENDIX
Error in { :
Internal error: .internal.selfref prot is not itself an extptr
One of the annoying problems here is that I can't get it to reproduce with any consistency, but it will happen during some long (several hrs) tasks, so I want to make sure it never happens, if possible.
Since I refer to the same data.table, DT, in each loop, I tried running the following at the beginning of each loop:
setattr(DT,".internal.selfref",NULL)
...to remove the invalid/corrupted self ref attribute. This works and the internal selfref error no longer occurs. It's a workaround, though.
Any ideas for addressing the root problem?
Many thanks for any help!
Eric
Appendix: Abbreviated R Session Info to confirm latest versions:
R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
other attached packages:
[1] data.table_1.8.8 doMC_1.3.0
Example using simulated data -- you may have to run the history() function many times (like, hundreds) to get the error:
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Load packages and Prepare Data
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
require(data.table)
##this is the package we use for multicore
require(doMC)
##register n-2 of your machine's cores
registerDoMC(multicore:::detectCores()-2)
## Build simulated data
value.a <- runif(500,0,1)
value.b <- 1-value.a
value <- c(value.a,value.b)
answer.opt <- c(rep("a",500),rep("b",500))
answer.id <- rep( 6000:6499 , 2)
question.id <- rep( sample(c(1001,1010,1041,1121,1124),500,replace=TRUE) ,2)
date <- rep( (Sys.Date() - sample.int(150, size=500, replace=TRUE)) , 2)
user.id <- rep( sample(250:350, size=500, replace=TRUE) ,2)
condition <- substr(as.character(user.id),1,1)
condition[which(condition=="2")] <- "x"
condition[which(condition=="3")] <- "y"
##Put everything in a data.table
DT.full <- data.table(user.id = user.id,
answer.opt = answer.opt,
question.id = question.id,
date = date,
answer.id = answer.id,
condition = condition,
value = value)
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Daily Aggregation Function
##
##a basic function that aggregates all the values from
##all users for every question on a given day:
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
each.day <- function(val.date){
DT <- DT.full[ date < val.date ]
#count the number of updates per user (for weighting)
setkey(DT, question.id, user.id)
DT <- DT[ DT[answer.opt=="a",length(value),by="question.id,user.id"] ]
setnames(DT, "V1", "freq")
#retain only the most recent value from each user on each question
setkey(DT, question.id, user.id, answer.id)
DT <- DT[ DT[ ,answer.id == max(answer.id), by="question.id,user.id", ][[3]] ]
#now get a weighted mean (with freq) of the value for each question
records <- lapply(unique(DT$question.id), function(q.id) {
DT <- DT[ question.id == q.id ]
probs <- DT[ ,weighted.mean(value,freq), by="answer.opt" ]
return(data.table(q.id = rep(q.id,nrow(probs)),
ans.opt = probs$answer.opt,
date = rep(val.date,nrow(probs)),
value = probs$V1))
})
return(do.call("rbind",records))
}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## foreach History Function
##
##to aggregate accross many days quickly
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
history <- function(start, end){
#define a sequence of dates
date.seq <- seq(as.Date(start),as.Date(end),by="day")
#now run a foreach to get the history for each date
hist <- foreach(day = date.seq, .combine = "rbind") %dopar% {
#setattr(DT,".internal.selfref",NULL) #resolves occasional internal selfref error
each.day(val.date = day)
}
}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Examples
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##aggregate only one day
each.day(val.date = "2012-12-13")
##generate a history
hist.example <- history (start = "2012-11-01", end = Sys.Date())
Thanks for reporting and all the help in finding it! Now fixed in v1.8.11. From NEWS :
In long running computations where data.table is called many times
repetitively, the following error could sometimes occur, #2647 :
Internal error: .internal.selfref prot is not itself an extptr
Fixed. Thanks to theEricStone, StevieP and JasonB for (difficult) reproducible examples.
Possibly related is a memory leak in grouping, which is also now fixed.
Long outstanding (usually small) memory leak in grouping fixed, #2648. When the last group is smaller than the largest group, the difference in those sizes was not being released. Also in non-trivial aggregations where each group returns a different number of rows. Most users run a grouping query once and will never have noticed these, but anyone looping calls to grouping (such as when running in parallel, or benchmarking) may have suffered. Tests added.
Thanks to many including vc273 and Y T.
Memory leak in data.table grouped assignment by reference
Slow memory leak in data.table when returning named lists in j (trying to reshape a data.table)
A similar problem has been plaguing me for months. Perhaps we can see a pattern by putting our experiences together.
I've been waiting to post till I could create a a reproducible example. Not possible thus far.
The bug doesn't happen in the same code location. In the past I've been able to avoid the error often by merely rerunning the exact same code. Other times I've reformulated an expression and rerun with success. In any case I'm pretty sure that these errors are truly internal to data.table.
I've saved the last 4 error messages in attempt to detect a pattern (pasted below).
---------------------------------------------------
[1] "err msg: location 1"
Error in selfrefok(x) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: my.fun1 ... $<- -> $<-.data.table -> [<-.data.table -> selfrefok
Execution halted
---------------------------------------------------
[1] "err msg: location 1"
Error in alloc.col(newx) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: my.fun1 -> $<- -> $<-.data.table -> copy -> alloc.col
Execution halted
---------------------------------------------------
[1] "err msg: location 2"
Error in shallow(x) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: print ... do.call -> lapply -> as.list -> as.list.data.table -> shallow
Execution halted
---------------------------------------------------
[1] "err msg: location 3"
Error in shallow(x) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: calc.book.summ ... .rbind.data.table -> as.list -> as.list.data.table -> shallow
Execution halted
Another similarity to the above example: I'm passing data.tables around among parallel threads, so they are being serialized/unserialized.
I will try the 'setattr' fix mentioned above.
hope this helps and thanks, jason
here is a simplification of one of the code segments that seems to generate this error 1 out of every 50-100k times it is run:
thanks #MatthewDowle btw. data.table has been most useful. Here is one stripped down bit of code:
require(data.table)
require(xts)
book <- data.frame(name='',
s=0,
Value=0.0,
x=0.0,
Qty=0)[0, ]
for (thing in list(1,2,3,4,5)) {
tmp <- xts(1:5, order.by= make.index.unique(rep(Sys.time(), 5)))
colnames(tmp) <- 'A'
tmp <- cbind(coredata(tmp[nrow(tmp), 'A']),
coredata(colSums(tmp[, 'A'])),
coredata(tmp[nrow(tmp), 'A']))
book <- rbind(book,
data.table(name='ALPHA',
s=0*NA,
Value=tmp[1],
x=tmp[2],
Qty=tmp[3]))
}
something like this seems to be the cause of this error:
Error in shallow(x) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: my.function ... .rbind.data.table -> as.list -> as.list.data.table -> shallow
Execution halted
For the sake of reproducing the error, I have a script for you guys to pour over and figure out where this bug is coming from. The error reads:
Error in { :
task 96 failed - "Internal error: .internal.selfref prot is not itself an extptr"
Calls: apply ... system.time -> apply -> FUN -> %dopar% -> <Anonymous>
Execution halted
and I'm using doParallel to register my backend for foreach.
Context: I'm testing out classifiers on the MNIST hand-written digit dataset. You can get the data from me via
wget -nc https://www.dropbox.com/s/xr4i8gy11ed8bsh/digit_id_data_and_benchmarks.zip
just be sure to modify the script (above) so that it correctly points to load_data.R and load_data.R correctly points to the MNIST data -- though it may be easier for you to just clone my repo, hop on the random_gov branch, and then run dt_centric_random_gov.R.
Sorry I couldn't make a more minimal reproducible example, but like #JasonB's answer, this error doesn't seem to pop up until you do a ton of calculations.
edit: I re-ran my script using the suggested work-around above and it seemed to go off without a hitch.

Resources