Parallel processing stopped working with error: object 'mcinteractive' not found - r

For a long time I've been successfully running a program which uses parallel processing. a couple of days ago to code stopped working with the error message:
"Error in get("mcinteractive", pkg) : object 'mcinteractive' not
found
traceback()
8: get("mcinteractive", pkg)
7: .customized_mcparallel({
result <- mclapply(X, function(...) {
res <- FUN(...)
writeBin(1L, progressFifo)
return(res)
}, ..., mc.cores = mc.cores, mc.preschedule = mc.preschedule,
mc.set.seed = mc.set.seed, mc.cleanup = mc.cleanup,
mc.allow.recursive = mc.allow.recursive)
if ("try-error" %in% sapply(result, class)) {
writeBin(-1L, progressFifo)
}
close(progressFifo)
result
})
6: pbmclapply(1:N, FUN = function(i) {
max_score = max(scores[i, ])
topLabels = names(scores[i, scores[i, ] >= max_score -
fine.tune.thres])
if (length(topLabels) == 0) {
return(names(which.max(scores[i, ])))
}
(I have more traceback if you are interested, but I think it mainly belongs to the "surrounding" code and is not so interesting for the error per se. Tell me if you need it and I'll make an edit!)
I do not know anything about parallel processing, and I haven't been able to understand the issue by digging into the code. From what I've understood, parallel::mcparallel is a function containing the argument mcinteractive for which you can choose TRUE or FALSE. Earlier I got the tip to decrease the number of cores used in the processing. Before I used 16 cores without any issues. After the error started occurring I tried to set the number of cores to both 8 and 1 with the same result. If it is some memory problem I guess I'm in the wrong forum, sorrysorrysorry!! But I only experience problems when using RStudio, which is why I'm writing here. The only other thing that I can think of, that might be related, is that my processing (through RStudio) sometimes gets stuck and the only thing I found is that the RAM memory is full and I have to restart the computer. Then the processing works as usual again. However, this does not help with the new error when using parallel computation.
Do anyone recognize this issue and have any lead to what could be the cause? Is it the code, teh package, studioR or my computer? Any checks I can run? :)
Edit:
Including a short version of the error while searching the code after changing pbmclapply to mclapply.
> packageVersion("parallel")
[1] ‘3.4.4’
> labels = parallel::pbmclapply(1:N, FUN = function(i) {
. . .
+ }, mc.cores = numCores)
Error: 'pbmclapply' is not an exported object from 'namespace:parallel'
> labels = pbmcapply::pbmclapply(1:N, FUN = function(i) {
. . .
+ }, mc.cores = numCores)
Error in get("mcinteractive", pkg) : object 'mcinteractive' not found
> labels = parallel::mclapply(1:N, FUN = function(i) {
. . .
+ }, mc.cores = numCores)
Warning message:
In parallel::mclapply(1:N, FUN = function(i) { :
all scheduled cores encountered errors in user code
#inside mclapply
> job.res <- lapply(seq_len(cores), inner.do)
Error in mcfork() : could not find function "mcfork"
#inside inner.do
> f <- parallel::mcfork()
Error: 'mcfork' is not an exported object from 'namespace:parallel'
Edit 2: came a bit further in my error searching.
I had to add a triple colon before a lot of functions for parallel, meaning that i'm attaching an internal function (?), which in turn should mean that paralell is no longer part of my search path(?)
parallel:::mcfork()
parallel:::mc.advance.stream()
parallel:::selectChildren()
parallel:::isChild()
#Had to change .check_ncores(cores) to
parallel::detectCores()

This problem occurs because pbmclapply was updated and now only works with R >3.5, updating R solved my problem.

Related

How to use Trycatch to skip errors in data downloading in R

I am trying to download data from the USGS website using the dataRetrieval package of R.
For that purpose, I have generated a function called getstreamflow in R that works fine when I ran for example.
siteNumber <- c("094985005","09498501","09489500","09489499","09498502")
Streamflow = getstreamflow(siteNumber)
The output of the function is a list of data frames
I could run the function when there is no issue downloading the data, but for some stations, I got the following error:
Request failed [404]. Retrying in 1.1 seconds...
Request failed [404]. Retrying in 3.3 seconds...
For: https://waterservices.usgs.gov/nwis/site/?siteOutput=Expanded&format=rdb&site=0946666666
To avoid that the function stops when encounters an error, I am trying to use tryCatch as in the following code:
Streamflow = tryCatch(
expr = {
getstreamflow(siteNumber)
},
error = function(e) {
message(paste(siteNumber," there was an error"))
})
I want the function to skip the station and go to the next when encountering an error. Currently, the output I got is the one presented below, that obviously is wrong, because it says that for all the stations there was an error:
094985005 there was an error09498501 there was an error09489500 there was an error09489499 there was an error09498502 there was an error09511300 there was an error09498400 there was an error09498500 there was an error09489700 there was an error09500500 there was an error09489082 there was an error09510200 there was an error09489100 there was an error09490500 there was an error09510180 there was an error09494000 there was an error09490000 there was an error09489086 there was an error09489089 there was an error09489200 there was an error09489078 there was an error09510170 there was an error09493500 there was an error09493000 there was an error09498503 there was an error09497500 there was an error09510000 there was an error09509502 there was an error09509500 there was an error09492400 there was an error09492500 there was an error09497980 there was an error09497850 there was an error09492000 there was an error09497800 there was an error09510150 there was an error09499500 there was an error... <truncated>
What I am doing wrong using the tryCatch?
Answer
You wrote the tryCatch outside of getstreamflow. Hence, if one site fails, then getstreamflow will return an error and nothing else. You should either supply 1 site at a time, or put the tryCatch inside getstreamflow.
Example
x <- 1:5
fun <- function(x) {
for (i in x) if (i == 5) stop("ERROR")
return(x^2)
}
tryCatch(fun(x), error = function(e) paste0("wrong", x))
This returns:
[1] "wrong1" "wrong2" "wrong3" "wrong4" "wrong5"
Multiple arguments
You indicated that you have both siteNumber and datatype to iterate over.
Using Map, we can define a function that takes two inputs:
Map(function(x, y) tryCatch(fun(x, y),
error = function(e) message(paste(x, " there was an error"))),
x = siteNumber,
y = datatype)
Using a for-loop, we can just iterate over them:
Streamflow <- vector(mode = "list", length = length(siteNumber))
for (i in seq_along(siteNumber)) {
Streamflow[[i]] <- tryCatch(getstreamflow(siteNumber[i], datatype), error = function(e) message(paste(x, " there was an error")))
}
Or, as suggested, just modify getstreamflow.

R - `try` in conjunction with capturing ALL console output?

Here's a piece of code I'm working with:
install.package('BiocManager');BiocManager::install('UniProt.ws')
requireNamespace('UniProt.ws')
uniprot_object <- UniProt.ws::UniProt.ws(
UniProt.ws::availableUniprotSpecies(
pattern = '^Homo sapiens$')$`taxon ID`)
query_results <- try(
UniProt.ws::select(
x = uniprot_object,
keys = 'BAA08084.1',
keytype = 'EMBL/GENBANK/DDBJ',
columns = c('ENSEMBL','UNIPROTKB')))
This particular key/keytype combination is non-productive and produces the following output:
Getting mapping data for BAA08084.1 ... and ACC
error while trying to retrieve data in chunk 1:
no lines available in input
continuing to try
Error in `colnames<-`(`*tmp*`, value = `*vtmp*`) :
attempt to set 'colnames' on an object with less than two dimensions
Of the two [eE]rrors reported only the second is a 'proper' R error object and given the use of try accordingly captured in the variable query_result.
I am, however, desperate to capture the other error bit (no lines available in input) to inform downstream programmatic processes.
After playing with a plethora of capture.output, sink, purrr::quietly, etc. options found by startpaging (googling), I continue to fail capturing that bit. How can I do that?
As #Csd suggested, you could use tryCatch. The message that you are after is printed by the message() function in R, not stop(), so try() will ignore it. To capture output from message(), use code like this:
query_results <- tryCatch(
UniProt.ws::select(
x = uniprot_object,
keys = 'BAA08084.1',
keytype = 'EMBL/GENBANK/DDBJ',
columns = c('ENSEMBL','UNIPROTKB')),
message = function(e) conditionMessage(e))
This will abort evaluation when it gets any message, and return the message in query_results. If you are doing more than debugging, you probably want the message saved, but evaluation to continue. In that case, use withCallingHandlers instead. For example,
saveMessages <- c()
query_results <- withCallingHandlers(
UniProt.ws::select(
x = uniprot_object,
keys = 'BAA08084.1',
keytype = 'EMBL/GENBANK/DDBJ',
columns = c('ENSEMBL','UNIPROTKB')),
message = function(e)
saveMessages <<- c(saveMessages, conditionMessage(e)))
When I run this version, query_results is unchanged (because the later error aborted execution), but the messages are saved:
saveMessages
[1] "Getting mapping data for BAA08084.1 ... and ACC\n"
[2] "error while trying to retrieve data in chunk 1:\n no lines available in input\ncontinuing to try\n"
Based on #user2554330 s most excellent answer, I constructed an ugly thing that does exactly what I want:
try to execute the statement
don't fail fatally
leave no ugly messages
allow me access to errors and messages
So here it is in all it's despicable glory:
saveMessages <- c()
query_results <- suppressMessages(
withCallingHandlers(
try(
UniProt.ws::select(
x = uniprot_object,
keys = 'BAA08084.1',
keytype = 'EMBL/GENBANK/DDBJ',
columns = c('ENSEMBL','UNIPROTKB')),
silent = TRUE),
message = function(e)
saveMessages <<- c(saveMessages, conditionMessage(e))))

Error in serialize(data, node$con) : error writing to connection

I'm currently trying to run some code that implements parallel processing, but I'm running into this error:
Error: cannot allocate vector of size 2.1 Gb
Execution halted
Error in serialize(data, node$con) : error writing to connection
Calls: %dopar% ... postNode -> sendData -> sendData.SOCKnode -> serialize
Execution halted
Warning message:
system call failed: Cannot allocate memory
Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode ->
unserialize Execution halted
I can't seem to figure out why there's a memory problem. If I take the code out of the foreach loop or change the foreach to a for loop, it works perfectly fine, so I don't think it has to do with the contents of the code itself, but rather something about the parallelization. Also, it seems to throw the error pretty soon after the code starts executing. Any ideas why this might be happening? Here's a look at my code:
list_storer <- list()
list_storer <- foreach(bt=2:bootreps, .combine=list, .multicombine=TRUE) %dopar% {
ur <- sample.int(nrow(dailydatyr),nrow(dailydatyr),replace=TRUE)
ddyr_boot <- dailydatyr[ur,]
weightvar <- ddyr_boot[,c('ymd1_IssueD','MatD_ymd2')]
weightvar <- abs(weightvar)
x <- DM[ur,]
y<-log(ddyr_boot$dirtyprice2/ddyr_boot$dirtyprice1)
weightings <- rep(1,nrow(ddyr_boot))
weightings <- weightings/(ddyr_boot$datenum2-ddyr_boot$datenum1)
treg <- repeatsales(y,x,maxdailyreturn,weightings,weightvar)
zbtcol <- 0
cnst <- NULL
if (is.null(dums) == FALSE){
zbtcol <- length(treg)-ncol(x)
cnst <- paste("tbs(",dums,")_",(middleyr),sep="")
if (is.null(interactVar) == FALSE){
ninteract <- (length(treg)-ncol(x)-length(dums))/length(dums)
interact <- unlist(lapply(cnst,function(xla) paste(xla,"*c",c(1:ninteract),sep="")))
cnst <- c(cnst,interact)}
}
}
tregtotal <- tregtotal + (is.na(treg)==FALSE)
treg[is.na(treg)==TRUE] <- 0
list_storer[[length(list_storer)+1]] <- treg
}
stopImplicitCluster(cl)
Parallelisation as done by foreach is a space vs. time trade-off. We get faster execution at the expense of higher memory usage. The reason for the higher memory usage is that several R process are started and each of them needs it’s own memory to hold the data necessary for the calculation. Currently foreach is using an implicit PSOCK cluster. One way to solve this is to make the cluster creation explicit using a lower number of processes. How low depends on the amount of memory you have and on the memory requirements of each job:
n <- parallel::detectCores()/2 # experiment!
cl <- parallel::makeCluster(n)
doParallel::registerDoParallel(cl)
<foreach>
parallel::stopCluster(cl)

using package snow's parRapply: argument missing error

I want to find documents whose similarity between other doucuments are larger than a given value(0.1) by cutting documents into blocks.
library(tm)
data("crude")
sample.dtm <- DocumentTermMatrix(
crude, control=list(
weighting=function(x) weightTfIdf(x, normalize=FALSE),
stopwords=TRUE
)
)
step = 5
n = nrow(sample.dtm)
block = n %/% step
start = (c(1:block)-1)*step+1
end = start+step-1
j = unlist(lapply(1:(block-1),function(x) rep(((x+1):block),times=1)))
i = unlist(lapply(1:block,function(x) rep(x,times=(block-x))))
ij <- cbind(i,j)
library(skmeans)
getdocs <- function(k){
ci <- c(start[k[[1]]]:end[k[[1]]])
cj <- c(start[k[[2]]]:end[k[[2]]])
combi <- sample.dtm[ci]
combj < -sample.dtm[cj]
rownames(combi)<-ci
rownames(combj)<-cj
comb<-c(combi,combj)
sim<-1-skmeans_xdist(comb)
cat("Block", k[[1]], "with Block", k[[2]], "\n")
flush.console()
tri.sim<-upper.tri(sim,diag=F)
results<-tri.sim & sim>0.1
docs<-apply(results,1,function(x) length(x[x==TRUE]))
docnames<-names(docs)[docs>0]
gc()
return (docnames)
}
It works well when using apply
system.time(rmdocs<-apply(ij,1,getdocs))
When using parRapply
library(snow)
library(skmeans)
cl<-makeCluster(2)
clusterExport(cl,list("getdocs","sample.dtm","start","end"))
system.time(rmdocs<-parRapply(cl,ij,getdocs))
Error:
Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: attempt to set 'rownames' on an object with no dimensions
Timing stopped at: 0.01 0 0.04
It seems that sample.dtm coundn't be used in parRapply. I'm confused. Can anyone help me? Thanks!
In addition to exporting objects, you need to load the necessary packages on the cluster workers. In your case, the result of not doing so is that there isn't a dimnames method defined for "DocumentTermMatrix" objects, causing rownames<- to fail.
You can load packages on the cluster workers with the clusterEvalQ function:
clusterEvalQ(cl, { library(tm); library(skmeans) })
After doing that, rownames(combi)<-ci will work correctly.
Also, if you want to see the output from cat, you should use the makeCluster outfile argument:
cl <- makeCluster(2, outfile='')

Getting a random internal selfref error in data.table for R

I love data.table, it's fast and intuitive, what could be better?
Alas, here's my problem: when referring to a data.table within a foreach() loop (using the doMC implementation) I will occasionally get the following error:
EXAMPLE IN APPENDIX
Error in { :
Internal error: .internal.selfref prot is not itself an extptr
One of the annoying problems here is that I can't get it to reproduce with any consistency, but it will happen during some long (several hrs) tasks, so I want to make sure it never happens, if possible.
Since I refer to the same data.table, DT, in each loop, I tried running the following at the beginning of each loop:
setattr(DT,".internal.selfref",NULL)
...to remove the invalid/corrupted self ref attribute. This works and the internal selfref error no longer occurs. It's a workaround, though.
Any ideas for addressing the root problem?
Many thanks for any help!
Eric
Appendix: Abbreviated R Session Info to confirm latest versions:
R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
other attached packages:
[1] data.table_1.8.8 doMC_1.3.0
Example using simulated data -- you may have to run the history() function many times (like, hundreds) to get the error:
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Load packages and Prepare Data
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
require(data.table)
##this is the package we use for multicore
require(doMC)
##register n-2 of your machine's cores
registerDoMC(multicore:::detectCores()-2)
## Build simulated data
value.a <- runif(500,0,1)
value.b <- 1-value.a
value <- c(value.a,value.b)
answer.opt <- c(rep("a",500),rep("b",500))
answer.id <- rep( 6000:6499 , 2)
question.id <- rep( sample(c(1001,1010,1041,1121,1124),500,replace=TRUE) ,2)
date <- rep( (Sys.Date() - sample.int(150, size=500, replace=TRUE)) , 2)
user.id <- rep( sample(250:350, size=500, replace=TRUE) ,2)
condition <- substr(as.character(user.id),1,1)
condition[which(condition=="2")] <- "x"
condition[which(condition=="3")] <- "y"
##Put everything in a data.table
DT.full <- data.table(user.id = user.id,
answer.opt = answer.opt,
question.id = question.id,
date = date,
answer.id = answer.id,
condition = condition,
value = value)
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Daily Aggregation Function
##
##a basic function that aggregates all the values from
##all users for every question on a given day:
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
each.day <- function(val.date){
DT <- DT.full[ date < val.date ]
#count the number of updates per user (for weighting)
setkey(DT, question.id, user.id)
DT <- DT[ DT[answer.opt=="a",length(value),by="question.id,user.id"] ]
setnames(DT, "V1", "freq")
#retain only the most recent value from each user on each question
setkey(DT, question.id, user.id, answer.id)
DT <- DT[ DT[ ,answer.id == max(answer.id), by="question.id,user.id", ][[3]] ]
#now get a weighted mean (with freq) of the value for each question
records <- lapply(unique(DT$question.id), function(q.id) {
DT <- DT[ question.id == q.id ]
probs <- DT[ ,weighted.mean(value,freq), by="answer.opt" ]
return(data.table(q.id = rep(q.id,nrow(probs)),
ans.opt = probs$answer.opt,
date = rep(val.date,nrow(probs)),
value = probs$V1))
})
return(do.call("rbind",records))
}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## foreach History Function
##
##to aggregate accross many days quickly
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
history <- function(start, end){
#define a sequence of dates
date.seq <- seq(as.Date(start),as.Date(end),by="day")
#now run a foreach to get the history for each date
hist <- foreach(day = date.seq, .combine = "rbind") %dopar% {
#setattr(DT,".internal.selfref",NULL) #resolves occasional internal selfref error
each.day(val.date = day)
}
}
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Examples
##~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
##aggregate only one day
each.day(val.date = "2012-12-13")
##generate a history
hist.example <- history (start = "2012-11-01", end = Sys.Date())
Thanks for reporting and all the help in finding it! Now fixed in v1.8.11. From NEWS :
In long running computations where data.table is called many times
repetitively, the following error could sometimes occur, #2647 :
Internal error: .internal.selfref prot is not itself an extptr
Fixed. Thanks to theEricStone, StevieP and JasonB for (difficult) reproducible examples.
Possibly related is a memory leak in grouping, which is also now fixed.
Long outstanding (usually small) memory leak in grouping fixed, #2648. When the last group is smaller than the largest group, the difference in those sizes was not being released. Also in non-trivial aggregations where each group returns a different number of rows. Most users run a grouping query once and will never have noticed these, but anyone looping calls to grouping (such as when running in parallel, or benchmarking) may have suffered. Tests added.
Thanks to many including vc273 and Y T.
Memory leak in data.table grouped assignment by reference
Slow memory leak in data.table when returning named lists in j (trying to reshape a data.table)
A similar problem has been plaguing me for months. Perhaps we can see a pattern by putting our experiences together.
I've been waiting to post till I could create a a reproducible example. Not possible thus far.
The bug doesn't happen in the same code location. In the past I've been able to avoid the error often by merely rerunning the exact same code. Other times I've reformulated an expression and rerun with success. In any case I'm pretty sure that these errors are truly internal to data.table.
I've saved the last 4 error messages in attempt to detect a pattern (pasted below).
---------------------------------------------------
[1] "err msg: location 1"
Error in selfrefok(x) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: my.fun1 ... $<- -> $<-.data.table -> [<-.data.table -> selfrefok
Execution halted
---------------------------------------------------
[1] "err msg: location 1"
Error in alloc.col(newx) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: my.fun1 -> $<- -> $<-.data.table -> copy -> alloc.col
Execution halted
---------------------------------------------------
[1] "err msg: location 2"
Error in shallow(x) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: print ... do.call -> lapply -> as.list -> as.list.data.table -> shallow
Execution halted
---------------------------------------------------
[1] "err msg: location 3"
Error in shallow(x) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: calc.book.summ ... .rbind.data.table -> as.list -> as.list.data.table -> shallow
Execution halted
Another similarity to the above example: I'm passing data.tables around among parallel threads, so they are being serialized/unserialized.
I will try the 'setattr' fix mentioned above.
hope this helps and thanks, jason
here is a simplification of one of the code segments that seems to generate this error 1 out of every 50-100k times it is run:
thanks #MatthewDowle btw. data.table has been most useful. Here is one stripped down bit of code:
require(data.table)
require(xts)
book <- data.frame(name='',
s=0,
Value=0.0,
x=0.0,
Qty=0)[0, ]
for (thing in list(1,2,3,4,5)) {
tmp <- xts(1:5, order.by= make.index.unique(rep(Sys.time(), 5)))
colnames(tmp) <- 'A'
tmp <- cbind(coredata(tmp[nrow(tmp), 'A']),
coredata(colSums(tmp[, 'A'])),
coredata(tmp[nrow(tmp), 'A']))
book <- rbind(book,
data.table(name='ALPHA',
s=0*NA,
Value=tmp[1],
x=tmp[2],
Qty=tmp[3]))
}
something like this seems to be the cause of this error:
Error in shallow(x) :
Internal error: .internal.selfref prot is not itself an extptr
Calls: my.function ... .rbind.data.table -> as.list -> as.list.data.table -> shallow
Execution halted
For the sake of reproducing the error, I have a script for you guys to pour over and figure out where this bug is coming from. The error reads:
Error in { :
task 96 failed - "Internal error: .internal.selfref prot is not itself an extptr"
Calls: apply ... system.time -> apply -> FUN -> %dopar% -> <Anonymous>
Execution halted
and I'm using doParallel to register my backend for foreach.
Context: I'm testing out classifiers on the MNIST hand-written digit dataset. You can get the data from me via
wget -nc https://www.dropbox.com/s/xr4i8gy11ed8bsh/digit_id_data_and_benchmarks.zip
just be sure to modify the script (above) so that it correctly points to load_data.R and load_data.R correctly points to the MNIST data -- though it may be easier for you to just clone my repo, hop on the random_gov branch, and then run dt_centric_random_gov.R.
Sorry I couldn't make a more minimal reproducible example, but like #JasonB's answer, this error doesn't seem to pop up until you do a ton of calculations.
edit: I re-ran my script using the suggested work-around above and it seemed to go off without a hitch.

Resources