This post describes how to pass packages to all clusters:
Function not found in R doParallel 'foreach' - Error in { : task 1 failed - "could not find function "raster""
However, I would like to pass functions to the clusters (because I use them in the foreach-loop), that are not part of an R-package. How can I do this?
Edit:
My only idea was to run another foreach-loop from 1:#cluster before the actual foreach-loop and define the function in each cluster. But this is a) not very elegant and b) it doesn't work (function can still not be found)
Related
I am trying to run a power analysis using a MonteCarlo approach in R.
I have created a function of two parameters that does output a boolean (tested manually for all relevant values of the parameters). I also have run baby-examples of the MonteCarlo function to make sure that I understand it and that it works well.
Yet when I try to run the real thing, I get the following error message:
Error in parse(text = all_funcs_found[i]) : <text>:1:1: unexpected '::'
1: ::
I read through the source code of the MonteCarlo function (which I found here) and found
#loop through non-primitive functions used in func and check from which package they are
for(i in 1:length(all_funcs_found)){
if(environmentName(environment(eval(parse(text=all_funcs_found[i]))))%in%env_names){
packages<-c(packages,env_names[which(env_names==environmentName(environment(eval(parse(text=all_funcs_found[i])))))])
}
}
which doesn't really make sense to me - why should there be a problem there?
Thank you for any ideas.
I found the answer: the function I wrote was calling a function from a specific library in the form libraryname::functionname.
This works OK if you use the function once manually, but makes MonteCarlo break.
I solved the problem by first loading the relevant library, then removing the 'libraryname::' part from the definition of the main function. MonteCarlo then runs just fine.
I’m trying to implement parallel computing in an R package that calls C from R with the .C function. It seems that the nodes of the cluster can’t access the dynamic library. I have made a parallel socket cluster, like this:
cl <- makeCluster(2)
I would like to evaluate a C function called valgrad from my R package on each of the nodes in my cluster using clusterEvalQ, from the R package parallel. However, my code is producing an error. I compile my package, but when I run
out <- clusterEvalQ(cl, cresults <- .C(C_valgrad, …))
where … represents the arguments in the C function valgrad. I get this error:
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
2 nodes produced errors; first error: object 'C_valgrad' not found
I suspect there is a problem with clusterEvalQ’s ability to access the dynamic library. I attempted to fix this problem by loading the glmm package into the cluster using
clusterEvalQ(cl, library(glmm))
but that did not fix the problem.
I can evaluate valgrad on each of the clusters using the foreach function from the foreach R package, like this:
out <- foreach(1:no_cores) %dopar% {.C(C_valgrad, …)}
no_cores is the number of nodes in my cluster. However, this function doesn’t allow any of the results of the evaluation of valgrad to be accessed in any subsequent calculation on the cluster.
How can I either
(1) make the results of the evaluation of valgrad accessible for later calculations on the cluster or
(2) use clusterEvalQ to evaluate valgrad?
You have to load the external library. But this is not done with library calls, it's done with dyn.load.
The following two functions are usefull if you work with more than one operating system, they use the built-in variable .Platform$dynlib.ext.
Note also the unload function. You will need it if you develop a C functions library. If you change a C function before testing it the dynamic library has to be unloaded, then (the new version) reloaded.
See Writing R Extensions, file R-exts.pdf in the doc folder, section 5 or on CRAN.
dynLoad <- function(dynlib){
dynlib <- paste(dynlib, .Platform$dynlib.ext, sep = "")
dyn.load(dynlib)
}
dynUnload <- function(dynlib){
dynlib <- paste(dynlib, .Platform$dynlib.ext, sep = "")
dyn.unload(dynlib)
}
I'm trying to call the Error() function but it says could not find function "Error". I checked the docs and Error does not seem to be a part of R base package. This is a very hard function to search for because "Error" is a very overloaded word. What package is Error() in? For context, I'm running an anova. I'm pretty sure that this isn't a user defined since I see multiple tutorials referencing it without defining.
EDIT:
Here are the tutorials:
https://datascienceplus.com/two-way-anova-with-repeated-measures/ , http://personality-project.org/r/r.guide/r.anova.html#withinone (look at usages of Error() in within sujects/repeated measures anova)
EDIT2:
Here is the model answer from the tutorial. There does not seem to be any information about how the 'Error' function is defined or where it comes from:
model <- aov(wm$iq ~ wm$condition + Error(wm$subject / wm$condition))
The Error() in this case is specifying the error term for the aov function. It's a parameter passed to the function aov() and thus is not a function on its own. I've also tried searching for Error using the package sos, which yields 0 results:
# install.packages("sos")
library(sos)
results <- findFn("Error")
filtered_results <- results[results$Function == 'Error']
nrow(filtered_results)
Output:
[1] 0
You might want to read this Cross Validated post on how to set the Error term within the aov() function.
I'm in the process of writing an R package. One of my functions takes another function and some other data-related arguments and runs a %dopar% loop using the foreach package. This foreach-using function is used inside the one of the main functions of my package.
When I call the main function from another file, after having loaded my package, I get the message
Error in { : task 1 failed - "could not find function "some_function"
where some_function is some function from my package. I get this message, with varying missing function, when I set the .export argument in the call to foreach to any of the following:
ls(as.environment("package:mypackagename"))
ls(.GlobalEnv)
ls(environment())
ls(parent.env(environment()))
ls(parent.env(parent.env(environment()))
And even concatenations of the above. I also tried passing my package name to the .package argument, which only yields the error message
Error in e$fun(obj, substitute(ex), parent.frame(), e$data) : worker initialization failed: there is no package called ‘mypackagename’
I feel like I have tried just about everything, and I really need this piece of code to work. I should note that it does work if I use %do% instead of %dopar%. What am I doing wrong?
I need to modify the function gamGPDfit() in the package QRM to solve a problem. The function gamGPDfit() in turn calls other functions fit.GPD() and gamGPDfitUp() to compute the estimates of the parameters.
The structure of the function is shown below:
#######################################################
gamGPDfit<-function (..., init = fit.GPD(...) , ...)
{
...
Par<-gamGPDfitUp(...)
...
return (list(...))
}
<environment: namespace:QRM>
#######################################################
Now, when I call fit.GPD(), I get the function on the command window to make the necessary modifications. However, the other function gamGPDfitUp() returns
> gamGPDfitUp
Error: object 'gamGPDfitUp' not found
The question is, how do I get such an in-built function within another function? Does it have to do with the environment QRM? If so how do I obtain the function to modify it?.
I have attached the function and the call of the gamGPDfitUp() is indicated in colour red.
There's a couple of things that may come in handy.
One is help(":::") - Accessing exported and internal variables in a namespace. You can access GamGPDfitUp probably by prefixing it with QRM:::.
Another function is fixInNamespace, which allows you to modify functions inside packages. The help page for this one lists a few more interesting tools. Play around with this and it should solve most of your problems.