Way to load all functions in a package in R? - r

I am attempting to take a function and apply it in a slightly different way in my own function. However I keep running into dependency issues, such and such function not found.
This in spite of the fact, that the function itself works just fine only loading rlist.
This is select.list in the rlist package:
function (.data, ...)
{
args <- set_argnames(dots(...))
quote <- as.call(c(quote(list), args))
list.map.internal(.data, quote, parent.frame())
}
If I attempt to run this function I get an error that the function set_argnames was not found.
When I go into that function here:
function (args, data = args)
{
argnames <- getnames(args, character(length(args)))
indices <- !nzchar(argnames) & vapply(args, is.name, logical(1L))
argnames[indices] <- as.character(args[indices])
setnames(data, argnames)
}
I run into errors of similar nature.
I have intalled and loaded all package dependecies, LinkingTos,Imports,and even suggesteds. Nothing seems to solve the issue.
Is there a way to load all the dependent functions of a function?

Related

adding a *local* function to an existing R-package

Many great R packages exists. However, often they use slightly different names for the same behaviour. As I often use different packages, the different names get in my way. Thus, I would like to extend the original package by adding local functions. E.g.
in the package "rethinking" we use the function "extract.samples()" to obtain the samples from the posterior distribution.
in the package "rstanarm" we use the function "as.matrix()" instead.
It would be nice to add the function "extract.samples()" to my local repository, and to define that it is called only if the input parameter is an "rstanarm object". Thus, I really would like to extend the package: If I load "rethinking" the "rethinking::extract.samples()" is used, and if I load "rstanarm" the function "rstanarm::extract.samples()" is used.
What I currently do is the following:
extract.samples = function(object, n=1000, ...){
if ( class(object)[[1]] == 'stanreg' ){
# rstanarm object:
SIMULATIONS = as.matrix(object)
} else if ( attr(class(object), "package") == 'rethinking' ){
SIMULATIONS = rethinking::extract.samples(object, n=n, ...)
}
return(invisible( SIMULATIONS ))
}
Thus, I explicitly have to take care of all the possible objects and parameter setting. This becomes messy, if a third package defines the function "extract.samples" or if the two packages use different parameters. I wonder, if there is a more robust method.
The proper way to do this is to create your own package that exports a generic function and several methods for it. If you can edit rethinking, then just do that, otherwise create a third package. I'll assume you're doing that.
Here's what the code could look like:
extract.samples <- function(x, ...) {
UseMethod("extract.samples")
}
extract.samples.stanreg <- function(x, ...) {
as.matrix(x, ...)
}
extract.samples.default <- function(x, ...) {
rethinking::extract.samples(x, ...)
}

How do I pass a function argument into map_df()?

I am trying to create a function to clean data and return as a data.frame in R.
I'm using the map_df() function to return the cleaned data as a data.frame, and have a function written to clean the data.
The first thing I do is pull a list of files from a folder, then iterate through them and clean each file. I have a pre-defined set specifying which column names to pull (stored in selectCols) in case of variation between files:
files <- list.files(filepath,full.names=F)
colInd <- which(names(fread(files[i],nrows=0)) %in% gsub("_","",selectCols))
I also have a function to clean my data, which uses fread() to read in the .csv files. It takes colInd and i as arguments to clean files iteratively.
cleanData <- function(files,i,colInd) {
addData <- fread(files[i],select=c(colInd))
[...]
}
Overall it looks like this (as a recursive function):
i <- 1
files <- list.files(filepath,full.names=F)
iterateCleaning <- function(files,i) {
colInd <- (which(names(fread(files[i],nrows=0)) %in% gsubs("_","",selectCols))
if (length(colInd)==length(selectCols)) {
newData <- map_df(files,cleanData)
saveToFolder(newData,i,files)
}
else {}
i=i+1
if (i<-length(files)){
iterateCleaning(files,i)
}
else {}
}
When I try to run without specifying the arguments for my function I get this error:
Error in fread(files,select=c(colInd)):
argument "colInd" is missing, with no default.
When I insert it into my map_df() I do it like so:
newData <- map_df(files,i,colInd,cleanData)
Then I get this error:
Error in as_mapper(.f,...): object 'colInd' not found.
Any suggestions for resolving this error? As I understand it, map_df() applies to each element in the function, but I don't need it applied to the i and colInd inputs, I just need them for the function I am calling in map_df(). How can I call map_df() on a function that requires additional arguments?
I read the documentation but it seemed a bit confusing. It says for a single-argument function to use "." and for two-argument functions to use .x and .y, but I'm not sure what it means. My initial guess is something like these, but neither line works):
newData <- map_df(files,cleanData,.i,.colInd)
newData <- map_df(files,cleanData,.x=i,.y=colInd)
Any recommendations? Will I have the same output if I just call map_df() afterwards on the output of my function?

How to list all the functions signatures in an R file?

Is there an R function that lists all the functions in an R script file along with their arguments?
i.e. an output of the form:
func1(var1, var2)
func2(var4, var10)
.
.
.
func10(varA, varB)
Using [sys.]source has the very undesirable side-effect of executing the source inside the file. At the worst this has security problems, but even “benign” code may simply have unintended side-effects when executed. At best it just takes unnecessary time (and potentially a lot).
It’s actually unnecessary to execute the code, though: it is enough to parse it, and then do some syntactical analysis.
The actual code is trivial:
file_parsed = parse(filename)
functions = Filter(is_function, file_parsed)
function_names = unlist(Map(function_name, functions))
And there you go, function_names contains a vector of function names. Extending this to also list the function arguments is left as an exercise to the reader. Hint: there are two approaches. One is to eval the function definition (now that we know it’s a function definition, this is safe); the other is to “cheat” and just get the list of arguments to the function call.
The implementation of the functions used above is also not particularly hard. There’s probably even something already in R core packages (‘utils’ has a lot of stuff) but since I’m not very familiar with this, I’ve just written them myself:
is_function = function (expr) {
if (! is_assign(expr)) return(FALSE)
value = expr[[3L]]
is.call(value) && as.character(value[[1L]]) == 'function'
}
function_name = function (expr) {
as.character(expr[[2L]])
}
is_assign = function (expr) {
is.call(expr) && as.character(expr[[1L]]) %in% c('=', '<-', 'assign')
}
This correctly recognises function declarations of the forms
f = function (…) …
f <- function (…) …
assign('f', function (…) …)
It won’t work for more complex code, since assignments can be arbitrarily complex and in general are only resolvable by actually executing the code. However, the three forms above probably account for ≫ 99% of all named function definitions in practice.
UPDATE: Please refer to the answer by #Konrad Rudolph instead
You can create a new environment, source your file in that environment and then list the functions in it using lsf.str() e.g.
test.env <- new.env()
sys.source("myfile.R", envir = test.env)
lsf.str(envir=test.env)
rm(test.env)
or if you want to wrap it as a function:
listFunctions <- function(filename) {
temp.env <- new.env()
sys.source(filename, envir = temp.env)
functions <- lsf.str(envir=temp.env)
rm(temp.env)
return(functions)
}

Environment chaining in R

In my R development I need to wrap function primitives in proto objects so that a number of arguments can be automatically passed to the functions when the $perform() method of the object is invoked. The function invocation internally happens via do.call(). All is well, except when the function attempts to access variables from the closure within which it is defined. In that case, the function cannot resolve the names.
Here is the smallest example I have found that reproduces the behavior:
library(proto)
make_command <- function(operation) {
proto(
func = operation,
perform = function(., ...) {
func <- with(., func) # unbinds proto method
do.call(func, list(), envir=environment(operation))
}
)
}
test_case <- function() {
result <- 100
make_command(function() result)$perform()
}
# Will generate error:
# Error in function () : object 'result' not found
test_case()
I have a reproducible testthat test that also outputs a lot of diagnostic output. The diagnostic output has me stumped. By looking up the parent environment chain, my diagnostic code, which lives inside the function, finds and prints the very same variable the function fails to find. See this gist..
How can the environment for do.call be set up correctly?
This was the final answer after an offline discussion with the poster:
make_command <- function(operation) {
proto(perform = function(.) operation())
}
I think the issue here is clearer and easier to explore if you:
Replace the anonymous function within make_command() with a named one.
Make that function open a browser() (instead of trying to get result). That way you can look around to see where you are and what's going on.
Try this, which should clarify the cause of your problem:
test_case <- function() {
result <- 100
myFun <- function() browser()
make_command(myFun)$perform()
}
test_case()
## Then from within the browser:
#
parent.env(environment())
# <environment: 0x0d8de854>
# attr(,"class")
# [1] "proto" "environment"
get("result", parent.env(environment()))
# Error in get("result", parent.env(environment())) :
# object 'result' not found
#
parent.frame()
# <environment: 0x0d8ddfc0>
get("result", parent.frame()) ## (This works, so points towards a solution.)
# [1] 100
Here's the problem. Although you think you're evaluating myFun(), whose environment is the evaluation frame of test_case(), your call to do.call(func, ...) is really evaluating func(), whose environment is the proto environment within which it was defined. After looking for and not finding result in its own frame, the call to func() follows the rules of lexical scoping, and next looks in the proto environment. Neither it nor its parent environment contains an object named result, resulting in the error message you received.
If this doesn't immediately make sense, you can keep poking around within the browser. Here are a few further calls you might find helpful:
environment(get("myFun", parent.frame()))
ls(environment(get("myFun", parent.frame())))
environment(get("func", parent.env(environment())))
ls(environment(get("func", parent.env(environment()))))

automatic redirection of functions

The language is R.
I have a couple of files:
utilities.non.foo.R
utilities.foo.R
utilities.R
foo is an in-house package that has been cobbled together (for image processing, although this is irrelevant). It works great, but only on Linux machines, and it is a huge pain to try and compile it even on those.
Basically, utilities.foo.R contains a whole lot of functions that require package foo.
The functions in here are called functionname.foo.
I'm about to start sharing this code with external collaborators who don't have this package or Linux, so I've written a file utilities.non.foo.R, which contains all the functions in utilities.foo.R, except the dependency on package foo has been removed.
These functions are all called functionname.non.foo.
The file utilities.R has a whole heap of this, for each function:
functionname <- function(...) {
if ( fooIsLoaded() ) {
functionname.foo(...)
} else {
functionname.non.foo(...)
}
}
The idea is that one only needs to load utilities.R and if you happen to have package foo (e.g. my internal collaborators), you will use that backend. If you don't have foo (external collaborators), you'll use the non-foo backend.
My question is: is there some way to do the redirection for each function name without explicitly writing the above bit of code for every single function name?
This reminds me of how (e.g.) there is a print method, a print.table, print.data.frame, etc, but the user only needs to use print and which method is used is chosen automatically.
I'd like to have that, except the method.class would be more like method.depends_on_which_package_is_loaded.
Is there any way to avoid writing a redirection function per function in my utilities.R file?
As Dirk says, just use a package. In this case, put all your new *.non.foo functions in a new package, which is also called foo. Distribute this foo to your collaborators, instead of your in-house version. That way your utilities code can just be
functionname <- function(...) functionname.foo(...)
without having to make any checks at all.
Here is an idea: write a function that sets f to either f.foo or f.non.foo. It could be called in a loop, over all functions in a given namespace (or all functions whose name ends in .foo).
dispatch <- function(s) {
if ( fooIsLoaded() ) {
f <- get( paste(s, "foo", sep=".") )
} else {
f <- get( paste(s, "non.foo", sep=".") )
}
assign( s, f, envir=.GlobalEnv ) # You may want to use a namespace
}
f.foo <- function() cat("foo\n")
f.non.foo <- function() cat("non-foo\n")
fooIsLoaded <- function() TRUE
dispatch("f")
f()
fooIsLoaded <- function() FALSE
dispatch("f")
f()
A simpler solution would be to give the same name
to both functions, but put them in different namespaces/packages.
This sounds quite inefficient and inelegant, but how about
funify = function(f, g, package="ggplot2") {
if(paste("package:", package, sep="") %in% search()) f else
{ message("how do you intend to work without ", package) ; g}
}
detach(package:ggplot2)
foo = funify(paste, function(x) letters[x])
foo(1:10)
library(ggplot2)
foo = funify(paste, function(x) letters[x])
foo(1:10)

Resources