R List functions in file - r

How do I list all functions of a certain R file doing something like
list = list.all.functions(file.name, alphabetical = TRUE, ...)
where list is a string vector containing the names of the functions in file.name?
The solution of How to list all the functions and their arguments in an R file? gives no output for me (since I am not interested in arguments I opened a new question).
EDIT
File allometry.R starts with
#==========================================================================================#
#==========================================================================================#
# Standing volume of a tree. #
#------------------------------------------------------------------------------------------#
dbh2vol <<- function(hgt,dbh,ipft){
vol = pft$b1Vol[ipft] * hgt * dbh ^ pft$b2Vol[ipft]
return(vol)
}#end function dbh2ca
#==========================================================================================#
#==========================================================================================#
My main looks like
rm(list=ls())
here = "/directory/of/allometry.R/"
setwd(here)
is_function = function (expr) {
if (! is_assign(expr))
return(FALSE)
value = expr[[3]]
is.call(value) && as.character(value[[1]]) == 'function'
}
function_name = function (expr)
as.character(expr[[2]])
is_assign = function (expr)
is.call(expr) && as.character(expr[[1]]) %in% c('=', '<-', 'assign')
file_parsed = parse("allometry.R")
functions = Filter(is_function, file_parsed)
function_names = unlist(Map(function_name, functions))

Probably too late to join the party, but better late than never.
There is a package called NCmisc which has a function to list all functions in a file and returns a list where the names of the components are the names of the packages they belong to. If there are any functions in the global environment, they will be under the .GobalEnv list component. Simply load all packages the file uses and then run the following:
all.functions <- list.functions.in.file(
filename = "/path/to/file/my_file.R")

Related

Capturing ellipsis arguments from within an internal function

I'm trying to extract arguments passed to ... from within an internal function to perform validity check. Since the only purpose of the function is to check ellipsis, I'd like the function to have no parameter and capture the ellipsis from the parent function internally.
Here's a simple example of what I'd like to do:
check_dots <- function() {
# capture ... arguments here
if (rlang::dots_n(...) == 1L && ... == "foo") {
stop()
}
}
(function(...) {
check_dots()
"success"
})("foo", "bar")
I've tried using formals(fun = rlang::caller_fn()) to extract ... arguments without success.
The following, using base R, does what you want:
check_dots = function () {
call = match.call(definition = sys.function(-1L), call = sys.call(-1L), expand.dots = FALSE)
if (length(call$...) == 1L && call$...[[1L]] == 'foo') stop('error')
}
‘rlang’ has caller_call as an rough equivalent of match.call, but it’s missing an option to prevent expanding dots, so I don’t know how to do the same as above using ‘rlang’.

Run testthat test in separate R session (how to combine the outcomes)

I need to test package loading operations (for my multiversion package) and know that unloading namespaces and stuff is dangerous work. So I want to run every test in a fresh R session. Running my tests in parallel does not meet this demand since it will reuse slaves, and these get dirty.
So I thought callr::r would help me out. Unfortunately I am again stuck with the minimally documented reporters it seems.
The following is a minimal example. Placed in file test-mytest.R.
test_that('test 1', {
expect_equal(2+2, 5)
})
reporter_in <- testthat::get_reporter()
# -- 1 --
reporter_out <- callr::r(
function(reporter) {
reporter <- testthat::with_reporter(reporter, {
testthat::test_that("test inside", {
testthat::expect_equal('this', 'wont match')
})
})
},
args = list(reporter = reporter_in),
show = TRUE
)
# -- 2 --
testthat::set_reporter(reporter_out)
# -- 3 --
test_that('test 2', {
expect_equal(2+2, 8)
})
I called this test file using:
# to be able to check the outcome, work with a specific reporter
summary <- testthat::SummaryReporter$new()
testthat::test_file('./tests/testthat/test-mytest.R', reporter = summary)
Which seems to do what I want, but when looking at the results...
> summary$end_reporter()
== Failed ===============================================================================================
-- 1. Failure (test-load_b_pick_last_true.R:5:5): test 1 ------------------------------------------------
2 + 2 (`actual`) not equal to 5 (`expected`).
`actual`: 4
`expected`: 5
== DONE =================================================================================================
...it is only the first test that is returned.
How it works:
An ordinary test is executed.
The reporter, currently in use, is obtained (-- 1 --)
callr::r is used to call a testthat block including a test.
Within the call, I tried using set_reporter, but with_reporter is practically identical.
The callr::r call returns the reporter (tried it with get_reporter(), but with_reporter also returns the reporter (invisibly))
Now the returned reporter seems fine, but when setting it as the actual reporter with set_reporter, it seems that it is not overwriting the actual reporter.
Note that at -- 2 --, the reporter_out contains both test outcomes.
Question
I am not really sure what I expect it to do, but in the end I want the results to be added to the original reporter ((summary or) reporter_in that is, if that is not some kind of copy).
One workaround I can think of would be to move the actual test execution outside of the callr::r call, but gather the testcases inside.
I think it is neat, as long as you can place these helper functions (see the elaborate example) in your package, you can write tests with little overhead.
It doesn't answer how to work with the 'reporter' object though...
Simple example:
test_outcome <- callr::r(
function() {
# devtools::load_all()
list(
check1 = mypackage::sum(5,5), # some imaginary exported functions sum and name.
check2 = mypackage::name()
)
}
)
test_that('My test case', {
expect_equal(test_outcome$check1, 10)
expect_equal(test_outcome$check2, 'Siete')
})
Elaborate example
Note that from .add_test to .exp_true are only function definitions which can better be included in your package so they will be available when being loaded with devtools::load_all(). load_all also loads not-exported functions by default.
test_outcome <- callr::r(
function() {
# devtools::load_all()
# Defining helper functions
tst <- list(desc = 'My first test', tests = list())
.add_test <- function(type, A, B) {
# To show at least something about what is actually tested when returning the result, we can add the actual `.exp_...` call to the test.
call <- as.character(sys.call(-1))
tst$tests[[length(tst$tests) + 1]] <<- list(
type = type, a = A, b = B,
# (I couldn't find a better way to create a nice call string)
call = paste0(call[1], '(', paste0(collapse = ', ', call[2:length(call)]), ')'))
}
.exp_error <- function(expr, exp_msg) {
err_msg <- ''
tryCatch({expr}, error = function(err) {
err_msg <<- err$message
})
.add_test('error', err_msg, exp_msg)
}
.exp_match <- function(expr, regex) {
.add_test('match', expr, regex)
}
.exp_equal <- function(expr, ref) {
.add_test('equal', expr, ref)
}
.exp_false <- function(expr) {
.add_test('false', expr, FALSE)
}
.exp_true <- function(expr) {
.add_test('true', expr, TRUE)
}
# Performing the tests
.exp_match('My name is Siete', 'My name is .*')
.exp_equal(mypackage::sum(5,5), 10) # some imaginary exported functions sum and name.
.exp_match(mypackage::name(), 'Siete')
.exp_false('package:testthat' %in% search())
return(tst)
},
show = TRUE)
# Performing the actual testthat tests:
.run_test_batch <- function(test_outcome) {
test_that(test_outcome$desc, {
for (test in test_outcome$tests) {
# 'test' is a list with the fields 'type', 'a', 'b' and 'call'.
# Where 'type' can contain 'match', 'error', 'true', 'false' or 'equal'.
if (test$type == 'equal') {
with(test, expect_equal(a, b, label = call))
} else if (test$type == 'true') {
expect_true( test$a, label = test$call)
} else if (test$type == 'false') {
expect_false(test$a, label = test$call)
} else if (test$type %in% c('match', 'error')) {
with(test, expect_match(a, b, label = call))
}
}
})
}
.run_test_batch(test_outcome)
When moving the functions to your package you would need the following initialize function too.
tst <- new.env(parent = emptyenv())
tst$desc = ''
tst$tests = list()
.initialize_test <- function(desc) {
tst$desc = desc
tst$tests = list()
}
It works as follows:
An empty list is created: tst
By calling .exp_... functions, tests are added to that list
The list with tests is returned by the function in callr::r
Then we loop over the list and execute every test

Why does this happen when a user-defined R function does not return a value?

In the function shown below, there is no return. However, after executing it, I can confirm that the value entered d normally.
There is no return. Any suggestions in this regard will be appreciated.
Code
#installed plotly, dplyr
accumulate_by <- function(dat, var) {
var <- lazyeval::f_eval(var, dat)
lvls <- plotly:::getLevels(var)
dats <- lapply(seq_along(lvls), function(x) {
cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
})
dplyr::bind_rows(dats)
}
d <- txhousing %>%
filter(year > 2005, city %in% c("Abilene", "Bay Area")) %>%
accumulate_by(~date)
In the function, the last assignment is creating 'dats' which is returned with bind_rows(dats) We don't need an explicit return statement. Suppose, if there are two objects to be returned, we can place it in a list
In some languages like python, for memory efficiency, generators are used which will yield instead of creating the whole output in memory i.e. Consider two functions in python
def get_square(n):
result = []
for x in range(n):
result.append(x**2)
return result
When we run it
get_square(4)
#[0, 1, 4, 9]
The same function can be written as a generator. Instead of returning anything,
def get_square(n):
for x in range(n):
yield(x**2)
Running the function
get_square(4)
#<generator object get_square at 0x0000015240C2F9E8>
By casting with list, we get the same output
list(get_square(4))
#[0, 1, 4, 9]
There is always a return :) You just don't have to be explicit about it.
All R expressions return something. Including control structures and user-defined functions. (Control-structures are just functions, by the way, so you can just remember that everything is a value or a function call, and everything evaluates to a value).
For functions, the return value is the last expression evaluated in the execution of the function. So, for
f <- function(x) 2 + x
when you call f(3) you will invoke the function + with two parameters, 2 and x. These evaluate to 2 and 3, respectively, so `+`(2, 3) evaluates to 5, and that is the result of f(3).
When you call the return function -- and remember, this is a function -- you just leave the control-flow of a function early. So,
f <- function(x) {
if (x < 0) return(0)
x + 2
}
works as follows: When you call f, it will call the if function to figure out what to do in the first statement. The if function will evaluate x < 0 (which means calling the function < with parameters x and 0). If x < 0 is true, if will evaluate return(0). If it is false, it will evaluate its else part (which, because if has a special syntax when it comes to functions, isn't shown, but is NULL). If x < 0 is not true, f will evaluate x + 2 and return that. If x < 0 is true, however, the if function will evaluate return(0). This is a call to the function return, with parameter 0, and that call will terminate the execution of f and make the result 0.
Be careful with return. It is a function so
f <- function(x) {
if (x < 0) return;
x + 2
}
is perfectly valid R code, but it will not return when x < 0. The if call will just evaluate to the function return but not call it.
The return function is also a little special in that it can return from the parent call of control structures. Strictly speaking, return isn't evaluated in the frame of f in the examples above, but from inside the if calls. It just handles this special so it can return from f.
With non-standard evaluation this isn't always the case.
With this function
f <- function(df) {
with(df, if (any(x < 0)) return("foo") else return("bar"))
"baz"
}
you might think that
f(data.frame(x = rnorm(10)))
should return either "foo" or "bar". After all, we return in either case in the if statement. However, the if statement is evaluated inside with and it doesn't work that way. The function will return baz.
For non-local returns like that, you need to use callCC, and then it gets more technical (as if this wasn't technical enough).
If you can, try to avoid return completely and rely on functions returning the last expression they evaluate.
Update
Just to follow up on the comment below about loops. When you call a loop, you will most likely call one of the built-in primitive functions. And, yes, they return NULL. But you can write your own, and they will follow the rule that they return the last expression they evaluate. You can, for example, implement for in terms of while like this:
`for` <- function(itr_var, seq, body) {
itr_var <- as.character(substitute(itr_var))
body <- substitute(body)
e <- parent.frame()
j <- 1
while (j < length(seq)) {
assign(x = itr_var, value = seq[[j]], envir = e)
eval(body, envir = e)
j <- j + 1
}
"foo"
}
This function, will definitely return "foo", so this
for(i in 1:5) { print(i) }
evalutes to "foo". If you want it to return NULL, you have to be explicit about it (or just let the return value be the result of the while loop -- if that is the primitive while it returns NULL).
The point I want to make is that functions return the last expression they evaluate has to do with how the functions are defined, not how you call them. The loops use non-standard evaluation, so the last expression in the loop body you provide them might be the last value they evaluate and might not. For the primitive loops, it is not.
Except for their special syntax, there is nothing magical about loops. They follow the rules all functions follow. With non-standard evaluation it can get a bit tricky to work out from a function call what the last expression they will evaluate might be, because the function body looks like it is what the function evaluates. It is, to a degree, if the function is sensible, but the loop body is not the function body. It is a parameter. If it wasn't for the special syntax, and you had to provide loop bodies as normal parameters, there might be less confusion.

Can R recognize the type of distribution used as a function argument?

Background
I have a simple function called TBT. This function has a single argument called x. A user can provide any type rdistribution_name() (e.g., rnorm(), rf(), rt(), rbinom() etc.) existing in R for argument x, EXCEPT ONE: "rcauchy()".
Question
I was wondering how R could recognize that a user has provided an rcauchy() as the input for x, and when this is the case, then R issues a warning message?
Here is my R code with no success:
TBT = function(x) {
if( x == rcauchy(...) ) { warning("\n\tThis type of distribution is not supported.") }
}
TBT( x = rcauchy(1e4) )
Error in TBT(rcauchy(10000)) : '...' used in an incorrect context
If you are expeciting them do call to random function when they call your function, you could so
TBT <- function(x) {
xcall <- match.call()$x
if (class(xcall)=="call" && xcall[[1]]=="rcauchy") {
warning("\n\tThis type of distribution is not supported.")
}
}
TBT( x = rcauchy(1e4) )
But this would not catch cases like
x <- rcauchy(1e4)
TBT( x )
R can't track where the data in the x variable came from

Order of methods in R reference class and multiple files

There is one thing I really don't like about R reference class: the order you write the methods matters. Suppose your class goes like this:
myclass = setRefClass("myclass",
fields = list(
x = "numeric",
y = "numeric"
))
myclass$methods(
afunc = function(i) {
message("In afunc, I just call bfunc...")
bfunc(i)
}
)
myclass$methods(
bfunc = function(i) {
message("In bfunc, I just call cfunc...")
cfunc(i)
}
)
myclass$methods(
cfunc = function(i) {
message("In cfunc, I print out the sum of i, x and y...")
message(paste("i + x + y = ", i+x+y))
}
)
myclass$methods(
initialize = function(x, y) {
x <<- x
y <<- y
}
)
And then you start an instance, and call a method:
x = myclass(5, 6)
x$afunc(1)
You will get an error:
Error in x$afunc(1) : could not find function "bfunc"
I am interested in two things:
Is there a way to work around this nuisance?
Does this mean I can never split a really long class file into multiple files? (e.g. one file for each method.)
Calling bfunc(i) isn't going to invoke the method since it doesn't know what object it is operating on!
In your method definitions, .self is the object being methodded on (?). So change your code to:
myclass$methods(
afunc = function(i) {
message("In afunc, I just call bfunc...")
.self$bfunc(i)
}
)
(and similarly for bfunc). Are you coming from C++ or some language where functions within methods are automatically invoked within the object's context?
Some languages make this more explicit, for example in Python a method with one argument like yours actually has two arguments when defined, and would be:
def afunc(self, i):
[code]
but called like:
x.afunc(1)
then within the afunc there is the self variable which referes to x (although calling it self is a universal convention, it could be called anything).
In R, the .self is a little bit of magic sprinkled over reference classes. I don't think you could change it to .this even if you wanted.

Resources