"could not find function" when using functions as arguments - r

I have two .R files, plotDataSet(..) and plotAllDataSets(). plotDataSet(..) makes a call to curve(..) (in the R graphics library), while plotAllDataSets() makes a call to plotDataSet(..). plotDataSet(..) takes a function as a parameter, and passes it to curve(..).
I want to pass in my function argument for curve(..) into plotDataSet(..) from a list of functions, such as:
v <- c(function(x){x}, function(x){x*x}, function(x){x*x}, function(x){x*x*x},
function(x){x*x}, function(x){x*x*x}, function(x){x*x*x})
for (i in 1:7) {
plotSaveData(data, v[i], i)
}
I get the following output: Error in eval(expr, envir, enclos) :
could not find function "expectedOrderEquation".
Interestingly, when I call plotDataSet(..) and pass in a function like function(x){x*x}, it works fine:
for (i in 1:7) {
plotSaveData(data, function(x) {x}, i)
}
But this won't let me call plotSaveData(..) while cycling through a list of functions.
Can someone please explain why this does not work?
I hope this is sufficient, but I am happy to provide more context as needed. Also, I am a bit new to R, so any corrections to my descriptions would be helpful.

use double brackets instead of single brackets
v[[i]] instead of v[i]
Have a look at the difference between these two:
v[[i]] (3)
v[i] (3) # error
The single brackets returns a list, whose contents is a function
The double brackets returns the function.

Related

Encountering errors when writing a quicksort function in R using a partition function

I've been writing this quicksort function in R trying to incorporate a partition function I've created as well. However, I've been encountering bugs when comparing p and r. It keeps telling me my argument is of length 0, however, I thought I declared the p and r objects when I initially called the quicksort function.
partition <- function(input,p, r){
pivot = input[r]
while(p<r){
while(input[p]<pivot) {p<-p+1}
while(input[r]>pivot) {r<-r-1}
if(input[p]==input[r]) {p<-p+1}
else if (p<r){
tmp <- input[p]
input[p] = input[r]
input[r] = tmp
}
}
return(r)
}
quicksort<- function(input,p,r){
if(p<r){
j<- partition(input,p,r)
input <- quicksort(input,p,j-1)
input <- quicksort(input,j+1,r)
}
}
input <- c(500,700,800,100,300,200,900,400,1000,600)
print("Input:")
print(input)
quicksort(input,1,10)
print("Output:")
print(input)
The error in question is caused because input[p] is of length zero. Why? Because in this instance input is NULL. input isn't NULL for the first few goes, so what would make it NULL?
Your quicksort function is designed to take an input, change it (if p<r), and then output it. But, you've left out the output step. If p<r then this is taken care of implicitly by the last input <- ... line, but if not then the function doesn't do anything and just returns NULL.
The output from one call to quicksort is the input to the next, and so this NULL propagates and breaks the next call.
Recursive functions are beautiful but often frustrating to debug. I recommend liberally sprinkling print() statements around while you're still developing it so you can see what it's doing more easily.

Subsetting data as generic function in R

I am trying to create a function that plots graphs for either an entire dataset, or a subset of the data. The function needs to be able to do both so that you can plot the subset if you so wish. I am struggling with just coming up with the generic subset function.
I currently have this code (I am more of a SAS user so R is confusing me a bit):
subset<-function(dat, varname, val)
if(dat$varname==val) {
data<-subset(dat, dat$varname==val)
}
But R keeps returning this error message:
Error in if (dat$varname == val) { : argument is of length zero
Could someone help me to resolve this? Thanks so much! I figure it may have to do with the way I wrote it.
First off all the $ operator can not handle variables. In your code you are always looking up a column named varname.
Replace $varname with [varname] instead.
The next error is that you are conditioning on a vector, dat$varname==val will be vector of booleans.
A third error in your code is that you are naming your function subset and thus overlayering the subset function in the base package. So the inner call to subset will be a recursive call to your own function. To fix this rename your function or you have to specify that it is the subset function in the base package you are calling with base::subset(dat, dat[varname]==val).
The final error in the code is that your function does not return anything. Do not assign the result to the variable data but return it instead.
Here is how the code should look like.
mySubset<-function(dat, varname, val)
if(any(dat[varname]==val)) {
subset(dat, dat[varname]==val)
} else {
NA
}
Or even better
mySubset <- function(dat,varname,val) dat[dat[varname] == val]

using callCC with higher-order functions in R

I'm trying to figure out how to get R's callCC function for short-circuiting evalutation of a function to work with functions like lapply and Reduce.
Motivation
This would make Reduce and and lapply have asymptotic efficiency > O(n), by allowing you to
exit a computation early.
For example, if I'm searching for a value in a list I could map a 'finder' function across the list, and the second it is found lapply stops running and that value is returned (much like breaking a loop, or using a return statement to break out early).
The problem is I am having trouble writing the functions that lapply and Reduce should take using a style that callCC requires.
Example
Say I'm trying to write a function to find the value '100' in a list: something equivalent to
imperativeVersion <- function (xs) {
for (val in xs) if (val == 100) return (val)
}
The function to pass to lapply would look like:
find100 <- function (val) { if (val == 100) SHORT_CIRCUIT(val) }
functionalVersion <- function (xs) lapply(xs, find100)
This (obviously) crashes, since the short circuiting function hasn't been defined yet.
callCC( function (SHORT_CIRCUIT) lapply(1:1000, find100) )
The problem is that this also crashes, because the short circuiting function wasn't around when find100 was defined. I would like for something similar to this to work.
the following works because SHORT_CIRCUIT IS defined at the time that the function passed to lapply is created.
callCC(
function (SHORT_CIRCUIT) {
lapply(1:1000, function (val) {
if (val == 100) SHORT_CIRCUIT(val)
})
)
How can I make SHORT_CIRCUIT be defined in the function passed to lapply without defining it inline like above?
I'm aware this example can be achieved using loops, reduce or any other number of ways. I am looking for a solution to the problem of using callCC with lapply and Reduce in specific.
If I was vague or any clarification is needed please leave a comment below. I hope someone can help with this :)
Edit One:
The approach should be 'production-quality'; no deparsing functions or similar black magic.
I found a soluton to this problem:
find100 <- function (val) {
if (val == 100) SHORT_CIRCUIT(val)
}
short_map <- function (fn, coll) {
callCC(function (SHORT_CIRCUIT) {
clone_env <- new.env(parent = environment(fn))
clone_env$SHORT_CIRCUIT <- SHORT_CIRCUIT
environment(fn) <- clone_env
lapply(coll, fn)
})
}
short_map(find100, c(1,2,100,3))
The trick to making higher-order functions work with callCC is to assign the short-circuiting function into the input functions environment before carrying on with the rest of the program. I made a clone of the environment to avoid unintended side-effects.
You can achieve this using metaprogramming in R.
#alexis_laz's approach was in fact already metaprogramming.
However, he used strings which are a dirty hack and error prone. So you did well to reject it.
The correct way to approach #alexis_laz's approach would be by wrangling on code level. In base R this is done using substitute(). There are however better packages e.g. rlang by Hadley Wickham. But I give you a base R solution (less dependency).
lapply_ <- function(lst, FUN) {
eval.parent(
substitute(
callCC(function(return_) {
lapply(lst_, FUN_)
}),
list(lst_ = lst, FUN_=substitute(FUN))))
}
Your SHORT_CIRCUIT function is actually a more general, control flow return function (or a break function which takes an argument to return it). Thus, I call it return_.
We want to have a lapply_ function, in which we can in the FUN= part use a return_ to break out of the usual lapply().
As you showed, this is the aim:
callCC(
function (return_) {
lapply(1:1000, function (x) if (x == 100) return_(x))
}
)
Just with the problem, that we want to be able to generalize this expression.
We want
callCC(
function(return_) lapply(lst, FUN_)
)
Where we can use inside the function definition we give for FUN_ the return_.
We can let, however, the function defintion see return_ only if we insert the function definition code into this expression.
This exactly #alexis_laz tried using string and eval.
Or you did this by manipulating environment variables.
We can safely achieve the insertion of literal code using substitute(expr, replacer_list) where expr is the code to be manipulated and replacer_list is the lookup table for the replacement of code.
By substitute(FUN) we take the literal code given for FUN= for lapply_ without evaluating it. This expression returns literal quoted code (better than the string in #alexis_laz's approach).
The big substitute command says: "Take the expression callCC(function(return_) lapply(lst_, FUN_)) and replace lst_ in this expression by the list given for coll and FUN_ by the literal quoted expression given for FUN.
This replaced expression is then evaluated in the parent environment (eval.parent()) meaning: the resulting expression replaces the lapply_() call and is executed exactly where it was placed.
Such use of eval.parent() (or eval( ... , envir=parent.frame())) is fool proof. (otherwise, tidyverse packages wouldn't be production level ...).
So in this way, you can generalize callCC() calls.
lapply_(1:1000, FUN=function(x) if (x==100) return_(x))
## [1] 100
I don't know if it can be of use, but:
find100 <- "function (val) { if (val == 100) SHORT_CIRCUIT(val) }"
callCC( function (SHORT_CIRCUIT) lapply(1:1000, eval(parse(text = find100))) )
#[1] 100

Environment chaining in R

In my R development I need to wrap function primitives in proto objects so that a number of arguments can be automatically passed to the functions when the $perform() method of the object is invoked. The function invocation internally happens via do.call(). All is well, except when the function attempts to access variables from the closure within which it is defined. In that case, the function cannot resolve the names.
Here is the smallest example I have found that reproduces the behavior:
library(proto)
make_command <- function(operation) {
proto(
func = operation,
perform = function(., ...) {
func <- with(., func) # unbinds proto method
do.call(func, list(), envir=environment(operation))
}
)
}
test_case <- function() {
result <- 100
make_command(function() result)$perform()
}
# Will generate error:
# Error in function () : object 'result' not found
test_case()
I have a reproducible testthat test that also outputs a lot of diagnostic output. The diagnostic output has me stumped. By looking up the parent environment chain, my diagnostic code, which lives inside the function, finds and prints the very same variable the function fails to find. See this gist..
How can the environment for do.call be set up correctly?
This was the final answer after an offline discussion with the poster:
make_command <- function(operation) {
proto(perform = function(.) operation())
}
I think the issue here is clearer and easier to explore if you:
Replace the anonymous function within make_command() with a named one.
Make that function open a browser() (instead of trying to get result). That way you can look around to see where you are and what's going on.
Try this, which should clarify the cause of your problem:
test_case <- function() {
result <- 100
myFun <- function() browser()
make_command(myFun)$perform()
}
test_case()
## Then from within the browser:
#
parent.env(environment())
# <environment: 0x0d8de854>
# attr(,"class")
# [1] "proto" "environment"
get("result", parent.env(environment()))
# Error in get("result", parent.env(environment())) :
# object 'result' not found
#
parent.frame()
# <environment: 0x0d8ddfc0>
get("result", parent.frame()) ## (This works, so points towards a solution.)
# [1] 100
Here's the problem. Although you think you're evaluating myFun(), whose environment is the evaluation frame of test_case(), your call to do.call(func, ...) is really evaluating func(), whose environment is the proto environment within which it was defined. After looking for and not finding result in its own frame, the call to func() follows the rules of lexical scoping, and next looks in the proto environment. Neither it nor its parent environment contains an object named result, resulting in the error message you received.
If this doesn't immediately make sense, you can keep poking around within the browser. Here are a few further calls you might find helpful:
environment(get("myFun", parent.frame()))
ls(environment(get("myFun", parent.frame())))
environment(get("func", parent.env(environment())))
ls(environment(get("func", parent.env(environment()))))

R warning() wrapper - raise to parent function

I have a wrapper around the in-built warning() function in R that basically calls warning(sprintf(...)):
warningf <- function(...)
warning(sprintf(...))
This is because I use warning(sprintf(...)) so often that I decided to make a function out of it (it's in a package I have of functions I use often).
I then use warningf when I write functions. i.e., instead of writing:
f <- function() {
# ... do stuff
warning(sprintf('I have %i bananas!',2))
# ... do stuff
}
I write:
f <- function() {
# ... do stuff
warningf('I have %i bananas!',2)
# ... do stuff
}
If I call the first f(), I get:
Warning message:
In f() : I have 2 bananas!
This is good - it tells me where the warning came from f() and what went wrong.
If I call the second f(), I get:
Warning message:
In warningf("I have %i bananas!",2) : I have 2 bananas!
This is not ideal - it tells me the warning was in the warningf function (of course, because it's the warningf function that calls warning, not f), masking the fact that it actually came from the f() function.
So my question is : Can I somehow "raise" the warning call so it displays the warning in f() message instead of the warning in warningf ?
One way of dealing with this is to get a list of the environments in your calling stack, and then pasting the name of the parent frame in your warning.
You do this with the function sys.call() which returns an item in the call stack. You want to extract the second from last element in this list, i.e. the parent to warningf:
warningf <- function(...){
parent.call <- sys.call(sys.nframe() - 1L)
warning(paste("In", deparse(parent.call), ":", sprintf(...)), call.=FALSE)
}
Now, if I run your function:
> f()
Warning message:
In f() : I have 2 bananas!
Later edit : deparse(parent.call) converts the call to a string in the case that the f() function had arguments, and shows the call as it was specified (ie including arguments etc).
I know it's old but, sys.call(sys.nframe() - 1L), or sys.call(-1),
returns a vector, with the function name and the argument.
If you use it inside paste() it will raise two warnings, one from the function and one from the argument.
The answer doesn't show because f() has no arguments.
sys.call(sys.nframe() - 1L)[1] does the trick.

Resources