I am trying to refactor this. In Python, I would use a decorator. What's the 'R'tful way to do this? Say, we have this pattern
good_input <- format_input( bad_input )
bad_output <- use_this_func( good_input )
good_output <- format_output( bad_output )
And then again,
good_input <- format_input( bad_input )
bad_output <- use_this_other_func( good_input )
good_output <- format_output( bad_output )
As you can imagine, this proliferates like wild mushroom. I want something close to this solution
use_this_robust_func <- wrapper( use_this_func ) # Or wrapper( use_this_other_func )
good_output <- use_this_robust_func( bad_input )
I'm trying to wrap the call to use_this_func and use_this_other_func (and related functions) with format_input and format_output. Using in part this question, so far I have
wrapper <- function( func_not_robust ){
func_robust <- function( ... ){
# This is the bit I haven't figured out
... format_input( ) ... # supposed to convert bad input - the function argument - to good
bad_output <- func_not_robust( ... ) # supposed to take good input as argument
good_output <- format_output( bad_output )
return( good_output )
}
return( func_robust )
}
Sorry for the pseudo-code. Note I am not sure that this is the way to go in R. I'm not wedded to the sketch of the solution above, which is born from translating Python - and badly at that - to R. How would a R native do this? Thanks in advance.
I think you are pretty much there. Here's an example where the first stage of cleaning is to replace negative input values with NAs, and the output cleaning is simple to negate everything:
format_input <- function(x){
x[x<0] <- NA
return(x)
}
format_output <- function(x){
return(-x)
}
wrapper <- function(f){
force(f)
g = function(bad_input){
good_input = format_input(bad_input)
bad_output = f(good_input)
good_output = format_output(bad_output)
return(good_output)
}
g
}
Then:
> wrapper(sqrt)(c(-2,2))
[1] NA -1.414214
wrapper(sqrt) returns a "closure", which is a function with enclosed data. The function f has the value of the function sqrt as part of that enclosure.
The force call is needed since f doesn't get evaluated when g is created, and in some cases without it then f won't get found when running the wrapped version due to R's lazy evaluation or "promises" or something. I'm never exactly sure when this happens but adding a force call to unevaluated arguments to closure generators is zero-overhead. Its a bit cargo-cult programming but never a problem.
A more flexible solution might be to specify the input and output cleaning functions as functions to the closure generator, with defaults:
wrapper <- function(f, fi=format_input, fo=format_output){
force(f) ; force(fi); force(fo)
g = function(bad_input){
good_input = fi(bad_input)
bad_output = f(good_input)
good_output = fo(bad_output)
return(good_output)
}
g
}
Then I can wrap sqrt with different input and output formatters. For example to change that negative function with a positive one:
> make_pos = function(x){abs(x)}
> wrapper(sqrt,fo=make_pos)(c(-2,2))
[1] NA 1.414214
An even more flexible solution is to spot that you are generating chains of functions here. Your output is format_output(sqrt(format_output(bad_input))). This is function composition and there's a function in the functional package to do that:
> require(functional)
> w = Compose(format_input, sqrt, format_output)
> w(c(-2,2))
[1] NA -1.414214
This perhaps gets more useful when you have more than three functions in your composition, you could for example have a list of functions and compose them all together using do.call....
Once you see patterns in functional programming its addictive. I'll stop now.
Related
I want to write a function. However, the assumption is that I don't know the input arguments of the function. I just have a character vector to define input arguments of the function. Consider the following code:
f <- expression(exp(-d^2/s^2) )
fx <- function(d, s){ eval( f[[1]] ) }
In the above code, I know the parameters of expression and easily define a calculative function for it. But I get the expression from the user and I don't know what are the parameters. So, I want something like this:
f <- expression(exp(-d^2/s^2) )
v = all.vars(f)
#"d" "s"
fx <- function(?){ eval( f[[1]] ) }
I want to convert v to d and s on input function instead of ?. Is there any way?
There are a few ways to construct functions programmatically, one being:
body <- quote(exp(-d^2/s^2))
arg_names <- all.vars(body)
args <- setNames(rep(NA,length(arg_names)),all.vars(body))
fx <- as.function(c(args,body))
> fx(1,1)
[1] 0.3678794
I am eager to use the R package BB to solve a system of non-linear equations, but the syntax does not seem to allow for parameters to be passed to the system of equations. Very strange since this would severely limit what appears to be an otherwise very appealing and powerful alternative to nleqslv().
To be clear: "Normally", you expect a solver to have a space for passing parameters to the underlying objective function. For eg. in nleqslv:
out <- nleqslv(in_x, obj_fn, jac = NULL, other_pars1, other_pars2, method = "Broyden")
Where "in_x" is the vector of initial guesses at a solution, and the "other_pars1, other_pars2" are additional fixed parameters (can be scalars, vectors, matrices, whatever) required by "obj_fn".
In BBsolve, on the other hand, you just have
out <- BBsolve(in_x, obj_fn)
With no space to put in all the "other_pars1, other_pars2" required by obj_fn.
Create a function that "attaches" additional parameters to your objective function. The key concept here is that the return value is itself a function:
gen_obj_fn <- function( obj_fn, other_pars1, other_pars2 )
{
function(x) { obj_fn( x, other_pars1, other_pars2 ) }
}
The output of gen_obj_fn can now be passed directly to BBsolve:
## Previous call using nleqslv():
out <- nleqslv( in_x, myFun, jac = NULL, myParam1, myParam2, ... )
## Equivalent call using BBsolve():
myObjF <- gen_obj_fn( myFun, myParam1, myParam2 )
is.function( myObjF ) ## TRUE
out <- BBsolve( in_x, myObjF )
You haven't shown how you are using BBsolve. As I said in my comment BBsolve certainly does accept additional function arguments.
But you must name those arguments.
See this example for how to do what you seem to want:
library(nleqslv)
f <- function(x,p1=3,p2=2) {
y <- numeric(2)
y[1] <- 10*x[1]+3*x[2]^2 - p1
y[2] <- x[1]^2 -exp(x[2]) -p2
y
}
xstart <- c(1,1)
nleqslv(xstart, f)
library(BB)
BBsolve(xstart,f)
Try slightly different values for p1 and p2:
nleqslv(xstart,f,p1=2.7,p2=2.1)
BBsolve(xstart,f,p1=2.7,p2=2.1)
Both functions find the same solution.
I need a function created by a list of commands to fully evaluate so that it is identical to the "manual" version of the function.
Background: I am using ScaleR functions in Microsoft R Server and need to apply a set of transformations as a function. ScaleR is very picky about needing to be passed a function that is phrased exactly as specified below:
functionThatWorks <- function(data) {
data$marital_status_p1_ismarried <- impute(data$marital_status_p1_ismarried)
return(data)
}
I have a function that creates this list of transformations (and hundreds more, hence the need to functionalize its writing).
transformList <- list ("data$ismarried <- impute(data$ismarried)",
"data$issingle <- impute(data$issingle)")
This line outputs the evaluated string that I want to the console, but I am unaware of a way to move it from console output to being used in a function:
cat(noquote(unlist(bquote( .(noquote(transformList[1]))))))
I need to evaluate functionIWant so that it is identical to functionThatWorks.
functionIWant <- function(data){
eval( cat(noquote(unlist(bquote( .(noquote(transformList[1])))))) )
return(data)
}
identical(functionThatWorks, functionIWant)
EDIT: Adding in the answer based on #dww 's code. It works well in ScaleR. It is identical, minus meaningless spacing.
functionIWant <- function(){}
formals(functionIWant) <- alist(data=NULL)
functionIWant.text <- parse(text = c(
paste( bquote( .(noquote(transformList[1]))), ";", "return(data)\n")
))
body(functionIWant) <- as.call(c(as.name("{"), functionIWant.text))
Maybe something like this?
# 1st define a 'hard-coded' function
f1 <- function (x = 2)
{
y <- x + 1
y^2
}
f1(3)
# [1] 16
# now create a similar function from a character vector
f2 <- function(){}
formals(f2) <- alist(x=2)
f2.text <- parse(text = c('y <- x + 1', 'y^2'))
body(f2) <- as.call(c(as.name("{"), f2.text))
f2(3)
# [1] 16
The function testfun1, defined below, does what I want it to do. (For the reasoning of all this, see the background info below the code example.) The question I wanted to ask you is why what I tried in testfun2 doesn't work. To me, both appear to be doing the exact same thing. As shown by the print in testfun2, the evaluation of the helper function inside testfun2 takes place in the correct environment, but the variables from the main function environment get magically passed to the helper function in testfun1, but not in testfun2. Does anyone of you know why?
helpfun <- function(){
x <- x^2 + y^2
}
testfun1 <- function(x,y){
xy <- x*y
environment(helpfun) <- sys.frame(sys.nframe())
x <- eval(as.call(c(as.symbol("helpfun"))))
return(list(x=x,xy=xy))
}
testfun1(x = 2,y = 1:3)
## works as intended
eval.here <- function(fun){
environment(fun) <- parent.frame()
print(environment(fun))
eval(as.call(c(as.symbol(fun))))
}
testfun2 <- function(x,y){
print(sys.frame(sys.nframe()))
xy <- x*y
x <- eval.here("helpfun")
return(list(x=x,xy=xy))
}
testfun2(x = 2,y = 1:3)
## helpfun can't find variable 'x' despite having the same environment as in testfun1...
Background info: I have a large R code in which I want to call helperfunctions inside my main function. They alter variables of the main function environment. The purpose of all this is mainly to unclutter my code. (Main function code is currently over 2000 lines, with many calls to various helperfunctions which themselves are 40-150 lines long...)
Note that the number of arguments to my helper functions is very high, so that the traditional explicit passing of function arguments ( "helpfun(arg1 = arg1, arg2 = arg2, ... , arg50 = arg50)") would be cumbersome and doesnt yield the uncluttering of the code that I am aiming for. Therefore, I need to pass the variables from the parent frame to the helper functions anonymously.
Use this instead:
eval.here <- function(fun){
fun <- get(fun)
environment(fun) <- parent.frame()
print(environment(fun))
fun()
}
Result:
> testfun2(x = 2,y = 1:3)
<environment: 0x0000000013da47a8>
<environment: 0x0000000013da47a8>
$x
[1] 5 8 13
$xy
[1] 2 4 6
I am constructing an approximating function recursively (adaboost). I would like to create the resulting learning function along the way (not to apply the approximation directly to my test data but keep the function that leads to it)
unfortunately, it seems that R updates the value to which a variable name refers to long after it is used.
#defined in plyr as well
id <- function(x) {x}
#my first classifier
modelprevious <- function(inputx, k) { k(0)}
#one step of my superb model
modelf <- function(x) 2*x #for instance
#I update my classifier
modelCurrent <- function(inputx, k)
{ modelprevious(inputx, function(res) {k(res + modelf(inputx))})}
#it works
modelCurrent(2,id) #4
#Problem
modelf <- function(x) 3*x
modelCurrent(2,id) #6 WTF !!
The same function with the same argument return something different, which is quite annoying !
So how is it possible to capture the value represented by modelf so that the resulting function only depends on its argument at the time of the binding, and not of some global state ?
Given that problem I dont see how one can do a recursive function building in R if one can not touch local variable, apart going through ugly hacks of quote/parse
You need a factory:
modelCurrent = function(mf){
return(function(inputx,k){
modelprevious(
inputx,
function(res){
k(res+mf(inputx))
} # function(res)
) # modelprevious
} # inner function
) # return
} # top function
Now you use the factory to create models with the modelf function that you want it to use:
> modelf <- function(x) 2*x
> m1 = modelCurrent(modelf)
> m1(2,id)
[1] 4
> modelf <- function(x) 3*x
> m1(2,id) # no change.
[1] 4
You can always make them on an ad-hoc basis:
> modelCurrent(modelf)(2,id)
[1] 6
and there you can see the factory created a function using the current definition of modelf, so it multiplied by three.
There's one last ginormous WTF!?! that will hit you. Watch carefully:
> modelf <- function(x) 2*x
> m1 = modelCurrent(modelf)
> m1(2,id)
[1] 4
>
> m1 = modelCurrent(modelf) # create a function using the 2* modelf
> modelf <- function(x) 3*x # change modelf...
> m1(2,id) # WTF?!
[1] 6
This is because when the factory is called, mf isn't evaluated - that's because the inner function isn't called, and mf isn't used until the inner function is called.
The trick is to force evaluation of the mf in the outer function, typically using force:
modelCurrent = function(mf){
force(mf)
return(function(inputx,k){
modelprevious(
inputx,
function(res){
k(res+mf(inputx))
} # function(res)
) # modelprevious
} # inner function
) # return
} # top function
This has lead me to premature baldness, because if you forget this and think there's some odd bug going on, and then try sticking print(mf) in place to see what's going on, you'll be evaluating mf and thus getting the behaviour you wanted. By inspecting the data, you changed it! A Heisenbug!