accumulating functions and closures in R - r

I am constructing an approximating function recursively (adaboost). I would like to create the resulting learning function along the way (not to apply the approximation directly to my test data but keep the function that leads to it)
unfortunately, it seems that R updates the value to which a variable name refers to long after it is used.
#defined in plyr as well
id <- function(x) {x}
#my first classifier
modelprevious <- function(inputx, k) { k(0)}
#one step of my superb model
modelf <- function(x) 2*x #for instance
#I update my classifier
modelCurrent <- function(inputx, k)
{ modelprevious(inputx, function(res) {k(res + modelf(inputx))})}
#it works
modelCurrent(2,id) #4
#Problem
modelf <- function(x) 3*x
modelCurrent(2,id) #6 WTF !!
The same function with the same argument return something different, which is quite annoying !
So how is it possible to capture the value represented by modelf so that the resulting function only depends on its argument at the time of the binding, and not of some global state ?
Given that problem I dont see how one can do a recursive function building in R if one can not touch local variable, apart going through ugly hacks of quote/parse

You need a factory:
modelCurrent = function(mf){
return(function(inputx,k){
modelprevious(
inputx,
function(res){
k(res+mf(inputx))
} # function(res)
) # modelprevious
} # inner function
) # return
} # top function
Now you use the factory to create models with the modelf function that you want it to use:
> modelf <- function(x) 2*x
> m1 = modelCurrent(modelf)
> m1(2,id)
[1] 4
> modelf <- function(x) 3*x
> m1(2,id) # no change.
[1] 4
You can always make them on an ad-hoc basis:
> modelCurrent(modelf)(2,id)
[1] 6
and there you can see the factory created a function using the current definition of modelf, so it multiplied by three.
There's one last ginormous WTF!?! that will hit you. Watch carefully:
> modelf <- function(x) 2*x
> m1 = modelCurrent(modelf)
> m1(2,id)
[1] 4
>
> m1 = modelCurrent(modelf) # create a function using the 2* modelf
> modelf <- function(x) 3*x # change modelf...
> m1(2,id) # WTF?!
[1] 6
This is because when the factory is called, mf isn't evaluated - that's because the inner function isn't called, and mf isn't used until the inner function is called.
The trick is to force evaluation of the mf in the outer function, typically using force:
modelCurrent = function(mf){
force(mf)
return(function(inputx,k){
modelprevious(
inputx,
function(res){
k(res+mf(inputx))
} # function(res)
) # modelprevious
} # inner function
) # return
} # top function
This has lead me to premature baldness, because if you forget this and think there's some odd bug going on, and then try sticking print(mf) in place to see what's going on, you'll be evaluating mf and thus getting the behaviour you wanted. By inspecting the data, you changed it! A Heisenbug!

Related

Create an R function programmatically with non-fixed body

In a for loop I make a "string-formula" and allocate it to e.g. body1. And when I try to make a function with that body1 it fails... And I have no clue what I should try else...
This question How to create an R function programmatically? helped me a lot but sadly only quote is used to set the body...
I hope you have an idea how to work around with this issue.
And now my code:
A.m=matrix(c(3,4,2,2,1,1,1,3,2),ncol=3,byrow=TRUE)
for(i in 1:dim(A.m)[1]) {
body=character()
# here the string-formula emerges
for(l in 1:dim(A.m)[2]) {
body=paste0(body,"A.m[",i,",",l,"]","*x[",l,"]+")
}
# only the last plus-sign is cutted off
assign(paste0("body",i),substr(body,1,nchar(body)-1))
}
args=alist(x = )
# just for your convenience the console output
body1
## [1] "A.m[1,1]*x[1]+A.m[1,2]*x[2]+A.m[1,3]*x[3]"
# in this code-line I don't know how to pass body1 in feasible way
assign("Function_1", as.function(c(args, ???body1???), env = parent.frame())
And this is my aim:
Function_1(x=c(1,1,1))
## 9 # 3*1 + 4*1 + 2*1
Since you have a string, you need to parse that string. You can do
assign("Function_1",
as.function(c(args, parse(text=body1)[[1]])),
env = parent.frame())
Though I would strongly discourage the use of assign for filling your global environment with a bunch of variables with indexes in their name. In general that makes things much tougher to program with. It would be much easier to collect all your functions in a list. For example
funs <- lapply(1:dim(A.m)[1], function(i) {
body <- ""
for(l in 1:dim(A.m)[2]) {
body <- paste0(body,"A.m[",i,",",l,"]","*x[",l,"]+")
}
body <- substr(body,1,nchar(body)-1)
body <- parse(text=body)[[1]]
as.function(c(alist(x=), body), env=parent.frame())
})
And then you can call the different functions by extracting them with [[]]
funs[[1]](x=c(1,1,1))
# [1] 9
funs[[2]](x=c(1,1,1))
# [1] 4
Or you can ever call all the functions with an lapply
lapply(funs, function(f, ...) f(...), x=c(1,1,1))
# [[1]]
# [1] 9
# [[2]]
# [1] 4
# [[3]]
# [1] 6
Although if this is actually what your function is doing, there are easier ways to do this in R using matrix multiplication %*%. Your Function_1 is the same as A.m[1,] %*% c(1,1,1). You could make a generator funciton like
colmult <- function(mat, row) {
function(x) {
as.numeric(mat[row,] %*% x)
}
}
And then create the same list of functions with
funs <- lapply(1:3, function(i) colmult(A.m, i))
Then you don't need any string building or parsing which tends to be error prone.

R - 2 fun same code, one works and one gives $ operator is invalid for atomic vectors

i know might of u think this question and duplicated,but i do really have something kinda freaks me out, i was learning R and i had assignment to do something i do manage to solve it but i wonder why this error appear while the 2 code is very similar, is that something i don't understand in R
the first who has give me an error was :
makeCacheMatrix <- function(x = matrix()) {
#i for invirse
i <- NULL
set <- function(y){
x <<- y
i <<- NULL
}
get <- function(){x}
setinv <- function(solved){i <<- solved}
getinv <- function(){i}
#a list that has the 4 internal methods
list(set = set, get = get,
setinv = setinv,
getinv = getinv)
}
## Write a short comment describing this function
# this method check first if the data already cached,
# if they are it return it with message says "Getting cached data"
# if not it calculated, cache it and then return it
cacheSolve <- function(x, ...) {
## Return a matrix that is the inverse of 'x'
i <- x$getinv()
if(!is.null(i)){
message("Getting cached data")
return(i)
}
#calculate the inverse
x <- x$get()
i <- solve(x)
#print(i)
#print(class(i))
x$setinv(i)
i
}
this is how i call my function and the only result i get is
ps uncomment the print and clas funs will give the correct answers:
> source('~/R/cachematrix[not workng].R')
> x <- makeCacheMatrix(matrix(rnorm(16), 4,4))
> cacheSolve(x)
Error in x$setinv : $ operator is invalid for atomic vectors
after some time i said to myself why using too match variable and i use the methods inside each other for more sample code (one line is better)
but somehow the code works, for me it's the same code the only thing i have did the to pass the methods to each other instead of passing it into variable then pass the variable to method (it's the same really)
the code became like this now :
## Write a short comment describing this function
# this method retuen a matrix that has a list
# this list has 4 method as getters and sitters
makeCacheMatrix <- function(x = matrix()) {
#i for invirse
i <- NULL
set <- function(y){
x <<- y
i <<- NULL
}
get <- function(){x}
setinv <- function(solved){i <<- solved}
getinv <- function(){i}
#a list that has the 4 internal methods
list(set = set, get = get,
setinv = setinv,
getinv = getinv)
}
## Write a short comment describing this function
# this method check first if the data alredy cached,
# if they are it return it with messege sys "Getting cached data"
# if not it calculated, cache it and then return it
cacheSolve <- function(x, ...) {
## Return a matrix that is the inverse of 'x'
i <- x$getinv()
if(!is.null(i)){
message("Getting cached data")
return(i)
}
#calculate the invirse
i <- solve(x$get())
#cache the invirse for later
x$setinv(i)
i
}
and this's how i call my function:
> source('~/R/ProgrammingAssignment2/cachematrix.R')
> x <- makeCacheMatrix(matrix(rnorm(16), 4,4))
> cacheSolve(x)
[,1] [,2] [,3] [,4]
[1,] 0.09904578 -0.4586855 -0.2487849 -0.3421875
[2,] -1.84896897 0.8476203 0.7990204 0.5919526
[3,] 0.70645287 -0.1508695 -0.7141914 -0.2729974
[4,] 1.37441746 -0.9853108 -0.5607929 0.6553295
works good, i just wanna know what happen there, and why the first code give me an error and the second one didn't while both suppose to have the same logic, Thanks in advance mates
ps. i'm using : R version 3.3.2 on linux mint 18.1 with the latest version of rstudio

Make return from S3 indexing function "[" invisible

Is it possible to return an invisible object when using the S3 indexing function "[" on a custom class? For example, in the code below, is there a way to make the last line of code not print anything?
mat <- function(x) {
structure(x, class="mat")
}
"[.mat" <- function(x, i, j) {
invisible(unclass(x)[i,j])
}
m1 <- mat(matrix(1:10, ncol=2))
m1[1:2,]
[,1] [,2]
[1,] 1 6
[2,] 2 7
You are running into issues with the visibility mechanism caused by primitive functions. Consider:
> length.x <- function(x) invisible(23)
> length(structure(1:10, class="x"))
[1] 23
> mean.x <- function(x) invisible(23)
> mean(structure(1:10, class="x"))
> # no output
length is a primitive, but mean is not. From R Internals:
Whether the returned value of a top-level R expression is printed is controlled by the global boolean variable R_Visible. This is set (to true or false) on entry to all primitive and internal functions based on the eval column of the table in file src/main/names.c: the appropriate setting can be extracted by the macro PRIMPRINT.
and
Internal and primitive functions force the documented setting of R_Visible on return, unless the C code is allowed to change it (the exceptions above are indicated by PRIMPRINT having value 2).
So it would seem that you cannot force invisible returns from primitive generics like [, length, etc., and you must resort to workarounds like the one suggested by Alex.
The problem is that the value returned from [.mat is not of class mat since you're using unclass, so it uses the default printing method for whatever class it has. To fix this, just ensure that the returned object is still a mat and define a printing method for mat objects.
mat <- function(x) {
class(x) <- "mat"
x
}
`[.mat` <- function(x, i, j) {
y <- mat(unclass(x)[i, j])
invisible(y)
}
print.mat <- function(x, ...) {
invisible(x)
}
test <- mat(matrix(1:10, ncol = 2))
test[1, 1]
# Nothing is printed

R: on.exit - use returned value without knowing its name

I have below function. I cannot alter the function in any way except the first block of code in the function.
In this simple example I want to display apply some function on returning object.
The point is the name of variable returned by function may vary and I'm not able to guess it.
Obviously I also cannot wrap the f function into { x <- f(); myfun(x); x }.
The below .Last.value in my on.exit call represents the value to be returned by f function.
f <- function(param){
# the only code I know - start
on.exit(if("character" %in% class(.Last.value)) message(print(.Last.value)) else message(class(.Last.value)))
# the only code I know - end
# real processing of f()
a <- "aaa"
"somethiiiing"
if(param==1L) return(a)
b <- 5L
"somethiiiing"
if(param==2L) return(b)
"somethiiiing"
return(32)
}
f(1L)
# function
# [1] "aaa"
f(2L)
# aaa
# [1] 5
f(3L)
# integer
# [1] 32
Above code with .Last.value seems to be working with lag (so in fact not working) and also the .Last.value is probably not the way to go as I want to use the value few times like if(fun0(x)) fun1(x) else fun2(x), and because returned value might be a big object, copy it on the side is also bad approach.
Any way to use on.exit or any other function which can help me to run my function on the f function results without knowing result variable name?
In a similar way to how you are modifying the function, you could easily wrap it as well. Here's a reproducible example.
library(data.table)
append.log<-function(x) {
cat(paste("value:",x,"\n"))
}
idx.dt <- data.table:::`[.data.table`
environment(idx.dt)<-asNamespace("data.table")
idx.wrap <- function(...) {
x<-do.call(idx.dt, as.list(substitute(...())), envir=parent.frame())
append.log(if(is(x, "data.table")) {
nrow(x)
} else { NA })
x
}
environment(idx.wrap)<-asNamespace("data.table")
(unlockBinding)("[.data.table",asNamespace("data.table"))
assign("[.data.table",idx.wrap,envir=asNamespace("data.table"),inherits=FALSE)
dt<-data.table(a=1:10, b=seq(2, 20, by=2), c=letters[1:10])
dt[a%%2==0]
Since R 3.2.0 it is fully possible, thanks to new function returnValue.
Working example below.
f <- function(x, err = FALSE){
pt <- proc.time()[[3L]]
on.exit(message(paste("proc.time:",round(proc.time()[[3L]]-pt,4),"\nnrow:",as.integer(nrow(returnValue()))[1L])))
Sys.sleep(0.001)
if(err) stop("some error")
return(x)
}
dt <- data.frame(a = 1:5, b = letters[1:5])
f(dt)
f(dt, err=T)
f(dt)
f(dt[dt$a %in% 2:3 & dt$b %in% c("c","d"),])

Anonymous passing of variables from current environment to subfunction calls

The function testfun1, defined below, does what I want it to do. (For the reasoning of all this, see the background info below the code example.) The question I wanted to ask you is why what I tried in testfun2 doesn't work. To me, both appear to be doing the exact same thing. As shown by the print in testfun2, the evaluation of the helper function inside testfun2 takes place in the correct environment, but the variables from the main function environment get magically passed to the helper function in testfun1, but not in testfun2. Does anyone of you know why?
helpfun <- function(){
x <- x^2 + y^2
}
testfun1 <- function(x,y){
xy <- x*y
environment(helpfun) <- sys.frame(sys.nframe())
x <- eval(as.call(c(as.symbol("helpfun"))))
return(list(x=x,xy=xy))
}
testfun1(x = 2,y = 1:3)
## works as intended
eval.here <- function(fun){
environment(fun) <- parent.frame()
print(environment(fun))
eval(as.call(c(as.symbol(fun))))
}
testfun2 <- function(x,y){
print(sys.frame(sys.nframe()))
xy <- x*y
x <- eval.here("helpfun")
return(list(x=x,xy=xy))
}
testfun2(x = 2,y = 1:3)
## helpfun can't find variable 'x' despite having the same environment as in testfun1...
Background info: I have a large R code in which I want to call helperfunctions inside my main function. They alter variables of the main function environment. The purpose of all this is mainly to unclutter my code. (Main function code is currently over 2000 lines, with many calls to various helperfunctions which themselves are 40-150 lines long...)
Note that the number of arguments to my helper functions is very high, so that the traditional explicit passing of function arguments ( "helpfun(arg1 = arg1, arg2 = arg2, ... , arg50 = arg50)") would be cumbersome and doesnt yield the uncluttering of the code that I am aiming for. Therefore, I need to pass the variables from the parent frame to the helper functions anonymously.
Use this instead:
eval.here <- function(fun){
fun <- get(fun)
environment(fun) <- parent.frame()
print(environment(fun))
fun()
}
Result:
> testfun2(x = 2,y = 1:3)
<environment: 0x0000000013da47a8>
<environment: 0x0000000013da47a8>
$x
[1] 5 8 13
$xy
[1] 2 4 6

Resources