Trying to define, evaluate and integrate r functions recursively in loop - r

This is the intergral I'm trying to evaluate
What I'm trying to do is put the g_k(x,s) into a list so I can recursively use them in the loop.
n = 500
k = 3
g_1 <- 1
rho <- sqrt( ((k-1)*(n-k))/(k*(n-k+1)) )
tau2 <- (sqrt( n / (k*(n-k+1))))
g_lst <- c()
#Coded discretely for k=1,2,3,4
g1 <- function(x,s){ return( 1 ) }
g_lst <- c(g_lst,g1)
g2 <- function(x,s){ return( integrate(function(y){g_lst[[1]](y,s)*(dnorm(y,rho*x,tau2)+dnorm(y,-rho*x,tau2))},0,s)[[1]] ) }
g_lst <- c(g_lst,g2)
g3 <- function(x,s){ return( integrate(function(y){g_lst[[2]](y,s)*(dnorm(y,rho*x,tau2)+dnorm(y,-rho*x,tau2))},0,s)[[1]] ) }
g_lst <- c(g_lst,g3)
g4 <- function(x,s){ return( integrate(function(y){g_lst[[3]](y,s)*(dnorm(y,rho*x,tau2)+dnorm(y,-rho*x,tau2))},0,s)[[1]] ) }
g_lst <- c(g_lst,g4)
#Trying to generalise to k=1,2,...,k
g_lst2 <- c()
g1 <- function(x,s){ return( 1 ) }
g_lst2 <- c(g_lst2,g1)
for (i in 1:3){
i <- force(i)
gk <- function(x,s){ return( integrate(function(y){g_lst[[i]](y,s)*(dnorm(y,rho*x,tau2)+dnorm(y,-rho*x,tau2))},0,s)[[1]] ) }
force(gk)
g_lst2 <- c(g_lst2,gk)
}
Here are the corresponding values im getting from evualting the functions from the list. g_lst is giving me the correct values whereas g_lst2 for all i>=1 g_lst[[i]] is giving me the value of the g_lst2[4].
From the threads I could find on stack exchange I feel i need to use the force() function but for the way I've used it is not helping.
g_lst2[[2]](2,2)
#[1] 1.811419
g_lst2[[3]]](2,2)
#[1] 1.811419
g_lst2[[4]](2,2)
#[1] 1.811419
g_lst[[2]](2,2)
#[1] 0.7380149
g_lst[[3]](2,2)
#[1] 1.156224
g_lst[[4]](2,2)
#[1] 1.811419

Using force() inside a for loop doesn't really help because for doesn't create a new scope. You want to create a generating function instead to build your list. For example
#Trying to generalise to k=1,2,...,k
g_lst2 <- c()
g1 <- function(x,s){ return( 1 ) }
g_lst2 <- c(g_lst2,g1)
for (i in 1:3){
make_gk <- function(i) {
force(i)
function(x,s){ return( integrate(function(y){g_lst2[[i]](y,s)*(dnorm(y,rho*x,tau2)+dnorm(y,-rho*x,tau2))},0,s)[[1]] ) }
}
gk <- make_gk(i)
g_lst2 <- c(g_lst2,gk)
}
Note that we are calling force() inside the function because that creates a new scope.
This will return the desired values
g_lst2[[2]](2,2)
# [1] 0.7380149
g_lst2[[3]](2,2)
# [1] 1.156224
g_lst2[[4]](2,2)
# [1] 1.811419
Alternatively you could use lapply which would take care of the forcing for you
g_lst2 <- c()
g1 <- function(x,s){ return( 1 ) }
g_lst2 <- c(g_lst2,g1)
g_lst2 <- c(g_lst2, lapply(1:3, function(i)
function(x,s){ return( integrate(function(y){g_lst2[[i]](y,s)*(dnorm(y,rho*x,tau2)+dnorm(y,-rho*x,tau2))},0,s)[[1]] ) }
))

You could easily use recursion to generalize. Below are two ways to do this
n <- 500
k <- 3
rho <- sqrt( ((k-1)*(n-k))/(k*(n-k+1)) )
tau2 <- (sqrt( n / (k*(n-k+1))))
g <- function(x,s,k){
if(k == 1) 1
else
integrate(function(y)g(y,s, k-1)*(
dnorm(y, x*rho, tau2) + dnorm(y, -x*rho, tau2)),0,s)[[1]]
}
g(2,2,1)
#> [1] 1
g(2,2,2)
#> [1] 0.7380149
g(2,2,3)
#> [1] 1.156224
g(2,2,4)
#> [1] 1.811419
The second way is to return a function then evaluate the function at the points needed
g1 <- function(k){
if(k == 1) function(...) 1
else function(x,s)
integrate(function(y)g1(k-1)(y,s)*(
dnorm(y, x*rho, tau2) + dnorm(y, -x*rho, tau2)),0,s)[[1]]
}
g1(1)(2,2)
#> [1] 1
g1(2)(2,2)
#> [1] 0.7380149
g1(3)(2,2)
#> [1] 1.156224
g1(4)(2,2)
#> [1] 1.811419
Created on 2023-02-10 with reprex v2.0.2
Of course this method given is slow as it does not save the intermediate results. ie if you compute g(2,2,10) then if you need g(2,2,8) you should not compute it but rather read it from the table of already computed g(2,2,10). But the method provided above does compute the value again. We can skip this process by using memoization:

Related

Is there a way to use do.call without explicitly providing arguments

Part of a custom function I am trying to create allows the user to provide a function as a parameter. For example
#Custom function
result <- function(.func){
do.call(.func, list(x,y))
}
#Data
x <- 1:2
y <- 0:1
#Call function
result(.func = function(x,y){ sum(x, y) })
However, the code above assumes that the user is providing a function with arguments x and y. Is there a way to use do.call (or something similar) so that the user can provide a function with different arguments? I think that the correct solution might be along the lines of:
#Custom function
result <- function(.func){
do.call(.func, formals(.func))
}
#Data
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
#Call function
result(.func = function(m,n){ sum(m, n) })
result(.func = function(x,y,z){ sum(x,y,z) })
But this is not it.
1) Use formals/names/mget to get the values in a list. An optional argument, envir, will allow the user to specify the environment that the variables are located in so it knows where to look. The default if not specified is the parent frame, i.e. the caller.
result1 <- function(.func, envir = parent.frame()) {
do.call(.func, mget(names(formals(.func)), envir))
}
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
result1(.func = function(m,n) sum(m, n) )
## [1] 9
result1(.func = function(x,y,z) sum(x,y,z) )
## [1] 14
result1(function(Time, demand) Time + demand, list2env(BOD))
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
1a) Another possibility is to evaluate the body. This also works if envir is specified as a data frame whose columns are to be looked up.
result1a <- function(.func, envir = parent.frame()) {
eval(body(.func), envir)
}
result1a(.func = function(m,n) sum(m, n) )
## [1] 9
result1a(.func = function(x,y,z) sum(x,y,z) )
## [1] 14
result1a(function(Time, demand) Time + demand, BOD)
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
2) Another design which is even simpler is to provide a one-sided formula interface. Formulas have environments so we can use that to look up the variables.
result2 <- function(fo, envir = environment(fo)) eval(fo[[2]], envir)
result2(~ sum(m, n))
## [1] 9
result2(~ sum(x,y,z))
## [1] 14
result2(~ Time + demand, BOD)
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
3) Even simpler yet is to just pass the result of the computation as an argument.
result3 <- function(x) x
result3(sum(m, n))
## [1] 9
result3(sum(x,y,z))
## [1] 14
result3(with(BOD, Time + demand))
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
This works.
#Custom function
result <- function(.func){
do.call(.func, lapply(formalArgs(.func), as.name))
}
#Data
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
#Call function
result(.func = function(m,n){ sum(m, n) })
result(.func = function(x,y,z){ sum(x,y,z) })
This seems like a bit of a pointless function, since the examples in your question imply that what you are trying to do is evaluate the body of the passed function using variables in the calling environment. You can certainly do this easily enough:
result <- function(.func){
eval(body(.func), envir = parent.frame())
}
This gives the expected results from your examples:
x <- 1:2
y <- 0:1
result(.func = function(x,y){ sum(x, y) })
#> [1] 4
and
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
result(.func = function(m,n){ sum(m, n) })
#> [1] 9
result(.func = function(x,y,z){ sum(x,y,z) })
#> [1] 14
But note that, when the user types:
result(.func = function(x,y){ ...user code... })
They get the same result they would already get if they didn't use your function and simply typed
...user code....
You could argue that it would be helpful with a pre-existing function like mean.default:
x <- 1:10
na.rm <- TRUE
trim <- 0
result(mean.default)
#> [1] 5.5
But this means users have to name their variables as the parameters being passed to the function, and this is just a less convenient way of calling the function.
It might be useful if you could demonstrate a use case where what you are proposing doesn't make the user's code longer or more complex.
You could also use ..., but like the other responses, I don't quite see the value, or perhaps I don't fully understand the use-case.
result <- function(.func, ...){
do.call(.func, list(...))
}
Create function
f1 <- function(a,b) sum(a,b)
Pass f1 and values to result()
result(f1, m,n)
Output:
[1] 9
Here is how I would do it based on your clarifying comments.
Basically since you say your function will take a data.frame as input, the function you are asking for essentially just reverses the order of arguments you pass to do.call()... which takes a function, then a list of arguments. A data.frame is just a special form of list where all elements (columns) are vectors of equal length (number of rows)
result <- function(.data, .func) {
# .data is a data.frame, which is a list of argument vectors of equal length
do.call(.func, .data)
}
result(data.frame(a=1, b=1:5), function(a, b) a * b)
result(data.frame(c=1:10, d=1:10), function(c, d) c * d)

decorate a function to count the number of times it gets called while preserving the original functions args

I want to write a decorator function that adds a counter to a function, counting the number of times it was called. E.g.
foo <- function(x) {x}
foo <- counter_decorator(foo)
foo(1)
foo(1)
# => the counter gets incremented with each call and has the value 2 now
The approach below basically works, but:
I want the inner function (which is returned by the decorator) to have the same formal args as the original function and not just ellipsis (i.e. ...). I am not sure how to accomplish that. Any ideas?
Not sure if the whole approach is a good one. Alternatives or improvements are appreciated.
Here is what I did so far:
# Init or reset counter
counter_init <- function() {
.counters <<- list()
}
# Decorate a function with a counter
#
# Each time the function is called the counter is incremented
#
# fun: function to be decorated
# fun_name: name in .counters list to store number of times in
#
counter_decorator <- function(fun, fun_name = NULL)
{
# use function name if no name is passed explicitly
if (is.null(fun_name)) {
fun_name <- deparse(substitute(fun))
}
fun <- force(fun) # deep copy to prevent infinite recursion
function(...) { # ==> ellipsis not optimal!
n <- .counters[[fun_name]]
if (is.null(n)) {
n <- 0
}
.counters[[fun_name]] <<- n + 1
fun(...)
}
}
Now let's create some functions and decorate them.
library(dplyr) # for pipe
# Create functions and decorate them with a counter
# create and decorate in second call
add_one <- function(x) {
x + 1
}
add_one <- counter_decorator(add_one)
# create and decorate the piping way by passing the fun_name arg
add_two <- {function(x) {
x + 2
}} %>% counter_decorator(fun_name = "add_two")
mean <- counter_decorator(mean)
counter_init()
for (i in 1:100) {
add_one(1)
add_two(1)
mean(1)
}
What we get in the .counters list is
> .counters
$add_one
[1] 100
$add_two
[1] 100
$mean
[1] 100
which is basically what I want.
1) The trace command can be used. Use untrace to undo the trace or set .counter to any desired value to start over again from that value.
f <- function(x) x
trace(f, quote(.counter <<- .counter + 1), print = FALSE)
.counter <- 0
f(1)
## [1] 1
f(1)
## [1] 1
.counter
## [1] 2
2) This variation stores the counter in an attribute of f.
f <- function(x) x
trace(f, quote(attr(f, "counter") <<- attr(f, "counter") + 1), print = FALSE)
attr(f, "counter") <- 0
f(1)
## [1] 1
f(1)
## [1] 1
attr(f, "counter")
## [1] 2
3) This variation stores the counter in an option.
f <- function(x) x
trace(f, quote(options(counter = getOption("counter", 0) + 1)), print = FALSE)
f(1)
## [1] 1
f(1)
## [1] 1
getOption("counter")
## [1] 2
This method stores the counter within the wrapper function itself instead of somewhere in the users environment or package environment. (There's nothing wrong with the latter; the former can be problematic or at least annoying/discourteous.)
The biggest side-effect (liability?) of this is when the package is detached or reloaded (i.e., during development), then the counter list is cleared/re-initialized.
counter_decorator <- function(fun) {
.counter <- 0L
fun2 <- function(...) {
.counter <<- .counter + 1L
cl <- match.call()
cl[[1]] <- fun
eval.parent(cl)
}
formals(fun2) <- formals(args(fun))
fun2
}
Demo:
foo <- function(x, y) x + y
foo2 <- counter_decorator(foo)
get(".counter", envir = environment(foo2))
# [1] 0
foo2(5, 9)
# [1] 14
foo2(5, 11)
# [1] 16
foo2(5, 13)
# [1] 18
get(".counter", envir = environment(foo2))
# [1] 3
Same formals:
formals(foo)
# $x
# $y
formals(foo2)
# $x
# $y
Edited (twice) to better track primitives where formals(.) is NULL; in that case, we can use formals(args(fun)).
Adapted for your preferred methodology, albeit with a little poetic liberty:
counters <- local({
.counters <- list()
function(init = FALSE) {
out <- .counters # will return counters *before* initialization
if (init) .counters <<- list()
out
}
})
counter_decorator <- function(fun, fun_name) {
if (missing(fun_name)) {
fun_name <- deparse(substitute(fun))
}
count <- get(".counters", envir = environment(counters))
count[[fun_name]] <- 0L
assign(".counters", count, envir = environment(counters))
fun2 <- function(...) {
.count <- get(".counters", envir = environment(counters))
.count[[fun_name]] <- if (is.null(.count[[fun_name]])) 1L else .count[[fun_name]] + 1L
assign(".counters", .count, envir = environment(counters))
cl <- match.call()
cl[[1]] <- fun
eval.parent(cl)
}
formals(fun2) <- formals(args(fun))
fun2
}
add_one <- function(x) {
x + 1
}
add_one <- counter_decorator(add_one)
add_two <- {function(x) {
x + 2
}} %>% counter_decorator(fun_name = "add_two")
new_mean <- counter_decorator(mean)
for (i in 1:100) {
add_one(1)
add_two(1)
new_mean(1)
}
counters()
# $add_one
# [1] 100
# $add_two
# [1] 100
# $mean
# [1] 100
formals(new_mean)
# $x
# $...
Initialization is not strictly required. Re-initialization returns the counters before reinitializing, so you don't need a double-call to get the values and then reset (and if you don't care about previous values, just ignore its return).
counters(TRUE)
# $add_one
# [1] 100
# $add_two
# [1] 100
# $mean
# [1] 100
counters()
# list()
add_one(10)
# [1] 11
counters()
# $add_one
# [1] 1

Implement table() function as a user defined function

x <- c(1,2,3,2,1)
table(x)
# x
# 1 2 3
# 2 2 1
Outputs how many times each element occur in the vector.
I am trying to imitate the above function using function()
Below is my code:
TotalTimes = function(x){
times = 0
y = unique(x)
for (i in 1:length(y)) {
for (i in 1:length(x)) {
if(y[i] == x[i])
times = times + 1
}
return(times)
}
}
What would be the right approach?
Here's a one-liner, using rle():
f <- function(x) {
with(rle(sort(x)), setNames(lengths, values))
}
f(c(1,2,3,2,1))
# 1 2 3
# 2 2 1
Alternatively, here's an option that's less "tricky", and is probably a better model for learning to code in an R-ish way:
f2 <- function(x) {
ss <- sort(x)
uu <- unique(ss)
names(uu) <- uu
sapply(uu, function(u) sum(ss == u))
}
f2(c(1,2,3,2,1))
# 1 2 3
# 2 2 1
function(x) {
q = sapply(unique(x), function(i) sum(x == i))
names(q) = unique(x)
return(q)
}
Here is one method using base R:
# data
x <- c(1,2,3,2,1)
# set up
y <- sort(unique(x))
counts <- rep_len(0, length.out=length(y))
names(counts) <- y
for(i in seq_along(x)) {
counts[x[i] == y] <- counts[x[i] == y] + 1
}
Wrapping it in a function:
table2 <- function(x) {
# transform x into character vector to reduce search cost in loop
x <- as.character(x)
y <- sort(unique(x))
counts <- rep_len(0, length.out=length(y))
names(counts) <- y
for(i in seq_along(x)) {
counts[x[i]] <- counts[x[i]] + 1L
}
return(counts)
}
This version only accepts a single vector, of course. At #Frank's suggestion, the function version is slightly different, and possibly faster, in that it transforms the input x into a character. The potential speed up is in the search in counts[x[i]] where the name in counts is referred to (as x[i]), rather than performing a search using "==."

Why are variable values in closures getting lost after repeatedly calling lapply?

I'm attempting to use a series of lapply calls to build a list of curried functions, which ideally at the last lapply call, returns the final desired value. The currying works, but lapply seems to always applies the last element in the list after the second application.
Example:
curry <- function(fn, ...) {
arglist <- list(...)
function(...) {
do.call(fn, append(arglist, list(...)))
}
}
# rcurry is used only to init the first lapply.
rcurry <- function(v1, fn, ...) {
arglist <- append(list(v1), list(...))
function(...) {
do.call(fn, append(arglist, list(...)))
}
}
myadd <- function(a,b,c) {
a+b+c
}
This works as expected:
# you can achieve the same by closure:
# curry.a <- lapply(c(10, 1000), FUN = function(a) { curry(myadd, a) })
curry.a <- lapply(list(10, 1000), rcurry, myadd)
curry.a[[1]](1,2)
curry.a[[2]](1,2)
# > [1] 13
# > [1] 1003
The next lapply of curry "mangles the scope":
# this does give the desired output:
# curry.a.b <- list(curry(curry.a[[1]], 1), curry(curry.a[[2]], 1))
curry.a.b <- lapply(curry.a, curry, 1)
curry.a.b[[1]](2)
curry.a.b[[2]](2)
# > [1] 1003
# > [1] 1003
It doesn't seem like a result of the curry or rcurry function. Using roxygen's Curry function does the same thing. creating curry.a by closure above or using curry.a <- list(curry(myadd, 10), curry(myadd, 1000)) also results the same.
And of course the final curry:
# it doesn't work if you re-define this:
# curry.a.b <- list(curry(curry.a[[1]], 1), curry(curry.a[[2]], 2))
curry.a.b.c <- lapply(curry.a.b, curry, 2)
lapply(curry.a.b.c, do.call, list())
# > [1] 1003
# > [1] 1003
What's going on here?
fn in curry is not evaluated in the scope of function and hence it is promise.
If you force it then you can get what you expect:
curry <- function(fn, ...) {
force(fn)
arglist <- list(...)
function(...) {
do.call(fn, append(arglist, list(...)))
}
}
then,
> curry.a.b <- lapply(curry.a, curry, 1)
> curry.a.b[[1]](2)
[1] 13
> curry.a.b[[2]](2)
[1] 1003
>
> curry.a.b.c <- lapply(curry.a.b, curry, 2)
> lapply(curry.a.b.c, do.call, list())
[[1]]
[1] 13
[[2]]
[1] 1003
More internally, lapply generates a local variable X that is referred by each call of function. If X is not evaluated in each function when calling the lapply, X is promise. After calling lapply, X in all function call from lapply returns same (i.e., last) value. So lapply is similar with:
f0 <- function(i) function() i
f1 <- function(i) {force(i); function() i}
f <- local({
r0 <- list()
r1 <- list()
for (i in 1:2) {
r0[[i]] <- f0(i)
r1[[i]] <- f1(i)
}
list(r0 = r0, r1 = r1)
})
then,
> f$r0[[1]]()
[1] 2
> f$r1[[1]]()
[1] 1
> f$r0[[2]]()
[1] 2
> f$r1[[2]]()
[1] 2

Return system.time from evaluated function

R version 2.12, Windows XP
I am attempting to write a function (say 'g') that takes one argument, a function (say 'f'), and returns the matched function. Furthermore, enclosed within the body of 'g' is a statement that tells the resulting object to return the value of system.time when the object is called. An example will clarify.
What I want:
g <- function(f) {...}
z <- g(mean)
z(c(1, 4, 7))
with output
user system elapsed
0.04 0.00 0.04
What I have:
g <- function(f) {if (!exists("x")) {x <- match.fun(f)} else {system.time(x)}}
z <- g(mean)
z(c(1, 4, 7))
with output
[1] 4
Any help is greatly appreciated.
Maybe this will help:
g <- function(f){
function(x){
zz <- system.time(
xx <- match.fun(f)(x)
)
list(value=xx, system.time=zz)
}
}
In use:
g(mean)(c(1, 4, 7))
$value
[1] 4
$system.time
user system elapsed
0 0 0
You may want to think about how your return the values. I used a list, but another option is to print the system time as a side effect and return the calculated value.
Recently I made similar function for myself:
with_times <- function(f) {
f <- match.fun(f)
function(...) {
.times <- system.time(res <- f(...))
attr(res, "system.time") <- as.list(na.omit(.times))
res
}
}
For example:
g <- function(x,y) {r<-x+y; Sys.sleep(.5); r}
g(1, 1)
# [1] 2
g2 <- with_times(g)
w <- g2(1, 1)
Timings can be extracted in two ways:
attributes(w)$system.time
# $user.self
# [1] 0
# $sys.self
# [1] 0
# $elapsed
# [1] 0.5
or
attr(w, "system.time")
# $user.self
# [1] 0
# $sys.self
# [1] 0
# $elapsed
# [1] 0.5

Resources