Handling of closures in data.table - r

I am using the data.table package to return a list of function closures in a j expression as output by the approxfun function from the stats package. Basically, on each Date, I would like a closure that allows me to calculate an arbitrary yval based on an arbitrary xval as determined by approxfun.
However, approxfun is only valid when there are at least two unique values of x passed to the function. In the case where there is only one unique value of x, I would like to return a function that returns the one unique value of y.
In the code below, I perform this step by check the .N value and returning a different function depending on whether or not .N is > 1.
library(data.table)
set.seed(10)
N <- 3
x <- data.table(Date = Sys.Date() + rep(1:N, each = 3), xval = c(0, 30, 90), yval = rnorm(N * 3))
x <- x[-c(2:3), ]
##interpolation happens correctly
x2 <- x[order(Date, xval), {
if(.N > 1){
afun <- approxfun(xval, yval, rule = 1)
}else{
afun <- function(v) yval
}
print(afun(30))
list(Date, afun = list(afun))
}, by = Date]
##evaluation does NOT happen correctly, the val used is the last...
sapply(x2[, afun], do.call, args = list(v = 30))
When evaluating the function 'afun' in the context of the j expression, the correct value of 'yval' is printed. However, when I go back after the fact to evaluate the first function, the yval returned is the last yval in the group created by the 'by' grouping for the function that is not created by approxfun (all the closures created by approxfun work as expected).
My suspicion is that this has to do with something I am missing with lazy evaluation. I tried the additional code below using the 'force' function but was unsuccessful.
x3 <- x[order(Date, xval), {
if(.N > 1){
afun <- approxfun(xval, yval, rule = 1)
}else{
fn <- function(x){
force(x)
function(v) x
}
afun <- fn(yval)
}
print(afun(30))
list(Date, afun = list(afun))
}, by = Date]
sapply(x3[, afun], do.call, args = list(v = 30))
Has anyone else encountered this issue? Is it something I am missing with base R or something I am missing with data.table?
Thanks in advance for the help

Yes, typical data.table reference vs copy FAQ. This works as expected:
x2 <- x[order(Date, xval), {
if(.N > 1){
afun <- approxfun(xval, yval, rule = 1)
}else{
fn <- function(){
#ensure the value is copied
x <- copy(yval)
function(v) x
}
afun <- fn()
}
print(afun(30))
list(Date, afun = list(afun))
}, by = Date]
#[1] 0.01874617
#[1] 0.2945451
#[1] -0.363676
sapply(x2[, afun], do.call, args = list(v = 30))
#[1] 0.01874617 0.29454513 -0.36367602

Related

Creating a function in R but getting a replacement has length zero error

I tried to create a function f and create the function so when a value x is inserted, it spits out a function f from y.But, when I try to run the code to plot, it gives me an error that says that my y_value has no length.
f <- function(x){
if (x<0){
print(y_values<-x*x*x)
}
if(x>0 & x<=1){
print(y_values<-x*x)
}
if(x>1){
print(y_values<-sqrt(x))
}
}
x_values <- seq(-2, 2, by = 0.1)
y_values <- rep(NA, length(x_values))
for (i in seq_along(x_values)) {
x <- x_values[i]
y_values[i] <- f(x)
}
# output
plot(x_values, y_values, type = "l")
Two issues:
From ?print
‘print’ prints its argument and returns it invisibly (via
‘invisible(x)’)
So all your function f does is print the values to the console (instead of returning them).
As per your definition of f, the function does not know how to deal with x=0; so this will create a problem when you store the output of f(0) later.
We can fix these issues by slightly altering f as
f <- function(x) {
y_values <- NA
if (x<0){
y_values<-x*x*x
}
if(x>0 & x<=1){
y_values<-x*x
}
if(x>1){
y_values<-sqrt(x)
}
return(y_values)
}
Then
x_values <- seq(-2, 2, by = 0.1)
y_values <- rep(NA, length(x_values))
for (i in seq_along(x_values)) {
x <- x_values[i]
y_values[i] <- f(x)
}
plot(x_values, y_values, type = "l")
You could also use Vectorize to obtain a vectorised function f2, which allows you to pass x_values as a vector, thereby avoiding the explicit for loop:
f2 <- Vectorize(f)
x_values <- seq(-2, 2, by = 0.1)
y_values <- f2(x_values)
The resulting plot is the same.
I would recommend you explore other methods for coding something like this:
here is one option that doesn't use a for loop. If you are simply working on using for loops then the fix Mauritus Evers made should work for you.
library(tidyverse)
data.frame(x_values = seq(-2, 2, by = 0.1)) %>%
mutate(y_values = case_when(x_values < 0 ~ x_values^3,
x_values>=0 & x_values<=1 ~ x_values^2,
x_values>1 ~ sqrt(x_values))) %>%
ggplot(aes(x_values, y_values)) + geom_point()
note that I changed your code to produce output when x_value = 0.

Create new functions using a list of functions and list of function parameters to Be Passed

I am trying to create new functions from a list of function and a list of parameters to be passed to these functions, but am unable to do so so far. Please see the example below.
fun_list <- list(f = function(x, params) {x+params[1]},
z = function(a, params) {a * params[1] * params[2]})
params_list <- list(f = 1, z = c(3, 5))
# goal is to create 2 new functions in global environment
# fnew <- function(x) {x+1}
# znew <- function(a) {a*3*5}
# I've tried
for(x in names(fun_list)){
force(x)
assign(paste0(x, "new"), function(...) fun_list[[x]] (..., params = params_list[[x]]))
}
The goal is to do this dynamically for arbitrary functions and parameters.
Well, force() doesn't work in a for-loop because for loops do not create new environments. Based on a previous question of mine, I created a capture() function
capture <- function(...) {
vars <- sapply(substitute(...()), deparse);
pf <- parent.frame();
Map(assign, vars, mget(vars, envir=pf, inherits = TRUE), MoreArgs=list(envir=pf))
}
this allows
for(x in names(fun_list)) {
f = local({
capture(x);
p = params_list[[x]];
f = fun_list[[x]];
function(x) f(x, p)
})
assign(paste0(x, "new"), f)
}
where we create a local, private environment for the functions to store their default parameter values.
Which gives
fnew(2)
# [1] 3
znew(2)
# [1] 30
How about this:
for(x in names(fun_list)) {
formals(fun_list[[x]])$params <- params_list[[x]]
assign(paste0(x, "new"), fun_list[[x]])
}
This is similar in spirit:
ps <- list(fp=1,zp=c(3,5))
f0s <- substitute(list(f=function(x)x+fp,z=function(a)a*zp1*zp2),as.list(unlist(ps)))
f0s # list(f = function(x) x + 1, z = function(a) a * 3 * 5)
fs <- eval(f0s)
fs$f(1) # 2
To do the fancy thing described in the OP, you'd probably have to mess with formals.

Difficulty in applying a function over a sequence, when the function doesnt take the sequence as an argument

Difficulty in applying a function over a sequence, when the function doesn’t take the sequence as an argument and when the function uses values returned from the function itself at the previous value of the sequence.
I have the following problem. In my main method function mainn, I will first initialise fa, mu.a, mu.b, sigma, which will then be used as arguments in the step2 function. (Please note that I am pretty sure that my px.q4 and step2 are working correctly/properly in the sense that px.q4 returns a value and step2 returns 3 values)
This is where my question enters.
I would like to perform apply step2 over a sequence, and every time it applies over the sequence (except for the first iteration), the step2 function will use NOT the initial values but the values of fa.iter, mu.a.iter, mu.b.iter from the previous step2. Can this be done in R? Or do I have to use a for-loop for this sort of question
I was able to do this using a for-loop. However, I want to learn more about R. Is there a (more efficient) particular command that I can use to get around the problem?
#this function returns a value
px.q4 <- function(fa, mu.a, mu.b,sigma)
{
v <- fa + mu.a + mu.b * sigma
v
}
#this function returns 3 values, fa.iter, mu.a.iter, and mu.b.iter
#for completeness I include the full code of what i am trying to do.
#But all you need to know is that the function step2 returns 3 values .
step2 <- function(fa, mu.a, mu.b)
{
#set prev = iter values // this also allow set initial values = prev in
#the 1st iteration
mu.a.prev <- mu.a.iter
mu.b.prev <- mu.b.iter
fa.prev <- fa.iter
#draw a trail point x.trail from propsal distribution ~ N(x_i-1,0.1)
mu.a.trail <- rnorm(1, mu.a.prev, 0.1)
mu.b.trail <- rnorm(1, mu.b.prev, 0.1)
fa.trail <- rnorm(1, fa.prev, 0.1)
while(fa.trail < 0 || fa.trail > 1)
{
fa.trail <- rnorm(1, fa.prev, sigma)
}
#if p(x_trail) >= p(x_i-1) set x_i = x_trail
a <- px.q4(fa.trail, mu.a.trail, mu.b.trail, sigma)
b <- px.q4(fa.prev ,mu.a.prev , mu.b.prev,sigma)
if(a >= b)
{
mu.a.iter <- mu.a.trail
mu.b.iter <- mu.b.trail
fa.iter <- fa.trail
}else{
r <- runif(1,min = 0, max = 1)
if(r < a/b){
mu.a.iter <- mu.a.trail
mu.b.iter <- mu.b.trail
fa.iter <- fa.trail
}else{
mu.a.iter <- mu.a.prev
mu.b.iter <- mu.b.prev
fa.iter <- fa.prev
}
}
res <- list(mu.a.iter, mu.b.iter, fa.iter)
res
}
#main body
mainn <- function(n,fa,mu.a,mu.b)
{
sigma <- 0.3
mu.a.init <- mu.a #initial values
mu.b.init <- mu.b
fa.init <- fa #(must be between 0 and 1)
#set initial values = iter values (for entering for loop)
mu.a.iter <- mu.a.init
mu.b.iter <- mu.b.init
fa.iter <- fa.init
#where to the logical flaw comes in and how can I overcome it
y <- sapply(n,FUN = step2)
}
You can use Reduce to apply a function recursively over a list:
for instance,
Reduce(`+`, 1:10, accumulate=TRUE)
is equivalent to cumsum(1:10).
In your case, you are not applying the function over a list,
but just iterating it: you can still use Reduce, if you ignore the second argument.
# It is easier if your function takes a vector and returns a vector.
# The second argument is ignored.
step2 <- function(x, u) cumsum(x) + rnorm(length(x))
r <- Reduce(step2, 1:100, init=c(0,0,0), accumulate = TRUE)
But there is nothing wrong in using a loop: it should not be significantly slower, unless your function is very fast.

Wrapping very long functions in RExcel VBA?

When you want to use R functions in VBA via RExcel, you have to use
RInterface.RRun "..."
Then, if you'd like to define your own R function, you can simply
RInterface.RRun "y <- function(x) { ... }"
If y is made up by more than one command line, you can separate each line with ;, as you're used to do in R environment.
But... what if your y function is very very long?
A 20 ~ 30 rows R function is damn difficult to be written in such a way in VBA; and there's a limit to the length of VBA sentences.
So: how may I wrap?
Here's an example of a quite long R function: can you show me how to put in VBA using RExcel?
bestIV <- function(dT, IVTS.t, Spot, r) {
b <- r
xout <- seq(0, max(T), dT)
sfm <- matrix(NA, nrow = length(K), ncol = length(xout))
for(i in 1:length(K)) {
sfm[i,] <- approx(x = T, y = IVTS.t[i,], xout = xout, rule = 2)$y
}
sfm[,1] <- sfm[,1] + sfm[,2] - sfm[,3]
rownames(sfm) <- K
colnames(sfm) <- xout
Option <- matrix(NA, nrow = length(K), ncol = length(xout))
for(i in 1:length(K)) {
for(j in 1:length(xout)) {
TypeFlag <- ifelse(K[i] < Spot, 'p', 'c')
Option[i,j] <- GBSOption(TypeFlag = TypeFlag, S = Spot, X = K[i],
Time = xout[j] / 365, r = r, b = b,
sigma = sfm[i,j] / 100)#price
}
}
rownames(Option) <- K
colnames(Option) <- xout
dP <- (cbind(0, -t(apply(X = Option, MARGIN = 1, FUN = diff))) / Option)[,-(1:2)]
dV <- dP / dT
min.V <- which(dV == min(dV), arr.ind = TRUE, useNames = TRUE)
Strike <- as.numeric(dimnames(min.V)[1])
Maturity <- as.numeric(unlist(dimnames(dV)[2]))[min.V[2]]
Days <- dT
Mat <- c(dV[which(dV == min(dV))], Strike, Maturity, Days)
names(Mat) <- c('Value', 'Strike', 'Maturity', 'Days')
return(Mat)
}
Thanks,
Put your R code in your spreadhseet (in a range of cells) and use this function instead:
RInterface.RunRCodeFromRange range
Executes the commands in range on a worksheet
(allows to use commands prepared for interactive execution with R to be run in macro code)
You are passing a string as an argument to a VBA function. Thus your question reduces to "how can I concatenate strings in VBA".
The answer is to use the concatenation operator &, like this:
"a" & "b"
Say you have an R function:
y <- function(x, a, b){
return(x)
}
Then you can do this in VBA:
RInterface.RRun "y <- function(x, a, b) {" &
"return(x)" &
"}"

Is there a better way to create quantile "dummies" / factors in R?

i´d like to assign factors representing quantiles. Thus I need them to be numeric.
That´s why I wrote the following function, which is basically the answer to my problem:
qdum <- function(v,q){
qd = quantile(v,1:(q)/q)
v = as.data.frame(v)
v$b = 0
names(v) <- c("a","b")
i=1
for (i in 1:q){
if(i == 1)
v$b[ v$a < qd[1]] = 1
else
v$b[v$a > qd[i-1] & v$a <= qd[i]] = i
}
all = list(qd,v)
return(all)
}
you may laugh now :) .
The returned list contains a variable that can be used to assign every observation to its corresponding quantile. My question is now: is there a better way (more "native" or "core") to do it? I know about quantcut (from the gtools package), but at least with the parameters I got, I ended up with only with those unhandy(? - at least to me) thresholds.
Any feedback thats helps to get better is appreciated!
With base R, use quantiles to figure out the splits and then cut to convert the numeric variable to discrete:
qcut <- function(x, n) {
cut(x, quantile(x, seq(0, 1, length = n + 1)), labels = seq_len(n),
include.lowest = TRUE)
}
or if you just want the number:
qcut2 <- function(x, n) {
findInterval(x, quantile(x, seq(0, 1, length = n + 1)), all.inside = T)
}
I'm not sure what quantcut is but I would do the following
qdum <- function(v, q) {
library(Hmisc)
quantilenum <- cut2(v, g=q)
levels(quantilenum) <- 1:q
cbind(v, quantilenum)
}

Resources