Problems checking a package containing generated functions - r

I wish I knew how to make this example smaller, but I don't understand the problem well enough to do that.
I have a package that rewrites R functions to make them tail-recursive: tailr. It does a bit of analysis of a recursive function and then translates it into a looping function. For example, it will translate this factorial function
factorial <- function(n, acc) {
if (n <= 1) acc
else factorial(n - 1, acc * n)
}
into this version
factorial <- function(n, acc) {
.tailr_n <- n
.tailr_acc <- acc
callCC(function(escape) {
repeat {
n <- .tailr_n
acc <- .tailr_acc
if (n <= 1)
escape(acc)
else {
.tailr_n <<- n - 1
.tailr_acc <<- acc * n
}
}
})
}
The generated function is not pretty, but it does work.
My problem is if I write a package that uses the transformation, one that contains only these lines of R:
#' Computes the factorial.
#' #param n A number
#' #param acc Accumulator to make the function tail-recursive
#' #return factorial of n
#' #export
factorial <- function(n, acc) {
if (n <= 1) acc
else factorial(n - 1, acc * n)
}
#' Computes the factorial.
#' #param n A number
#' #return factorial of n
#' #param acc Accumulator to make the function tail-recursive
#' #export
factorial_loop <- tailr::loop_transform(factorial)
running devtools::check() give me this error:
Error in attr(e, "srcref")[[i]] : subscript out of bounds
Calls: <Anonymous> ... <Anonymous> -> collectUsage -> collectUsageFun -> walkCode -> h
Execution halted
If I put a dummy version of the transformation into the package, I do not get an error
dummy_transform_body <- function(expr) {
rlang::expr({
.tailr_n <- n
.tailr_acc <- acc
callCC(function(escape) {
repeat {
n <- .tailr_n
acc <- .tailr_acc
if (n <= 1)
escape(acc)
else {
.tailr_n <<- n - 1
.tailr_acc <<- acc * n
}
}
})
})
}
dummy_transform <- function(fun) {
fun_q <- rlang::enquo(fun)
new_fun_body <- dummy_transform_body(body(fun))
result <- rlang::new_function(
args = formals(fun),
body = new_fun_body,
env = rlang::get_env(fun_q)
)
result
}
#' Computes the factorial.
#' #param n A number
#' #return factorial of n
#' #param acc Accumulator to make the function tail-recursive
#' #export
factorial_loop_dummy <- dummy_transform(factorial)
I don't see any differences between the two functions, so I am puzzled why the check accepts the dummy but not the real version.
> body(factorial_loop) == body(factorial_loop_dummy)
[1] TRUE
> environment(factorial_loop)
<environment: namespace:Test>
> environment(factorial_loop_dummy)
<environment: namespace:Test>
> formals(factorial_loop)
$n
$acc
> formals(factorial_loop_dummy)
$n
$acc
> attributes(factorial_loop())
Error in factorial_loop() : argument "n" is missing, with no default
> attributes(factorial_loop)
NULL
> attributes(factorial_loop_dummy)
NULL
The error mentions the attribute srcref, but neither transformed function has any attributes. If I explicitly set the srcref attribute it doesn't help with the error though.
Any ideas, anyone?
Update 2018/03/20:
The problem seems to be with the quasi-quotation splicing in my transformation function. If I uncomment that, the !!! statements below, and manually insert the cases for the factorial, then the error goes away.
dummy_transform_body <- function(fun_expr, info) {
vars <- names(formals(info$fun))
tmp_assignments <- vector("list", length = length(vars))
locals_assignments <- vector("list", length = length(vars))
for (i in seq_along(vars)) {
local_var <- as.symbol(vars[[i]])
tmp_var <- parse(text = paste(".tailr_", vars[[i]], sep = ""))[[1]]
tmp_assignments[[i]] <- rlang::expr(rlang::UQ(tmp_var) <- rlang::UQ(local_var))
locals_assignments[[i]] <- rlang::expr(rlang::UQ(local_var) <- rlang::UQ(tmp_var))
}
# this would be a nice pipeline, but it is a bit much to require
# magrittr just for this
fun_expr <- make_returns_explicit(fun_expr, FALSE, info)
fun_expr <- simplify_returns(fun_expr, info)
fun_expr <- handle_recursive_returns(fun_expr, info)
fun_expr <- returns_to_escapes(fun_expr, info)
fun_expr <- simplify_nested_blocks(fun_expr)
rlang::expr({
#!!! tmp_assignments
.tailr_n <- n
.tailr_acc <- acc
callCC(function(escape) {
repeat {
#!!! locals_assignments
n <<- .tailr_n
acc <<- .tailr_acc
!! fun_expr
next
}
})
})
}
Another Update:
...Deleted the previous update... The hack with putting the splicing inside another bock doesn't work for me any longer...
Yet another update...
Ok, I still have absolutely no idea why the splicing isn't working. I made other dummy-functions where it did. So I am really interested if someone has any ideas. In any case, I managed to rewrite my tailr function to avoid !!! and that seems to work now.
repeat_body <- as.call(
c(`{`, locals_assignments, fun_expr, quote(next))
)
call_cc_stmt <- rlang::expr(
callCC(function(escape) {
repeat {
!!repeat_body
}
})
)
as.call(
c(`{`, tmp_assignments, call_cc_stmt)
)
This is just a lot less elegant and the generated code is uglier--but I hide that by setting srcref to the original code, so no one need ever know.

Related

Evaluating an integral in R multiple times

I am trying to integrate the next function with respect x
integrand <- function(x) {
f1 <- pnorm((1/sqrt(u/x))*( sqrt((t*u*v)/x) - sqrt(x/(t*u*v)) ))}
where,
v=10
u=5
However, I need to integrate considering different values of t, so tried defining a sequence of values as:
t=seq(0,100,0.1)
And used the sapply function as:
data=sapply(t, function(x) integrate(integrand, lower = 0 , upper = 10000)$value )
I got these errors:
Error in integrate(integrand, lower = 0, upper = 10000) :
evaluation of function gave a result of wrong length
In addition: Warning messages:
1: In (t * u * v)/x : longer object length is not a multiple of shorter object length
2: In x/(t * u * v) : longer object length is not a multiple of shorter object length
3: In (1/sqrt(u/x)) * (sqrt((t * u * v)/x) - sqrt(x/(t * u * v))) :
longer object length is not a multiple of shorter object length
I haven't had any luck.
I would greatly appreciate any help.
Regards!
You can still use sapply like so:
sapply(t, function(t) {
integrate(function(x) {
pnorm((1/sqrt(u/x))*( sqrt((t*u*v)/x) - sqrt(x/(t*u*v)) ))
}, lower = 0, upper = 1000)$value
})
Output
[1] 0.000000 5.416577 10.251273 15.146418 20.084907 25.049283 ...
A previous post have a similar problem with an specific solution here
the code would result as:
t=seq(0,100,0.1)
fu<- list()
int<- numeric()
for(i in 1:length(t))
{
fu[[i]] = function(x){
f1 <- pnorm((1/sqrt(u/x))*( sqrt((t[i]*u*v)/x) - sqrt(x/(t[i]*u*v)) ));
}
int[i] = integrate(h[[i]], lower=0, upper=1000)$value
}
int

Manually written function doesn't behave the same as the gamma function

So I implemented a function that calculates the value of the gamma function. and when I try to multiply f5(a) with a numeric I receive the error : Error in result * f5(a) : non-numeric argument to binary operator and if I instead use result * gamma(a) which is the predefined function it works just fine. It seems like it won't let me do any arithmetic operation with f5 even though it returns the same result as gamma
f5 <- function(a)
{
f <- function(x)
x^(a-1)*exp(-x)
integrate(f, 0, Inf)
}
f6 <- function(a)
{
if (a < 0)
print("a is negative")
else if (a%%1 == 0)
return (factorial(a-1))
else
{
result <- 1
while (a > 1)
{
result <- result * (a - 1)
a <- a - 1
}
result <- result * f5(a)
result
}
}
gamma(0.3)
f5(0.3)
f6(0.3)
This is because of the class of object that gets returned from f5().
class(f5(0.3))
[1] "integrate"
This is a named list object, and you can call the specific value from it:
names(f5(a))
[1] "value" "abs.error" "subdivisions" "message" "call"
You want the value component. Modifying f6() to the code below makes it work:
f6 <- function(a){
if (a < 0){
print("a is negative")
}else if (a%%1 == 0){
return (factorial(a-1))
}else{
result <- 1
while (a > 1){
result <- result * (a - 1)
a <- a - 1
}
result <- result * f5(a)$value
result
}
}

Why does "Sum()" succeed where "+" fails in recursive R function?

I am experimenting with the functional programming paradigm in R. I have defined a function that sums a sequence of integers from n to m. When I use sum() the function returns the expected result:
sumRange <- function(n, m) {
if (n <= m) {
return(sum(n, sumRange((n + 1), m)))
}
}
sumRange(1, 10)
# [1] 55
However, when I use the + operator the function returns numeric(0):
sumRange <- function(n, m) {
if (n <= m) {
return(n + sumRange((n + 1), m))
}
}
sumRange(1, 10)
# numeric(0)
Why does the operator + not work in this recursive function? Is there a way to rewrite the function so that it does?
The issue is that you never specify an else condition, hence at the end of the recursion it appears that R is returning NULL when the if condition fails. Returning 0 as the else condition fixes your problem:
sumRange <- function(n, m) return(ifelse (n <= m, (n + sumRange((n+1), m)), 0))
sumRange(1, 10)
[1] 55
Note that this is essentially defining a base case for your recursion. A base case, when hit, ends the recursion and causes the calls on the stack to be unwound.
To see the issue with the way you phrased your code, try writing out your function explicitly:
sumRange <- function(n, m) {
if (n <= m) {
return(n + sumRange((n+1), m))
}
// but what gets returned if n > m ?
// this is undefined behavior
}
I'm not an R guru, but my understanding is that R was written in C, and C might allow a recursion like this with no else condition. But the behavior is not well defined and you should not be relying on it.
Demo
If there is no return (using a explicit or implicit return statement) is executed, then R functions seems to return a NULL object.
If you add numerical value to a this object, it will simply return numeric(0).
So, what happens in the second case is that when n reaches 11, it returns a NULL object, and goes back adding values to it. But NULL + 10 + 9 .. = numeric(0).
Check this with
no_ret <- function ()
{
# just return nothing
}
obj <- no_ret()
obj
# NULL
class(obj)
# "NULL
new_obj <- obj + 10
new_obj
# numeric(0)
When the first function is executed, the what the sum statement get is
a vector with a NULL in it. For example,
vec <- c(NULL, 10, 9,...) which is actually vec <- c(10, 9, ...), so you get the expected outcome.
> c(NULL, 10:1)
[1] 10 9 8 7 6 5 4 3 2 1
> sum(NULL, 10:1)
[1] 55
> NULL + 10:1
integer(0)

How to initialize R function during first run or whenever input changes

I'm new to R and have some trouble of understanding so called "envirionments" and way to use them properly. What I miss a lot in R language are static variables (like in Java).
I'm writing a program with couple of functions that will need to initialize during first run. To achieve this for each function I've created new environment which will be only accessed by this particular function (for example "f1" will be only accessed from inside "myfunction1").
What I don't like about my solution is that there is some additional code outside of function body and it's not too readable. Is there any simpler way to achieve the same? And if yes then it would be nice if you could provide me with modified example to show me how it works. Thank you.
f1 <- new.env()
f1$initialized <- FALSE
f1$o <- NULL
f1$length <- NULL
f1$compute
myfunction1 <- function(x) {
if(f1$initialized == FALSE){
f1$initialized <- TRUE
f1$compute <- 2*pi^2+3
}
if(is.null(f1$length) || f1$length!=length(x)){
f1$length <- length(x)
if(f1$length==2) {f1$o<-read.table("data_1.txt")}
else {f1$o<-read.table("data_2.txt")}
}
print("Lets print something!")
return(f1$o * f1$compute * x + 1000)
}
If you are familiar with Java then maybe using RefrenceClasses would be a good way to go. This seems to do what you are looking for:
myclass <- setRefClass('myclass', fields = list(initilized = 'logical',
o = 'data.frame',
len = 'numeric',
compute = 'numeric'))
#constructor
myclass$methods(initialize = function(initialized, len){
initilized <<- initialized
len <<- len
})
#method
myclass$methods(myfunction1 = function(x){
if(initilized == FALSE){
initilized <<- TRUE
compute <<- 2*pi^2+3
}
if(is.null(len) || len != length(x)){
len <<- length(x)
if(len==2) {o <<- read.table("data_1.txt")}
else {o <<- read.table("data_2.txt")}
}
print("Lets print something!")
return(o * compute * x + 1000)
})
obj <- myclass$new(FALSE, 0)
obj$myfunction1(2)
Check out ?ReferenceClasses for information on what's going on here (much more OOP styled and has some support for class inheritance, which sounds like what you want anyway).

R program keeps getting numeric(0) answer

I am a beginner in R. Here's the formula I'm trying to code to find the lambda that maximizes the log likelihood of some bigrams. When the bigrams are not found, the P_b (bigram) function fails, but the P_u (unigram) function should provide the unigram result (lambda = 0).
It works for bigrams that are found. When they're not found, tho, I only get numeric(0), not the unigram result.
p.mix <- function(w2, w1) {
(1-lambda) * uni.dfrm$prob[uni.dfrm$token==w2] + lambda * p.bi(w2,w1)
}
The p.bi() function looks complicated because of the indexing so I'm reluctant to post it but it does work when the bigrams are found. It just looks up the count of times w' appears after w and divides it by the times w appears, but I have to go through another vector of vocabulary words so it looks ugly.
When w' is never found occurring after w, instead of a zero count, there's no row at all, which is what apparently causes the numeric(0) result. That's what the mixed model is supposed to solve, but I can't get it to work. Any ideas how this can work?
You can add a test for the case where w2 is numeric(0) for example :
p.mix <- function(w2, w1) {
if(length(w2)>0){
res <- (1-lambda) * uni.dfrm$prob[uni.dfrm$token==w2] +
lambda * p.bi(w2,w1)
}else res <- 0
res
}
EDIT
p.mix <- function(w2, w1) {
if(length(w2) && length(uni.dfrm$prob[uni.dfrm$token==w2]) > 0)
(1-lambda) * uni.dfrm$prob[uni.dfrm$token==w2] + lambda * p.bi(w2,w1)
else 0
}

Resources