R ff ffbase ffwhich error in a function call? - r

Here is My code call ffwhich in a function:
library(ffbase)
rm(a,b)
test <- function(x) {
a <- 1
b <- 3
ffwhich(x, x > a & x < b)
}
x <- ff(1:10)
test(x)
Error in eval(expr, envir, enclos) (from <text>#1) : object 'a' not found
traceback()
6: eval(expr, envir, enclos)
5: eval(e)
4: which(eval(e))
3: ffwhich.ff_vector(x, x > a & x < b)
2: ffwhich(x, x > a & x < b) at #4
1: test(x)
It may caused by lazy evaluation? The eval() can not find the a and b which is bounded in function test. How can I use ffwhich in a function?
R 2.15.2
ffbase 0.6-3
ff 2.2-10
OS opensuse 12.2 64 bit

Yes, it looks like an eval issue like Arun is indicating. I normally use the following when using ffwhich which is like an eval.
library(ffbase)
rm(a,b)
test <- function(x) {
a <- 1
b <- 3
idx <- x > a & x < b
idx <- ffwhich(idx, idx == TRUE)
idx
}
x <- ff(1:10)
test(x)

I was having the same problem, and the answer given was not solving it, because we can not pass the argument "condition" to the function.
I just got a way to do that.
Here it is ::
require(ffdf)
# the data ::
x <- as.ffdf( data.frame(a = c(1:4,1),b=5:1))
x[,]
# Now the function below is working ::
idx_ffdf <- function(data, condition){
exp <-substitute( (condition) %in% TRUE)
# substitute will take the value of condition (non-evaluated).
# %in% TRUE makes the condition be false when there is NAs...
idx <- do.call(ffwhich, list(data, exp) ) # here is the trick: do.call !!!
return(idx)
}
# testing :
idx <- idx_ffdf(x,a==1)
idx[] # gives the correct 1,5 ...
idx <- idx_ffdf(x,b>3)
idx[] # gives the correct 1,2 ...
Hope this helps somebody !

Related

Does `gmp` library disregard index ordering in `[<-`?

Consider these two examples:
> foo <- 1:5
> foo2 <- c(10,20)
> foo[3:2] <- foo2
> foo
[1] 1 20 10 4 5
> bar <- as.bigz(1:5)
> bar2 <- as.bigz(c(10,20))
> bar[3:2] <- bar2
> bar
Big Integer ('bigz') object of length 5:
[1] 1 10 20 4 5
Am I missing something in how bigz objects are indexed, or is this a bug in the library?
Added: gmp 0.6.5 and R-windows 4.2.0 .
This is likely irrelevant but viewed within a debug context on [
a_5 <- as.bigz(c(5,4,3,2,1))
a_5[2:3]
debugging in: `[.bigz`(a_5, 2:3)
debug at /home/chris/r_TMPDIR/RtmpVMOZqo/R.INSTALL4bd8b27ba6a92/gmp/R/biginteger.R#480: {
mdrop <- missing(drop)
Narg <- nargs() - (!mdrop)
matrixAccess = Narg > 2
has.j <- !missing(j)
if (!is.null(attr(x, "nrow")) & matrixAccess) {
.Call(matrix_get_at_z, x, i, j)
}
else {
if (has.j)
stop("invalid vector subsetting")
r <- .Call(biginteger_get_at, x, i)
attr(r, "nrow") <- NULL
r
}
}
Browse[2]> Q
so we start in the context of debugging in: [.bigz(a_5, 2:3),
our first [, whereas:
bar = as.bigz(c(300, 400))
a_5[2:3] <- bar[2:1]
debugging in: `[.bigz`(bar, 2:1)
debug at /home/chris/r_TMPDIR/RtmpVMOZqo/R.INSTALL4bd8b27ba6a92/gmp/R/biginteger.R#480: {
mdrop <- missing(drop)
Narg <- nargs() - (!mdrop)
matrixAccess = Narg > 2
has.j <- !missing(j)
if (!is.null(attr(x, "nrow")) & matrixAccess) {
.Call(matrix_get_at_z, x, i, j)
}
else {
if (has.j)
stop("invalid vector subsetting")
r <- .Call(biginteger_get_at, x, i)
attr(r, "nrow") <- NULL
r
}
}
and we're on RHS debugging in: [.bigz(bar, 2:1), contrary to our expectation as would be with normal values:
a_norm <- c(5,4,3,2,1)
b_nums <- c(200, 300)
a_norm[2:3] = b_nums[2:1]
a_norm
[1] 5 300 200 2 1
where both the LHS 'where we want it' and RHS 'what it is' occur in a one liner. Note, I couldn't get debug to trigger in this instance.
Feature or flaw?
I received confirmation from one of the authors (M Maechler) that this is one of several bugs in gmp related to indexing and subsetting. There's also problems when attempting to run certain apply functions, BTW. We'll just have to wait for the next version to be released.

How to work around the fact that `fun<-` starts by evaluating `value`?

Consider the following function, it replaces the lhs by the value if the condition is TRUE
`==<-` <- function (e1, e2, value) replace(e1, e1 == e2, value)
if x == 3 replace x by 42 :
x <- 3
x == 3 <- 42
x
# [1] 42
So far so good, but what if value has side effects ? So far it is evaluated even if my condition is FALSE.
# desired: if x == 100, stop
x == 100 <- stop("equals 100!")
# Error: equals 100!
Is there a way around this ?
See below some workarounds I found around this, but I would like to see if there's more.
EDIT :
this addresses sotos' comment :
`==<-` <- function (e1, e2, value) {
cond <- e1 == e2
if(any(cond))
replace(e1, cond, value)
else e1
}
x <- 3; x == 100 <- 'xyz'
x
# [1] 3
Here are a few ways to work around this :
quote and modify ==<- so it always evaluates quoted calls
Use ~ as a quoting function
Use ~ as a shorthand for functions and use rlang::as_function
Use a function delay to quote input and add a class delayed so that only unquoted inputs and delayed quoted inputs will be evaluated.
Override <- to recognize ==<- and always delay the lhs
The last way is the only one to work without changing the interface, though it works is by overriding <- which is generally not advisable.
1. quote and modify ==<- so it always evaluates quoted calls
If we know we don't want to assign unevaluated calls
we can make sure our function evaluates everything, and just quote our
input.
`==<-` <- function (e1, e2, value) {
cond <- e1 == e2
if(any(cond))
replace(e1, e1 == e2, eval.parent(value))
else e1
}
x <- 42
x == 100 <- quote(stop("equals 100!"))
x <- 100
x == 100 <- quote(stop("equals 100!"))
# Error in eval(expr, envir, enclos) : equals 100!
2. Use ~ as a quoting function
If we know we don't want to assign formulas
we can use a ~ instead of quoting.
`==<-` <- function (e1, e2, value) {
cond <- e1 == e2
if(any(cond))
replace(e1, e1 == e2,
if(inherits(value, "formula"))
eval.parent(as.list(value)[[2]])
else
value)
else e1
}
x <- 42
x == 100 <- ~stop("equals 100!")
x <- 100
x == 100 <- ~stop("equals 100!")
# Error in eval(expr, envir, enclos) : equals 100!
3. Use ~ as a shorthand for functions and use rlang::as_function
If we know we don't want to assign functions nor formulas we can go a step further and build a feature out of it.
`==<-` <- function (e1, e2, value) {
cond <- e1 == e2
if(any(cond))
replace(e1, e1 == e2,
if(inherits(value, "formula") || is.function(value))
rlang::as_function(value)(e1)
else
value)
else e1
}
x <- 42
x == 100 <- ~stop("equals 100!")
x <- 100
x == 100 <- ~stop("equals 100!")
# Error in eval(expr, envir, enclos) : equals 100!
x == 100 <- sqrt
x
# [1] 10
4. Use a function delay to quote input and add a class delayed
We can create a function delay which will quote the value expression and add a class "delayed" which our function will recognize to trigger the call at the right moment :
`==<-` <- function (e1, e2, value) {
cond <- e1 == e2
if(any(cond))
replace(e1, e1 == e2,
if (inherits(x,"delayed")) eval.parent(x) else x)
else e1
}
delay <- function(x) {
x <- substitute(x)
class(x) <- "delayed"
x
}
x <- 42
x == 100 <- delay(stop("equals 100!"))
x <- 100
x == 100 <- delay(stop("equals 100!"))
# Error in eval(expr, envir, enclos) : equals 100!
The good part is that it can work with any code that might trigger an error, the bad part is that delay is a strange function that makes sense only in a specific context.
We can mitigate the awkwardness by defining a proper printing method referring to the package help:
print.delayed <- function(x,...){
message(
"Delayed call, useful as a `value` argument of `mmassign` assignment functions.\n",
"See ?mmassign::delay.")
print(unclass(x),...)
x
}
delay(stop("equals 100!"))
# delayed call, useful as a `value` argument of `mmassign` assignment functions.
# See ?mmassign::delay.
# stop("equals 100!")
We can with the same principles design a STOP function that will behave "delayed"
STOP <- function(...) `class<-`(substitute(stop(...)), "delayed")
x <- 42
x == 100 <- STOP("equals 100!")
x <- 100
x == 100 <- STOP("equals 100!")
# Error in eval(expr, envir, enclos) : equals 100!
STOP("equals 100!")
# delayed call, useful as a `value` argument of `mmassign` assignment functions.
# See ?mmassign::delay.
# stop("equals 100!")
5. Override <- to recognize ==<- and always delay the lhs
If we override <- we can make it work, but that's bad practice of course, so just for fun. If the first element of the LHS is ==, then quote value and add class "delayed" and proceed as above.
`<-` <- function(e1,e2) {
.Primitive("<-")(lhs, match.call()[[2]])
if(length(lhs) > 1 && identical(lhs[[1]],quote(`==`))) {
invisible(eval.parent(substitute(
.Primitive("<-")(e1,e2),
list(e1=substitute(e1),
e2= substitute(`class<-`(quote(e2),"delayed"))
))))
} else {
invisible(eval.parent(substitute(.Primitive("<-")(e1,e2))))
}
}
x <- 4
x == 100 <-stop("equals 100!")
x <- 100
x == 100 <-stop("equals 100!")
# Error in eval(expr, envir, enclos) : equals 100!

Is it possible to construct an assignment expression using rlang?

I'm trying to use the rlang package to construct an expression that does an assignment, given a right-hand side expression (the value to assign) and a left-hand side expression (the place to assign it to). For example, let's say I want to construct and evaluate the expression a <- 5:
> library(rlang)
> a <- "Not 5"
> lhs <- quo(a)
> rhs <- quo(5)
> eval_tidy(quo( (!!lhs) <- (!!rhs)) ) # Error
Error in (~a) <- (~5) : could not find function "(<-"
> eval_tidy(quo(`<-`(!!lhs, !!rhs))) # Error
Error in ~a <- ~5 : could not find function "~<-"
> eval_tidy(quo(`<-`(!!f_rhs(lhs), !!rhs))) # No error, but no effect
[1] 5
> stopifnot(a == 5)
Error: a == 5 is not TRUE
> print(a)
[1] "Not 5"
As you can see, none of the above methods of constructing and evaluating this assignment have the desired effect. Is there any way to do this correctly?
Edit: Using assign instead of <- is not a good solution, because it only works for variables, not elements of objects. For example, it won't work for:
> a <- list(ShouldBeFive="Not 5")
> lhs <- quo(a$ShouldBeFive)
Edit 2: I have written a proof of concept that demonstrates what I'm trying to accomplish. It defines an assign_general function that allows arbitrary left-hand sides, e.g. assign_general(a[[1]], 5) is equivalent to a[[1]] <- 5. However, my implementation seems kind of hackish, I don't know what corner cases I may have missed, and I'm still not sure if there's a more direct way to do it, so I'm still interested to see if anyone has a better solution.
1) rlang::lang We can use rlang::lang like this:
library(rlang)
# inputs
a <- "Not 5"
lhs <- quote(a)
rhs <- 5
L <- lang("<-", lhs, rhs)
eval(L)
a
## [1] 5
2) call or without rlang use call in place of lang:
# inputs
a <- "Not 5"
lhs <- quote(a)
rhs <- 5
cc <- call("<-", lhs, rhs)
eval(cc)
a
## [1] 5
2a) Both of the above also work in the case that lhs is an appropriate expression. For example using the built-in data frame BOD:
# inputs
BOD2 <- BOD
lhs <- quote(BOD2$xyz)
rhs <- 5
cc <- call("<-", lhs, rhs)
eval(cc)
names(BOD2)
## [1] "Time" "demand" "xyz"
2b) assign_general
assign_general <- function(lhs, rhs, envir = parent.frame()) {
cc <- call("<-", substitute(lhs), substitute(rhs))
eval(cc, envir = envir)
}
# test
a <- 1:5
assign_general(a[3], 5)
a
## [1] 1 2 5 4 5
Some alternatives to the call statement would be:
cc <- substitute(call("<-", lhs, rhs))
or
cc <- substitute(lhs <- rhs)
2c) Of course this would be sufficient:
assign_general2 <- `<-`
a <- 1:5
assign_general2(a[3], 5)
## [1] 1 2 5 4 5
3) rlang version of assign_general An rlang implementation of assign_general in (2b) can be obtained by replacing call with lang and substitute with enexpr:
library(rlang)
assign_general3 <- function(lhs, rhs, envir = parent.frame()) {
L <- lang("<-", enexpr(lhs), enexpr(rhs))
eval(L, envir = envir)
}
# test
a <- 1:5
assign_general3(a[3], 5)
a
## [1] 1 2 5 4 5
4) strings Another possibility is to deparse the arguments into strings:
assign_general4 <- function(lhs, rhs, envir = parent.frame()) {
s <- paste(deparse(substitute(lhs)), "<-", deparse(substitute(rhs)))
p <- parse(text = s)
eval(p, envir = envir)
}
# test
a <- 1:5
assign_general4(a[3], 5)
a
## [1] 1 2 5 4 5
The advantage of meta programming is using expressions as strings and being able to perform assignments to an L-value which can also be declared as a string. The assign function is reusable in meta programming in some cases.
rhs <- "1 > 0"
assign("lhs", eval(eval_tidy(parse(text=rhs))))
lhs
[1] TRUE
Above you can see that both lhs and rhs are passed as strings and an expression is assigned to an L-value.
With a little bit of dark magic and some luck I was able to achieve what you are after:
library(rlang)
expr<-quote(x<-1) # just some sample assignment operator to modify
a <- list(ShouldBeFive="Not 5")
lhs <- quo(a[[1]])
rhs <- quo(5)
expr[[2]] <-UQE(eval(lhs))
expr[[3]] <-UQE(eval(rhs))
expr
>a[[1]] <- 5
eval(expr)
a$ShouldBeFive
>5
Here's hopefully cleaner alternative that does not depend on rlang:
b <- list(ShouldBeSix="Not 6")
lhs <- quote(b[[1]])
rhs <- quote(6)
eval(substitute(x <- value,list(x = lhs, value = eval(rhs))))
b$ShouldBeSix

data.table by with substitute, eval and deparse

Problem 1: Why it is not consistent?
dt <- data.table(x=1:4, y=c(1,1,2,2), z=c(1,2,1,2))
test1 <- function(dt, a){
t <- deparse(substitute(a))
dt[,list(x=sum(x)), by=t]
}
test1(dt, y) # Works well
y x
1: 1 3
2: 2 7
test2 <- function(dt, a){
dt[,list(x=sum(x)), by=deparse(substitute(a))]
}
test2(dt, y)
# Error: 'by' appears to evaluate to column names but isn't c() or key().
Problem 2:
It seems I can do the following in both frames? Why is that? Which one should I use?
test1 <- function(dt, a){
dt[,list(x=sum(x)), by=eval(substitute(a))]
}
test1(dt, y)
substitute x
1: 1 3
2: 2 7
>
test2 <- function(dt, a){
dt[,list(x=sum(x)), by=eval(substitute(a), parent.frame())]
}
test2(dt, y)
substitute x
1: 1 3
2: 2 7
You didn't reproduce the full error:
test2(dt, y)
Error in [.data.table(dt, , list(x = sum(x)), by = deparse(substitute(a))) :
'by' appears to evaluate to column names but isn't c() or key(). Use by=list(...) if you can. Otherwise, by=eval(deparse(substitute(a))) should work. This is for efficiency so data.table can detect which columns are needed.
As suggested (or perhaps merely hinted at) you could get success by just enclosing in c
test2 <- function(dt, a){
dt[,list(x=sum(x)), by=c(deparse(substitute(a)))]
}
> test2(dt, y)
y x
1: 1 3
2: 2 7
I think the c() forces an evaluation.

Subset data.table by logical column

I have a data.table with a logical column. Why the name of the logical column can not be used directly for the i argument? See the example.
dt <- data.table(x = c(T, T, F, T), y = 1:4)
# Works
dt[dt$x]
dt[!dt$x]
# Works
dt[x == T]
dt[x == F]
# Does not work
dt[x]
dt[!x]
From ?data.table
Advanced: When i is a single variable name, it is not considered an
expression of column names and is instead evaluated in calling scope.
So dt[x] will try to evaluate x in the calling scope (in this case the global environment)
You can get around this by using ( or { or force
dt[(x)]
dt[{x}]
dt[force(x)]
x is not defined in the global environment. If you try this,
> with(dt, dt[x])
x y
1: TRUE 1
2: TRUE 2
3: TRUE 4
It would work. Or this:
> attach(dt)
> dt[!x]
x y
1: FALSE 3
EDIT:
according to the documentation the j parameter takes column name, in fact:
> dt[x]
Error in eval(expr, envir, enclos) : object 'x' not found
> dt[j = x]
[1] TRUE TRUE FALSE TRUE
then, the i parameter takes either numerical or logical expression (like x itself should be), however it seems it (data.table) can't see x as logical without this:
> dt[i = x]
Error in eval(expr, envir, enclos) : object 'x' not found
> dt[i = as.logical(x)]
x y
1: TRUE 1
2: TRUE 2
3: TRUE 4
This should also work and is arguably more natural:
setkey(dt, x)
dt[J(TRUE)]
dt[J(FALSE)]

Resources