data.table CJ with string as input

data.table CJ with string as input - r

How can the CJ-command be run with string as input? The following MNWE illustrates what is needed:
library(data.table)
# This is the desired output (when needed.cols==2)
dt.wanted <- CJ(X.1=c(1L, 2L), X.2=c(1L, 2L))
# Here is an example with needed.cols as variable
needed.cols <- 2L
use.text <- paste0("X.", 1L:needed.cols, "=c(1L, 2L)", collapse=", ")
# Here are some failing attempts
dt.fail <- CJ(use.text)
dt.fail <- CJ(eval(use.text))
dt.fail <- CJ(get(use.text))
So it is the use.text I want to make scriptable (because it varies, not only with needed.cols).

IIUC, you are looking for a function to pass a list of arguments into ... of a function. You can do it using do.call as follows:
do.call(CJ, eval(parse(text=paste0("list(",use.text,")"))))
Hope that is what you are looking for...

The get-function is the standard way of promoting a character value to a true R name value.
Is this what you want:
col.wanted =2
dt.wanted[ , get(paste0("X.", col.wanted) )]
#[1] 1 2 1 2
Getting multiple columns based on evaluation of a more complex expression might require somewhat more baroque efforts:
> use.text <- paste0("list(", paste0("X.", 1L:needed.cols, collapse=", "),")")
> use.text
[1] "list(X.1, X.2)"
> dt.wanted[ , eval(use.text)]
[1] "list(X.1, X.2)"
> dt.wanted[ , parse(text=use.text)]
expression(list(X.1, X.2))
> dt.wanted[ , eval(parse(text=use.text))]
X.1 X.2
1: 1 1
2: 1 2
3: 2 1
4: 2 2

Related

Modify a function call captured in exp

For example, changing cumsum in the output of expr(cumsum(1:3)) to cumprod.
Currently the only thing I can think of is converting the output of expr(cumsum(1:3)) to a string, editing it, then changing it back to a function call.
This seems like a pretty poor solution though and I'm struggling to find a better way.
library(rlang)
f(expr(cumsum(1:4)), cumprod)
# [1] 1 2 6 24
This is basically what I'm trying to achieve. Can you help me find a starting point?

If you just apply gsub to expression R casts it to character vector and does the substitution which you can cast as expression with parse:
y <- 1:4
x <- expression({cumsum(y)})
x.2 <- gsub("cumsum", "cumprod", x)
class(x.2)
# [1] "character"
x.2 <- parse(text = x.2)
eval(x)
# [1] 1 3 6 10
eval(x.2)
# [1] 1 2 6 24

Here is an option using rlang
f <- function(ex, fn) {
ex1 <- as.character(ex)
fn <- enquo(fn)[-1]
eval_tidy(parse_expr(glue::glue('{fn}({ex1[-1]})')))
}
f(expr(cumsum(1:4)), cumprod)
#[1] 1 2 6 24

Note that if you replaced cumsum with cumprod the output would be a vector 4 long, not 24, so we assume you meant to replace it with prod.
We use substitute to substitute cumsum with the value of the cumsum argument and then evaluate the resulting expression.
f here uses no packages -- the input in the question uses expr from rlang but even that is not really needed since we could have used quote(...) in place of expr(...).
f <- function(.x, cumsum) eval.parent(do.call("substitute", list(.x)))
# test
f(expr(cumsum(1:4)), prod)
## [1] 24
f(expr(cumsum(1:4)), cumprod)
## [1] 1 2 6 24

I like #David Arenburg, so I'm posting his answer here and marking it.
It's not clear to me how do you decide which function you want replace (because : is also a function). But if you want to always replace the outer one, you could define the following
function f <- function(x, y) {
tmp <- substitute(x)
tmp[[1]] <- substitute(y)
eval(tmp)
}
and then use it as follows
f(cumsum(1:4), cumprod)
#[1] 1 2 6 24
– David Arenburg

How to read user input into the subset command

I have some R command like this
subset(
(aggregate(cbind(var1,var2)~Ei+Mi+hours,a, FUN=mean)),
(aggregate(cbind(var1,var2)~Ei+Mi+hours,a, FUN=mean))$Ei == c(1:EXP)
)
I want to do
1) Ask the user to input the var1 and var2
2) Get those variables into the subset command line as shown above and
continue with other things.
Note: for reading the user input I have variables like
c(ax,bx,cx,dx,ex,fx,gx,hx,ix,jx,kx,lx,mx,nx,ox) = c(1:15) and each
variable is mapped to number 1 to 15. So displaying this for user and
asking the user to select any number between 1 to 15 and then
checking the corresponding variable for the entered number and
reading this into the command line is whats the best method, I think.
So how can I implement this?
Regarding the answer:
Just wondering there is one possible scenario like , if the user wants to enter multiple of numbers in one go. [ex: 1,2,3]...than how to read this using readlines as said in the answer below using
v1 <- quote(var1 <- as.numeric(readline('Enter Variable 1: ')))
eavl(v1)
xx <- paste0(letters[1:15], 'x')
xx[v1]
How to read multiple variables in this case?

Here's a rough example of the readline interactive prompt. When v1 is evaluated, the user will be prompted to enter a value. That value is then stored as var1.
> v1 <- quote(var1 <- as.numeric(readline('Enter Variable 1: ')))
> eval(v1)
Enter Variable 1: 1000 ## user enters 1000, for example
> 100 + var1 + 50 ## example to show captured output as object
## [1] 1150
So in your case it might go something like
> v1 <- quote(var1 <- as.numeric(readline('Enter a number from 1 to 15: ')))
> eval(v1)
Enter a number from 1 to 15: 7
> var1
## [1] 7
> xx <- paste0(letters[1:15], 'x')
> xx
## [1] "ax" "bx" "cx" "dx" "ex" "fx" "gx" "hx" "ix" "jx" "kx" "lx" "mx" "nx" "ox"
> xx[var1]
## [1] "gx"
I borrowed this idea for a function from this older SO post. You can return the output invisibly and it will still take in the user values.
input.fun <- function(){
v1 <- readline("var1: ")
v2 <- readline("var2: ")
v3 <- readline("var3: ")
v4 <- readline("var4: ")
v5 <- readline("var5: ")
out <- sapply(c(v1, v2, v3, v4, v5), as.numeric, USE.NAMES = FALSE)
invisible(out)
}
> x <- input.fun()
var1: 7
var2: 4
var3: 8
var4: 5
var5: 2
> x
[1] 7 4 8 5 2
In response to your edit: I'm not sure if this is the standard method for reading multiple numbers in one line, but it works.
> xx <- readline('Enter numbers separated by a space: ')
Enter numbers separated by a space: 4 12 67 9 2
> as.numeric(strsplit(xx, ' ')[[1]])
## [1] 4 12 67 9 2

Here's a possibility using scan()
#sample data
df<-data.frame(
ax=runif(50),
bx=runif(50),
cx=runif(50),
dx=runif(50),
Ei=sample(letters[1:5], 50, replace=T)
)
#get vars
vars<-c(NA,NA)
while(any(is.na(vars))) {
cat(paste("enter var number", sum(!is.na(vars))+1),"\n")
cat(paste(seq_along(names(df)), ":", names(df)), sep="\n")
try(n<-scan(what=integer(), nmax=1), silent=T)
vars[min(which(is.na(vars)))]<-n
}
#--pause
#use vars
subset(aggregate(df[,vars], df[,c("Ei"), drop=F], FUN=mean), Ei=="a")
It's not super robust, but if you copy the first half (before the pause) it will ask you for two variable numbers, and then if you run the second half, it will use those two values. I've adjusted the aggregate and subset to be more appropriate for variable usage which means not using the formula syntax.
I did not do any error checking. That's left as an exercise for the asker.

Remove quotes from vector element in order to use it as a value

Suppose that I have a vector x whose elements I want to use to extract columns from a matrix or data frame M.
If x[1] = "A", I cannot use M$x[1] to extract the column with header name A, because M$A is recognized while M$"A" is not. How can I remove the quotes so that M$x[1] is M$A rather than M$"A" in this instance?

Don't use $ in this case; use [ instead. Here's a minimal example (if I understand what you're trying to do).
mydf <- data.frame(A = 1:2, B = 3:4)
mydf
# A B
# 1 1 3
# 2 2 4
x <- c("A", "B")
x
# [1] "A" "B"
mydf[, x[1]] ## As a vector
# [1] 1 2
mydf[, x[1], drop = FALSE] ## As a single column `data.frame`
# A
# 1 1
# 2 2
I think you would find your answer in the R Inferno. Start around Circle 8: "Believing it does as intended", one of the "string not the name" sub-sections.... You might also find some explanation in the line The main difference is that $ does not allow computed indices, whereas [[ does. from the help page at ?Extract.
Note that this approach is taken because the question specified using the approach to extract columns from a matrix or data frame, in which case, the [row, column] mode of extraction is really the way to go anyway (and the $ approach would not work with a matrix).

Assigning NA to groups of variables using data.table

I am trying to assign NA for specific values (0 and 99) to a group of variables (9 variables, from p05_1 to p05_9) using data.table. I don't get any error, but nothing happens when I use this code:
Here a short example:
v_1 <- c(0,0,1,2,3,4,4,99)
v_2 <- c(1,2,2,2,3,99,1,0)
dat <- data.table(v_1,v_2)
for(n in 1:9) {
char <- sprintf('p05_%s', n)
st[eval(parse(text=char)) %in% c(0,99), eval(parse(text=char)) := NA_integer_]
}
Best.

This is related to this question and answer
For the data.table to kick into use eval in j mode, the whole call should be eval(...).
Otherwise, your call is parsed as
`:=`(eval(parse(text=char)), NA_integer_)
Which won't be picked up as I'm trying to use eval in j by [.data.table.
I haven't tested for i, but it could be safe to do this anyway
something like
for(n in 1:2) {
chari <- paste0(sprintf('v_%s' ,n), ' %in% c(0,99)')
charj <- sprintf('v_%s := NA_integer_', n)
dat[eval(parse(text=chari)), eval(parse(text=charj))]
}
should work. Note I have fudged the call to %in% to avoid sprintf giving an error using % as a regular character.

An alternative to the eval(parse(text= route, in this case :
for (n in 1:2) {
vnam = paste0("v_",n)
set(dat, which(dat[[vnam]]%in%c(0,99)), vnam, NA_integer_)
}
Note that [[ in base R doesn't take a copy of the column (it's copy-on-write), so that can be a good way to refer to a single column. Looping set and [[ can be worth it if there are a lot of columns (say 10,000+).

Here's another alternative using the replace() function:
> dat[, lapply(list(v_1, v_2), function(x) replace(x, x %in% c(0, 99), NA_integer_))]
V1 V2
1: NA 1
2: NA 2
3: 1 2
4: 2 2
5: 3 3
6: 4 NA
7: 4 1
8: NA NA

How Can I vectorize this function to return an index vector?

I'm new to R and am trying to get a handle on the apply family of functions. Specifically, I am trying to write a higher-order function that will accept 2 character vectors, "host", and "guest" (which do not need to be the same length) and return me an index vector the same length as "host", with the resulting elements corresponding to their indices in guest (NA if not there).
host <- c("A","B","C","D")
guest <- c("D","C","A","F")
matchIndices <- function(x,y)
{
return(match(x,y))
}
This code returns 3 as expected:
matchIndices(host[1],guest)
This is the loop I'd like to be able to replace with a succinct apply function (sapply?)
for (i in 1:length(host))
{ idx <- matchIndices(host[i],guest);
cat(paste(idx,host[i],"\n",sep=";"))
}
This code "works" in that it produces the output below, but I really want the result to be a vector, and I have a hunch that one of the apply functions will do the trick. I'm just stuck on how to write it. Any help would be most appreciated. Thanks.
3;A;
NA;B;
2;C;
1;D;

host <- c("A","B","C","D")
guest <- c("D","C","A","F")
matchIndices <- function(x,y) {
return(match(x,y))
}
One (inefficient) way is to sapply over the host vector, passing in guest as an argument (note you could just simplify this to sapply(host, match, guest) but this illustrates a general way of approaching this sort of thing):
> sapply(host, matchIndices, guest)
A B C D
3 NA 2 1
However, this can be done directly using match as it accepts a vector first argument:
> match(host, guest)
[1] 3 NA 2 1
If you want a named vector as output,
> matched <- match(host, guest)
> names(matched) <- host
> matched
A B C D
3 NA 2 1
which could be wrapped into a function
matchIndices2 <- function(x, y) {
matched <- match(x, y)
names(matched) <- x
return(matched)
}
returning
> matchIndices2(host, guest)
A B C D
3 NA 2 1
If you really want the names and the matches stuck together into a vector of strings, then:
> paste(match(host, guest), host, sep = ";")
[1] "3;A" "NA;B" "2;C" "1;D"

if you want the output vector in the host;guestNum format you would use do.call, paste, match as follows:
> do.call(paste, list(host, sapply(host, match, guest), sep = ';'))
[1] "A;3" "B;NA" "C;2" "D;1"

sapply(host , function(x) which(guest==x))
$A
[1] 3
$B
integer(0)
$C
[1] 2
$D
[1] 1
unlist(sapply(host , function(x) which(guest==x)))
A C D
3 2 1
paste(host, sapply(host , function(x) which(guest==x)), sep=":", collapse=" ")
[1] "A:3 B:integer(0) C:2 D:1"

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

data.table CJ with string as input - r

IIUC, you are looking for a function to pass a list of arguments into ... of a function. You can do it using do.call as follows: do.call(CJ, eval(parse(text=paste0("list(",use.text,")")))) Hope that is what you are looking for...

Related

Modify a function call captured in exp

How to read user input into the subset command

Remove quotes from vector element in order to use it as a value

Assigning NA to groups of variables using data.table

How Can I vectorize this function to return an index vector?

Categories

Resources