Evaluate dataframe$column expression stored as a string value - r

Can a string of the form below be evaluated so that it is equivalent to the same "literal" expression?
Example data and code:
df.name = data.frame(col1 = 1:5, col2 = LETTERS[seq(1:5)], col3 = letters[seq(1:5)], stringsAsFactors = FALSE)
col.name = "col2"
row.num = "4"
var1 = str_c("df.name$", col.name,"[",row.num,"]")
> var1
[1] "df.name$col2[4]"
The literal works as expected
> df.name$col2[4]
[1] D
get() is not equivalent:
get(var1)
## Error in get(var1) : object 'df.name$col2[4]' not found
This form of get() "works" but does not solve the problem
get("df.name")$col2[4]
[1] D
Per other posts I've tried eval(parse()) and eval(parse(text())) without success.
I'm trying to create a function that will search (subset) df.name using the col.name passed to the function. I want to avoid writing a separate function for each column name, though that will work since I can code df.name$col2[row.num] as a "literal".
EDIT
The example code should have shown the row.num as type numeric / integer, i.e., row.num = 4

You are almost there:
> eval(parse(text = var1))
[1] "D"
Because parse expecting file by default, you need to specify the text parameter.

I'm trying to create a function that will search (subset) df.name using the col.name passed to the function.
Set up data:
df.name = data.frame(col1 = 1:5, col2 = LETTERS[1:5], ## seq() is unnecessary
col3 = letters[1:5],
stringsAsFactors = FALSE)
col.name = "col2"
row.num = "4"
Solving your ultimate (index the data frame by column name) rather than your proximal (figure out how to use get()/eval() etc.) question: as #RichardScriven points out,
f <- function(col.name,row.num,data=df.name)
return(data[[col.name]][as.numeric(row.num)])
}
should work. It would probably be more idiomatic if you specified the row number as numeric rather than character, if possible ...

Related

R function that selects certain columns from a dataframe

I am trying to figure out how to write a function in R than can select specific columns from a dataframe(df) for subsetting:
Essentially I have df with columns or colnames : count_A.x, count_B.x, count_C.x, count_A.y, count_B.y, count_C.y.
I would ideally like a function where I can select both "count_A.x" and "count_A.y" columns by simply specifying count_A in function argument.
I tried the following:
e.g. pull_columns2 <- function(df,count_char){
df_subset<- df%>% select(,c(count_char.x, count_char.y))
}
Unfortunately when I run the above code [i.e., pull_columns2(df, count_A)] the following code it rightfully says that column count_char.x does not exist and does not "convert" count_char.x to count_A
pull_columns2(df, count_A)
We can use
pull_columns2 <- function(df,count_char){
df_subset<- df %>% select(contains(count_char))
df_subset
}
#> then use it as follows
df %>% pull_columns2("count_A")
Try
select_func = function(df, pattern){
return(df[colnames(df)[which(grepl(pattern, colnames(df)))]])
}
df = data.frame("aaa" = 1:10, "aab" = 1:10, "bb" = 1:10, "ca" = 1:10)
select_func(df,"b")

Convert a quosure with dashes to a string?

When I do:
> quo(DLX6-AS1)
The output is:
<quosure>
expr: ^DLX6 - AS1
env: global
Which inserts spaces around the dash.
When I try to convert that to a string, I get either:
quo(DLX6-AS1) %>% quo_name
"DLX6 - AS1"
or
quo(DLX6-AS1) %>% rlang::quo_name
or
quo(`DLX6-AS1`) %>% rlang::quo_name
Error: Can't convert a call to a string
How can I make it possible to use strings with dashes in my function? The function takes in a gene name and looks up that row in a dataframe, but some of the genes are concatenated by a dash:
geneFn <- function(exp.df = seurat.object#data, gene = SOX2) {
gene <- enquo(gene)
exp.df <- exp.df[as_name(gene), ]
}
> geneFn(DLX6-AS1)
Thanks!
This has been asked before here: https://github.com/r-lib/rlang/issues/770 , but it doesn't answer how to actually do this.
What version of rlang do you have? For me this works:
quo(`DLX6-AS1`) %>% quo_name()
#> [1] "DLX6-AS1"
You do need to use backticks when column names have special characters, otherwise they are interpreted as code.
Note that it is recommended to use either as_name() or as_label() instead of quo_name(), the latter was a misleading misnomer and might be deprecated in the future.
One option would be to stick with bare row names but wrap names that aren't syntactically valid (like names with dashes) in backticks. This could be confusing if someone else is supposed to use this function.
Here's a small, reproducible example:
library(rlang)
dat = data.frame(x1 = letters[1:2],
x2 = LETTERS[1:2])
row.names(dat) = c("DLX6-AS1", "other")
geneFn <- function(exp.df = dat, gene = other) {
gene <- enquo(gene)
exp.df[as_name(gene), ]
}
geneFn(gene = other)
# x1 x2
# other b B
geneFn(gene = `DLX6-AS1`)
# x1 x2
# DLX6-AS1 a A
If you have many names like this, it may be simpler to pass quoted names instead of bare names. This also simplifies the function a bit since you don't need tidyeval.
geneFn2 <- function(exp.df = dat, gene = "other") {
exp.df[gene, ]
}
geneFn2(gene = "other")
# x1 x2
# other b B
geneFn2(gene = "DLX6-AS1")
# x1 x2
# DLX6-AS1 a A
Another option is to make syntactically valid names row names. The make.names() function can help with this.
make.names( row.names(dat) )
[1] "DLX6.AS1" "other"
Then you could assign these new row names to replace the old and go ahead with your original function with the new names.
row.names(dat) = make.names( row.names(dat) )
What about:
geneFn <- function(exp.df = seurat.object#data, gene = SOX2) {
gene <- sub(" - ","-", deparse(enexpr(gene)))
exp.df <- exp.df[gene, ]
}

R NameValue from CSV String - access value via name

I am new to R and have a question not knowing how to solve it. Maybe you can help?
I do have a separated name/value input string: param1=test;param2=3;param3=140;
I would like to access a value via it's name in R.
Something like using
myParams["param1]
I already tried something like:
input = "param1=test;param2=3;param3=140;"
output1 = strsplit(input,";")[[1]]
output2 = do.call(rbind, strsplit(output1, "="))
to get a matrix but am missing the rest..
You could define a custom function myParams:
# Your sample data
input = "param1=test;param2=3;param3=140;"
output1 = strsplit(input,";")[[1]]
output2 = do.call(rbind, strsplit(output1, "="))
# Define function
myParams <- function(par, df = output2) {
return(df[which(df[, 1] == par), 2])
}
myParams("param1");
#[1] "test"
myParams("param2");
#[1] "3"
A simple way would be to create a dataframe out of that matrix first and then access the value via row names
input = "param1=test;param2=3;param3=140;"
output1 = strsplit(input,";")[[1]]
output2 = do.call(rbind, strsplit(output1, "="))
temp = data.frame(output2,row.names = TRUE)
# X2
#param1 test
#param2 3
#param3 140
temp[,"param1"]
#test
temp[,"param2"]
#3
temp[,"param3"]
#140

extract string using substring in column data.frame using another column's integer as argument

If I have a data.frame how can I use the v2 values to substring v1.
df <- data.frame(v1 = c("jsdlfkjs", "fjdslkkkkfj", "jdkskksjdjslak"),
v2 = c(3,4,2))
What to apply something like this :
res <- substring(df$v1, start = df$v2-1, stop = df$v2+1)
and get
res
# [1] "sdl" "dsl" "jdk"
You're using the wrong arguments for substring. Look at ?substring for more information. You want to use first, last not start, stop
res <- substring(df$v1, first = df$v2-1, last = df$v2+1)

zoo create new column with dynamic column name

I am trying to add a column to a zoo object. I found merge which works well
test = zoo(data.frame('x' = c(1,2,3)))
test = merge(test, 'x1' = 0)
However when I try to name the column dynamically, it no longer works
test = merge(test, paste0('x',1) = 0)
Error: unexpected '=' in "merge(test,paste0('x',1) ="
I have been working with data frames and the same syntax works
test = data.frame('x' = c(1,2,3))
test[paste0('x',1)] = 0
Can someone help explain what the problem is and how to get around this?
Try setNames :
setNames( merge(test, 0), c(names(test), paste0("x", 1)) )
or names<-.zoo like this:
test2 <- merge(test, 0)
names(test2) <- c(names(test), paste0("x", 1))
I found this solution very easy and elegant. It uses the eval() function to interpret a string as an R command. Thus, you are completely free to assemble the string exactly the way you want:
test = merge(test, paste0("x",1) = 0)
# does not work (see question)
test[,"x1"] <- 0
# does not work for uninitialized columns
test$x1 <- 0
# works to initialize a new column
# so lets trick R by assembling this command out of strings:
newcolumn <- "x1"
eval(parse(text=paste0("test$",newcolumn," <- 0")))
# welcome test$x1 :-)
Merge expects a string as variable name, it doesn't understand variable names that are return values of functions. Why not
test = zoo(data.frame('x' = c(1,2,3)))
var <- paste0('x',1)
test = merge(test, var = 0)

Resources