I'm trying to write my function and need to pass argument inside.
Use mtcars dataset as an example:
get.param <- function(data, var){
data %>% select(eval(var)) %>%
head()
}
get.param(mtcars, 'hp')
In the above function, replacing eval() with get() gave me error.
I'm little bit confused which one should I use. I use get() i some other functions and work. What is the difference between these two?
You can get it to work via
get.param <- function(data, var){
var <- enquo(var)
data %>% select(!!var) %>%
head()
}
get.param(mtcars, hp)
hp
Mazda RX4 110
Mazda RX4 Wag 110
Datsun 710 93
Hornet 4 Drive 110
Hornet Sportabout 175
Valiant 105
Normally one does not use get or eval with dplyr. See the vignette in the rlang package for how it is done with that package; however, in this particular case one can just pass var directly to select adding parentheses around it so that it does not confuse it with a column called "var" should it exist. If you are not worried about that edge case you could omit the parentheses.
get.param <- function(data, var) {
data %>% select((var)) %>% head
}
get.param(mtcars, 'hp')
giving:
hp
Mazda RX4 110
Mazda RX4 Wag 110
Datsun 710 93
Hornet 4 Drive 110
Hornet Sportabout 175
Valiant 105
Another possibility is to use ... like this and giving the same answer. In this variation we don't need to add the parentheses to eliminate an edge case. It also allows multiple columns to be specified.
get.param <- function(data, ...) {
data %>% select(...) %>% head
}
get.param(mtcars, 'hp')
Related
I want to get the input parameter of a function to create an output with the same prefix in the global env
fun_mtcars <- function(name_ref,...){
df <- name_ref %>%
select(mpg,...)
.GlobalEnv$selec_name_ref <- df
}
fun_mtcars(mtcars,disp)
In global env a new data frame was created with the name "selec_name_ref " but I want a name "selec_mtcars"
I can do selec_mtcars <- fun_mtcars(mtcars,disp)
but I have a lot of function to execute
We may extract the object name as a string with deparse/substitute and use that in paste to create new object and assign to .GlobalEnv with [[ instead of $
fun_mtcars <- function(name_ref,...){
name_ref_str <- deparse(substitute(name_ref))
df <- name_ref %>%
select(mpg,...)
.GlobalEnv[[paste0("select_", name_ref_str)]] <- df
}
-checking
fun_mtcars(mtcars,disp)
> head(select_mtcars)
mpg disp
Mazda RX4 21.0 160
Mazda RX4 Wag 21.0 160
Datsun 710 22.8 108
Hornet 4 Drive 21.4 258
Hornet Sportabout 18.7 360
Valiant 18.1 225
I am looking for an easy way to have my function work with input the comes from Shiny (i.e. string input) or with typical interactive use that is Tidyverse functions enable with NSE. Without the need to duplicate my code to handle each case separately.
An example of usage:
library(dplyr)
flexible_input <- function(var){
mtcars %>%
select(var)
}
# This works for NSE
nse_input <- function(var){
mtcars %>%
select({{ var }})
}
# This works for shiny but now I am duplicated my code essentially
shiny_input <- function(var){
mtcars %>%
select(.data[[var]])
}
flexible_input(mpg)
flexible_input('mpg')
If we need flexible_input to take string and unquoted input, convert to symbol and evaluate (!!)
flexible_input <- function(var){
mtcars %>%
dplyr::select(!! rlang::ensym(var))
}
-testing
> flexible_input(mpg) %>% head
mpg
Mazda RX4 21.0
Mazda RX4 Wag 21.0
Datsun 710 22.8
Hornet 4 Drive 21.4
Hornet Sportabout 18.7
Valiant 18.1
> flexible_input("mpg") %>% head
mpg
Mazda RX4 21.0
Mazda RX4 Wag 21.0
Datsun 710 22.8
Hornet 4 Drive 21.4
Hornet Sportabout 18.7
Valiant 18.1
This question already has answers here:
dplyr/rlang: parse_expr with multiple expressions
(3 answers)
Closed 2 years ago.
I often create a "vector" of the variables I use most often while I'm coding. Usually if I just input the vector object in select it works perfectly. Is there any way I can use in the helper functions in a string?
For example I could do
library(dplyr)
x = c('matches("cyl")')
mtcars %>%
select_(x)
but this is not preferable because 1) select_ is deprecated and 2) it's not scalable (i.e., x = c('hp', 'matches("cyl")') will not grab both the relevant columns.
Is there anyway I could use more tidyselect helper functions in as part of a vector?
Note: if I do something like:
x = c(matches("cyl"))
#> Error: `matches()` must be used within a *selecting* function.
#> ℹ See <https://tidyselect.r-lib.org/reference/faq-selection-context.html>.
I get an error, so I'll definitely need to enquo it somehow.
You are trying to turn a string into code which might not be the best approach. However, you can use parse_exprs with !!!.
library(dplyr)
library(rlang)
x = c('matches("cyl")')
mtcars %>% select(!!!parse_exprs(x))
# Cyl
#Mazda RX4 6
#Mazda RX4 Wag 6
#Datsun 710 4
#Hornet 4 Drive 6
#Hornet Sportabout 8
#...
x = c('matches("cyl")', 'hp')
mtcars %>% select(!!!parse_exprs(x))
# cyl hp
#Mazda RX4 6 110
#Mazda RX4 Wag 6 110
#Datsun 710 4 93
#Hornet 4 Drive 6 110
#Hornet Sportabout 8 175
#....
I'm wondering if there is a way to select a group of columns by the name of the first column in the group and then all the next columns either a) to the end of the data frame, or b) to another column, also using its name.
a) As an example for the first question, in the mtcars dataset, is there a way to select the columns from drat to the end of the data frame? (Something like mtcars[,'drat':ncol(mtcars)])
b) For the second question, is there a way to select the columns starting at cyl and ending at wt? (Something like mtcars[,'cyl':'wt'])
Many elegant solutions already provided but one can even use base-R to get the desired result using which as:
Ans a:
mtcars[,which(names(mtcars) == "drat"):ncol(mtcars)]
Ans b:
mtcars[,which(names(mtcars) == "cyl"):which(names(mtcars) == "wt")]
# cyl disp hp drat wt
#Mazda RX4 6 160.0 110 3.90 2.620
#Mazda RX4 Wag 6 160.0 110 3.90 2.875
#Datsun 710 4 108.0 93 3.85 2.320
#Hornet 4 Drive 6 258.0 110 3.08 3.215
#Hornet Sportabout 8 360.0 175 3.15 3.440
#......so on
We can do with this with select from dplyr
Answer a)
mtcars %>% select(drat:get(last(names(.))))
Answer b)
mtcars %>% select(cyl:wt)
In dplyr, the select function does exactly this (no quotes needed):
mtcards %>%
select(cyl:wt)
If we need to use a quoted string, convert it to sym (symbol) and then do the evaluation (!!
mtcars %>%
select(!! (rlang::sym("cyl")): !!(rlang::sym("wt")))
It would be when these are stored in an object
a <- "cyl"
b <- "wt"
mtcars %>%
select(!! (rlang::sym(a)): !!(rlang::sym(b)))
Or another option is
mtcars %>%
select(!! rlang::parse_expr(glue::glue("{a}:{b}")))
EXAMPLE DATASET:
mtcars
mpg cyl disp hp drat wt ...
Mazda RX4 21.0 6 160 110 3.90 2.62 ...
Mazda RX4 Wag 21.0 6 160 110 3.90 2.88 ...
Datsun 710 22.8 4 108 93 3.85 2.32 ...
............
Recommended ggplot way:
ggplot(mtcars,aes(x=mpg)) + geom_histogram
They way I want to do it:
ggplot(mtcars,aes(x=[,1]) +geom_histogram
or
ggplot(mtcars,aes(x=[[1]]))+geom_histogram
Why can't ggplot let me call out my variable by its column? I need to call it out by column number not name. Why is ggplot so strict here? Any work around for this?
The problem you're facing is that the ggplot aes argument evaluates within the data.frame that you pass it. A column name is a string, and can't be properly evaluated the same way.
Fortunately, there is a solution: use the aes_string option, as follows:
library(ggplot2)
my_data <- mtcars
names(my_data)
ggplot(my_data, aes_string(x=names(my_data)[1]))+
geom_histogram()
This works because names(my_data)[1] returns a string, and is perfectly acceptable for the aes_string option.