I am exporting multiple dataframe in a list to different sheet in one excel file, and I can do this with the below code(use mtcars as an example):
library(tidyverse)
library(ImportExport)
data_list <- split(mtcars, mtcars$cyl)
table_name <- names(data_list)
# Run
excel_export(data_list, "foo.xlsx", table_names = tab_name)
And then, I want to do this with another way, cause I see the dplyr document said that:
.y to refer to the key, a one row tibble with one column per
grouping variable that identifies the group.
So I thought .y equal to my created variable table_name, and I do this:
data_list %>% excel_export("foo.xlsx", table_names = .y)
Then I got an error:
Error in .jcall(wb, "Lorg/apache/poi/ss/usermodel/Sheet;",
"createSheet", : object '.y' not found
Could someone explain why and how can I do this with .y.
Any help will be highly appreciated.
If you reference a quote, I think it makes sense to use the function in which that quote is used. In this case, I found it in group_map (which includes group_walk, a complementary function that operates primarily in side-effect).
You still need to group_by the data. More specifically, it needs to operate on a tbl_df (not a list), typically (but not always) a grouped tibble.
I have neither ImportExport nor xlsx (on which the former depends) installed, so I'll proxy your action with write.csv.
library(dplyr)
mtcars %>%
group_by(cyl) %>%
group_walk(~ write.csv(.x, paste0(.y, ".csv")))
The side-effect of this is that three files are created in the current directory named 4.csv, 6.csv, and 8.csv.
If you want to operate on a named list, one could also use one of:
# using: data_list <- split(mtcars, mtcars$cyl)
purrr::imap(data_list,
~ write.csv(.x, paste0("~/Downloads/", .y, ".csv")))
Map(write.csv, data_list, paste0(names(data_list), ".csv"))
The effect is the same.
.x and .y are not global parameters that can be used anywhere, those are reserved for specific functions. (usually map)
From ?map
.f A function, formula, or vector (not necessarily atomic).
If a formula, e.g. ~ .x + 2, it is converted to a function. There are three ways to refer to the arguments:
For a single argument function, use .
For a two argument function, use .x and .y
For more arguments, use ..1, ..2, ..3 etc
Let's take a simple list
library(purrr)
listvec1 <- list(a = 1:3, b = 4:6, c = 2:4)
1) Let's say we want to multiply every element in the list with 2, map has a single argument so we use . here.
map(listvec1, ~. * 2)
#$a
#[1] 2 4 6
#$b
#[1] 8 10 12
#$c
#[1] 4 6 8
Coincidently .x works here as well :
map(listvec1, ~.x * 2)
but if you use .y it will give an error because there are no 2 arguments in map.
map(listvec, ~.y * 2)
Error in .f(.x[[i]], ...) : the ... list contains fewer than 2 elements
2) Let's take another list and now add listvec1 with listvec2. For this, we can use map2 which has two arguments so here we refer them as .x and .y.
listvec2 <- list(a = 7, b = 8, c = 9)
map2(listvec1,listvec2, ~.x + .y)
#you could actually simplify this as but anyway this is just an example
#map2(listvec1,listvec2, `+`)
#$a
#[1] 8 9 10
#$b
#[1] 12 13 14
#$c
#[1] 11 12 13
In the same we use .x and .y in imap where .x is the element and .y is either the name of the list (if present) or index.
In the post above .y means nothing, so using it as
data_list %>% excel_export("foo.xlsx", table_names = .y)
would definitely yield an error. You need to use specific functions as described above to use .x, .y. So if you want to use pipes for your command, you should use
data_list %>% excel_export("foo.xlsx", table_names = tab_name)
Related
I have three datasets a, b, c with identical variable names. I want to check whether these variables contain missing/invalid values.
I have a checking function check_variables() that checks missing or invalid values (for example the function could just be is.na).
While I could apply my checking function check_variables() explicitly to each of these datasets, like:
check.output = list(
a = check_variables(a),
b = check_variables(b),
c = check_variables(c)
)
purrr offers a nice all-in-one-step solution for this problem:
list(a,b,c) %>%
map(~ .x %>% check_variables())
But this step only maps check_variables() to elements of datasets in the list. Instead, I want function check_variables() map to each dataset. Is there a way to effectively map functions to the datasets in the list instead of the elements within each dataset?
If you want to modify the independent variables you can pass a list of variable names to edit then use get and assign to access and modify them.
library(purrr)
library(magrittr)
a = list(var = 1)
b = list(var = 2)
c = list(var = 3)
# get the current environment. alternative is to use functions like
# parent.frame from within the loop but that can get confusing
e = environment()
c('a','b','c') %>%
map(function(x){
ls = get(x,envir = e)
# whatever modification you want to make on the list
ls$var = ls$var+1
assign(x,ls,envir = e)
})
Note in real life, as #MrFlick stated, you probably don't want to do this. Keep a, b, c in a single list and your downstream analysis will be easier since I assume they will have to be processes through the same pipeline. map will happily return your modified list that you can either use to overwrite the original list or assign to a new variable. Alternatively, use a for loop over list indexes to modify the original list on the go or fill a pre-allocated new variable.
If the purpose is to apply check_variables() which takes in a dataset (table) and returns a single TRUE or FALSE, then the issue might be related to the usage of vectorized functions.
R and packages of R have many vectorized functions, such as is.na, which means when applying these function on to a list c(1, NA, 2) or dataframe, the function will be applied on to each elements of the list, resulting FALSE TRUE FALSE instead of TRUE (any element is.na) or FALSE (all element is.na).
When check_variable() function is composed by these vectorized functions, we will need to "aggregate" the vectorized functions use functions like all, any. Further more we will need to control the scope of aggregation in order to control whether the check_variables() function is to be applied on elements, variables (columns), or the entire table(dataframe):
require(tidyverse) # in production code, import only `dplyr` and `tidyr`
require(purrr)
a = data.frame(x = c(1,2,3), y =c(3,NA,5))
b = data.frame(x = c(1,NA,3), y =c(3,4,5))
c = data.frame(x = c(1,NA,3), y =c(3,NA,4))
# apply `check.func` on varaibles(columns)
# aggregation has to be limited within scope of each varaible (column)
# `dplyr::summarize_all` happens to functioning like this
check.vars = function(list.tbls, check.func) list.tbls %>% map(~ .x %>% summarize_all(check.func) )
# apply `check.func` on the entire table
# as long as `check.func` takes a table and returns a single value
# we can directly apply this function
check.tbls = function(list.tbls, check.func) list.tbls %>% map(~ check.func(.x))
## Some sample functions
# check if all elements under the scope, has no NA
# take in either a vector or a table, return a boolean
has.no.na = . %>% is.na %>% any %>% `!`
# check if all elements under the scope is less than 5, NAs are counted as False
# take in either a vector or a table, return a boolean
has.no.na = . %>% is.na %>% any %>% `!`
is.lt.5 = . %>% `<`(5) %>% all %>% replace_na(F)
# check if all elements under the scope is less than 5, NAs are ignored, all NA means TRUE
# take in either a vector or a table, return a boolean
is.lt.5.rm.na = . %>% `<`(5) %>% all(na.rm=T)
## Use of sample functions to check variables within each dataset
list(a,b,c) %>% check.vars(has.no.na)
list(a,b,c) %>% check.vars(is.lt.5)
## Use of sample functions to check each dataset
list(a,b,c) %>% check.tbls(has.no.na)
list(a,b,c) %>% check.tbls(is.lt.5)
list(a,b,c) %>% check.tbls(is.lt.5.rm.na)
I have a data frame and want to filter it in one of two ways, by either column "this" or column "that". I would like to be able to refer to the column name as a variable. How (in dplyr, if that makes a difference) do I refer to a column name by a variable?
library(dplyr)
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
df
# this that
# 1 1 1
# 2 2 1
# 3 2 2
df %>% filter(this == 1)
# this that
# 1 1 1
But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is. Both as.symbol and get work in other contexts, but not this:
column <- "this"
df %>% filter(as.symbol(column) == 1)
# [1] this that
# <0 rows> (or 0-length row.names)
df %>% filter(get(column) == 1)
# Error in get("this") : object 'this' not found
How can I turn the value of column into a column name?
Using rlang's injection paradigm
From the current dplyr documentation (emphasis by me):
dplyr used to offer twin versions of each verb suffixed with an underscore. These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value. Their purpose was to make it possible to program with dplyr. However, dplyr now uses tidy evaluation semantics. NSE verbs still capture their arguments, but you can now unquote parts of these arguments. This offers full programmability with NSE verbs. Thus, the underscored versions are now superfluous.
So, essentially we need to perform two steps to be able to refer to the value "this" of the variable column inside dplyr::filter():
We need to turn the variable column which is of type character into type symbol.
Using base R this can be achieved by the function as.symbol()
which is an alias for as.name(). The former is preferred by the
tidyverse developers because it
follows a more modern terminology (R types instead of S modes).
Alternatively, the same can be achieved by rlang::sym() from the tidyverse.
We need to inject the symbol from 1) into the dplyr::filter() expression.
This is done by the so called injection operator !! which is basically syntactic
sugar allowing to modify a piece of code before R evaluates it.
(In earlier versions of dplyr (or the underlying rlang respectively) there used to be situations (incl. yours) where !! would collide with the single !, but this is not an issue anymore since !! gained the right operator precedence.)
Applied to your example:
library(dplyr)
df <- data.frame(this = c(1, 2, 2),
that = c(1, 1, 2))
column <- "this"
df %>% filter(!!as.symbol(column) == 1)
# this that
# 1 1 1
Using alternative solutions
Other ways to refer to the value "this" of the variable column inside dplyr::filter() that don't rely on rlang's injection paradigm include:
Via the tidyselection paradigm, i.e. dplyr::if_any()/dplyr::if_all() with tidyselect::all_of()
df %>% filter(if_any(.cols = all_of(column),
.fns = ~ .x == 1))
Via rlang's .data pronoun and base R's [[:
df %>% filter(.data[[column]] == 1)
Via magrittr's . argument placeholder and base R's [[:
df %>% filter(.[[column]] == 1)
I would steer clear of using get() all together. It seems like it would be quite dangerous in this situation, especially if you're programming. You could use either an unevaluated call or a pasted character string, but you'll need to use filter_() instead of filter().
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
column <- "this"
Option 1 - using an unevaluated call:
You can hard-code y as 1, but here I show it as y to illustrate how you can change the expression values easily.
expr <- lazyeval::interp(quote(x == y), x = as.name(column), y = 1)
## or
## expr <- substitute(x == y, list(x = as.name(column), y = 1))
df %>% filter_(expr)
# this that
# 1 1 1
Option 2 - using paste() (and obviously easier):
df %>% filter_(paste(column, "==", 1))
# this that
# 1 1 1
The main thing about these two options is that we need to use filter_() instead of filter(). In fact, from what I've read, if you're programming with dplyr you should always use the *_() functions.
I used this post as a helpful reference: character string as function argument r, and I'm using dplyr version 0.3.0.2.
Here's another solution for the latest dplyr version:
df <- data.frame(this = c(1, 2, 2),
that = c(1, 1, 2))
column <- "this"
df %>% filter(.[[column]] == 1)
# this that
#1 1 1
Regarding Richard's solution, just want to add that if you the column is character. You can add shQuote to filter by character values.
For example, you can use
df %>% filter_(paste(column, "==", shQuote("a")))
If you have multiple filters, you can specify collapse = "&" in paste.
df %>$ filter_(paste(c("column1","column2"), "==", shQuote(c("a","b")), collapse = "&"))
The latest way to do this is to use my.data.frame %>% filter(.data[[myName]] == 1), where myName is an environmental variable that contains the column name.
Or using filter_at
library(dplyr)
df %>%
filter_at(vars(column), any_vars(. == 1))
Like Salim B explained above but with a minor change:
df %>% filter(1 == !!as.name(column))
i.e. just reverse the condition because !! otherwise behaves
like
!!(as.name(column)==1)
You can use the across(all_of()) syntax, it takes a string as argument
column = "this"
df %>% filter(across(all_of(column)) == 1)
I have a data frame and want to filter it in one of two ways, by either column "this" or column "that". I would like to be able to refer to the column name as a variable. How (in dplyr, if that makes a difference) do I refer to a column name by a variable?
library(dplyr)
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
df
# this that
# 1 1 1
# 2 2 1
# 3 2 2
df %>% filter(this == 1)
# this that
# 1 1 1
But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is. Both as.symbol and get work in other contexts, but not this:
column <- "this"
df %>% filter(as.symbol(column) == 1)
# [1] this that
# <0 rows> (or 0-length row.names)
df %>% filter(get(column) == 1)
# Error in get("this") : object 'this' not found
How can I turn the value of column into a column name?
Using rlang's injection paradigm
From the current dplyr documentation (emphasis by me):
dplyr used to offer twin versions of each verb suffixed with an underscore. These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value. Their purpose was to make it possible to program with dplyr. However, dplyr now uses tidy evaluation semantics. NSE verbs still capture their arguments, but you can now unquote parts of these arguments. This offers full programmability with NSE verbs. Thus, the underscored versions are now superfluous.
So, essentially we need to perform two steps to be able to refer to the value "this" of the variable column inside dplyr::filter():
We need to turn the variable column which is of type character into type symbol.
Using base R this can be achieved by the function as.symbol()
which is an alias for as.name(). The former is preferred by the
tidyverse developers because it
follows a more modern terminology (R types instead of S modes).
Alternatively, the same can be achieved by rlang::sym() from the tidyverse.
We need to inject the symbol from 1) into the dplyr::filter() expression.
This is done by the so called injection operator !! which is basically syntactic
sugar allowing to modify a piece of code before R evaluates it.
(In earlier versions of dplyr (or the underlying rlang respectively) there used to be situations (incl. yours) where !! would collide with the single !, but this is not an issue anymore since !! gained the right operator precedence.)
Applied to your example:
library(dplyr)
df <- data.frame(this = c(1, 2, 2),
that = c(1, 1, 2))
column <- "this"
df %>% filter(!!as.symbol(column) == 1)
# this that
# 1 1 1
Using alternative solutions
Other ways to refer to the value "this" of the variable column inside dplyr::filter() that don't rely on rlang's injection paradigm include:
Via the tidyselection paradigm, i.e. dplyr::if_any()/dplyr::if_all() with tidyselect::all_of()
df %>% filter(if_any(.cols = all_of(column),
.fns = ~ .x == 1))
Via rlang's .data pronoun and base R's [[:
df %>% filter(.data[[column]] == 1)
Via magrittr's . argument placeholder and base R's [[:
df %>% filter(.[[column]] == 1)
I would steer clear of using get() all together. It seems like it would be quite dangerous in this situation, especially if you're programming. You could use either an unevaluated call or a pasted character string, but you'll need to use filter_() instead of filter().
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
column <- "this"
Option 1 - using an unevaluated call:
You can hard-code y as 1, but here I show it as y to illustrate how you can change the expression values easily.
expr <- lazyeval::interp(quote(x == y), x = as.name(column), y = 1)
## or
## expr <- substitute(x == y, list(x = as.name(column), y = 1))
df %>% filter_(expr)
# this that
# 1 1 1
Option 2 - using paste() (and obviously easier):
df %>% filter_(paste(column, "==", 1))
# this that
# 1 1 1
The main thing about these two options is that we need to use filter_() instead of filter(). In fact, from what I've read, if you're programming with dplyr you should always use the *_() functions.
I used this post as a helpful reference: character string as function argument r, and I'm using dplyr version 0.3.0.2.
Here's another solution for the latest dplyr version:
df <- data.frame(this = c(1, 2, 2),
that = c(1, 1, 2))
column <- "this"
df %>% filter(.[[column]] == 1)
# this that
#1 1 1
Regarding Richard's solution, just want to add that if you the column is character. You can add shQuote to filter by character values.
For example, you can use
df %>% filter_(paste(column, "==", shQuote("a")))
If you have multiple filters, you can specify collapse = "&" in paste.
df %>$ filter_(paste(c("column1","column2"), "==", shQuote(c("a","b")), collapse = "&"))
The latest way to do this is to use my.data.frame %>% filter(.data[[myName]] == 1), where myName is an environmental variable that contains the column name.
Or using filter_at
library(dplyr)
df %>%
filter_at(vars(column), any_vars(. == 1))
Like Salim B explained above but with a minor change:
df %>% filter(1 == !!as.name(column))
i.e. just reverse the condition because !! otherwise behaves
like
!!(as.name(column)==1)
You can use the across(all_of()) syntax, it takes a string as argument
column = "this"
df %>% filter(across(all_of(column)) == 1)
I have a data frame and want to filter it in one of two ways, by either column "this" or column "that". I would like to be able to refer to the column name as a variable. How (in dplyr, if that makes a difference) do I refer to a column name by a variable?
library(dplyr)
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
df
# this that
# 1 1 1
# 2 2 1
# 3 2 2
df %>% filter(this == 1)
# this that
# 1 1 1
But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is. Both as.symbol and get work in other contexts, but not this:
column <- "this"
df %>% filter(as.symbol(column) == 1)
# [1] this that
# <0 rows> (or 0-length row.names)
df %>% filter(get(column) == 1)
# Error in get("this") : object 'this' not found
How can I turn the value of column into a column name?
Using rlang's injection paradigm
From the current dplyr documentation (emphasis by me):
dplyr used to offer twin versions of each verb suffixed with an underscore. These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value. Their purpose was to make it possible to program with dplyr. However, dplyr now uses tidy evaluation semantics. NSE verbs still capture their arguments, but you can now unquote parts of these arguments. This offers full programmability with NSE verbs. Thus, the underscored versions are now superfluous.
So, essentially we need to perform two steps to be able to refer to the value "this" of the variable column inside dplyr::filter():
We need to turn the variable column which is of type character into type symbol.
Using base R this can be achieved by the function as.symbol()
which is an alias for as.name(). The former is preferred by the
tidyverse developers because it
follows a more modern terminology (R types instead of S modes).
Alternatively, the same can be achieved by rlang::sym() from the tidyverse.
We need to inject the symbol from 1) into the dplyr::filter() expression.
This is done by the so called injection operator !! which is basically syntactic
sugar allowing to modify a piece of code before R evaluates it.
(In earlier versions of dplyr (or the underlying rlang respectively) there used to be situations (incl. yours) where !! would collide with the single !, but this is not an issue anymore since !! gained the right operator precedence.)
Applied to your example:
library(dplyr)
df <- data.frame(this = c(1, 2, 2),
that = c(1, 1, 2))
column <- "this"
df %>% filter(!!as.symbol(column) == 1)
# this that
# 1 1 1
Using alternative solutions
Other ways to refer to the value "this" of the variable column inside dplyr::filter() that don't rely on rlang's injection paradigm include:
Via the tidyselection paradigm, i.e. dplyr::if_any()/dplyr::if_all() with tidyselect::all_of()
df %>% filter(if_any(.cols = all_of(column),
.fns = ~ .x == 1))
Via rlang's .data pronoun and base R's [[:
df %>% filter(.data[[column]] == 1)
Via magrittr's . argument placeholder and base R's [[:
df %>% filter(.[[column]] == 1)
I would steer clear of using get() all together. It seems like it would be quite dangerous in this situation, especially if you're programming. You could use either an unevaluated call or a pasted character string, but you'll need to use filter_() instead of filter().
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
column <- "this"
Option 1 - using an unevaluated call:
You can hard-code y as 1, but here I show it as y to illustrate how you can change the expression values easily.
expr <- lazyeval::interp(quote(x == y), x = as.name(column), y = 1)
## or
## expr <- substitute(x == y, list(x = as.name(column), y = 1))
df %>% filter_(expr)
# this that
# 1 1 1
Option 2 - using paste() (and obviously easier):
df %>% filter_(paste(column, "==", 1))
# this that
# 1 1 1
The main thing about these two options is that we need to use filter_() instead of filter(). In fact, from what I've read, if you're programming with dplyr you should always use the *_() functions.
I used this post as a helpful reference: character string as function argument r, and I'm using dplyr version 0.3.0.2.
Here's another solution for the latest dplyr version:
df <- data.frame(this = c(1, 2, 2),
that = c(1, 1, 2))
column <- "this"
df %>% filter(.[[column]] == 1)
# this that
#1 1 1
Regarding Richard's solution, just want to add that if you the column is character. You can add shQuote to filter by character values.
For example, you can use
df %>% filter_(paste(column, "==", shQuote("a")))
If you have multiple filters, you can specify collapse = "&" in paste.
df %>$ filter_(paste(c("column1","column2"), "==", shQuote(c("a","b")), collapse = "&"))
The latest way to do this is to use my.data.frame %>% filter(.data[[myName]] == 1), where myName is an environmental variable that contains the column name.
Or using filter_at
library(dplyr)
df %>%
filter_at(vars(column), any_vars(. == 1))
Like Salim B explained above but with a minor change:
df %>% filter(1 == !!as.name(column))
i.e. just reverse the condition because !! otherwise behaves
like
!!(as.name(column)==1)
You can use the across(all_of()) syntax, it takes a string as argument
column = "this"
df %>% filter(across(all_of(column)) == 1)
I have a data frame and want to filter it in one of two ways, by either column "this" or column "that". I would like to be able to refer to the column name as a variable. How (in dplyr, if that makes a difference) do I refer to a column name by a variable?
library(dplyr)
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
df
# this that
# 1 1 1
# 2 2 1
# 3 2 2
df %>% filter(this == 1)
# this that
# 1 1 1
But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is. Both as.symbol and get work in other contexts, but not this:
column <- "this"
df %>% filter(as.symbol(column) == 1)
# [1] this that
# <0 rows> (or 0-length row.names)
df %>% filter(get(column) == 1)
# Error in get("this") : object 'this' not found
How can I turn the value of column into a column name?
Using rlang's injection paradigm
From the current dplyr documentation (emphasis by me):
dplyr used to offer twin versions of each verb suffixed with an underscore. These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value. Their purpose was to make it possible to program with dplyr. However, dplyr now uses tidy evaluation semantics. NSE verbs still capture their arguments, but you can now unquote parts of these arguments. This offers full programmability with NSE verbs. Thus, the underscored versions are now superfluous.
So, essentially we need to perform two steps to be able to refer to the value "this" of the variable column inside dplyr::filter():
We need to turn the variable column which is of type character into type symbol.
Using base R this can be achieved by the function as.symbol()
which is an alias for as.name(). The former is preferred by the
tidyverse developers because it
follows a more modern terminology (R types instead of S modes).
Alternatively, the same can be achieved by rlang::sym() from the tidyverse.
We need to inject the symbol from 1) into the dplyr::filter() expression.
This is done by the so called injection operator !! which is basically syntactic
sugar allowing to modify a piece of code before R evaluates it.
(In earlier versions of dplyr (or the underlying rlang respectively) there used to be situations (incl. yours) where !! would collide with the single !, but this is not an issue anymore since !! gained the right operator precedence.)
Applied to your example:
library(dplyr)
df <- data.frame(this = c(1, 2, 2),
that = c(1, 1, 2))
column <- "this"
df %>% filter(!!as.symbol(column) == 1)
# this that
# 1 1 1
Using alternative solutions
Other ways to refer to the value "this" of the variable column inside dplyr::filter() that don't rely on rlang's injection paradigm include:
Via the tidyselection paradigm, i.e. dplyr::if_any()/dplyr::if_all() with tidyselect::all_of()
df %>% filter(if_any(.cols = all_of(column),
.fns = ~ .x == 1))
Via rlang's .data pronoun and base R's [[:
df %>% filter(.data[[column]] == 1)
Via magrittr's . argument placeholder and base R's [[:
df %>% filter(.[[column]] == 1)
I would steer clear of using get() all together. It seems like it would be quite dangerous in this situation, especially if you're programming. You could use either an unevaluated call or a pasted character string, but you'll need to use filter_() instead of filter().
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
column <- "this"
Option 1 - using an unevaluated call:
You can hard-code y as 1, but here I show it as y to illustrate how you can change the expression values easily.
expr <- lazyeval::interp(quote(x == y), x = as.name(column), y = 1)
## or
## expr <- substitute(x == y, list(x = as.name(column), y = 1))
df %>% filter_(expr)
# this that
# 1 1 1
Option 2 - using paste() (and obviously easier):
df %>% filter_(paste(column, "==", 1))
# this that
# 1 1 1
The main thing about these two options is that we need to use filter_() instead of filter(). In fact, from what I've read, if you're programming with dplyr you should always use the *_() functions.
I used this post as a helpful reference: character string as function argument r, and I'm using dplyr version 0.3.0.2.
Here's another solution for the latest dplyr version:
df <- data.frame(this = c(1, 2, 2),
that = c(1, 1, 2))
column <- "this"
df %>% filter(.[[column]] == 1)
# this that
#1 1 1
Regarding Richard's solution, just want to add that if you the column is character. You can add shQuote to filter by character values.
For example, you can use
df %>% filter_(paste(column, "==", shQuote("a")))
If you have multiple filters, you can specify collapse = "&" in paste.
df %>$ filter_(paste(c("column1","column2"), "==", shQuote(c("a","b")), collapse = "&"))
The latest way to do this is to use my.data.frame %>% filter(.data[[myName]] == 1), where myName is an environmental variable that contains the column name.
Or using filter_at
library(dplyr)
df %>%
filter_at(vars(column), any_vars(. == 1))
Like Salim B explained above but with a minor change:
df %>% filter(1 == !!as.name(column))
i.e. just reverse the condition because !! otherwise behaves
like
!!(as.name(column)==1)
You can use the across(all_of()) syntax, it takes a string as argument
column = "this"
df %>% filter(across(all_of(column)) == 1)