I'm using the mtcars dataset for this example.
I have a function which creates a named list using a variable:
make_list <- function(df, variable_name) {
a <- df %>%
list(variable_name = .)
return(a)
}
When I use this function:
mylist <- make_list(mtcars, "car_info")
head(mylist)
$variable_name
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
The list name is called variable_name, rather than car_info.
How do I change the function (but still use a pipe format) so that the correct name is returned?
If you want to continue using the pipe, you can use setNames:
make_list <- function(df, variable_name) {
df %>%
list%>%
setNames(variable_name)
}
make_list(mtcars, "car_info")
Output:
$car_info
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
make_list <- function(df, variable_name) {
a <- df %>% list
names(a) <- variable_name
return(a)
}
Try this:
make_list <- function(df, variable_name) {
a <- df %>%
list()
names(a) <- variable_name
return(a)
}
mylist <- make_list(mtcars, "car_info")
Output (Some rows):
mylist
$car_info
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
rlang has a list2 function that does that
make_list <- function(df, variable_name) {
rlang::list2(!! variable_name := df)
}
make_list(mtcars, "car_info")
#> $car_info
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Or tibble::lst works the same: make_list <- function(df, variable_name) tibble::lst(!! variable_name := df)
Related
Using the R inbuilt dataset
mtcars
I want to make a column called "want".
mtcars$want<-NA
When column "carb" is equal to 1 (Column A), input value of column "qsec" (Column B) in column "want" (Column C).
If carb is not equal to 1 do nothing.
The first 5 rows of the new dataset should look like this:
mpg cyl disp hp drat wt qsec vs am gear carb want
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 NA
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 NA
Datsun 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 18.61
Hornet Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 19.44
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 NA
This should do the job:
mtcars$want <- ifelse(mtcars$carb == 1, mtcars$qsec, NA)
head(mtcars, 5)
mpg cyl disp hp drat wt qsec vs am gear carb want
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 NA
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 NA
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 18.61
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 19.44
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 NA
If you only want to achieve it in the print out you could try the following (in the data.frame itself this will still be shown as NA):
mtcars$want <- ifelse(mtcars$carb == 1, mtcars$qsec, "")
head(mtcars, 5)
mpg cyl disp hp drat wt qsec vs am gear carb want
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 18.61
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 19.44
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
If it is helpful, I am of the impression that a loop over the columns should work. One can modify the loop or add further conditionals as appropriate to fill in the other values of the column.
#written in R version 4.2.1
data(mtcars)
mtcars$want = 0
for(i in 1:dim(mtcars)[1]){
if(mtcars$carb[i] == 1){
mtcars$want[i] = mtcars$qsec[i]
}}
Result:
head(mtcars)
# mpg cyl disp hp drat wt qsec vs am gear carb want
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 0.00
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 0.00
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 18.61
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 19.44
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 0.00
#Valiant
What you can do is first set a value to your new column "want" for example 2. You can use ifelse to do your criteria and return "want" if do nothing like this:
mtcars$want <- 2
library(dplyr)
mtcars %>%
mutate(want = ifelse(carb == 1, qsec, want)) %>%
head(5)
#> mpg cyl disp hp drat wt qsec vs am gear carb want
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2.00
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 2.00
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 18.61
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 19.44
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 2.00
Created on 2022-06-30 by the reprex package (v2.0.1)
I am trying to write a function using rlang so that I can subset data based on supplied expression. Although the actual function is complicated, here is a minimal version of it that illustrates the problem.
minimal version of needed function
library(rlang)
# define a function
foo <- function(data, expr = NULL) {
if (!quo_is_null(enquo(expr))) {
dplyr::filter(data, !!enexpr(expr))
} else {
data
}
}
# does the function work? yes
head(foo(mtcars, NULL)) # with NULL
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
head(foo(mtcars, mpg > 20)) # with expression
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
#> Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
problems with purrr::pmap
When used with purrr::pmap(), it works as expected when expr is NULL, but not otherwise. Instead of list, I also tried using alist to supply the input.
library(purrr)
# works when expression is `NULL`
pmap(
.l = list(data = list(head(mtcars)), expr = list(NULL)),
.f = foo
)
#> [[1]]
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
# but not otherwise
pmap(
.l = list(data = list(head(mtcars)), expr = list("mpg > 20")),
.f = foo
)
#> Error: Problem with `filter()` input `..1`.
#> ℹ Input `..1` is `"mpg > 20"`.
#> x Input `..1` must be a logical vector, not a character.
Created on 2021-07-20 by the reprex package (v2.0.0)
One way to make this work is by wrapping with quote
purrr::pmap(
.l = list(data = list(head(mtcars)), expr = list(quote(mpg > 20))),
.f = foo
)
-output
[[1]]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
which also works with the NULL
pmap(
.l = list(data = list(head(mtcars)), expr = list(quote(NULL))),
.f = foo
)
[[1]]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Same output with subset
subset(head(mtcars), mpg > 20)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Or another option is to modify the function by changing the enexpr to parse_expr
foo1 <- function(data, expr = NULL) {
if (!quo_is_null(enquo(expr))) {
dplyr::filter(data, !!parse_expr(expr))
} else {
data
}
}
-testing
> pmap(
+ .l = list(data = list(head(mtcars)), expr = list(NULL)),
+ .f = foo1
+ )
[[1]]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
>
> pmap(
+ .l = list(data = list(head(mtcars)), expr = list("mpg > 20")),
+ .f = foo1
+ )
[[1]]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
I want to copy data frame a to a new data frame b inside a function.
a <- mtcars
saveData <- function(x, y){
y <- x
return(y)
}
saveData(a, b)
In this example, the function should create the object/data frame b. b should be a copy of a (i.e., mtcars)
The crux is to flexibly "name" objects.
I excessively played around with assign(), deparse(), and substitute(), but I could not make it work.
It is not a good pracrtice to save the data in global environment from a function. However if you want to do it here is a way :
saveData <- function(x, y){
assign(deparse(substitute(y)), x, envir = parent.frame())
}
a <- mtcars
b
Error: object 'b' not found
saveData(a, b)
b
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#...
Another idea is to use list2env, but you have to convert to a named list, so your second argument will need to be a character, i.e.
saveData <- function(x, y) {
v1 <- setNames(list(x), y)
list2env(v1, envir = .GlobalEnv)
}
saveData(a, 'b')
b
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
#.....
NOTE: I wouldn't recommend adding staff to your global environment. It is better to keep them in lists
I got the following datasets:
dflist <- list(mtcars, mtcars)
dflist[[1]] %>%
mutate(cyl2 = cyl * 2)
This works!
dflist %>%
map(.x, ~.x$cyl2 = .x$cyl * 2)
Error: unexpected '=' in:
"dflist %>%
map(.x, ~x$cyl2 ="
This results in an error. I tried other options, but the function does not except the = sign. What is wrong there?
Try :
library(dplyr)
library(purrr)
dflist %>% map(~.x %>% mutate(cyl2 = cyl * 2))
#[[1]]
# mpg cyl disp hp drat wt qsec vs am gear carb cyl2
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 12
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 12
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 8
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 12
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 16
#....
#[[2]]
# mpg cyl disp hp drat wt qsec vs am gear carb cyl2
#1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 12
#2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 12
#3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 8
#4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 12
#5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 16
#...
Or keeping it in base R:
lapply(dflist, function(x) transform(x, cyl2 = cyl * 2))
You can also try:
modify(dflist, ~ update_list(., cyl2 = ~ cyl * 2))
[[1]]
mpg cyl disp hp drat wt qsec vs am gear carb cyl2
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 12
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 12
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 8
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 12
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 16
[[2]]
mpg cyl disp hp drat wt qsec vs am gear carb cyl2
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 12
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 12
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 8
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 12
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 16
We can use transform without anonymous function call in base R
lapply(dflist, transform, cyl2 = cyl *2)
I have a matrix and I would like to eliminate two columns by their names.
My code was :
trn_data = subset(trn_data, select = -c("Rye flour","Barley products"))
but R gave me an error message like this:
Error in -c("Rye flour", "Barley products") :
invalid argument to unary operator
I tried this
trn_data = subset(trn_data, select = -c(Rye flour,Barley products))
Also returning an error:
Error: unexpected symbol in "trn_data=subset(trn_data,select =-c(Rye flour"
How can I fix this? Is there any other method that can eliminate two columns by their names?
You should not provide the names as characters to subset. This works:
trn_data_subset <- subset(trn_data, select = -c(`Rye flour`,`Barley products`))
If you have spaces in the name of columns, you should use Grave Accent.
Here's an example using mtcars dataset:
mtexapmple <- mtcars[1:4,]
names(mtexapmple)[1] <- "mpg with space"
mtexapmple
#> mpg with space cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
subset(mtexapmple, select = -c(`mpg with space`, cyl))
#> disp hp drat wt qsec vs am gear carb
#> Mazda RX4 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 258 110 3.08 3.215 19.44 1 0 3 1
You can also do it like these:
within(trn_data, rm(`Rye flour`,`Barley products`))
or
trn_data[, !(colnames(trn_data) %in% c("Rye flour","Barley products"))]
With dplyr, we can still use - with double quote
library(dplyr)
mtexample %>%
select(-"mpg with space")
# cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 6 258 110 3.08 3.215 19.44 1 0 3 1
data
mtexample <- mtcars[1:4,]
names(mtexample)[1] <- "mpg with space"