I'm looking to add the input of a function to the column name created in the function.
new_col = function(dat_, period_) {
dat_ %>% mutate(paste0('mpg',period_) = mpg + period_
}
mtcars %>% newcol(20)
In this function, mpg20 should be created where it's values are equal to mpg + 20.
You can use :
library(dplyr)
new_col = function(dat_, period_) {
dat_ %>% mutate(!!paste0('mpg',period_) := mpg + period_)
}
mtcars %>% new_col(20)
# mpg cyl disp hp drat wt qsec vs am gear carb mpg20
#Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 41.0
#Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 41.0
#Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 42.8
#Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 41.4
#Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 38.7
#Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 38.1
#...
#...
!! and := combination is used when we have a character variable at the LHS.
Related
I have a column including lots of "0" and other values (f.i. 2 or 2,3 etc). Is there any possibility to rename the columns with 0 to "None" and all other values to "others"? I wanted to use fct_recode or fct_collapse but cant figure out how to include all other values. Do you have any idea? I must not be necessarily include the fct_recode function.
Thanks a lot
Philipp
I tried to use fct_recode, fct_collapse
Here is a way to do it using mtcars and the vs column as an example:
cars <- mtcars
head(cars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
cars$vs <- ifelse(cars$vs == 0, "none", "other")
head(cars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 none 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 none 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 other 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 other 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 none 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 other 0 3 1
Note that R coerces the vs column from numeric to character. But you could do that explicitly first for clarity:
cars$vs <- as.character(cars$vs)
Using dplyr, we can do this on multiple colums as
library(dplyr)
df1 <- df1 %>%
mutate(across(everything(), ~ case_when(.x == 0 ~ "none", TRUE ~ "other")))
Or in base R
df1[] <- c("other", "none")[1 + (df1 == 0)]
I need to re-ordering the columns' position in a dataframe with 500 columns. In fact, I only want the last column to be moved between the third and the fourth columns.
Here is what I tried:
df[ ,c(1, 2, 3, ncol(df), 4:ncol(df)-1)]
But it gives me a vector of values which are the columns' number. Would you someone tell me what I expect wrong from this code?
The issue maybe related to the operator precedence - wrap the (ncol(df)-1) within bracket (assuming the original object is a data.frame)
library(data.table)
df <- df[ ,c(1, 2, 3, ncol(df), 4:(ncol(df)-1)), with = FALSE]
Or use setcolorder to update the original object
setcolorder(df, c(1, 2, 3, ncol(df), 4:(ncol(df)-1)))
NOTE: with = FALSE was added after the OP confirmed it is a data.table object
Or another option is select
library(dplyr)
df <- df %>%
select(1:3, last_col(), everything())
Or with relocate
df <- df %>%
relocate(last_col(), .before = 4)
-reproducible example testing
> data(mtcars)
> head(mtcars)[, c(1, 2, 3, ncol(mtcars), 4:(ncol(mtcars)-1))]
mpg cyl disp carb hp drat wt qsec vs am gear
Mazda RX4 21.0 6 160 4 110 3.90 2.620 16.46 0 1 4
Mazda RX4 Wag 21.0 6 160 4 110 3.90 2.875 17.02 0 1 4
Datsun 710 22.8 4 108 1 93 3.85 2.320 18.61 1 1 4
Hornet 4 Drive 21.4 6 258 1 110 3.08 3.215 19.44 1 0 3
Hornet Sportabout 18.7 8 360 2 175 3.15 3.440 17.02 0 0 3
Valiant 18.1 6 225 1 105 2.76 3.460 20.22 1 0 3
> head(mtcars) %>% select(1:3, last_col(), everything())
mpg cyl disp carb hp drat wt qsec vs am gear
Mazda RX4 21.0 6 160 4 110 3.90 2.620 16.46 0 1 4
Mazda RX4 Wag 21.0 6 160 4 110 3.90 2.875 17.02 0 1 4
Datsun 710 22.8 4 108 1 93 3.85 2.320 18.61 1 1 4
Hornet 4 Drive 21.4 6 258 1 110 3.08 3.215 19.44 1 0 3
Hornet Sportabout 18.7 8 360 2 175 3.15 3.440 17.02 0 0 3
Valiant 18.1 6 225 1 105 2.76 3.460 20.22 1 0 3
> ?relocate
> head(mtcars) %>% relocate(last_col(), .before = 4)
mpg cyl disp carb hp drat wt qsec vs am gear
Mazda RX4 21.0 6 160 4 110 3.90 2.620 16.46 0 1 4
Mazda RX4 Wag 21.0 6 160 4 110 3.90 2.875 17.02 0 1 4
Datsun 710 22.8 4 108 1 93 3.85 2.320 18.61 1 1 4
Hornet 4 Drive 21.4 6 258 1 110 3.08 3.215 19.44 1 0 3
Hornet Sportabout 18.7 8 360 2 175 3.15 3.440 17.02 0 0 3
Valiant 18.1 6 225 1 105 2.76 3.460 20.22 1 0 3
I want to rename() some variables in my data programmatically, so I can to it via map at some point.
I'm looking for the equivalent of,
library(tidyverse)
mtcars %>% rename(
"MPG" = "mpg"
)
but using environment variables instead. I tried !!sym() by doing the following,
library(tidyverse)
new_name <- "MPG"
old_name <- "mpg"
mtcars %>% rename(
!!sym(new_name) = !!sym(old_name)
)
However, I get the error Error: unexpected ')' in ")". I am not sure what I am missing here!
We could use setNames and evaluate (!!!)
head(mtcars %>%
rename(!!! setNames(old_name, new_name)))
-output
MPG cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 6 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
You can use {{}} -
library(dplyr)
new_name <- "MPG"
old_name <- "mpg"
mtcars %>% rename({{new_name}} := {{old_name}}) %>% head
# MPG cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
I have a set a variables say Var1, Var2 to Varn. They all take three possible values 0, 1, and 2. I want to replace all 2 as 1
like so
df$Var1[df$Var1 >= 1] <- 1
This does the job. But when I try to write a function to do this
MakeBinary <- function(varName dfName){dfName$varName[dfName$varNAme > = 1] <- 1}
and use this function like:
MakeBinary(Var2, df)
I got an error message: Error in $<-.data.frame(*tmp*, "varName", value = numeric(0)) :
replacement has 0 rows, data has 512.
I just want to know why I got this message. Thanks. My sample size is 512.
If we are passing column name as string, then use [[ instead of $ and return the dataset
MakeBinary <- function(varName, dfName){
dfName[[varName]][dfName[[varName]] >= 1] <- 1
dfName
}
MakeBinary("Var2", df)
example with mtcars
MakeBinary("carb", head(mtcars))
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 1
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 1
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 1
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Unquoted arguments for variable names can be passed as well, but it needs to be converted to string
MakeBinary <- function(varName, dfName){
varName <- deparse(substitute(varName))
dfName[[varName]][dfName[[varName]] >= 1] <- 1
dfName
}
MakeBinary(Var2, df)
Using a reproducible example with mtcars
MakeBinary(carb, head(mtcars))
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 1
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 1
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 1
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
I'm new to writing functions in R, but want to write a function to add 1% of the median of a variable to itself, using dplyr, and replace the variable with this transformation.
x is a numeric variable.
add_median <- function(df, x) {
x <- enquo(x)
x <- quo_name(x)
mutate(x=x+.01*median(x, na.rm=T))
}
When I run newDF <- DF %>% add_median(variable_of_interest), I get the following error:
Error in 0.01 * median(x, na.rm = T) : non-numeric argument to binary operator
What am I doing wrong here?
We could change the function to evaluate with {{}} and then use assign (:=) instead of = in mutate
library(dplyr)
add_median <- function(df, x) {
df %>%
mutate({{x}} := {{x}} + .01 * median({{x}}, na.rm = TRUE))
}
If we need to change multiple columns, use mutate_at
add_median_multiple <- function(df, vec){
df %>%
mutate_at(vars(vec), ~ . + .01 * median(., na.rm = TRUE))
}
-testing
data(mtcars)
head(mtcars) %>%
add_median(mpg)
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.21 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.21 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 23.01 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.61 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.91 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.31 6 225 105 2.76 3.460 20.22 1 0 3 1
comparison with original 'mpg' column
head(mtcars)
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
add_median_multiple(head(mtcars), c('mpg', 'wt'))
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.21 6 160 110 3.90 2.65045 16.46 0 1 4 4
#Mazda RX4 Wag 21.21 6 160 110 3.90 2.90545 17.02 0 1 4 4
#Datsun 710 23.01 4 108 93 3.85 2.35045 18.61 1 1 4 1
#Hornet 4 Drive 21.61 6 258 110 3.08 3.24545 19.44 1 0 3 1
#Hornet Sportabout 18.91 8 360 175 3.15 3.47045 17.02 0 0 3 2
#Valiant 18.31 6 225 105 2.76 3.49045 20.22 1 0 3 1