how to add a Column using IF_ELSE - r

I'm trying to add a column to a dataframe using add_column and if_else but I can get it I don't know how to do a correct logical test using logical conditional (or "|").
I have this kind data:
dataframe1
variable 1 variable2 variable3
(char) (char) (char)
value value value
value value value
value value value
I try this:
dataframe2 <- dataframe1%>%
add_column(newcolumn_name = if_else(variable3== "value1"|"value2”, TRUE, FALSE)
And I get this error:
Unknown or uninitialised column: value1.Error in variable3 ==
“value1“| "value2" : operations are possible only for numeric,
logical or complex types

Consider to extract the column with .$. The == can be replaced with %in% and | is used mostly with regex pattern (OR) while == does a fixed match. In addition, the output of == or %in% returns a logical vector. So, we don't need the if_else/ifelse
library(dplyr)
library(tibble)
dataframe1 %>%
add_column(newcolumn_name = .$variable3 %in% c("value1", "value2"))
Using a reproducible example
head(mtcars) %>%
add_column(new_column_name = .$carb %in% c(1, 4))
mpg cyl disp hp drat wt qsec vs am gear carb new_column_name
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 TRUE
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 TRUE
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 TRUE
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 TRUE
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 FALSE
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 TRUE
Also, this can be done within dplyr itself i.e. using mutate and thus we don't need to extract the column
head(mtcars) %>%
mutate(new_column_name = carb %in% c(1, 4))
mpg cyl disp hp drat wt qsec vs am gear carb new_column_name
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 TRUE
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 TRUE
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 TRUE
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 TRUE
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 FALSE
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 TRUE

I was able to do that with this code:
dataf2 <- dataf %>%
add_column(newcol = ifelse(dataf$var3=="value1" | dataf$var3=="value2", TRUE, FALSE) )

Related

Replace all other values in R

I have a column including lots of "0" and other values (f.i. 2 or 2,3 etc). Is there any possibility to rename the columns with 0 to "None" and all other values to "others"? I wanted to use fct_recode or fct_collapse but cant figure out how to include all other values. Do you have any idea? I must not be necessarily include the fct_recode function.
Thanks a lot
Philipp
I tried to use fct_recode, fct_collapse
Here is a way to do it using mtcars and the vs column as an example:
cars <- mtcars
head(cars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
cars$vs <- ifelse(cars$vs == 0, "none", "other")
head(cars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 none 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 none 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 other 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 other 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 none 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 other 0 3 1
Note that R coerces the vs column from numeric to character. But you could do that explicitly first for clarity:
cars$vs <- as.character(cars$vs)
Using dplyr, we can do this on multiple colums as
library(dplyr)
df1 <- df1 %>%
mutate(across(everything(), ~ case_when(.x == 0 ~ "none", TRUE ~ "other")))
Or in base R
df1[] <- c("other", "none")[1 + (df1 == 0)]

write r function to modify value in data frame

I have a set a variables say Var1, Var2 to Varn. They all take three possible values 0, 1, and 2. I want to replace all 2 as 1
like so
df$Var1[df$Var1 >= 1] <- 1
This does the job. But when I try to write a function to do this
MakeBinary <- function(varName dfName){dfName$varName[dfName$varNAme > = 1] <- 1}
and use this function like:
MakeBinary(Var2, df)
I got an error message: Error in $<-.data.frame(*tmp*, "varName", value = numeric(0)) :
replacement has 0 rows, data has 512.
I just want to know why I got this message. Thanks. My sample size is 512.
If we are passing column name as string, then use [[ instead of $ and return the dataset
MakeBinary <- function(varName, dfName){
dfName[[varName]][dfName[[varName]] >= 1] <- 1
dfName
}
MakeBinary("Var2", df)
example with mtcars
MakeBinary("carb", head(mtcars))
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 1
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 1
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 1
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Unquoted arguments for variable names can be passed as well, but it needs to be converted to string
MakeBinary <- function(varName, dfName){
varName <- deparse(substitute(varName))
dfName[[varName]][dfName[[varName]] >= 1] <- 1
dfName
}
MakeBinary(Var2, df)
Using a reproducible example with mtcars
MakeBinary(carb, head(mtcars))
# mpg cyl disp hp drat wt qsec vs am gear carb
#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 1
#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 1
#Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 1
#Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Specifying where columns should be placed in r

When I create a new variable, is there a way to specify in the function where to place it?
Right now, it adds it to the end of the dataframe, but for ease of viewing in Excel for example, I'd like to place a new calculated column beside the columns I used for the calculation.
Here's an example of code:
rawdata2 <- (rawdata1 %>% unite(location, locations1,locations2, locations3,
na.rm = TRUE, remove=TRUE)
%>% select(-location7, -location16)
%>% unite(Sector, Sectors, na.rm=TRUE, remove=TRUE)
%>% unite(TypeofSpace, TypesofSpace, type.of.spaceOther, na.rm=TRUE,
remove=TRUE)
)
You can rearrange the columns in your data frame. It looks like you are using dplyr::select in your example.
library(dplyr)
head(mtcars)
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
mtcars2 <- mtcars %>%
select(mpg, carb, everything()) ## moves carb up behind mpg
head(mtcars2)
# mpg carb cyl disp hp drat wt qsec vs am gear
# Mazda RX4 21.0 4 6 160 110 3.90 2.620 16.46 0 1 4
# Mazda RX4 Wag 21.0 4 6 160 110 3.90 2.875 17.02 0 1 4
# Datsun 710 22.8 1 4 108 93 3.85 2.320 18.61 1 1 4
# Hornet 4 Drive 21.4 1 6 258 110 3.08 3.215 19.44 1 0 3
# Hornet Sportabout 18.7 2 8 360 175 3.15 3.440 17.02 0 0 3
# Valiant 18.1 1 6 225 105 2.76 3.460 20.22 1 0 3
You can do the same thing with base subsetting, for example with a data frame with 11 columns you can move the 11th behind the second by
mtcars3 <- mtcars[,c(1,11,2:10)]
identical(mtcars2, mtcars3)
# [1] TRUE
I ended up using relocate, documentation here: dplyr.tidyverse.org/reference/relocate.html

calculate new column in dataframe or list using a function with column as param

I'm trying to calculate a new column with a user defined function that needs data from same row and a fixed value valid for all rows:
myfunc <- function(ds,colname,val1,col1,col2){
# content of new column <colname> should be computed from:
ds[colname] = val1 + ds[col1] * ds[col2] # for each row of ds
return(ds)
}
v1 = 2
data(mtcars)
mt = head(mtcars)
mt
mpg cyl disp hp drat wt qsec vs am gear
carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
apply(mt,'newcol',v1,mt$wt,mt$qsec)
mt
What I would like to see in mt$newcol in first row is: 2 + 2.620 * 16.46 (-> 45.12) and all other rows similiar.
So, how can I send a fixed value (v1) and two values from each row to my function and store returned value in this row in a new column?
Thanks
dplyr approach:
library(dplyr)
data(mtcars)
myfunc <- function(ds, new_column, val1, col1, col2){
name <- rownames(ds)
ds <- ds %>%
mutate(!!as.name(new_column) := val1 + !!as.name(col1) + !!as.name(col2),
car_name = name) %>%
select(car_name, mpg:!!as.name(new_column))
return(ds)
}
head(
myfunc(ds = mtcars,
new_column = "new_column",
val1 = 2,
col1 = "hp",
col2 = "vs")
)
output
car_name mpg cyl disp hp drat wt qsec vs am gear carb new_column
1 Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 112
2 Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 112
3 Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 96
4 Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 113
5 Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 177
6 Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 108

Opposite function to add_rownames in dplyr

As an intermediate step I generate a data frame with one column as character strings and the rest are numbers. I'd like to convert it to a matrix, but first I have to convert that character column into row names and remove it from the data frame.
Is there a simpe way to do this in dplyr? A function like to_rownames() that is opposite to add_rownames()?
I saw a solution using a custom function, but it's really out of dplyr philosophy.
You can now use the tibble-package:
tibble::column_to_rownames()
This provides NSE & standard eval functions:
library(dplyr)
df <- data_frame(a=sample(letters, 4), b=c(1:4), c=c(5:8))
reset_rownames <- function(df, col="rowname") {
stopifnot(is.data.frame(df))
col <- as.character(substitute(col))
reset_rownames_(df, col)
}
reset_rownames_ <- function(df, col="rowname") {
stopifnot(is.data.frame(df))
nm <- data.frame(df)[, col]
df <- df[, !(colnames(df) %in% col)]
rownames(df) <- nm
df
}
m <- "rowname"
head(as.matrix(reset_rownames(add_rownames(mtcars), "rowname")))
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
head(as.matrix(reset_rownames_(add_rownames(mtcars), m)))
## mpg cyl disp hp drat wt qsec vs am gear carb
## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Perhaps to_rownames() or set_rownames() makes more sense. ¯\_(ツ)_/¯ YMMV.
If you really need a matrix you can just save the character column to a separate variable, drop it, and then create the matrix
library(dplyr)
df <- data_frame(a = sample(letters, 4), b = c(1:4), c = c(5:8))
letters <- df %>% select(a)
a.matrix <- df %>% select(-a) %>% as.matrix
Not sure what you are going to do after that, but this gets you as far as you asked for...

Resources