recode several variables placed in a vector in a column - r

I want to know how to recode several variables placed in a vector in a column, I put an example of my code.
library(tidyverse)
df <- data.frame(
"num" = 1:5,
"letter" = c("a", "b", "c", "d" , "e"))
vector<-c("a","c")
df<-df %>% mutate(letter=recode(letter,vector="no"))

I think recode doesn't take vector arguments. Instead of recode you can try a simple if_else statement like this:
df<-df %>% mutate(letter= if_else(letter %in% vector, 'no', letter))

You can change the values of letter for vector values using %in% -
df$letter[df$letter %in% vector] <- 'no'
df
# num letter
#1 1 no
#2 2 b
#3 3 no
#4 4 d
#5 5 e
Or in data.table -
library(data.table)
setDT(df)[letter %in% vector, letter := 'no']
df

Related

mutate and if_any with condition over multiple columns

I tried to combine mutate, case_when and if_any to create a variable = 1 if any of the variables whose name begins with "string" is equal to a specific string.
I can't figure out what I'm missing in the combination of these conditions.
I'm trying:
df <-data.frame(string1= c("a","b", "c"), string2= c("d", "a", "f"), string3= c("a", "d", "c"), id= c(1,2,3))
df <- df%>%
mutate(cod = case_when(if_any(starts_with("string") == "a" ~1 )))
The syntax was slightly wrong, but you were close. Note that if_any works like across, so like this if_any(columns, condition), and you should use function, \ or ~ to specify the condition.
df %>%
mutate(cod = case_when(if_any(starts_with("string"), ~ .x == "a") ~ 1))
string1 string2 string3 id cod
1 a d a 1 1
2 b a d 2 1
3 c f c 3 NA

Filter row by a vector on single character field with more then one possible hits

I'm in trouble... with an R script:
Given a tibble and a character vector
tbEsempio <- tibble(C1 = c(100,150,200,343,563), C2 = c("A", "A&B", "D", "D$C", "E"))
vcValori <- c("A","B","C")
I need filter tbEsempio on C2 with values in vcValori vector.
If I use %like% without vector it is ok:
tbEsempioFiltered <- tbEsempio[which(tbEsempio$C2 %like% "A" |
tbEsempio$C2 %like% "B" |
tbEsempio$C2 %like% "C" )]
but using vcValori with %in% operator instead %like%
tbEsempioFiltered <- tbEsempio %>% filter(C2 %in% {{vcValori}})
the line with 2 (or more) values separate by "&", in the exemple is 2 and 4, are not included.
There is a solution using tidyverse framework?
You could use
library(dplyr)
tbEsempio %>%
filter(grepl(paste(vcValori, collapse = "|"), C2))
This returns
# A tibble: 3 x 2
C1 C2
<dbl> <chr>
1 100 A
2 150 A&B
3 343 D$C

extract a column in dataframe based on condition for another column R

I want to extract a column from a dataframe in R based on a condition for another column in the same dataframe, the dataframe is given below.
b <- c(1,2,3,4)
g <- c("a", "b" ,"b", "c")
df <- data.frame(b,g)
row.names(df) <- c("aa", "bb", "cc" , "dd")
I want to extract all values for column b as a dataframe (with rownames) where column g has value 'b',
My required output is given below:
df
b
cc 3
dd 4
I have tried several methods like which or subset but it does not work. I have also tried to find the answer to this question on stackoverflow but I was not able to find it. Is there a way to do it?
Thanks,
You can use the subset function in base R -
subset(df, g == 'b', select = b)
# b
#bb 2
#cc 3
Using data.table
library(data.table)
setDT(df, key = 'g')['b', .(b)]
b
1: 2
2: 3
Or with collapse
library(collapse)
sbt(df, g == 'b', b)
b
1 2
2 3
This is the basic way of slicing data in r
df[df$g == 'b',]['b']
Or the tidyverse answer
df %>%
filter(g == 'b') %>%
select(b)

How do I mutate a list-column to a common one leaving only the last value when there is a vector in the list?

I am trying to use purrr::map_chr to get the last element of the vector in a list-column as the actual value in case that it exists.
THE reproducible example:
library(data.table)
library(purrr)
x <- data.table(one = c("a", "b", "c"), two = list("d", c("e","f","g"), NULL))
I want data as it is but changing my list-column to a common one with "g" as the value for x[2,2]. What I've tryed:
x %>% mutate(two = ifelse(is.null(.$two), map_chr(~NA_character_), map_chr(~last(.))))
The result should be the next one.
# one two
# a d
# b g
# c NA
Thaks in advance!
Here is an option. We can use if/else instead of ifelse here
library(dplyr)
library(tidyr)
x %>%
mutate(two = map_chr(two, ~ if(is.null(.x)) NA_character_ else last(.x)))
# one two
#1 a d
#2 b g
#3 c NA
Or replace the NULL elements with NA and extract the last
x %>%
mutate(two = map_chr(two, ~ last(replace(.x, is.null(.), NA))))
I would propose this solution which is a bit cleaner.
library(tidyverse)
df <- tibble(one = c("a", "b", "c"), two = list("d", c("e","f","g"), NULL))
df %>%
mutate_at("two", replace_na, NA_character_) %>%
mutate_at("two", map_chr, last)

R: Replacing values of one list element with values of a second list element

I want to replace the values of one element of a list with the values of a second element of a list. Specifically,
I have a list containing multiple data sets.
Each data set has 2 variables
The variables are factors
The n'th element of the second variable of each data set needs to be replaced with the n'th element of the first variable in each data set
Also, the replaced value should be called "replaced"
dat1 <- data.frame(names1 =c("a", "b", "c", "f", "x"),values= c("val1_1", "val2_1", "val3_1", "val4_1", "val5_1"))
dat1$values <- as.factor(dat1$values)
dat2 <- data.frame(names1 =c("a", "b", "f2", "s5", "h"),values= c("val1_2", "val2_2", "val3_2", "val4_2", "val5_2"))
dat2$values <- as.factor(dat2$values)
list1 <- list(dat1, dat2)
The result should be the same list, but just with the 5th value replaced.
[[1]]
names1 values
1 a val1_1
2 b val2_1
3 c val3_1
4 f val4_1
5 replaced x
[[2]]
names1 values
1 a val1_2
2 b val2_2
3 f2 val3_2
4 s5 val4_2
5 replaced h
A base R approach using lapply, since both the columns are factors we need to add new levels first before replacing them with new values otherwise those value would turn as NAs.
n <- 5
lapply(list1, function(x) {
levels(x$values) <- c(levels(x$values), as.character(x$names1[n]))
x$values[n] <- x$names1[n]
levels(x$names1) <- c(levels(x$names1), "replaced")
x$names1[n] <- "replaced"
x
})
#[[1]]
# names1 values
#1 a val1_1
#2 b val2_1
#3 c val3_1
#4 f val4_1
#5 replaced x
#[[2]]
# names1 values
#1 a val1_2
#2 b val2_2
#3 f2 val3_2
#4 s5 val4_2
#5 replaced h
There is also another approach where we can convert both the columns to characters, then replace the values at required position and again convert them back to factors but since every dataframe in the list can be huge we do not want to convert all the values to characters and then back to factor just to change one value which could be computationally very expensive.
Here is one option with tidyverse. Loop through the list with map, slice the row of interest (in this case, it is the last row, so n() can be used), mutate the column value and bind with the original data without the last row
library(tidyverse)
map(list1, ~ .x %>%
slice(n()) %>%
mutate(values = names1, names1 = 'replaced') %>%
bind_rows(.x %>% slice(-n()), .))
#[[1]]
# names1 values
#1 a val1_1
#2 b val2_1
#3 c val3_1
#4 f val4_1
#5 replaced x
#[[2]]
# names1 values
#1 a val1_2
#2 b val2_2
#3 f2 val3_2
#4 s5 val4_2
#5 replaced h
Or it can be made more compact with fct_c from forcats. Different factor levels can be combined together with fct_c for the 'values' and 'names1' column
library(forcats)
map(list1, ~ .x %>%
mutate(values = fct_c(values[-n()], names1[n()]),
names1 = fct_c(names1[-n()], factor('replaced'))))
Or using similar approach with base R where we loop through the list with lapply, then convert the data.frame to matrix, rbind the subset of matrix i.e. the last row removed with the values of interest, and convert to data.frame (by default, stringsAsFactors = TRUE - so it gets converted to factor)
lapply(list1, function(x) as.data.frame(rbind(as.matrix(x)[-5, ],
c('replaced', as.character(x$names1[5])))))

Resources