Create new columns lopping an array inside mutate (dplyr) - r

I have the following dummy data frame called df:
A1 A2 A3 B1 B2 B3 C1 C2 C3
1 1 1 2 2 2 3 3 3
and I would like to sum columns that contain the same letter into a new column (naming it using the corresponding letter).
I would expect this result:
A1 A2 A3 B1 B2 B3 C1 C2 C3 A B C
1 1 1 2 2 2 3 3 3 3 6 9
I know I can achieve this result using mutatefrom dyplr:
A = A1 + A2 + A3,
B = B1 + B2 + B3,
C = C1 + C2 + C3)
Is there any way to do it using a vector like letters <- c("A", "B", "C") and looping over that vector inside the mutate function? Something like:
letters = paste0(letters,"1") + paste0(letters,"2") + paste0(letters,"3") )

One dplyr and purrr solution could be:
bind_cols(df, map_dfc(.x = LETTERS[1:3],
~ df %>%
transmute(!!.x := rowSums(select(., starts_with(.x))))))
A1 A2 A3 B1 B2 B3 C1 C2 C3 A B C
1 1 1 1 2 2 2 3 3 3 3 6 9


Is there a way to change data frame entries in R from numeric to a specific character?

If I have a data frame like so:
df <- data.frame(
a = c(1,1,1,2,2,2,3,3,3),
b = c(1,2,3,1,2,3,1,2,3)
which looks like this:
> df
a b
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
Is there a quick way to change the columns a and b to match the example below, without explicitly having to type it all out?
> df
a b
a1 b1
a1 b2
a1 b3
a2 b1
a2 b2
a2 b3
a3 b1
a3 b2
a3 b3
In other words, Im trying to take the name of the column and just place it in front of the value that was in that row originally.
We can use cur_column to return the corresponding column name within across and paste (str_c) the column value with the corresponding column name
df1 <- df %>%
mutate(across(everything(), ~ str_c(cur_column(), .)))
# a b
#1 a1 b1
#2 a1 b2
#3 a1 b3
#4 a2 b1
#5 a2 b2
#6 a2 b3
#7 a3 b1
#8 a3 b2
#9 a3 b3
Or using base R
df[] <- Map(paste0, names(df), df)
Or another option is
df[] <- paste0(names(df)[col(df)], unlist(df))

How to write two vectors of different length into one data frame by writing same values into same row?

I want to write two vectors of different length with partly equal values into one data frame. The same values should be written in the same row.
ef1 <- c('A1', 'A2', 'B0', 'B1', 'C1', 'C2')
ef2 <- c('A1', 'A2', 'C1', 'C2', 'D1', 'D2')
If I write them in one data frame, it looks like this:
df <- data.frame (ef1, ef2)
> df
ef1 ef2
1 A1 A1
2 A2 A2
3 B0 C1
4 B1 C2
5 C1 D1
6 C2 D2
But what I want is this:
> df
ef1 ef2
1 A1 A1
2 A2 A2
3 B0 NA
4 B1 NA
5 C1 C1
6 C2 C2
7 NA D1
8 NA D2
I'm grateful for any help.
One option is match
(tmp <- unique(c(ef1, ef2)))
# [1] "A1" "A2" "B0" "B1" "C1" "C2" "D1" "D2"
out <- data.frame(ef1 = ef1[match(tmp, ef1)],
ef2 = ef2[match(tmp, ef2)])
# ef1 ef2
#1 A1 A1
#2 A2 A2
#3 B0 <NA>
#4 B1 <NA>
#5 C1 C1
#6 C2 C2
#7 <NA> D1
#8 <NA> D2
Another solution, using dplyr's full_join. The idea is to artificially create a merging column and then make a full join.
ef1 %>%
full_join(ef2,by="a") %>%
# A tibble: 8 x 2
ef1 ef2
<chr> <chr>
1 A1 A1
2 A2 A2
3 B0 NA
4 B1 NA
5 C1 C1
6 C2 C2
7 NA D1
8 NA D2

group 2 variables and then delimit the strings

I am trying to group two variables and remove the comma seperated without increasing the number of row
#my dataframe
> df
g1 g2 g3
1 a1 a2 77.7,81.7
2 a1 a2 77.7,81.7
3 b2 b3 3,1,5
4 b2 b3 3,1,5
5 b2 b3 3,1,5
Expected Output:
g1 g2 g3
1 a1 a2 77.7
2 a1 a2 81.7
3 b2 b3 3
4 b2 b3 1
5 b2 b3 5
I tried some codes below but its unable to group and not comes in expected format. Please help!
df <- data.frame(g1 = c("a1","a1","b2","b2","b2"), g2 = c("a2","a2","b3","b3","b3"), g3 = c("77.7,81.7","77.7,81.7","3,1,5","3,1,5","3,1,5"))
s <- strsplit(df$g3, split = ",")
data.frame(V1 = rep(df$g1, sapply(s, length)), V2 = unlist(s))
Building on Chris Ruehlemann's answer: you can use the following and it will still work if values reappear.
df$g3_split <- unlist(lapply(split(df,df$g1), function(x) unique(unlist(strsplit(x$g3, ","))) ))
g1 g2 g3 g3_split
1 a1 a2 77.7,81.7 77.7
2 a1 a2 77.7,81.7 81.7
3 b2 b3 3,77.7,5 3
4 b2 b3 3,77.7,5 77.7
5 b2 b3 3,77.7,5 5
df <- data.frame(g1 = c("a1","a1","b2","b2","b2"),
g2 = c("a2","a2","b3","b3","b3"),
g3 = c("77.7,81.7","77.7,81.7","3,1,5","3,1,5","3,1,5"), stringsAsFactors = F)
df$g3_split <- unique(unlist(strsplit(df$g3, ",")))
g1 g2 g3 g3_split
1 a1 a2 77.7,81.7 77.7
2 a1 a2 77.7,81.7 81.7
3 b2 b3 3,1,5 3
4 b2 b3 3,1,5 1
5 b2 b3 3,1,5 5
If you want to replace g3with the new values, just assign unique(unlist(strsplit(df$g3, ","))) to df$g3 instead of df$g3_split.
An option with separate_rows
df %>%
mutate( g3_split = g3) %>%
separate_rows(g3_split) %>%
distinct(g3_split, .keep_all = TRUE)

access first row of group_by dataset

I have a dataframedf1 with columns a,b,c. I want to assign c=0 to the first row of the dataset returned by group_by(a,b). I tried something like
t <- df1 %>% group_by(a,b) %>% filter(row_number(a)==1) %>% mutate(c= 0)
But it reduced number of rows. Expected output is
a b c
a1 b1 0
a1 b1 NA
a2 b2 0
a2 b2 NA
You can use seq_along to number elements in each group from 1 to the total number of elements within each group (2, in this case). Then use ifelse to set the first element of 'c' for each group to be 0 and leave the other element as is.
df %>%
group_by(a, b) %>%
mutate(c = ifelse(seq_along(c) == 1, 0, c))
# A tibble: 4 x 3
# Groups: a, b [2]
# a b c
# <fct> <fct> <dbl>
#1 a1 b1 0.
#2 a1 b1 NA
#3 a2 b2 0.
#4 a2 b2 NA
df <- data.frame(a = rep(c("a1", "a2"), each = 2),
b = rep(c("b1", "b2"), each = 2),
c = NA)
# a b c
#1 a1 b1 NA
#2 a1 b1 NA
#3 a2 b2 NA
#4 a2 b2 NA

how to sort data frame by column names in R?

How can I sort the below data frame df to df1?
a1 a4 a3 a5 a2
sorted data frame
a1 a2 a3 a4 a5
We can use mixedorder from library(gtools)
df1 <- df[mixedorder(colnames(df))]
# a1 a3 a9 a10
#1 1 3 1 2
#2 2 4 2 3
#3 3 5 3 4
#4 4 6 4 5
#5 5 7 5 6
df <- data.frame(a1 = 1:5, a10=2:6, a3 = 3:7, a9= 1:5)
In base R, assuming the numbers in the colnames don't go into double digits.
# a1 a4 a3 a5 a2
#1 1 4 3 5 2
df[, order(names(df))]
# a1 a2 a3 a4 a5
#1 1 2 3 4 5
Assuming there is no "hole" in the numbers suffixing the columns names, you can also use dplyr:
df1 <- select(df, num_range("a", 1:4))
