Condensing data frame with same names and different values [duplicate] - r

This question already has answers here:
How to use Aggregate function in R
(3 answers)
How to sum a variable by group
(18 answers)
Closed 5 years ago.
I have a data frame that I am trying to condense. There are multiple value os X with the same names but with different Y values associated with them:
X Y
1 a 1
2 b 3
3 a 2
4 c 4
5 b 7
I want to condense the data frame so there are no duplicate names in X, like below:
X Y
1 a 3
2 b 10
3 c 4

Using tidyverse:
library(tidyverse)
df <- df %>%
group_by(x) %>%
summarise(y = sum(y))

Related

How can I sort a column alphabetically in a data frame? [duplicate]

This question already has answers here:
Sort (order) data frame rows by multiple columns
(19 answers)
Closed 1 year ago.
I'm beginning with R and I have a question.
I have this:
x <- data.frame(x0=c(1:10), x1=c("z", "a","a","a","a","a","c","b","b","b"))
So basically two columns. I want to sort alphabetically taking the entire row of the data frame.
So that 1 - z (both x0 and x1) appear at the end.
I've tried sort() but just managed to sort the column x1 and not both x0 and x1.
Thanks
In base R you can subset and order:
x[order(x$x1),]
x0 x1
2 2 a
3 3 a
4 4 a
5 5 a
6 6 a
8 8 b
9 9 b
10 10 b
7 7 c
1 1 z
With dplyr you use arrange:
library(dplyr)
x %>%
arrange(x1)

Reshaping dataframe to list values over unique id - back and forth [duplicate]

This question already has answers here:
Collapse text by group in data frame [duplicate]
(2 answers)
Collapse / concatenate / aggregate a column to a single comma separated string within each group
(6 answers)
Closed 3 years ago.
I want to condense information in a dataframe to reduce the number of rows.
Consider the dataframe:
df <- data.frame(id=c("A","A","A","B","B","C","C","C"),b=c(4,5,6,1,2,7,8,9))
df
id b
1 A 4
2 A 5
3 A 6
4 B 1
5 B 2
6 C 7
7 C 8
8 C 9
I want to collapse the dataframe to all unique values of "id" and list the values in variable b. The result should look like
df.results <- data.frame(id=c("A","B","C"),b=c("4,5,6","1,2","7,8,9"))
df.results
id b
1 A 4,5,6
2 B 1,2
3 C 7,8,9
A solution for the first step is:
library(dplyr)
df.results <- df %>%
group_by(id) %>%
summarise(b = toString(b)) %>%
ungroup()
How would you turn df.results back into df?

How to sum a specific column of replicate rows in dataframe? [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
How to group by two columns in R
(4 answers)
Closed 3 years ago.
I have a data frame which contains a lot of replicates rows. I would like to sum up the last column of replicates rows and remove the replications at the same time. Could anyone tell me how to do that?
The example is here:
name <- c("a","b","c","a","c")
position <- c(192,7,6,192,99)
score <- c(1,2,3,2,5)
df <- data.frame(name,position,score)
> df
name position score
1 a 192 1
2 b 7 2
3 c 6 3
4 a 192 2
5 c 99 5
#I would like to sum the score together if the first two columns are the
#same. The ideal result is like this way
name position score
1 a 192 3
2 b 7 2
3 c 6 3
4 c 99 5
Sincerely thank you for the help.
try this :
library(dplyr)
df %>%
group_by(name, position) %>%
summarise(score = sum(score, na.rm = T))

Countif function in R by fixing the range column [duplicate]

This question already has answers here:
Count number of rows per group and add result to original data frame
(11 answers)
Closed 5 years ago.
How to do countif in R
IN EXCEL we can write the formula as
"COUNTIF($K$2:$K$205,K2),COUNTIF($K$2:$K$205,K3),.... "
How to do in R
value col Countif
1 A 3
1 A 3
1 A 3
4 A 2
4 A 2
3 A 1
99 B 2
99 B 2
1000 B 4
1000 B 4
1000 B 4
1000 B 4
We can use the convenient function from dplyr i.e. add_count
library(dplyr)
df1 %>%
add_count(value, col)
which is similar to
df1 %>%
group_by(value, col) %>%
mutate(count = n())

R join same row and calculate mean value [duplicate]

This question already has answers here:
Grouping functions (tapply, by, aggregate) and the *apply family
(10 answers)
Closed 7 years ago.
I have a data frame that looks like this:
data<-data.frame(y=c(1,1,2,2,3,4,5,5),x=c(5,5,10,10,5,10,5,5))
y x
1 1 5
2 1 5
3 2 10
4 2 30
5 3 5
6 4 10
7 5 4
8 5 8
How can a merge those rows with same value in y column and modify the x column value to the mean of them.
I would like something like this:
y x
1 1 5
2 2 20
3 3 5
4 4 10
7 5 6
I'm trying:
unique(data)
But it removes the values instead of doing the mean of same rows.
It is easy with dplyr. Like here:
library("dplyr")
data %>%
group_by(y) %>%
summarise(x=mean(x))
We can use aggregate
aggregate(x~y, data, mean)
User plyr.
# Create dummy data.
nel = 30
df <- data.frame(x = round(5*runif(nel)), y= round(10*runif(nel)))
# Summarise means
require(plyr)
df$x <- as.factor(df$x)
res <- ddply(df, .(x), summarise, mu=mean(y))

Resources