consolidate rows with same value using R [duplicate] - r

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 4 years ago.
I have a CSV file where there are 4 columns. I would like to get answer by adding the 4th column values where the 3rd column values are same.
The data that i have looks like this:
Now i want to aggregate the above data like this:
Anyone can help me with your ideas!

Using aggregate would do the trick here. Below I'm summing the value variable using id as the group (notice ids 6 and 10 are repeating).
df <- data.frame(id = c(1,2,3,4,5,6,6,7,8,9,10,10),
value = c(9,5,6,8,4,3,2,5,3,5,1,2))
df_sum <- aggregate(value ~ id, data=df, FUN=sum)

Related

How to add a new column to calculate mean for each group using dplyr in R [duplicate]

This question already has answers here:
Adding a column of means by group to original data [duplicate]
(4 answers)
Closed 2 years ago.
I have a table with 2 columns.
Type: 1 or 2 or 3 or 4
Data: corresponding data (there are multiple data for each type)
Now I want to create a third column that contains means of data each type i.e., all the rows with type 1 have the same mean value. I think I should do it with mutate function but not sure how to proceed.
data %>% mutate(meanData = ifelse(...))
Can somebody help?
Thank you in advance.
We can do a group by operation
library(dplyr)
data <- data %>%
group_by(Type) %>%
mutate(meanData = mean(Data))

How to add together duplicate values in columns? [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 3 years ago.
I have three columns; loan_id, amount, date. I have 1,048,575 entries and I need to add together all the duplicates in loan_id column(there are different payments on the same loan_id) and in the second table the amount values should be added together matching with the loan_id.
Sample of how my data looks like this
Try
aggregate(df$amount,list(df$loan_id),sum)
So you want the total amount for each loan_id irrespective of date?
One way to do aggregate functions like this in R is by using the data.table package.
library(data.table)
# assuming you start with a data.frame
mydata = data.table(mydata)
mydata[,sum(amount), by=loan_id]

R sum on rows based on day [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 5 years ago.
I have a dataframe containing sells by day, but as there are many values for one day, so i'd like to rebuilt my dataframe to have just one value per day which is the sum of all the sells of this day.
what i have:
what i'd like to have:
(the values are not well calculated but it's for the example)
I tried agregate and functions like this but it dosn't work and I dont know how to do this...
Thanks for help
Aggregate should work
aggregate(TOT_OP_MAIN_PAID_MNT,by=list(DATE_ID),FUN=sum)
This should work
df <- data.frame(DATE_ID = 1, TOT_OP_MAIN_PAID_MNT = 1)
aggregate(TOT_OP_MAIN_PAID_MNT ~ DATE_ID, df, sum)

How can I group by name in R and apply sum to the other 2 columns? [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 6 years ago.
I am trying to summarize this dataset by grouping by name (Almeria, Ath Bilbao,...) and have the sum of its corresponding values in column 2 (HalfTimeResult) and 3 (FullTimeResult). I tried with the aggregate and group_by functions but have not been able to obtain the right output.
What function and how would I use it to obtain an output like this?
This is the dataset that I am working with:
We can use data.table
library(data.table)
setDT(df1)[, lapply(.SD, sum), by = HomeTeam]

How to take a data.frame and create a table of only the Max values? [duplicate]

This question already has answers here:
How to find the highest value of a column in a data frame in R?
(10 answers)
Closed 7 years ago.
I have the following dataframe:
I would like to create a table of the column headers, in a column with their maximum value to the right of them. I will be looking to embed this in a Shiny App. Does anyone know how I can do this?
I used:
colMax <- function(data) sapply(data, max, na.rm=TRUE)
Then I called as.data.frame(colmax(data)) and it worked as needed.
Duplicate: How to find the highest value of a column in a data frame in R?

Resources