This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 5 years ago.
I have a dataframe containing sells by day, but as there are many values for one day, so i'd like to rebuilt my dataframe to have just one value per day which is the sum of all the sells of this day.
what i have:
what i'd like to have:
(the values are not well calculated but it's for the example)
I tried agregate and functions like this but it dosn't work and I dont know how to do this...
Thanks for help
Aggregate should work
aggregate(TOT_OP_MAIN_PAID_MNT,by=list(DATE_ID),FUN=sum)
This should work
df <- data.frame(DATE_ID = 1, TOT_OP_MAIN_PAID_MNT = 1)
aggregate(TOT_OP_MAIN_PAID_MNT ~ DATE_ID, df, sum)
Related
This question already has answers here:
Grouping functions (tapply, by, aggregate) and the *apply family
(10 answers)
Closed 2 years ago.
The given data is like this:
df<-data.frame(farmer=c("F1","F1","F1","F2","F2","F2","F3","F4","F4"),
c2=c(4,4,5,3,3,3,1,2,1))
df
Question is: I would like to get the sum of farmer whose values in the c2 column are the same.
The output of this problem is 2. But how to use code to realize it? Which function should be used?
You can try the code below
nrow(unique(subset(df,ave(c2,farmer)==c2)))
where subset+ ave filters out the farmers who have identical c2 values, and nrow + unique helps count the number of those farmers.
This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 3 years ago.
I have three columns; loan_id, amount, date. I have 1,048,575 entries and I need to add together all the duplicates in loan_id column(there are different payments on the same loan_id) and in the second table the amount values should be added together matching with the loan_id.
Sample of how my data looks like this
Try
aggregate(df$amount,list(df$loan_id),sum)
So you want the total amount for each loan_id irrespective of date?
One way to do aggregate functions like this in R is by using the data.table package.
library(data.table)
# assuming you start with a data.frame
mydata = data.table(mydata)
mydata[,sum(amount), by=loan_id]
This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 4 years ago.
I have a CSV file where there are 4 columns. I would like to get answer by adding the 4th column values where the 3rd column values are same.
The data that i have looks like this:
Now i want to aggregate the above data like this:
Anyone can help me with your ideas!
Using aggregate would do the trick here. Below I'm summing the value variable using id as the group (notice ids 6 and 10 are repeating).
df <- data.frame(id = c(1,2,3,4,5,6,6,7,8,9,10,10),
value = c(9,5,6,8,4,3,2,5,3,5,1,2))
df_sum <- aggregate(value ~ id, data=df, FUN=sum)
This question already has answers here:
Convert continuous numeric values to discrete categories defined by intervals
(2 answers)
Closed 6 years ago.
I have a data frame in R that has a personal ID, an income and some other variables. I would like to add a new column to this data that categorises people in to which income group they fit in to (0-24,999, 25,000-49,999, 50,000-74,999, 75000-99,000, etc).
I then want to be able to make frequency tables of this data compared with some of the other variables (eg: weekly hours worked, age).
I should be fine to figure out the latter of these problems, but I am having trouble figuring out how to categorise my data. Any help would be greatly appreciated.
Thank you.
We can use cut or findInterval to group the "Variable"
gr <- cut(df1$Variable, breaks = c(0, 24999, 49999,74999,99999, Inf))
Then, use table to get the frequency count
table(gr, df1$age)
This question already has answers here:
Select rows with min value by group
(10 answers)
Closed 9 years ago.
I have a dataframe with three columns: Batch, Trial, Time.
Five Trials (0-4) are ran for each Batch number.
I want to pull out the row with the smallest time from each Batch and put them into a new dataframe.
I'm not sure where to start.
Assuming the dataframe as df.
Try
df.new <- df[ df$Time == ave(df$Time, df$Batch, FUN=min), ])