Calculate the sum of obervation values with conditions [duplicate] - r

This question already has answers here:
Grouping functions (tapply, by, aggregate) and the *apply family
(10 answers)
Closed 2 years ago.
The given data is like this:
df<-data.frame(farmer=c("F1","F1","F1","F2","F2","F2","F3","F4","F4"),
c2=c(4,4,5,3,3,3,1,2,1))
df
Question is: I would like to get the sum of farmer whose values in the c2 column are the same.
The output of this problem is 2. But how to use code to realize it? Which function should be used?

You can try the code below
nrow(unique(subset(df,ave(c2,farmer)==c2)))
where subset+ ave filters out the farmers who have identical c2 values, and nrow + unique helps count the number of those farmers.

Related

HOW DO I CALCULATE THE MEAN OF SELECTED COLUMNS? [duplicate]

This question already has answers here:
How to get the mean of specific columns in dataframe and store in vector (in R)
(1 answer)
Find mean of multiple columns in R
(4 answers)
Closed 2 months ago.
I have a large dataset. I want to calculate the mean of some of the columns in the data set together. I am not sure how I can use the
colMeans ()
I have only found how to calculate for categories and rows.
Let me take embedded data iris in R as an example.
colMeans(iris[, 1:3], na.rm=TRUE) # select columns #1~3.

Count occurrences of value in a set of variables in R (per column) [duplicate]

This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 1 year ago.
I have this data and I want to figure out a way to know how many ones and how many zeros are in each column (ie Arts and Crafts). I have been trying different things but it hasn't been working. Does anyone have any suggestions?
You can use the table() function in R. This creates a categorical representation of your data. Additionally here convert list to vector I have used unlist() function.
df1 <- read.csv("Your_CSV_file_name_here.csv")
table(unlist(df1$ArtsAndCrafts))
If you want to row vice categorize the number of zeros and ones you can refer to this question in Stackoverflow.

How to add together duplicate values in columns? [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 3 years ago.
I have three columns; loan_id, amount, date. I have 1,048,575 entries and I need to add together all the duplicates in loan_id column(there are different payments on the same loan_id) and in the second table the amount values should be added together matching with the loan_id.
Sample of how my data looks like this
Try
aggregate(df$amount,list(df$loan_id),sum)
So you want the total amount for each loan_id irrespective of date?
One way to do aggregate functions like this in R is by using the data.table package.
library(data.table)
# assuming you start with a data.frame
mydata = data.table(mydata)
mydata[,sum(amount), by=loan_id]

R sum on rows based on day [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 5 years ago.
I have a dataframe containing sells by day, but as there are many values for one day, so i'd like to rebuilt my dataframe to have just one value per day which is the sum of all the sells of this day.
what i have:
what i'd like to have:
(the values are not well calculated but it's for the example)
I tried agregate and functions like this but it dosn't work and I dont know how to do this...
Thanks for help
Aggregate should work
aggregate(TOT_OP_MAIN_PAID_MNT,by=list(DATE_ID),FUN=sum)
This should work
df <- data.frame(DATE_ID = 1, TOT_OP_MAIN_PAID_MNT = 1)
aggregate(TOT_OP_MAIN_PAID_MNT ~ DATE_ID, df, sum)

How can I group by name in R and apply sum to the other 2 columns? [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 6 years ago.
I am trying to summarize this dataset by grouping by name (Almeria, Ath Bilbao,...) and have the sum of its corresponding values in column 2 (HalfTimeResult) and 3 (FullTimeResult). I tried with the aggregate and group_by functions but have not been able to obtain the right output.
What function and how would I use it to obtain an output like this?
This is the dataset that I am working with:
We can use data.table
library(data.table)
setDT(df1)[, lapply(.SD, sum), by = HomeTeam]

Resources