Count for every unique value in a column - R [duplicate] - r

This question already has answers here:
How to count the number of unique values by group? [duplicate]
(1 answer)
count number of rows in a data frame in R based on group [duplicate]
(8 answers)
Closed 2 years ago.
I have a dataframe that contains a column representing the 'Year' and another column that represents 'Type':
a Year Creams
1 2004 11
2 2004 12
3 2001 13
4 2004 14
5 2002 15
. .... ..
How do I count every year in column 'Year' so it appears as:
a Year TypeCount
1 2004 3
2 2002 1
3 2001 1
It can be output into another dataframe, I don't mind. I just need it to be suitable to make a graph out of it at the end.

Related

keep unique value for each groups using R [duplicate]

This question already has answers here:
Select the first row by group
(8 answers)
Collapsing data frame by selecting one row per group
(4 answers)
Select the row with the maximum value in each group
(19 answers)
Closed 2 months ago.
keep the unique value for each person.
I have DF
name
size
john
16
khaled
15
john
15
Alex
16
john
16
I need in the output to remove the duplicated value in color for each name.
name
size
john
16
khaled
15
john
15
Alex
16
What is the best function or library to do that?

How to create a dataset of ids that get treated at some point in time? (R) [duplicate]

This question already has answers here:
Select groups which have at least one of a certain value
(3 answers)
How to keep all instances of a column ID if a specific value is found in another column 1 or more times [duplicate]
(2 answers)
Closed 11 months ago.
I have a longitudinal dataset with 3 important variables: ID, Year, Treatment
I would like to keep all the IDs that get treated at some point of time and drop all the IDs that never get treated. How do I do this on R?
Example:
ID
Year
Treatment
0001
2000
0
0001
2001
0
0001
2002
0
0002
2000
0
0002
2001
0
0002
2002
1
I would like to keep all observations of ID 0002 (Treated at some point in time), but drop all of ID 0001 (Never treated). I have a very big dataset with more IDs than that so I can not do this manually.
Thanks in advance.
Find the IDs that have treatment, then subset those IDs:
d[ d$ID %in% unique(d[ d$Treatment == 1, "ID" ]), ]
# ID Year Treatment
# 4 0002 2000 0
# 5 0002 2001 0
# 6 0002 2002 1

How to sum a variable by group but do not aggregate the data frame in R? [duplicate]

This question already has answers here:
Count number of rows per group and add result to original data frame
(11 answers)
Calculate group mean, sum, or other summary stats. and assign column to original data
(4 answers)
Closed 4 years ago.
although I have found a lot of ways to calculate the sum of a variable by group, all the approaches end up creating a new data set which aggregates the double cases.
To be more precise, if I have a data frame:
id year
1 2010
1 2015
1 2017
2 2011
2 2017
3 2015
and I want to count the number of times I have the same ID by the different years, there are a lot of ways (using aggregate, tapply, dplyr, sqldf etc) which use a "group by" kind of functionality that in the end will give something like:
id count
1 3
2 2
3 1
I haven't managed to find a way to calculate the same thing but keep my original data frame, in order to obtain:
id year count
1 2010 3
1 2015 3
1 2017 3
2 2011 2
2 2017 2
3 2015 1
and therefore do not aggregate my double cases.
Has somebody already figured out?
Thank you in advance

R counter, counting frequency in a table [duplicate]

This question already has answers here:
Numbering rows within groups in a data frame
(10 answers)
Add column with order counts
(2 answers)
Closed 6 years ago.
I have following data set
id year
2 20332 2005
3 6383 2005
14 20332 2006
15 6806 2006
16 23100 2006
I would like to have an additional column, which counts the number of years the id variable is already available:
id year Counter
2 20332 2005 1
3 6383 2005 1
14 20332 2006 2
15 6806 2006 1
16 23100 2006 1
The dataset is currently not sorted according to the year. I thought about mutate rather than a function.
Any ideas? Thanks!
We can use ave from base R
df1$Counter <- with(df1, ave(id, id, FUN = seq_along))

How to create a step-by-step cumulation of data? [duplicate]

This question already has answers here:
Calculating cumulative sum for each row
(6 answers)
Closed 7 years ago.
Probably my question is really dull but I couldn't find an easy solution for that. So we have a data.frame without (overall) column. Overall column must present a cumulative number of pies (in my case) eaten up to a certain time period. What is the easiest way to create it in R for an infinite number of rows? Thanks!
Year Pies eaten Pies eaten(overall)
1 1960 3 3
2 1961 2 5
3 1962 5 10
4 1963 1 11
5 1964 7 18
6 1965 4 22
We can use cumsum
df1$Pies_eaten_Overall <- cumsum(df1$Pies_eaten)

Resources