Countif function in R by fixing the range column [duplicate] - r

This question already has answers here:
Count number of rows per group and add result to original data frame
(11 answers)
Closed 5 years ago.
How to do countif in R
IN EXCEL we can write the formula as
"COUNTIF($K$2:$K$205,K2),COUNTIF($K$2:$K$205,K3),.... "
How to do in R
value col Countif
1 A 3
1 A 3
1 A 3
4 A 2
4 A 2
3 A 1
99 B 2
99 B 2
1000 B 4
1000 B 4
1000 B 4
1000 B 4

We can use the convenient function from dplyr i.e. add_count
library(dplyr)
df1 %>%
add_count(value, col)
which is similar to
df1 %>%
group_by(value, col) %>%
mutate(count = n())

Related

Select a category in dataframe for operation in r? [duplicate]

This question already has answers here:
Conditional replacement of values in a data.frame
(5 answers)
Closed last year.
Having this dataframe:
dat=data.frame(a=c("ll","pp","ml","ml","v"),value=c(1,2,12,1,2))
I want to multiply by 10 only values correspond to a=ml
In base R:
dat=data.frame(a=c("ll","pp","ml","ml","v"),value=c(1,2,12,1,2))
dat$value[dat$a=="ml"] = dat$value[dat$a=="ml"] * 10
dat
Output:
a value
1 ll 1
2 pp 2
3 ml 120
4 ml 10
5 v 2
Another solution is to use a ifelse statement
dat %>%
mutate(value = ifelse(a == "ml", value*10, value))
a value
1 ll 1
2 pp 2
3 ml 120
4 ml 10
5 v 2

Create new column for mean by group in original dataframe in R [duplicate]

This question already has answers here:
Adding a column of means by group to original data [duplicate]
(4 answers)
Closed 2 years ago.
I have a dataframe that looks like the following:
unit_id outcome
1 3
1 5
1 4
2 1
2 2
2 3
I know how to calculate the mean for each unit_id.
df <- df %>%
group_by(unit_id) %>%
summarise(mean = mean(outcome))
This yields:
unit_id mean
1 4
2 2
I am trying to figure out a way to get the mean for each unit_id and include that in the original dataframe. I would like the output to look like the following.
unit_id outcome mean
1 3 4
1 5 4
1 4 4
2 1 2
2 2 2
2 3 2
We can use mutate instead of summarise
library(dplyr)
df <- df %>%
group_by(unit_id) %>%
mutate(mean = mean(outcome))

How to sum a specific column of replicate rows in dataframe? [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
How to group by two columns in R
(4 answers)
Closed 3 years ago.
I have a data frame which contains a lot of replicates rows. I would like to sum up the last column of replicates rows and remove the replications at the same time. Could anyone tell me how to do that?
The example is here:
name <- c("a","b","c","a","c")
position <- c(192,7,6,192,99)
score <- c(1,2,3,2,5)
df <- data.frame(name,position,score)
> df
name position score
1 a 192 1
2 b 7 2
3 c 6 3
4 a 192 2
5 c 99 5
#I would like to sum the score together if the first two columns are the
#same. The ideal result is like this way
name position score
1 a 192 3
2 b 7 2
3 c 6 3
4 c 99 5
Sincerely thank you for the help.
try this :
library(dplyr)
df %>%
group_by(name, position) %>%
summarise(score = sum(score, na.rm = T))

in R: Sum by group without summarising [duplicate]

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 6 years ago.
I have searched a lot, but not found a solution.
I have the following data frame:
Age no.observations Factor
1 1 4 A
2 1 3 A
3 1 12 A
4 1 5 B
5 1 9 B
6 1 3 B
7 2 12 A
8 2 3 A
9 2 6 A
10 2 7 B
11 2 9 B
12 2 1 B
I would like to sum create another column with the sum by the categories Age and Factor, thus having 19 for the first three rows, 26 for the next three etc. I want this to be a column added to this data.frame, therefore dplyr and its summarise function do not help.
Use mutate with group_by to not summarise:
df %>%
group_by(Age, Factor) %>%
mutate(no.observations.in.group = sum(no.observations)) %>%
ungroup()

How to create a new row that would show me the number of observations in a group in an unbalanced panel dataset in R? [duplicate]

This question already has answers here:
Create counter with multiple variables [duplicate]
(6 answers)
Closed 6 years ago.
I have a dataset that looks like this:
id time
1 1
1 2
2 5
2 3
3 2
3 7
3 8
And I want to add another column to show me how many observations there are in a group.
id time label
1 1 1
1 2 2
2 5 1
2 3 2
3 2 1
3 7 2
3 8 3
We can use ave
df1$label <- with(df1, ave(seq_along(id), id, FUN=seq_along))
Or with dplyr
library(dplyr)
df1 %>%
group_by(id) %>%
mutate(label = row_number())

Resources