Receive the total sum score of every number [duplicate] - r

This question already has answers here:
Count number of occurences for each unique value
(14 answers)
Closed 2 years ago.
Using as input data frame:
df1 <- data.frame(num = c(1,1,1,2,2,2,3))
How is it possible to receive the sum of every number excited in the num column?
Example output:
num frequency
1 3
2 3
3 1

Using table and coerce it to a data frame.
as.data.frame(table(df1$num))
# Var1 Freq
# 1 1 3
# 2 2 3
# 3 3 1
or
with(df1, data.frame(num=unique(num), freq=tabulate(num)))
# num freq
# 1 1 3
# 2 2 3
# 3 3 1

Related

Can you create a new dataframe based on number of rows from other dataframe on R? [duplicate]

This question already has answers here:
Count number of rows within each group
(17 answers)
count number of rows in a data frame in R based on group [duplicate]
(8 answers)
Closed 2 years ago.
I have a df with 900k rows and each row has an action (about 80 different actions in total) and a number (about 500 different numbers in total), so it looks something like this:
Action Number
a 1
b 3
a 7
b 3
b 1
How can I create a new df using R which creates a new row with the number, the action and the number of rows with that combination, so it looks something like this:
Number Action Total
1 a 1
1 b 1
3 b 2
7 a 1
Try with dplyr:
library(dplyr)
#Code
newdf <- df %>% group_by(Number,Action) %>% summarise(N=n())
Output:
# A tibble: 4 x 3
# Groups: Number [3]
Number Action N
<int> <chr> <int>
1 1 a 1
2 1 b 1
3 3 b 2
4 7 a 1
Or in base R creating an indicator variable N and using aggregate():
#Base R
df$N <- 1
newdf <- aggregate(N~.,data=df,sum)
Output:
Action Number N
1 a 1 1
2 b 1 1
3 b 3 2
4 a 7 1

R, dplyr: Is there a way to add order of groups when there are multiple rows per group without creating a new data frame? [duplicate]

This question already has answers here:
How to create a consecutive group number
(13 answers)
Closed 2 years ago.
I have data from an experiment that has multiple rows per item (each row has the reading time for one word of a sentence of n words), and multiple items per subject. Items can be varying numbers of rows. Items were presented in a random order, and their order in the data as initially read in reflects the sequence they saw the items in. What I'd like to do is add a column that contains the order in which the subject saw that item (i.e., 1 for the first item, 2 for the second, etc.).
Here's an example of some input data that has the relevant properties:
d <- data.frame(Subject = c(1,1,1,1,1,2,2,2,2,2),
Item = c(2,2,2,1,1,1,1,2,2,2))
Subject Item
1 2
1 2
1 2
1 1
1 1
2 1
2 1
2 2
2 2
2 2
And here's the output I want:
Subject Item order
1 2 1
1 2 1
1 2 1
1 1 2
1 1 2
2 1 1
2 1 1
2 2 2
2 2 2
2 2 2
I know I can do this by setting up a temp data frame that filters d to unique combinations of Subject and Item, adding order to that as something like 1:n() or row_number(), and then using a join function to put it back together with the main data frame. What I'd like to know is whether there's a way to do this without having to create a new data frame just to store the order---can this be done inside dplyr's mutate somehow if I group by Subject and Item, for instance?
Here's one way:
d %>%
group_by(Subject) %>%
mutate(order = match(Item, unique(Item))) %>%
ungroup()
# # A tibble: 10 x 3
# Subject Item order
# <dbl> <dbl> <int>
# 1 1 2 1
# 2 1 2 1
# 3 1 2 1
# 4 1 1 2
# 5 1 1 2
# 6 2 1 1
# 7 2 1 1
# 8 2 2 2
# 9 2 2 2
# 10 2 2 2
Here is a base R option
transform(d,
order = ave(Item, Subject, FUN = function(x) as.integer(factor(x, levels = unique(x))))
)
or
transform(d,
order = ave(Item, Subject, FUN = function(x) match(x, unique(x)))
)
both giving
Subject Item order
1 1 2 1
2 1 2 1
3 1 2 1
4 1 1 2
5 1 1 2
6 2 1 1
7 2 1 1
8 2 2 2
9 2 2 2
10 2 2 2

How to sort a column from ascending order for EACH ID in R [duplicate]

This question already has answers here:
Sort (order) data frame rows by multiple columns
(19 answers)
Closed 7 years ago.
If I want to sort the Chrom# from ascending order (1 to 23) for each unique ID (as shown below there's multiple rows of same IDs, how to write the R code for it? eg) MB-0002, chrom from 1,1,1,2,4,22... etc. 1 chrom per row. I am new to R so any help would be appreciated. Thanks so much!
sample dataset
If you can use dplyr::arrange then you can easily sort by two variables.
tmp <- data.frame(id=c("a","a","b","a","b","c","a","b","c"),
value=c(3,2,4,1,2,1,7,4,3))
tmp
# id value
# 1 a 3
# 2 a 2
# 3 b 4
# 4 a 1
# 5 b 2
# 6 c 1
# 7 a 7
# 8 b 4
# 9 c 3
library(dplyr)
tmp %>% arrange(id, value)
# id value
# 1 a 1
# 2 a 2
# 3 a 3
# 4 a 7
# 5 b 2
# 6 b 4
# 7 b 4
# 8 c 1
# 9 c 3
FYI, an image doesn't work as a usable sample dataset.

R: getting list of matching data frame values [duplicate]

This question already has answers here:
Collapse / concatenate / aggregate a column to a single comma separated string within each group
(6 answers)
Closed 7 years ago.
Two data sets:
people <- read.table(text="
pid
1
2
3
4
", header=TRUE)
comps <- read.table(text="
pid comp rank
1 1 0
1 3 1
1 2 2
2 4 0
2 1 1
2 3 2
3 1 0
3 2 1
3 4 2
", header=TRUE)
Trying to get a data frame of each unique pid with a list of their comparisons, like:
pid comps
1 1,3,2
2 4,1,3
3 1,2,4
Can't quite get there..
You can do this with aggregate:
aggregate(comp~pid, comps, paste, collapse=",")
# pid comp
# 1 1 1,3,2
# 2 2 4,1,3
# 3 3 1,2,4

Duplicating data frame rows by freq value in same data frame [duplicate]

This question already has answers here:
Repeat each row of data.frame the number of times specified in a column
(10 answers)
Closed 7 years ago.
I have a data frame with names by type and their frequencies. I'd like to expand this data frame so that the names are repeated according to their name-type frequency.
For example, this:
> df = data.frame(name=c('a','b','c'),type=c(0,1,2),freq=c(2,3,2))
name type freq
1 a 0 2
2 b 1 3
3 c 2 2
would become this:
> df_exp
name type
1 a 0
2 a 0
3 b 1
4 b 1
5 b 1
6 c 2
7 c 2
Appreciate any suggestions on a easy way to do this.
You can just use rep to "expand" your data.frame rows:
df[rep(sequence(nrow(df)), df$freq), c("name", "type")]
# name type
# 1 a 0
# 1.1 a 0
# 2 b 1
# 2.1 b 1
# 2.2 b 1
# 3 c 2
# 3.1 c 2
And there's a function expandRows in the splitstackshape package that does exactly this. It also has the option to accept a vector specifying how many times to replicate each row, for example:
expandRows(df, "freq")

Resources