Count specific data in a data frame & display [duplicate] - r

This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 6 years ago.
am having exam dataframe with student.id marks
student.id marks
2 2
2 2
2 -1
3 -1
3 -1
3 2
4 2
4 -1
5 2
5 2
5 2
how could i sum total marks for a specific student.id like below table
student.id total-marks
2
3
4
5
how can i obtain the above table with respective total marks ? thanks

If you data.frame is callesd df, this should work:
aggregate(marks~student.id,df,FUN="sum")

Related

increasing value by one with each occurrence of non-repeated number [duplicate]

This question already has answers here:
Increment by 1 for every change in column
(6 answers)
Closed 2 years ago.
v <- c(1,1,2,3,3,3,1,1,3,4,4)
I'm trying to create a vector of elements in which the first occurrence of a non-repeated number always increases by one relative to the previous number.
This is the desired output
1,1,2,3,3,3,4,4,5,6,6
What would an efficient way of doing this would be?
A base R option with rle
> with(rle(v),rep(seq_along(values),lengths))
[1] 1 1 2 3 3 3 4 4 5 6 6
or data.table::rleid
> data.table::rleidv(v)
[1] 1 1 2 3 3 3 4 4 5 6 6

Merging Overlapping Intervals in R [duplicate]

This question already has answers here:
Overlap join with start and end positions
(5 answers)
Merge overlapping ranges into unique groups, in dataframe
(2 answers)
Collapse rows with overlapping ranges
(5 answers)
Closed 4 years ago.
I have a problem where I get information on the range of occupied cells. There may be multiple start and end entries of the range which can overlap for the same test. Not all the "test" have entries.
I have a data frame in R and want to merge all the unique ranges for each "test".
x<-data.frame(test=c(2,3,3,2,3,4),start=c(1,1,1,2,3,4),end=c(1,2,3,3,4,4))
> x
test start end
1 2 1 1
2 3 1 2
3 3 1 3
4 2 2 3
5 3 3 4
6 4 4 4
I would like to transform this data frame into:
test start end
1 2 1 1
2 2 2 3
3 3 1 4
4 4 4 4
In the end I just want to know how many cells are occupied by the range for each "row", so row 2 has (1,1) and (2,3) which means 3 cells. row 3 has (1,4) so 4 cells. row 4 has (4,4) so 1 cell. since row 1 or 5 to n has none occupied, all are 0 cells:
u<-unique(y[,1])
a<-rep(0,length(u))
for(i in 1:length(u)){
a[i]<-sum(y[which(y[,1]==u[i]),3]-y[which(y[,1]==u[i]),2])+length(which(y[,1]==u[i]))
}
> a
[1] 3 4 1

Tallying values in single column and separating into Rows in R [duplicate]

This question already has answers here:
Counting the number of elements with the values of x in a vector
(20 answers)
Closed 6 years ago.
I have a single row of numbers. I'm wondering how I can separate it out so that it outputs columns that total the tally of each set of numbers. I've tried playing around with "separate" but I can't figure out how to make it work.
Here's my data frame:
2
2
2
2
2
4
4
4
I'd like it to be
2 4
5 3
You can use the table() function.
> df
V1
1 2
2 2
3 2
4 2
5 2
6 4
7 4
8 4
> table(df$V1)
2 4
5 3
We can use tabulate which would be faster
tabulate(factor(df1$V1))
#[1] 5 3

How to create a new row that would show me the number of observations in a group in an unbalanced panel dataset in R? [duplicate]

This question already has answers here:
Create counter with multiple variables [duplicate]
(6 answers)
Closed 6 years ago.
I have a dataset that looks like this:
id time
1 1
1 2
2 5
2 3
3 2
3 7
3 8
And I want to add another column to show me how many observations there are in a group.
id time label
1 1 1
1 2 2
2 5 1
2 3 2
3 2 1
3 7 2
3 8 3
We can use ave
df1$label <- with(df1, ave(seq_along(id), id, FUN=seq_along))
Or with dplyr
library(dplyr)
df1 %>%
group_by(id) %>%
mutate(label = row_number())

R create new data.frame from old one according to column values [duplicate]

This question already has answers here:
Split data.frame based on levels of a factor into new data.frames
(3 answers)
Closed 7 years ago.
I have this simple data.frame
x=c(1,2,3,4,5,6)
y=c(5,6,1,2,4,5)
z=c(1,1,1,2,2,2)
data=data.frame(x,y,z)
I want to get
data1=
x y z
1 1 5 1
2 2 6 1
3 3 1 1
and
data2=
x y z
4 4 2 2
5 5 4 2
6 6 5 2
accordig to Z values
Try this
split(data, z)
this is a list

Resources