This question already has an answer here:
Include levels of zero count in result of table()
(1 answer)
Closed 2 years ago.
I have this collection
x <- c(3,4,5,7,7,9,9,9,10,10,10,10,11,11,11,11,11,11,11,12,12,12,12,12,12,12,12,12,12,13,13,13,13,13,13,13,13,13,13,13,13,13,13,14,14,14,15,15)
And I want to get the frequencies of each value of the sequence 3:15 within that collection. If I do table(x) it gives me the frequencies of the existing values, but for example, the value 6 would have a frequency value of 0 and is not shown with table().
Use factor with levels in table.
table(factor(x, levels = 3:15))
# 3 4 5 6 7 8 9 10 11 12 13 14 15
# 1 1 1 0 2 0 3 4 7 10 14 3 2
Or for a general case :
table(factor(x, levels = min(x):max(x)))
Related
This question already has answers here:
Repeat each row of data.frame the number of times specified in a column
(10 answers)
Closed 2 years ago.
library(data.table)
dataHAVE=data.frame("student"=c(1,2,3),
"score" = c(10,11,12),
"count"=c(4,1,2))
dataWANT=data.frame("student"=c(1,1,1,1,2,3,3),
"score"=c(10,10,10,10,11,12,12),
"count"=c(4,4,4,4,1,2,2))
setDT(dataHAVE)dataHAVE[rep(1:.N,count)][,Indx:=1:.N,by=student]
I have data 'dataHAVE' and seek to produce 'dataWANT' that basically copies each 'student' 'count' number of times as shown in 'dataWANT'. I try doing this as shown above in data.table as this is the solution I seek but get error
Error: unexpected symbol in "setDT(dat)dat"
and I cannot resolve thank you so much.
Try:
setDT(dataHAVE)[rep(1:.N,count)]
Output:
student score count
1: 1 10 4
2: 1 10 4
3: 1 10 4
4: 1 10 4
5: 2 11 1
6: 3 12 2
7: 3 12 2
As explained you could also replace 1:.N and do setDT(dataHAVE)[dataHAVE[, rep(.I, count)]].
Just FYI, there's also a nice function in tidyr that does similar thing:
tidyr::uncount(dataHAVE, count, .remove = FALSE)
Here is a base R solution
dataWANT<-do.call(rbind,
c(with(dataHAVE,rep(split(dataHAVE,student),count)),
make.row.names = FALSE))
such that
> dataWANT
student score count
1 1 10 4
2 1 10 4
3 1 10 4
4 1 10 4
5 2 11 1
6 3 12 2
7 3 12 2
This question already has answers here:
Calculating cumulative sum for each row
(6 answers)
Closed 5 years ago.
I have 2 columns, "x" and "y" generated with this code:
x = 1:8
y = c(2,7,1,3,5,4,1,2)
data = data.frame(x,y)
It look like this:
x y
1 2
2 7
3 1
4 3
5 5
6 4
7 1
8 2
Now I want to keep adding all the previous rows of "y" into "z".
x y z
1 2 2
2 7 9
3 1 10
4 3 13
5 5 18
6 4 22
7 1 23
8 2 25
I have tried everything without any luck.
Use cumsum, the cumulative sum function.
data$z <- cumsum(data$y)
probably not the cleanest way, but this is easy to understand and works well:
data$z=NA
for(i in 1:nrow(data)){
if(i==1){
data[i,'z']=data[i,'y']
} else{
data[i,'z']=data[i,'y']+data[i-1,'z']
}
}
This question already has answers here:
R: Differences by group and adding
(3 answers)
Closed 6 years ago.
I have the following dataset:
df <- data.frame (id= c(1,1,1,2,2), time = c(13,14,17,17,17))
id time
1 1 13
2 1 14
3 1 17
4 2 17
5 2 17
and I wish to go over on each id and subtract the next time and the previous time. So, My ideal output will be:
#output
id time diff
1 1 13 0
2 1 14 1
3 1 17 3
4 2 17 0
5 2 17 0
What is the most efficient way for that?
Thank so Zheyuan Li.
This is a great solution:
df$diff <- with(df, ave(time, id, FUN = function (x) c(0, diff(x))))
This question already has answers here:
How to sum a variable by group
(18 answers)
Closed 6 years ago.
I have searched a lot, but not found a solution.
I have the following data frame:
Age no.observations Factor
1 1 4 A
2 1 3 A
3 1 12 A
4 1 5 B
5 1 9 B
6 1 3 B
7 2 12 A
8 2 3 A
9 2 6 A
10 2 7 B
11 2 9 B
12 2 1 B
I would like to sum create another column with the sum by the categories Age and Factor, thus having 19 for the first three rows, 26 for the next three etc. I want this to be a column added to this data.frame, therefore dplyr and its summarise function do not help.
Use mutate with group_by to not summarise:
df %>%
group_by(Age, Factor) %>%
mutate(no.observations.in.group = sum(no.observations)) %>%
ungroup()
This question already has answers here:
Split data.frame based on levels of a factor into new data.frames
(3 answers)
Closed 7 years ago.
I have this simple data.frame
x=c(1,2,3,4,5,6)
y=c(5,6,1,2,4,5)
z=c(1,1,1,2,2,2)
data=data.frame(x,y,z)
I want to get
data1=
x y z
1 1 5 1
2 2 6 1
3 3 1 1
and
data2=
x y z
4 4 2 2
5 5 4 2
6 6 5 2
accordig to Z values
Try this
split(data, z)
this is a list