How do I use plyr to number rows?

How do I use plyr to number rows? - r

Basically I want an autoincremented id column based on my cohorts - in this case .(kmer, cvCut)
> myDataFrame
size kmer cvCut cumsum
1 8132 23 10 8132
10000 778 23 10 13789274
30000 324 23 10 23658740
50000 182 23 10 28534840
100000 65 23 10 33943283
200000 25 23 10 37954383
250000 584 23 12 16546507
300000 110 23 12 29435303
400000 28 23 12 34697860
600000 127 23 2 47124443
600001 127 23 2 47124570
I want a column added that has new row names based on the kmer/cvCut group
> myDataFrame
size kmer cvCut cumsum newID
1 8132 23 10 8132 1
10000 778 23 10 13789274 2
30000 324 23 10 23658740 3
50000 182 23 10 28534840 4
100000 65 23 10 33943283 5
200000 25 23 10 37954383 6
250000 584 23 12 16546507 1
300000 110 23 12 29435303 2
400000 28 23 12 34697860 3
600000 127 23 2 47124443 1
600001 127 23 2 47124570 2

I'd do it like this:
library(plyr)
ddply(df, c("kmer", "cvCut"), transform, newID = seq_along(kmer))

Just add a new column each time plyr calls you:
R> DF <- data.frame(kmer=sample(1:3, 50, replace=TRUE), \
cvCut=sample(LETTERS[1:3], 50, replace=TRUE))
R> library(plyr)
R> ddply(DF, .(kmer, cvCut), function(X) data.frame(X, newId=1:nrow(X)))
kmer cvCut newId
1 1 A 1
2 1 A 2
3 1 A 3
4 1 A 4
5 1 A 5
6 1 A 6
7 1 A 7
8 1 A 8
9 1 A 9
10 1 A 10
11 1 A 11
12 1 B 1
13 1 B 2
14 1 B 3
15 1 B 4
16 1 B 5
17 1 B 6
18 1 C 1
19 1 C 2
20 1 C 3
21 2 A 1
22 2 A 2
23 2 A 3
24 2 A 4
25 2 A 5
26 2 B 1
27 2 B 2
28 2 B 3
29 2 B 4
30 2 B 5
31 2 B 6
32 2 B 7
33 2 C 1
34 2 C 2
35 2 C 3
36 2 C 4
37 3 A 1
38 3 A 2
39 3 A 3
40 3 A 4
41 3 B 1
42 3 B 2
43 3 B 3
44 3 B 4
45 3 C 1
46 3 C 2
47 3 C 3
48 3 C 4
49 3 C 5
50 3 C 6
R>

I think that this is what you want:
Load the data:
x <- read.table(textConnection(
"id size kmer cvCut cumsum
1 8132 23 10 8132
10000 778 23 10 13789274
30000 324 23 10 23658740
50000 182 23 10 28534840
100000 65 23 10 33943283
200000 25 23 10 37954383
250000 584 23 12 16546507
300000 110 23 12 29435303
400000 28 23 12 34697860
600000 127 23 2 47124443
600001 127 23 2 47124570"), header=TRUE)
Use ddply:
library(plyr)
ddply(x, .(kmer, cvCut), function(x) cbind(x, 1:nrow(x)))

Related

Recode column every nth element in R

I'm looking to recode a column, say the following:
df <- data.frame(col1 = rep(3, 100),
col2 = rep(NA, 100))
I want to recode col2 as 1 for rows 1:5, 2 for rows 6:10, 3 for 11:15, etc. So, every five rows I would add +1 to the assigned value. Any way to automate this process to avoid manually recoding 100 rows?

There are lot of ways to do that. Here are couple of them -
Using rep :
df$col2 <- rep(1:nrow(df), each = 5, length.out = nrow(df))
Using ceiling
df$col2 <- ceiling(seq(nrow(df))/5)

dplyr way
df %>% mutate(col2 = ((row_number()-1) %/% 5)+1)
OR
A simple for loop
for(i in 0:((nrow(df)/5)-1)){
df[0:nrow(df) %/% 5 == i,2] <- i+1
}
> df
col1 col2
1 3 1
2 3 1
3 3 1
4 3 1
5 3 1
6 3 2
7 3 2
8 3 2
9 3 2
10 3 2
11 3 3
12 3 3
13 3 3
14 3 3
15 3 3
16 3 4
17 3 4
18 3 4
19 3 4
20 3 4
21 3 5
22 3 5
23 3 5
24 3 5
25 3 5
26 3 6
27 3 6
28 3 6
29 3 6
30 3 6
31 3 7
32 3 7
33 3 7
34 3 7
35 3 7
36 3 8
37 3 8
38 3 8
39 3 8
40 3 8
41 3 9
42 3 9
43 3 9
44 3 9
45 3 9
46 3 10
47 3 10
48 3 10
49 3 10
50 3 10
51 3 11
52 3 11
53 3 11
54 3 11
55 3 11
56 3 12
57 3 12
58 3 12
59 3 12
60 3 12
61 3 13
62 3 13
63 3 13
64 3 13
65 3 13
66 3 14
67 3 14
68 3 14
69 3 14
70 3 14
71 3 15
72 3 15
73 3 15
74 3 15
75 3 15
76 3 16
77 3 16
78 3 16
79 3 16
80 3 16
81 3 17
82 3 17
83 3 17
84 3 17
85 3 17
86 3 18
87 3 18
88 3 18
89 3 18
90 3 18
91 3 19
92 3 19
93 3 19
94 3 19
95 3 19
96 3 20
97 3 20
98 3 20
99 3 20
100 3 20

As there is a pattern (each 5th row) you can use rep(row_number()) length.out = n() takes into account the length of column.
Learned here dplyr: Mutate a new column with sequential repeated integers of n time in a dataframe from Ronak!!!
Thanks to Ronak!
df %>% mutate(col2 = rep(row_number(), each=5, length.out = n()))

Gen variable conditional on other variables from a different dataframe

I Have DF1:
c01 p01 c02 p02 c03 p03 c04 p04
1 0 1 20 1 33 1 49
2 3 2 21 2 34 2 50
3 4 3 21 3 38 3 50
4 6 4 23 4 40 4 51
5 7 5 24 5 41 5 53
6 9 6 27 6 41 6 54
7 11 7 29 7 41 7 55
8 15 8 31 8 43 8 57
9 15 9 33 9 47 9 57
10 16 10 33 10 49 10 60
And i Have DF2:
type round
A 1
B 1
A 2
B 2
A 3
B 3
A 4
B 4
What i want is to generate a new variable in DF2 that goes like:
DF2$g1<- if(DF2$round==1, 0)
DF2$g2<- if(c01==4 & round==1,DF2$p01)
DF2$g3<- if(c01==4 & round==2,DF2$p02)
DF2$g4<- if(c01==4 & round==3,DF2$p03)
DF2$g5<- if(c01==4 & round==4,DF2$p04)
DF2$g6<- if(c01==4 & round==5,DF2$p05)
So DF2 becomes:
type round g
A 1 6
B 1 6
A 2 23
B 2 23
A 3 40
B 3 40
A 4 50
B 4 50
Is there a way that i can loop this? In the original dataframe, i have 40 rounds, e C01 to C40 and P01 to P40

(Update) Add index column to data.frame based on two columns

Example data.frame:
df = read.table(text = 'colA colB
2 7
2 7
2 7
2 7
1 7
1 7
1 7
89 5
89 5
89 5
88 5
88 5
70 5
70 5
70 5
69 5
69 5
44 4
44 4
44 4
43 4
42 4
42 4
41 4
41 4
120 1
100 1', header = TRUE)
I need to add an index col based on colA and colB where colB shows the exact number of rows to group but it can be duplicated. colB groups rows based on colA and colA -1.
Expected output:
colA colB index_col
2 7 1
2 7 1
2 7 1
2 7 1
1 7 1
1 7 1
1 7 1
89 5 2
89 5 2
89 5 2
88 5 2
88 5 2
70 5 3
70 5 3
70 5 3
69 5 3
69 5 3
44 4 4
44 4 4
44 4 4
43 4 4
42 4 5
42 4 5
41 4 5
41 4 5
120 1 6
100 1 7
UPDATE
How can I adapt the code that works for the above df for the same purpose but by looking at colB values grouped based on colA, colA -1 and colA -2? i.e. (instead of 2 days considering 3 days)
new_df = read.table(text = 'colA colB
3 10
3 10
3 10
2 10
2 10
2 10
2 10
1 10
1 10
1 10
90 7
90 7
89 7
89 7
89 7
88 7
88 7
71 7
71 7
70 7
70 7
70 7
69 7
69 7
44 5
44 5
44 5
43 5
42 5
41 5
41 5
41 5
40 5
40 5
120 1
100 1', header = TRUE)
Expected output:
colA colB index_col
3 10 1
3 10 1
3 10 1
2 10 1
2 10 1
2 10 1
2 10 1
1 10 1
1 10 1
1 10 1
90 7 2
90 7 2
89 7 2
89 7 2
89 7 2
88 7 2
88 7 2
71 7 3
71 7 3
70 7 3
70 7 3
70 7 3
69 7 3
69 7 3
44 5 4
44 5 4
44 5 4
43 5 4
42 5 4
41 5 5
41 5 5
41 5 5
40 5 5
40 5 5
120 1 6
100 1 7
Thanks

We can use rleid
library(data.table)
index_col <-setDT(df)[, if(colB[1L] < .N) ((seq_len(.N)-1) %/% colB[1L])+1
else as.numeric(colB), rleid(colB)][, rleid(V1)]
df[, index_col := index_col]
df
# colA colB index_col
# 1: 2 7 1
# 2: 2 7 1
# 3: 2 7 1
# 4: 2 7 1
# 5: 1 7 1
# 6: 1 7 1
# 7: 1 7 1
# 8: 70 5 2
# 9: 70 5 2
#10: 70 5 2
#11: 69 5 2
#12: 69 5 2
#13: 89 5 3
#14: 89 5 3
#15: 89 5 3
#16: 88 5 3
#17: 88 5 3
#18: 120 1 4
#19: 100 1 5
Or a one-liner would be
setDT(df)[, index_col := df[, ((seq_len(.N)-1) %/% colB[1L])+1, rleid(colB)][, as.integer(interaction(.SD, drop = TRUE, lex.order = TRUE))]]
Update
Based on the new update in the OP's post
setDT(new_df)[, index_col := cumsum(c(TRUE, abs(diff(colA))> 1))
][, colB := .N , index_col]
new_df
# colA colB index_col
# 1: 3 10 1
# 2: 3 10 1
# 3: 3 10 1
# 4: 2 10 1
# 5: 2 10 1
# 6: 2 10 1
# 7: 2 10 1
# 8: 1 10 1
# 9: 1 10 1
#10: 1 10 1
#11: 71 7 2
#12: 71 7 2
#13: 70 7 2
#14: 70 7 2
#15: 70 7 2
#16: 69 7 2
#17: 69 7 2
#18: 90 7 3
#19: 90 7 3
#20: 89 7 3
#21: 89 7 3
#22: 89 7 3
#23: 88 7 3
#24: 88 7 3
#25: 44 2 4
#26: 43 2 4
#27: 120 1 5
#28: 100 1 6

An approach in base R:
df$idxcol <- cumsum(c(1,abs(diff(df$colA)) > 1) + c(0,diff(df$colB) != 0) > 0)
which gives:
> df
colA colB idxcol
1 2 7 1
2 2 7 1
3 2 7 1
4 2 7 1
5 1 7 1
6 1 7 1
7 1 7 1
8 70 5 2
9 70 5 2
10 70 5 2
11 69 5 2
12 69 5 2
13 89 5 3
14 89 5 3
15 89 5 3
16 88 5 3
17 88 5 3
18 120 1 4
19 100 1 5
On the updated example data, you need to adapt the approach to:
n <- 1
idx1 <- cumsum(c(1, diff(df$colA) < -n) + c(0, diff(df$colB) != 0) > 0)
idx2 <- ave(df$colA, cumsum(c(1, diff(df$colA) < -n)), FUN = function(x) c(0, cumsum(diff(x)) < -n ))
idx2[idx2==1 & c(0,diff(idx2))==0] <- 0
df$idxcol <- idx1 + cumsum(idx2)
which gives:
> df
colA colB idxcol
1 2 7 1
2 2 7 1
3 2 7 1
4 2 7 1
5 1 7 1
6 1 7 1
7 1 7 1
8 89 5 2
9 89 5 2
10 89 5 2
11 88 5 2
12 88 5 2
13 70 5 3
14 70 5 3
15 70 5 3
16 69 5 3
17 69 5 3
18 44 4 4
19 44 4 4
20 44 4 4
21 43 4 4
22 42 4 5
23 42 4 5
24 41 4 5
25 41 4 5
26 120 1 6
27 100 1 7
For new_df just change n tot 2 and you will get the desired output for that as well.

Group rows and add sum column of unique values

Here an example of my data.frame:
df = read.table(text='colA colB colC
10 11 7
10 34 7
10 89 7
10 21 7
2 23 5
2 21 5
2 56 5
22 14 3
22 19 3
22 90 3
11 19 2
11 45 2
1 45 0
1 23 0
9 8 0
9 11 0
9 21 0', header = TRUE)
I need to group the rows by colA and colC and add a new column which states the sum of unique values based on colB.
In steps here what I need to do for this specific data.frame:
group rows with colA = 10 and 9, colA = 2 and 1, colA = 22 and colA = 11;
find the unique values of colB per each group;
add the unique values in a new col (newcolD).
Note that colC states the total number of observations for colA = 10 and 9, colA = 2 and 1, colA = 22 and colA = 11.
The data.frame needs to remain ordered decreasingly by colC.
My expected output is:
colA colB colC newcolD
10 11 7 5
10 34 7 5
10 89 7 5
10 21 7 5
9 8 0 5
9 11 0 5
9 21 0 5
2 23 5 4
2 21 5 4
2 56 5 4
1 45 0 4
1 23 0 4
22 14 3 3
22 19 3 3
22 90 3 3
11 19 2 2
11 45 2 2
To note that in df the colB duplicated values are: 11 and 21 for group 10 and 9, and 23 for group 2 and 1.

You can do that with dplyr. The trick is to create a new grouping column which groups consecutive values in colA. This is done with cumsum(c(1, diff(colA) < -1) in the example below.
df1 = read.table(text='colA colB colC
10 11 7
10 34 7
10 89 7
10 21 7
2 23 5
2 21 5
2 56 5
22 14 3
22 19 3
22 90 3
1 45 0
1 23 0
9 8 0
9 11 0
9 21 0', header = TRUE,stringsAsFactors=FALSE)
library(dplyr)
df1 %>%
arrange(desc(colA)) %>%
group_by(group_sequential = cumsum(c(1, diff(colA) < -1))) %>%
mutate(newcolD=n_distinct(colB))
colA colB colC group_sequential newcolD
<int> <int> <int> <dbl> <int>
1 22 14 3 1 3
2 22 19 3 1 3
3 22 90 3 1 3
4 10 11 7 2 5
5 10 34 7 2 5
6 10 89 7 2 5
7 10 21 7 2 5
8 9 8 0 2 5
9 9 11 0 2 5
10 9 21 0 2 5
11 2 23 5 3 4
12 2 21 5 3 4
13 2 56 5 3 4
14 1 45 0 3 4
15 1 23 0 3 4
EDIT FOR NEW DATA
With the data you added, we need to create a custom grouping. I use case_when in the example below. This matches the order you show in the desired output column. In the text, you wrote that you wanted the table to be sorted by colC. To do so, change the last line to arrange(desc(colC))
df1 = read.table(text='colA colB colC
10 11 7
10 34 7
10 89 7
10 21 7
2 23 5
2 21 5
2 56 5
22 14 3
22 19 3
22 90 3
11 19 2
11 45 2
1 45 0
1 23 0
9 8 0
9 11 0
9 21 0', header = TRUE,stringsAsFactors=FALSE)
library(dplyr)
df1 %>%
group_by(group_sequential = case_when(.$colA==10|.$colA==9~1,
.$colA==2|.$colA==1~2,
.$colA==22~3,
.$colA==11~4)) %>%
mutate(newcolD=n_distinct(colB)) %>%
arrange(desc(newcolD))
colA colB colC group_sequential newcolD
<int> <int> <int> <dbl> <int>
1 10 11 7 1 5
2 10 34 7 1 5
3 10 89 7 1 5
4 10 21 7 1 5
5 9 8 0 1 5
6 9 11 0 1 5
7 9 21 0 1 5
8 2 23 5 2 4
9 2 21 5 2 4
10 2 56 5 2 4
11 1 45 0 2 4
12 1 23 0 2 4
13 22 14 3 3 3
14 22 19 3 3 3
15 22 90 3 3 3
16 11 19 2 4 2
17 11 45 2 4 2

You're really not making it easy for us, reposting slight variations of the same question instead of updating the old one and presenting conditions that are vague and inconsistent with what the desired output implies. Anyhow, here is my attempt. This is more an answer to the second question you posted, as that was a bit more general in form.
It's a bit messy, it's pretty much a direct translation of your conditions into a for loop with some if statements. I chose to focus on your written conditions rather than the expected output as that was the easier one to understand. If you want a better answer, please consider cleaning up you question(s) considerably.
df1 <- read.table(text="
colA colB colC
10 11 7
10 34 7
10 89 7
10 21 7
2 23 5
2 21 5
2 56 5
22 14 3
22 19 3
22 90 3
11 19 2
11 45 2
1 45 0
1 23 0
9 8 0
9 11 0
9 21 0", header=TRUE)
df2 <- read.table(text="
colA colB colC
10 11 7
10 34 7
10 89 7
10 21 7
2 23 5
2 21 5
2 56 5
33 24 3
33 78 3
22 14 3
22 19 3
22 90 3
11 19 2
11 45 2
1 45 0
1 23 0
9 8 0
9 11 0
9 21 0
32 11 0", header=TRUE)
df <- df1
for (i in 1:nrow(df)) {
df$colD[i] <- ifelse(df$colC[i] == 0,
0,
length(unique(df$colA[1:i])))
if (any(df$colA[i]-1 == df$colA[1:i]) & df$colC[i] != 0) {
df$colD[i] <- df$colD[which(df$colA[i]-1 == df$colA[1:i])][1]
}
}
# colA colB colC colD
# 10 11 7 1
# 10 34 7 1
# 10 89 7 1
# 10 21 7 1
# 2 23 5 2
# 2 21 5 2
# 2 56 5 2
# 22 14 3 3
# 22 19 3 3
# 22 90 3 3
# 11 19 2 1
# 11 45 2 1
# 1 45 0 0
# 1 23 0 0
# 9 8 0 0
# 9 11 0 0
# 9 21 0 0
df <- df2
for (i in 1:nrow(df)) {
df$colD[i] <- ifelse(df$colC[i] == 0,
0,
length(unique(df$colA[1:i])))
if (any(df$colA[i]-1 == df$colA[1:i]) & df$colC[i] != 0) {
df$colD[i] <- df$colD[which(df$colA[i]-1 == df$colA[1:i])][1]
}
}
df
# colA colB colC colD
# 10 11 7 1
# 10 34 7 1
# 10 89 7 1
# 10 21 7 1
# 2 23 5 2
# 2 21 5 2
# 2 56 5 2
# 33 24 3 3
# 33 78 3 3
# 22 14 3 4
# 22 19 3 4
# 22 90 3 4
# 11 19 2 1
# 11 45 2 1
# 1 45 0 0
# 1 23 0 0
# 9 8 0 0
# 9 11 0 0
# 9 21 0 0
# 32 11 0 0
To also group the rows where colC is zero, it's sufficient to adjust the conditionals like this:
for (i in 1:nrow(df)) {
df$colD[i] <- length(unique(df$colA[1:i]))
if (any(df$colA[i]-1 == df$colA[1:i])) {
df$colD[i] <- df$colD[which(df$colA[i]-1 == df$colA[1:i])][1]
}
}

Quartiles by group saved as new variable in data frame

I have data that look something like this:
id <- c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8,9,9,9)
yr <- c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
gr <- c(3,4,5,3,4,5,3,4,5,4,5,6,4,5,6,4,5,6,5,6,7,5,6,7,5,6,7)
x <- c(33,48,31,41,31,36,25,38,28,17,39,53,60,60,19,39,34,47,20,28,38,15,17,49,48,45,39)
df <- data.frame(id,yr,gr,x)
id yr gr x
1 1 1 3 33
2 1 2 4 48
3 1 3 5 31
4 2 1 3 41
5 2 2 4 31
6 2 3 5 36
7 3 1 3 25
8 3 2 4 38
9 3 3 5 28
10 4 1 4 17
11 4 2 5 39
12 4 3 6 53
13 5 1 4 60
14 5 2 5 60
15 5 3 6 19
16 6 1 4 39
17 6 2 5 34
18 6 3 6 47
19 7 1 5 20
20 7 2 6 28
21 7 3 7 38
22 8 1 5 15
23 8 2 6 17
24 8 3 7 49
25 9 1 5 48
26 9 2 6 45
27 9 3 7 39
I would like to create a new variable in the data frame that contains the quantiles of "x" computed within each unique combination of "yr" and "gr". That is, rather than finding the quantiles of "x" based on all 27 rows of data in the example, I would like to compute the quantiles by two grouping variables: yr and gr. For instance, the quantiles of "x" when yr = 1 and gr = 3, yr = 1 and gr = 4, etc.
Once these values are computed, I would like them to be appended to the data frame as a single column, say "x_quant".
I am able to split the data into the separate groups I need, and I am know how to calculate quantiles, but I am having trouble combining the two steps in a way that is amenable to creating a new column in the existing data frame.
Any help y'all can provide would be greatly appretiated! Thank you much!
~kj

# turn "yr" and "gr" into sortable column
df$y <- paste(df$yr,"",df$gr)
df.ordered <- df[order(df$y),] #sort df based on group
grp <- split(df.ordered,df.ordered$y);grp
# get quantiles and turn results into string
q <- vector('list')
for (i in 1:length(grp)) {
a <- quantile(grp[[i]]$x)
q[i] <- paste(a[1],"",a[2],"",a[3],"",a[4],"",a[5])
}
x_quant <- unlist(sapply(q, `[`, 1))
x_quant <- rep(x_quant,each=3)
# append quantile back to data frame. Gave new column a more descriptive name
df.ordered$xq_0_25_50_75_100 <- x_quant
df.ordered$y <- NULL
df <- df.ordered;df </pre>
Output:
> # turn "yr" and "gr" into sortable column
> df$y <- paste(df$yr,"",df$gr)
> df.ordered <- df[order(df$y),] #sort df based on group
> grp <- split(df.ordered,df.ordered$y);grp
$`1 3`
id yr gr x y
1 1 1 3 33 1 3
4 2 1 3 41 1 3
7 3 1 3 25 1 3
$`1 4`
id yr gr x y
10 4 1 4 17 1 4
13 5 1 4 60 1 4
16 6 1 4 39 1 4
$`1 5`
id yr gr x y
19 7 1 5 20 1 5
22 8 1 5 15 1 5
25 9 1 5 48 1 5
$`2 4`
id yr gr x y
2 1 2 4 48 2 4
5 2 2 4 31 2 4
8 3 2 4 38 2 4
$`2 5`
id yr gr x y
11 4 2 5 39 2 5
14 5 2 5 60 2 5
17 6 2 5 34 2 5
$`2 6`
id yr gr x y
20 7 2 6 28 2 6
23 8 2 6 17 2 6
26 9 2 6 45 2 6
$`3 5`
id yr gr x y
3 1 3 5 31 3 5
6 2 3 5 36 3 5
9 3 3 5 28 3 5
$`3 6`
id yr gr x y
12 4 3 6 53 3 6
15 5 3 6 19 3 6
18 6 3 6 47 3 6
$`3 7`
id yr gr x y
21 7 3 7 38 3 7
24 8 3 7 49 3 7
27 9 3 7 39 3 7
> # get quantiles and turn results into string
> q <- vector('list')
> for (i in 1:length(grp)) {
+ a <- quantile(grp[[i]]$x)
+ q[i] <- paste(a[1],"",a[2],"",a[3],"",a[4],"",a[5])
+ }
> x_quant <- unlist(sapply(q, `[`, 1))
> x_quant <- rep(x_quant,each=3)
> # append quantile back to data frame
> df.ordered$xq_0_25_50_75_100 <- x_quant
> df.ordered$y <- NULL
> df <- df.ordered
> df
id yr gr x xq_0_25_50_75_100
1 1 1 3 33 25 29 33 37 41
4 2 1 3 41 25 29 33 37 41
7 3 1 3 25 25 29 33 37 41
10 4 1 4 17 17 28 39 49.5 60
13 5 1 4 60 17 28 39 49.5 60
16 6 1 4 39 17 28 39 49.5 60
19 7 1 5 20 15 17.5 20 34 48
22 8 1 5 15 15 17.5 20 34 48
25 9 1 5 48 15 17.5 20 34 48
2 1 2 4 48 31 34.5 38 43 48
5 2 2 4 31 31 34.5 38 43 48
8 3 2 4 38 31 34.5 38 43 48
11 4 2 5 39 34 36.5 39 49.5 60
14 5 2 5 60 34 36.5 39 49.5 60
17 6 2 5 34 34 36.5 39 49.5 60
20 7 2 6 28 17 22.5 28 36.5 45
23 8 2 6 17 17 22.5 28 36.5 45
26 9 2 6 45 17 22.5 28 36.5 45
3 1 3 5 31 28 29.5 31 33.5 36
6 2 3 5 36 28 29.5 31 33.5 36
9 3 3 5 28 28 29.5 31 33.5 36
12 4 3 6 53 19 33 47 50 53
15 5 3 6 19 19 33 47 50 53
18 6 3 6 47 19 33 47 50 53
21 7 3 7 38 38 38.5 39 44 49
24 8 3 7 49 38 38.5 39 44 49
27 9 3 7 39 38 38.5 39 44 49
>

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How do I use plyr to number rows? - r

I'd do it like this: library(plyr) ddply(df, c("kmer", "cvCut"), transform, newID = seq_along(kmer))

Related

Recode column every nth element in R

Gen variable conditional on other variables from a different dataframe

(Update) Add index column to data.frame based on two columns

Group rows and add sum column of unique values

Quartiles by group saved as new variable in data frame

Categories

Resources