I am trying to create a vector where I have 3 repetitions of the number 1, then 3 repetitions of the number 2, and so on up to, for instance, 3 repetitions of the number 36.
c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5...)
I have tried the following use of rep() but got the following error:
Error in rep(3, seq(1:36)) : argument 'times' incorrect
What formulation do I need to use to properly generate the vector I want?
sort(rep(1:36, 3))
Or even better as #Wimpel mentioned in the comments, use the each argument of the rep function.
rep(1:36, each = 3)
output
# [1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 10 10 10 11 11 11 12 12 12 13 13 13 14 14 14 15 15 15 16 16 16 17 17 17 18 18 18 19 19 19 20 20 20 21 21 21 22
# [65] 22 22 23 23 23 24 24 24 25 25 25 26 26 26 27 27 27 28 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 34 34 34 35 35 35 36 36 36
This one should work. However probably not the most elegant.
reps = c()
n = 36
for(i in 1:n){
reps = append(reps, rep(i, 3))
}
reps
alternatively using the rep function properly (see documentation (?rep for argument each):
rep(1:36,each = 3)
rep approach is preferable (see existing answers)
Here are some other options:
> kronecker(1:36, rep(1, 3))
[1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9
[26] 9 9 10 10 10 11 11 11 12 12 12 13 13 13 14 14 14 15 15 15 16 16 16 17 17
[51] 17 18 18 18 19 19 19 20 20 20 21 21 21 22 22 22 23 23 23 24 24 24 25 25 25
[76] 26 26 26 27 27 27 28 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 34
[101] 34 34 35 35 35 36 36 36
> c(outer(rep(1, 3), 1:36))
[1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9
[26] 9 9 10 10 10 11 11 11 12 12 12 13 13 13 14 14 14 15 15 15 16 16 16 17 17
[51] 17 18 18 18 19 19 19 20 20 20 21 21 21 22 22 22 23 23 23 24 24 24 25 25 25
[76] 26 26 26 27 27 27 28 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 34
[101] 34 34 35 35 35 36 36 36
I have the following vector of indices
V_ind = cumsum(c(10,9,8,7,6,5,4,3,2,1))
[1] 10 19 27 34 40 45 49 52 54 55
And I created the following FOR LOOP
k=1
for(ind in V_ind){
if(ind<=10){
print("ok")
}else{
print(c(V_ind[1:k]))
k = k + 1
}
}
Which gives as a result
[1] "ok"
[1] 10
[1] 10 19
[1] 10 19 27
[1] 10 19 27 34
[1] 10 19 27 34 40
[1] 10 19 27 34 40 45
[1] 10 19 27 34 40 45 49
[1] 10 19 27 34 40 45 49 52
[1] 10 19 27 34 40 45 49 52 54
However, what I try to acheive is the following result
[1] "ok"
[1] 10
[1] 9 10 19
[1] 8 9 10 18 19 27
[1] 7 8 9 10 17 18 19 26 27 34
[1] 6 7 8 9 10 16 17 18 19 25 26 27 33 34 40
[1] 5 6 7 8 9 10 15 16 17 18 19 24 25 26 27 32 33 34 39 40 45
[1] 4 5 6 7 8 9 10 14 15 16 17 18 19 23 24 25 26 27 31 32 33 34 38 39 40 44 45 49
[1] 3 4 5 6 7 8 9 10 13 14 15 16 17 18 19 22 23 24 25 26 27 30 31 32 33 34 37 38 39 40 43 44 45 48 49 52
[1] 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 29 30 31 32 33 34 36 37 38 39 40 42 43 44 45 47 48 49 51 52 54
This result goes as follows:
In the first iteration, we just print OK
In the second iteration, we extract the first element of the vector V_ind,
In the third iteration, we extract the first and second element of the vector V_ind together with the first element of the vector V_ind minus 1 that is the number 9.
In the fourth iteration, we extract the first, second and third element of the vector V_ind, together with the first element minus 1, i.e. 9, first element minus 2, i.e 8, and second element minus 1,i.e.18.
In the fifth iteration, we extract the first, second, third and fourth element of the vector V_ind together with the first element minus 1, 2, 3 respectively, i.e 7,8,9, also the second element minus 1 and 2, i.e 17,18, and the third element minus 1, i.e 26.
And this procedure goes until the end of the FOR LOOP. Is this even possible to be done in R, in a generic way?
One option using purrr could be:
map(.x = accumulate(V_ind, c),
~ map2(.x,
rev(seq_along(.x) - 1),
function(y, z) seq(y - z, y, 1)) %>%
reduce(c))
[[1]]
[1] 10
[[2]]
[1] 9 10 19
[[3]]
[1] 8 9 10 18 19 27
[[4]]
[1] 7 8 9 10 17 18 19 26 27 34
[[5]]
[1] 6 7 8 9 10 16 17 18 19 25 26 27 33 34 40
[[6]]
[1] 5 6 7 8 9 10 15 16 17 18 19 24 25 26 27 32 33 34 39 40 45
[[7]]
[1] 4 5 6 7 8 9 10 14 15 16 17 18 19 23 24 25 26 27 31 32 33 34 38 39 40 44 45 49
[[8]]
[1] 3 4 5 6 7 8 9 10 13 14 15 16 17 18 19 22 23 24 25 26 27 30 31 32 33 34 37 38 39 40 43 44 45 48 49 52
[[9]]
[1] 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 29 30 31 32 33 34 36 37 38 39 40 42 43 44 45 47 48
[42] 49 51 52 54
[[10]]
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
[42] 42 43 44 45 46 47 48 49 50 51 52 53 54 55
And if it's important, you can simply add the "OK" iteration retrospesctively:
append("OK",
map(.x = accumulate(V_ind, c),
~ map2(.x,
rev(seq_along(.x) - 1),
function(y, z) seq(y - z, y, 1)) %>%
reduce(c)))
Likewise, if you need to leave out the last number from the original vector:
append("OK",
map(.x = accumulate(head(V_ind, -1), c),
~ map2(.x,
rev(seq_along(.x) - 1),
function(y, z) seq(y - z, y, 1)) %>%
reduce(c)))
for (i in seq_along(V_ind)) {
if (i == 1) {
print("ok")
} else if (i == 2) {
print(V_ind[1])
} else {
out_vector <- V_ind[seq(i - 1)]
max_minus <- i - 2
minus_indices <- rep(seq(max_minus), rev(seq(max_minus)) + 1)
minus_vector <- c()
for (j in rev(seq(max_minus))) {
minus_vector <- c(minus_vector, rev(seq(0, j)))
}
out_vector <- numeric(length(minus_vector))
for (k in seq_along(out_vector)) {
out_vector[k] <- V_ind[minus_indices[k]] - minus_vector[k]
}
print(c(out_vector, V_ind[i - 1]))
}
}
[1] "ok"
[1] 10
[1] 9 10 19
[1] 8 9 10 18 19 27
[1] 7 8 9 10 17 18 19 26 27 34
[1] 6 7 8 9 10 16 17 18 19 25 26 27 33 34 40
[1] 5 6 7 8 9 10 15 16 17 18 19 24 25 26 27 32 33 34 39 40 45
[1] 4 5 6 7 8 9 10 14 15 16 17 18 19 23 24 25 26 27 31 32 33 34 38 39 40 44 45 49
[1] 3 4 5 6 7 8 9 10 13 14 15 16 17 18 19 22 23 24 25 26 27 30 31 32 33 34 37 38 39 40 43 44 45 48 49 52
[1] 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 29 30 31 32 33 34 36 37 38 39 40 42 43 44 45 47 48 49
[43] 51 52 54
You could define the indices at which to subtract, and how much to subtract, explicitly. (+1 added to seq for subtracting 0). Then you just have to append the last item (V_ind[i -1]) where no subtraction is performed to the vector
Another option where sequence plays the key role
lapply(seq_along(x), function(n){
x[rep(1:n, n:1)] - rev(sequence(1:n) - 1)
})
# [[1]]
# [1] 10
#
# [[2]]
# [1] 9 10 19
#
# [[3]]
# [1] 8 9 10 18 19 27
#
# [[4]]
# [1] 7 8 9 10 17 18 19 26 27 34
Where x is a subset of your vector:
x = cumsum(10:7)
If desired, just c "ok" to the above.
I have a very large dataset (> 200000 lines) with 6 variables (only the first two shown)
>head(gt7)
ChromKey POS
1 2447 25
2 2447 183
3 26341 75
4 26341 2213
5 26341 2617
6 54011 1868
I have converted the Chromkey variable to a factor variable made up of > 55000 levels.
> gt7[1] <- lapply(gt7[1], factor)
> is.factor(gt7$ChromKey)
[1] TRUE
I can further make a table with counts of ChromKey levels
> table(gt7$ChromKey)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
88 88 44 33 11 11 33 22 121 11 22 11 11 11 22 11 33
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
22 22 44 55 22 11 22 66 11 11 11 22 11 11 11 187 77
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
77 11 44 11 11 11 11 11 11 22 66 11 22 11 44 22 22
... outut cropped
Which I can save in table format
> table <- table(gt7$ChromKey)
> head(table)
1 2 3 4 5 6
88 88 44 33 11 11
I would like to know whether is it possible to have a table (and histogram) of the number of levels with specific count numbers. From the example above, I would expect
88 44 33 11
2 1 1 2
I would very much appreciate any hint.
We can apply table again on the output to get the frequency count of the frequency
table(table(gt7$ChromKey))
I want to display a vector consistently in different R environment.
For example, for a vector like this
c(1:30)
will display 24 values per row
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
[25] 25 26 27 28 29 30
and not
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
The closest thing to what you are looking for is to use options() to configure the width of the results window:
options(width = 75)
c(1:30)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
[24] 24 25 26 27 28 29 30
What's the name for the [1] below.
What is its significance?
Is it always only [1]? If not, then under what conditions is it something else? (example please)
> bb <- c(5,6,7)
> bb
[1] 5 6 7
It shows the count of the variables. In your case, it shows
bb <- c(5,6,7)
> bb
# [1] 5 6 7
Try,
c(1:50)
#[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
#[35] 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
You can also avoid that being displayed by using cat
cat(c(1:50))
#1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50