Sequence of Number, through Iteration - r

I have the following vector of indices
V_ind = cumsum(c(10,9,8,7,6,5,4,3,2,1))
[1] 10 19 27 34 40 45 49 52 54 55
And I created the following FOR LOOP
k=1
for(ind in V_ind){
if(ind<=10){
print("ok")
}else{
print(c(V_ind[1:k]))
k = k + 1
}
}
Which gives as a result
[1] "ok"
[1] 10
[1] 10 19
[1] 10 19 27
[1] 10 19 27 34
[1] 10 19 27 34 40
[1] 10 19 27 34 40 45
[1] 10 19 27 34 40 45 49
[1] 10 19 27 34 40 45 49 52
[1] 10 19 27 34 40 45 49 52 54
However, what I try to acheive is the following result
[1] "ok"
[1] 10
[1] 9 10 19
[1] 8 9 10 18 19 27
[1] 7 8 9 10 17 18 19 26 27 34
[1] 6 7 8 9 10 16 17 18 19 25 26 27 33 34 40
[1] 5 6 7 8 9 10 15 16 17 18 19 24 25 26 27 32 33 34 39 40 45
[1] 4 5 6 7 8 9 10 14 15 16 17 18 19 23 24 25 26 27 31 32 33 34 38 39 40 44 45 49
[1] 3 4 5 6 7 8 9 10 13 14 15 16 17 18 19 22 23 24 25 26 27 30 31 32 33 34 37 38 39 40 43 44 45 48 49 52
[1] 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 29 30 31 32 33 34 36 37 38 39 40 42 43 44 45 47 48 49 51 52 54
This result goes as follows:
In the first iteration, we just print OK
In the second iteration, we extract the first element of the vector V_ind,
In the third iteration, we extract the first and second element of the vector V_ind together with the first element of the vector V_ind minus 1 that is the number 9.
In the fourth iteration, we extract the first, second and third element of the vector V_ind, together with the first element minus 1, i.e. 9, first element minus 2, i.e 8, and second element minus 1,i.e.18.
In the fifth iteration, we extract the first, second, third and fourth element of the vector V_ind together with the first element minus 1, 2, 3 respectively, i.e 7,8,9, also the second element minus 1 and 2, i.e 17,18, and the third element minus 1, i.e 26.
And this procedure goes until the end of the FOR LOOP. Is this even possible to be done in R, in a generic way?

One option using purrr could be:
map(.x = accumulate(V_ind, c),
~ map2(.x,
rev(seq_along(.x) - 1),
function(y, z) seq(y - z, y, 1)) %>%
reduce(c))
[[1]]
[1] 10
[[2]]
[1] 9 10 19
[[3]]
[1] 8 9 10 18 19 27
[[4]]
[1] 7 8 9 10 17 18 19 26 27 34
[[5]]
[1] 6 7 8 9 10 16 17 18 19 25 26 27 33 34 40
[[6]]
[1] 5 6 7 8 9 10 15 16 17 18 19 24 25 26 27 32 33 34 39 40 45
[[7]]
[1] 4 5 6 7 8 9 10 14 15 16 17 18 19 23 24 25 26 27 31 32 33 34 38 39 40 44 45 49
[[8]]
[1] 3 4 5 6 7 8 9 10 13 14 15 16 17 18 19 22 23 24 25 26 27 30 31 32 33 34 37 38 39 40 43 44 45 48 49 52
[[9]]
[1] 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 29 30 31 32 33 34 36 37 38 39 40 42 43 44 45 47 48
[42] 49 51 52 54
[[10]]
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
[42] 42 43 44 45 46 47 48 49 50 51 52 53 54 55
And if it's important, you can simply add the "OK" iteration retrospesctively:
append("OK",
map(.x = accumulate(V_ind, c),
~ map2(.x,
rev(seq_along(.x) - 1),
function(y, z) seq(y - z, y, 1)) %>%
reduce(c)))
Likewise, if you need to leave out the last number from the original vector:
append("OK",
map(.x = accumulate(head(V_ind, -1), c),
~ map2(.x,
rev(seq_along(.x) - 1),
function(y, z) seq(y - z, y, 1)) %>%
reduce(c)))

for (i in seq_along(V_ind)) {
if (i == 1) {
print("ok")
} else if (i == 2) {
print(V_ind[1])
} else {
out_vector <- V_ind[seq(i - 1)]
max_minus <- i - 2
minus_indices <- rep(seq(max_minus), rev(seq(max_minus)) + 1)
minus_vector <- c()
for (j in rev(seq(max_minus))) {
minus_vector <- c(minus_vector, rev(seq(0, j)))
}
out_vector <- numeric(length(minus_vector))
for (k in seq_along(out_vector)) {
out_vector[k] <- V_ind[minus_indices[k]] - minus_vector[k]
}
print(c(out_vector, V_ind[i - 1]))
}
}
[1] "ok"
[1] 10
[1] 9 10 19
[1] 8 9 10 18 19 27
[1] 7 8 9 10 17 18 19 26 27 34
[1] 6 7 8 9 10 16 17 18 19 25 26 27 33 34 40
[1] 5 6 7 8 9 10 15 16 17 18 19 24 25 26 27 32 33 34 39 40 45
[1] 4 5 6 7 8 9 10 14 15 16 17 18 19 23 24 25 26 27 31 32 33 34 38 39 40 44 45 49
[1] 3 4 5 6 7 8 9 10 13 14 15 16 17 18 19 22 23 24 25 26 27 30 31 32 33 34 37 38 39 40 43 44 45 48 49 52
[1] 2 3 4 5 6 7 8 9 10 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 29 30 31 32 33 34 36 37 38 39 40 42 43 44 45 47 48 49
[43] 51 52 54
You could define the indices at which to subtract, and how much to subtract, explicitly. (+1 added to seq for subtracting 0). Then you just have to append the last item (V_ind[i -1]) where no subtraction is performed to the vector

Another option where sequence plays the key role
lapply(seq_along(x), function(n){
x[rep(1:n, n:1)] - rev(sequence(1:n) - 1)
})
# [[1]]
# [1] 10
#
# [[2]]
# [1] 9 10 19
#
# [[3]]
# [1] 8 9 10 18 19 27
#
# [[4]]
# [1] 7 8 9 10 17 18 19 26 27 34
Where x is a subset of your vector:
x = cumsum(10:7)
If desired, just c "ok" to the above.

Related

Generating a vector with n repetitions of x, then y, then z, with a fixed upper bound

I am trying to create a vector where I have 3 repetitions of the number 1, then 3 repetitions of the number 2, and so on up to, for instance, 3 repetitions of the number 36.
c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5...)
I have tried the following use of rep() but got the following error:
Error in rep(3, seq(1:36)) : argument 'times' incorrect
What formulation do I need to use to properly generate the vector I want?
sort(rep(1:36, 3))
Or even better as #Wimpel mentioned in the comments, use the each argument of the rep function.
rep(1:36, each = 3)
output
# [1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 9 10 10 10 11 11 11 12 12 12 13 13 13 14 14 14 15 15 15 16 16 16 17 17 17 18 18 18 19 19 19 20 20 20 21 21 21 22
# [65] 22 22 23 23 23 24 24 24 25 25 25 26 26 26 27 27 27 28 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 34 34 34 35 35 35 36 36 36
This one should work. However probably not the most elegant.
reps = c()
n = 36
for(i in 1:n){
reps = append(reps, rep(i, 3))
}
reps
alternatively using the rep function properly (see documentation (?rep for argument each):
rep(1:36,each = 3)
rep approach is preferable (see existing answers)
Here are some other options:
> kronecker(1:36, rep(1, 3))
[1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9
[26] 9 9 10 10 10 11 11 11 12 12 12 13 13 13 14 14 14 15 15 15 16 16 16 17 17
[51] 17 18 18 18 19 19 19 20 20 20 21 21 21 22 22 22 23 23 23 24 24 24 25 25 25
[76] 26 26 26 27 27 27 28 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 34
[101] 34 34 35 35 35 36 36 36
> c(outer(rep(1, 3), 1:36))
[1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9
[26] 9 9 10 10 10 11 11 11 12 12 12 13 13 13 14 14 14 15 15 15 16 16 16 17 17
[51] 17 18 18 18 19 19 19 20 20 20 21 21 21 22 22 22 23 23 23 24 24 24 25 25 25
[76] 26 26 26 27 27 27 28 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 34
[101] 34 34 35 35 35 36 36 36

Changing multiple column names with a vector

I have tried to change column names based on a vector as follows:
library(data.table)
df <- fread(
"radio1 radio2 radio3 radio4 radio5 radio6 radio7
8 12 18 32 40 36 32
6 12 18 24 30 36 30
8 16 18 24 30 36 18
4 12 12 24 30 36 24
6 16 24 32 40 48 24
8 12 18 24 30 36 30
8 12 18 24 30 36 18
8 16 24 32 40 48 40
8 16 24 24 30 48 48",
header = TRUE
)
var <- c("radio1","radio2","radio3","radio4","radio5", "radio6", "radio7")
recode <- c("A","B","C","D","E", "F", "G")
variables <- cbind(var, recode)
variables <- as.data.table(variables)
for (i in seq_len(ncol(df))) {
colnames(df[[i]]) <- variables$recode[match(names(df)[i], variables $var)]
}
I however get the error:
Error in `colnames<-`(`*tmp*`, value = variables$recode[match(names(df)[i], :
attempt to set 'colnames' on an object with less than two dimensions
What am I doing wrong? Is there a better way to do this?
You can use match directly.
names(df) <- variables$recode[match(names(df), variables$var)]
df
# A B C D E F G
#1: 8 12 18 32 40 36 32
#2: 6 12 18 24 30 36 30
#3: 8 16 18 24 30 36 18
#4: 4 12 12 24 30 36 24
#5: 6 16 24 32 40 48 24
#6: 8 12 18 24 30 36 30
#7: 8 12 18 24 30 36 18
#8: 8 16 24 32 40 48 40
#9: 8 16 24 24 30 48 48
By changing colnames(df[[i]]) to colnames(df)[i], the loop works fine:
for (i in seq_len(ncol(df))) {
colnames(df)[i] <- variables$recode[match(names(df)[i], variables$var)] }
> df
A B C D E F G
1: 8 12 18 32 40 36 32
2: 6 12 18 24 30 36 30
3: 8 16 18 24 30 36 18
4: 4 12 12 24 30 36 24
5: 6 16 24 32 40 48 24
6: 8 12 18 24 30 36 30
7: 8 12 18 24 30 36 18
8: 8 16 24 32 40 48 40
9: 8 16 24 24 30 48 48

Repeating elements in a vector with a for loop

I want to make a vector from 3:50 in R, looking like
3 4 4 5 6 6 7 8 8 .. 50 50
I want to use a for loop in a for loop but it's not doing wat I want.
f <- c()
for (i in 3:50) {
for(j in 1:2) {
f = c(f, i)
}
}
What is wrong with it?
Another option is to use an embedded rep:
rep(3:50, rep(1:2, 24))
which gives:
[1] 3 4 4 5 6 6 7 8 8 9 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19 20 20
[28] 21 22 22 23 24 24 25 26 26 27 28 28 29 30 30 31 32 32 33 34 34 35 36 36 37 38 38
[55] 39 40 40 41 42 42 43 44 44 45 46 46 47 48 48 49 50 50
This utilizes the fact that the times-argument of rep can also be an integer vector which is equal to the length of the x-argument.
You can generalize this to:
s <- 3
e <- 50
v <- 1:2
rep(s:e, rep(v, (e-s+1)/2))
Even another option using a mix of rep and rep_len:
v <- 3:50
rep(v, rep_len(1:2, length(v)))
A solution based on sapply.
as.vector(sapply(0:23 * 2 + 2, function(x) x + c(1, 2, 2)))
# [1] 3 4 4 5 6 6 7 8 8 9 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19 20 20 21 22 22 23 24 24 25 26 26
# [37] 27 28 28 29 30 30 31 32 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42 43 44 44 45 46 46 47 48 48 49 50 50
Benchmarking
Here is a comparison of performance for all the current answers. The result shows that cumsum(rep(c(1, 1, 0), 24)) + 2L (m8) is the fastest, while rep(3:50, rep(1:2, 24))(m1) is almost as fast as the m8.
library(microbenchmark)
library(ggplot2)
perf <- microbenchmark(
m1 = {rep(3:50, rep(1:2, 24))},
m2 = {rep(3:50, each = 2)[c(TRUE, FALSE, TRUE, TRUE)]},
m3 = {v <- 3:50; sort(c(v,v[v %% 2 == 0]))},
m4 = {as.vector(t(cbind(seq(3,49,2),seq(4,50,2),seq(4,50,2))))},
m5 = {as.vector(sapply(0:23 * 2 + 2, function(x) x + c(1, 2, 2)))},
m6 = {sort(c(3:50, seq(4, 50, 2)))},
m7 = {rep(seq(3, 50, 2), each=3) + c(0, 1, 1)},
m8 = {cumsum(rep(c(1, 1, 0), 24)) + 2L},
times = 10000L
)
perf
# Unit: nanoseconds
# expr min lq mean median uq max neval
# m1 514 1028 1344.980 1029 1542 190200 10000
# m2 1542 2570 3083.716 3084 3085 191229 10000
# m3 26217 30329 35593.596 31871 34442 5843267 10000
# m4 43180 48321 56988.386 50891 55518 6626173 10000
# m5 30843 35984 42077.543 37526 40611 6557289 10000
# m6 40611 44209 50092.131 46779 50891 446714 10000
# m7 13879 16449 19314.547 17478 19020 6309001 10000
# m8 0 1028 1256.715 1028 1542 71454 10000
Use the rep function, along with the possibility to use recycling logical indexing ...[c(TRUE, FALSE, TRUE, TRUE)]
rep(3:50, each = 2)[c(TRUE, FALSE, TRUE, TRUE)]
## [1] 3 4 4 5 6 6 7 8 8 9 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19
## [26] 20 20 21 22 22 23 24 24 25 26 26 27 28 28 29 30 30 31 32 32 33 34 34 35 36
## [51] 36 37 38 38 39 40 40 41 42 42 43 44 44 45 46 46 47 48 48 49 50 50
If you use a logical vector (TRUE/FALSE) as index (inside [ ]), a TRUE leads to selection of the corresponding element and a FALSE leads to omission. If the logical index vector (c(TRUE, FALSE, TRUE, TRUE)) is shorter than the indexed vector (rep(3:50, each = 2) in your case), the index vector is recyled.
Also a side note: Whenever you use R code like
x = c(x, something)
or
x = rbind(x, something)
or similar, you are adopting a C-like programming style in R. This makes your code unnessecarily complex and might lead to low performance and out-of-memory issues if you work with large (say, 200MB+) data sets. R is designed to spare you those low-level tinkering with data structures.
Read for more information about the gluttons and their punishment in the R Inferno, Circle 2: Growing Objects.
The easiest way I can found is in way to create another one containing only even values (based on OP's intention) and then simply join two vectors. The example could be:
v <- 3:50
sort(c(v,v[v %% 2 == 0]))
# [1] 3 4 4 5 6 6 7 8 8 9 10 10 11 12 12 13 14 14 15 16 16
# 17 18 18 19 20 20 21 22 22 23 24 24 25 26 26 27 28 28
#[40] 29 30 30 31 32 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42
# 43 44 44 45 46 46 47 48 48 49 50 50
Here is a loop-free 1 line solution:
> as.vector(t(cbind(seq(3,49,2),seq(4,50,2),seq(4,50,2))))
[1] 3 4 4 5 6 6 7 8 8 9 10 10 11 12 12 13 14 14 15 16 16 17
[23] 18 18 19 20 20 21 22 22 23 24 24 25 26 26 27 28 28 29 30 30 31 32
[45] 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42 43 44 44 45 46 46
[67] 47 48 48 49 50 50
It forms a matrix whose first column is the odd numbers in the range 3:50 and whose second and third columns are the even numbers in that range and then (by taking the transpose) reads it off row by row.
The problem with your nested loop approach is that the fundamental pattern is one of length 3, repeated 24 times (instead of a pattern of length 2 repeated 50 times). If you wanted to use a nested loop, the outer loop could iterate 24 times and the inner loop 3. The first pass through the outer loop could construct 3,4,4. The second pass could construct 5,6,6. Etc. Since there are 24*3 = 72 elements, you can pre-allocate the vector (by using f <- vector("numeric",74) ) so that you aren't growing it 1 element at a time. The idiom f <- c(f,i) that you are using at each stage copies all of the old elements just to create a new vector which is only 1 element longer. Here there are too few elements for it to really make a difference, but if you try to create large vectors that way the performance can be shockingly bad.
Here is a method that combines portions of a couple of the other answers.
rep(seq(3, 50, 2), each=3) + c(0, 1, 1)
[1] 3 4 4 5 6 6 7 8 8 9 10 10 11 12 12 13 14 14 15 16
[21] 16 17 18 18 19 20 20 21 22 22 23 24 24 25 26 26 27 28 28 29
[41] 30 30 31 32 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42
[61] 43 44 44 45 46 46 47 48 48 49 50 50
Here is a second method using cumsum
cumsum(rep(c(1, 1, 0), 24)) + 2L
This should be very quick.
This should do too.
sort(c(3:50, seq(4, 50, 2)))
Another idea, though not competing in speed with fastest solutions:
mat <- matrix(3:50,nrow=2)
c(rbind(mat,mat[2,]))
# [1] 3 4 4 5 6 6 7 8 8 9 10 10 11 12 12 13 14 14 15 16 16 17 18 18 19 20 20 21 22 22
# [31] 23 24 24 25 26 26 27 28 28 29 30 30 31 32 32 33 34 34 35 36 36 37 38 38 39 40 40 41 42 42
# [61] 43 44 44 45 46 46 47 48 48 49 50 50

What is the name and reason for the [1] at the output prompt?

What's the name for the [1] below.
What is its significance?
Is it always only [1]? If not, then under what conditions is it something else? (example please)
> bb <- c(5,6,7)
> bb
[1] 5 6 7
It shows the count of the variables. In your case, it shows
bb <- c(5,6,7)
> bb
# [1] 5 6 7
Try,
c(1:50)
#[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
#[35] 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
You can also avoid that being displayed by using cat
cat(c(1:50))
#1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

split a list and increment for loop by 10

How to split a list in r?
I want to split a list in increment manner.
for ex.:
x <- 1:50
n <- 5
spt <- split(x,cut(x,quantile(x,(0:n)/n), include.lowest=TRUE, labels=FALSE))
we get
$`1`
[1] 1 2 3 4 5 6 7 8 9 10
$`2`
[1] 11 12 13 14 15 16 17 18 19 20
$`3`
[1] 21 22 23 24 25 26 27 28 29 30
$`4`
[1] 31 32 33 34 35 36 37 38 39 40
$`5`
[1] 41 42 43 44 45 46 47 48 49 50
I don't want this output. I want the output like below,
$`1`
[1] 1 2 3 4 5 6 7 8 9 10
$`2`
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
$`3`
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
$`4`
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2021 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
$`5`
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
any idea?
And i also want to know that how to increment for loop by 10 in r?
Thanks.
We can use seq
lapply(seq(10,50, by=10), function(i) x[1:i])
Or as #RichardScriven mentioned in the comments, the seq(10,50, by=10) can be replaced by 1:5 * 10L

Resources