Shuffling and combining two vectors [duplicate] - r

I would like to merge 2 vectors this way :
a = c(1,2,3)
b = c(11,12,13)
merged vector : c(1,11,2,12,3,13)
How could I do it ?

This will work using rbind :
c(rbind(a, b))
For example:
a = c(1,2,3)
b = c(11,12,13)
c(rbind(a,b))
#[1] 1 11 2 12 3 13
Explanation
This works because R stores arrays in column-major order.
When you rbind() the two vectors, you get:
rbind_result <- rbind(a, b)
rbind_result
# [,1] [,2] [,3]
# a 1 2 3
# b 11 12 13
Then c() coerces rbind_result into a column-wise flattened vector:
merged <- c(rbind_result)
merged
# [1] 1 11 2 12 3 13

The rbind() answer by #jalapic is excellent. Here's an alternative that creates a new vector then assigns the alternating values to it.
a <- c(1,2,3)
b <- c(11,12,13)
x <- vector(class(a), length(c(a, b)))
x[c(TRUE, FALSE)] <- a
x[c(FALSE, TRUE)] <- b
x
# [1] 1 11 2 12 3 13
And one more that shows append
c(sapply(seq_along(a), function(i) append(a[i], b[i], i)))
# [1] 1 11 2 12 3 13

Just wanted to add a simpler solution that works for when vectors are unequal length and you want to append the extra data to the end.
> a <- 1:3
> b <- 11:17
> c(a, b)[order(c(seq_along(a)*2 - 1, seq_along(b)*2))]
[1] 1 11 2 12 3 13 14 15 16 17
Explanation:
c(a, b) creates a vector of the values in a and b.
seq_along(a)*2 - 1 creates a vector of the first length(a) odd numbers.
seq_along(b)*2 creates a vector of the first length(b) even numbers.
order(...) will return the indexes of the numbers in the two seq_along vectors such that x[order(x)] is an ordered list. Since the first seq_along contains the even numbers and the second seq_along has the odds, order will take the first element from the first seq_along, then the first elements of the second seq_along, then the second element from the first seq_along, etc. interspersing the two vector indexes and leaving the extra data at the tail.
By indexing c(a, b) using the order vector, we will intersperse a and b.
As a note, since seq_along returns numeric(0) when the input is NULL this solution works even if one of the vectors is length 0.

I had to solve a similar problem, but my vectors were of unequal length. And, I didn't want to recycle the shorter vector, but just append the tail of the longer vector.
And the solution for #RichardScriven didn't work for me (though I may have done something wrong and didn't try hard to troubleshoot).
Here is my solution:
#' Riffle-merges two vectors, possibly of different lengths
#'
#' Takes two vectors and interleaves the elements. If one vector is longer than
#' the other, it appends on the tail of the longer vector to the output vector.
#' #param a First vector
#' #param b Second vector
#' #return Interleaved vector as described above.
#' #author Matt Pettis
riffle <- function(a, b) {
len_a <- length(a)
len_b <- length(b)
len_comm <- pmin(len_a, len_b)
len_tail <- abs(len_a - len_b)
if (len_a < 1) stop("First vector has length less than 1")
if (len_b < 1) stop("Second vector has length less than 1")
riffle_common <- c(rbind(a[1:len_comm], b[1:len_comm]))
if (len_tail == 0) return(riffle_common)
if (len_a > len_b) {
return(c(riffle_common, a[(len_comm + 1):len_a]))
} else {
return(c(riffle_common, b[(len_comm + 1):len_b]))
}
}
# Try it out
riffle(1:7, 11:13)
[1] 1 11 2 12 3 13 4 5 6 7
riffle(1:3, 11:17)
[1] 1 11 2 12 3 13 14 15 16 17
HTH,
Matt

#MBo's answer to my question at https://stackoverflow.com/a/58773002/2556061 implies a solution for evenly interlacing vectors of unequal length. I'm reporting it here in for reference.
interleave <- function(x, y)
{
m <- length(x)
n <- length(y)
xi <- yi <- 1
len <- m + n
err <- len %/% 2
res <- vector()
for (i in 1:len)
{
err <- err - m
if (err < 0)
{
res[i] <- x[xi]
xi <- xi + 1
err <- err + len
} else
{
res[i] <- y[yi]
yi <- yi + 1
}
}
res
}
gives
interleave(1:10, 100:120)
c(100, 1, 101, 102, 2, 103, 104, 3, 105, 106, 4, 107, 108, 5, 109, 110, 111, 6, 112, 113, 7, 114, 115, 8, 116, 117, 9, 118, 119, 10, 120)

A tidyverse approach is vctrs::vec_interleave:
vctrs::vec_interleave(a, b)
#[1] 1 11 2 12 3 13

Related

R: Logical Conditions Not Being Respected

I am working with the R programming language. I am trying to build a loop that performs the following :
Step 1: Keep generating two random numbers "a" and "b" until both "a" and "b" are greater than 12
Step 2: Track how many random numbers had to be generated until it took for Step 1 to be completed
Step 3: Repeat Step 1 and Step 2 100 times
Since I do not know how to keep generating random numbers until a condition is met, I tried to generate a large amount of random numbers hoping that the condition is met (there is probably a better way to write this):
results <- list()
for (i in 1:100){
# do until break
repeat {
# repeat many random numbers
a = rnorm(10000,10,1)
b = rnorm(10000,10,1)
# does any pair meet the requirement
if (any(a > 12 & b > 12)) {
# put it in a data.frame
d_i = data.frame(a,b)
# end repeat
break
}
}
# select all rows until the first time the requirement is met
# it must be met, otherwise the loop would not have ended
d_i <- d_i[1:which(d_i$a > 10 & d_i$b > 10)[1], ]
# prep other variables and only keep last row (i.e. the row where the condition was met)
d_i$index = seq_len(nrow(d_i))
d_i$iteration = as.factor(i)
e_i = d_i[nrow(d_i),]
results[[i]] <- e_i
}
results_df <- do.call(rbind.data.frame, results)
Problem: When I look at the results, I noticed that the loop is incorrectly considering the condition to be met, for example:
head(results_df)
a b index iteration
4 10.29053 10.56263 4 1
5 10.95308 10.32236 5 2
3 10.74808 10.50135 3 3
13 11.87705 10.75067 13 4
1 10.17850 10.58678 1 5
14 10.14741 11.07238 1 6
For instance, in each one of these rows - both "a" and "b" are smaller than 12.
Does anyone know why this is happening and can someone please show me how to fix this problem?
Thanks!
How about this way? As you tag while-loop, I tried using it.
res <- matrix(0, nrow = 0, ncol = 3)
for (j in 1:100){
a <- rnorm(1, 10, 1)
b <- rnorm(1, 10, 1)
i <- 1
while(a < 12 | b < 12) {
a <- rnorm(1, 10, 1)
b <- rnorm(1, 10, 1)
i <- i + 1
}
x <- c(a,b,i)
res <- rbind(res, x)
}
head(res)
[,1] [,2] [,3]
x 12.14232 12.08977 399
x 12.27158 12.01319 1695
x 12.57345 12.42135 302
x 12.07494 12.64841 600
x 12.03210 12.07949 82
x 12.34006 12.00365 782
dim(res)
[1] 100 3

How to add possible divisor numbers?

How do I retrieve maximum sum of possible divisors numbers
I have a below function which will give possible divisors of number
Code
divisors <- function(x) {
y <- seq_len(ceiling(x / 2))
y[x %% y == 0]
}
Example
Divisors of 99 will give the below possible values.
divisors(99)
[1] 1 3 9 11 33
My expected Logic :
Go from last digit to first digit in the divisors value
The last number is 33, Here next immediate number divisible by 33 is 11 . So I selected 11 , now traversing from 11 the next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
33 + 11 + 1 = 45
Move to next number 11, Now next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
11 + 1 = 12
Here immediate
Move to next number 9, Now next immediate number divisible by 11 is 1. So selected 1. Now add all the numbers.
9 + 3 + 1 = 13
Move to next number 3, Now next immediate number divisible by 3 is 1. So selected 1. Now add all the numbers.
3+1=4
Now maximum among these is 45.
Now I am struggling to write this logic in R . Help / Advice much appreciated.
Note : Prime numbers can be ignored.
update
For large integers, e.g., the maximum integer .Machine$integer.max (prime number), you can run the code below (note that I modified functions divisors and f a bit)
divisors <- function(x) {
y <- seq(x / 2)
y[as.integer(x) %% y == 0]
}
f <- function(y) {
if (length(y) <= 2) {
return(as.integer(sum(y)))
}
l <- length(y)
h <- y[l]
yy <- y[-l]
h + f(yy[h %% yy == 0])
}
and you will see
> n <- .Machine$integer.max - 1
> x <- divisors(n)
> max(sapply(length(x):2, function(k) f(head(x, k))))
[1] 1569603656
You can define a recursive function f that gives successive divisors
f <- function(y) {
if (length(y) == 1) {
return(y)
}
h <- y[length(y)]
yy <- y[-length(y)]
c(f(yy[h %% yy == 0]), h)
}
and you will see all possible successive divisor tuples
> sapply(rev(seq_along(x)), function(k) f(head(x, k)))
[[1]]
[1] 1 11 33
[[2]]
[1] 1 11
[[3]]
[1] 1 3 9
[[4]]
[1] 1 3
[[5]]
[1] 1
Then, we apply f within sapply like below
> max(sapply(rev(seq_along(x)), function(k) sum(f(head(x, k)))))
[1] 45
which gives the desired output.
You can also use the following solution. It may sound a little bit complicated and of course there is always an easier, more efficient solution. However, I thought this could be useful to you. I will take it from your divisors output:
> x
[1] 1 3 9 11 33
# First I created a list whose first element is our original x and from then on
# I subset the first element till the last element of the list
lst <- lapply(0:(length(x)-1), function(a) x[1:(length(x)-a)])
> lst
[[1]]
[1] 1 3 9 11 33
[[2]]
[1] 1 3 9 11
[[3]]
[1] 1 3 9
[[4]]
[1] 1 3
[[5]]
[1] 1
Then I wrote a custom function in order to implement your conditions and gather your desired output. For this purpose I created a function factory which in fact is a function that creates a function:
As you might have noticed the outermost function does not take any argument. It only sets up an empty vector out to save our desired elements in. It is created in the execution environment of the outermost function to shield it from any changes that might affect it in the global environment
The inner function is the one that takes our vector x so in general we call the whole setup like fnf()(x). First element of of our out vector is in fact the first element of the original x(33). Then I found all divisors of the first element whose quotient were 0. After I fount them I took the second element (11) as the first one was (33) and stored it in our out vector. Then I modified the original x vector and omitted the max value (33) and repeated the same process
Since we were going to repeat the process over again, I thought this might be a good case to use recursion. Recursion is a programming technique that a function actually calls itself from its body or from inside itself. As you might have noticed I used fn inside the function to repeat the process again but each time with one fewer value
This may sound a bit complicated but I believed there may be some good points for you to pick up for future exploration, since I found them very useful, hoped that's the case for you too.
fnf <- function() {
out <- c()
fn <- function(x) {
out <<- c(out, x[1])
z <- x[out[length(out)]%%x == 0]
if(length(z) >= 2) {
out[length(out) + 1] <<- z[2]
} else {
return(out)
}
x <- x[!duplicated(x)][which(x[!duplicated(x)] == z[2]):length(x[!duplicated(x)])]
fn(x)
out[!duplicated(out)]
}
}
# The result of applying the custom function on `lst` would result in your
# divisor values
lapply(lst, function(x) fnf()(sort(x, decreasing = TRUE)))
[[1]]
[1] 33 11 1
[[2]]
[1] 11 1
[[3]]
[1] 9 3 1
[[4]]
[1] 3 1
[[5]]
[1] 1
In the end we sum each element and extract the max value
Reduce(max, lapply(lst, function(x) sum(fnf()(sort(x, decreasing = TRUE)))))
[1] 45
Testing a very large integer number, I used dear #ThomasIsCoding's modified divisors function:
divisors <- function(x) {
y <- seq(x / 2)
y[as.integer(x) %% y == 0]
}
x <- divisors(.Machine$integer.max - 1)
lst <- lapply(0:(length(x)-1), function(a) x[1:(length(x)-a)])
Reduce(max, lapply(lst, function(x) sum(fnf()(sort(x, decreasing = TRUE)))))
[1] 1569603656
You'll need to recurse. If I understand correctly, this should do what you want:
fact <- function(x) {
x <- as.integer(x)
div <- seq_len(abs(x)/2)
factors <- div[x %% div == 0L]
return(factors)
}
maxfact <- function(x) {
factors <- fact(x)
if (length(factors) < 3L) {
return(sum(factors))
} else {
return(max(factors + mapply(maxfact, factors)))
}
}
maxfact(99)
[1] 45

Swapping elements between more than 2 arrays

Swapping elements within a single array (x) is a classic problem in computer science. The immediate (but by no means only, e.g., XOR) solution in a low-level language like C is to use a temporary variable:
x[0] = tmp
x[0] = x[1]
x[1] = tmp
The above algorithm swaps the first and second elements of x.
To swap elements between two subarrays, x and y, is similar
x[0] = tmp
x[0] = y[1]
y[1] = tmp
What about for the case of 3 arrays with the added restriction that an element of Array 1 must be swapped with an element of Array 2 and an element of Array 2 must be swapped with an element of Array 3? Elements in Arrays 1 and 3 are not swapped with one another.
How can such an approach (with the added restriction) be generalized to k arrays?
You could create a for-loop that repeats your set of instructions:
l=list(x = c(1,2,3,4,5),y = c(5,4,3,2,1),z = c(6,7,8,9,10))
swap_elements <- function(l)
{
for(i in 1:(length(l)-1))
{
tmp = l[[i]][1]
l[[i]][1] = l[[i+1]][2]
l[[i+1]][2] = tmp
}
return(l)
}
Output:
> swap_elements(l)
$x
[1] 4 2 3 4 5
$y
[1] 7 1 3 2 1
$z
[1] 6 5 8 9 10
if the Arrays are stacked into a matrix, you can lag the rows to create the required action
k <- 6
#generate dummy data with k rows and 3 columns
mat <- matrix(seq_len(3*k), nrow=k, byrow=TRUE)
mat
#lag the matrix
mat[c(seq_len(k)[-1], 1),]

How to merge 2 vectors alternating indexes?

I would like to merge 2 vectors this way :
a = c(1,2,3)
b = c(11,12,13)
merged vector : c(1,11,2,12,3,13)
How could I do it ?
This will work using rbind :
c(rbind(a, b))
For example:
a = c(1,2,3)
b = c(11,12,13)
c(rbind(a,b))
#[1] 1 11 2 12 3 13
Explanation
This works because R stores arrays in column-major order.
When you rbind() the two vectors, you get:
rbind_result <- rbind(a, b)
rbind_result
# [,1] [,2] [,3]
# a 1 2 3
# b 11 12 13
Then c() coerces rbind_result into a column-wise flattened vector:
merged <- c(rbind_result)
merged
# [1] 1 11 2 12 3 13
The rbind() answer by #jalapic is excellent. Here's an alternative that creates a new vector then assigns the alternating values to it.
a <- c(1,2,3)
b <- c(11,12,13)
x <- vector(class(a), length(c(a, b)))
x[c(TRUE, FALSE)] <- a
x[c(FALSE, TRUE)] <- b
x
# [1] 1 11 2 12 3 13
And one more that shows append
c(sapply(seq_along(a), function(i) append(a[i], b[i], i)))
# [1] 1 11 2 12 3 13
Just wanted to add a simpler solution that works for when vectors are unequal length and you want to append the extra data to the end.
> a <- 1:3
> b <- 11:17
> c(a, b)[order(c(seq_along(a)*2 - 1, seq_along(b)*2))]
[1] 1 11 2 12 3 13 14 15 16 17
Explanation:
c(a, b) creates a vector of the values in a and b.
seq_along(a)*2 - 1 creates a vector of the first length(a) odd numbers.
seq_along(b)*2 creates a vector of the first length(b) even numbers.
order(...) will return the indexes of the numbers in the two seq_along vectors such that x[order(x)] is an ordered list. Since the first seq_along contains the even numbers and the second seq_along has the odds, order will take the first element from the first seq_along, then the first elements of the second seq_along, then the second element from the first seq_along, etc. interspersing the two vector indexes and leaving the extra data at the tail.
By indexing c(a, b) using the order vector, we will intersperse a and b.
As a note, since seq_along returns numeric(0) when the input is NULL this solution works even if one of the vectors is length 0.
I had to solve a similar problem, but my vectors were of unequal length. And, I didn't want to recycle the shorter vector, but just append the tail of the longer vector.
And the solution for #RichardScriven didn't work for me (though I may have done something wrong and didn't try hard to troubleshoot).
Here is my solution:
#' Riffle-merges two vectors, possibly of different lengths
#'
#' Takes two vectors and interleaves the elements. If one vector is longer than
#' the other, it appends on the tail of the longer vector to the output vector.
#' #param a First vector
#' #param b Second vector
#' #return Interleaved vector as described above.
#' #author Matt Pettis
riffle <- function(a, b) {
len_a <- length(a)
len_b <- length(b)
len_comm <- pmin(len_a, len_b)
len_tail <- abs(len_a - len_b)
if (len_a < 1) stop("First vector has length less than 1")
if (len_b < 1) stop("Second vector has length less than 1")
riffle_common <- c(rbind(a[1:len_comm], b[1:len_comm]))
if (len_tail == 0) return(riffle_common)
if (len_a > len_b) {
return(c(riffle_common, a[(len_comm + 1):len_a]))
} else {
return(c(riffle_common, b[(len_comm + 1):len_b]))
}
}
# Try it out
riffle(1:7, 11:13)
[1] 1 11 2 12 3 13 4 5 6 7
riffle(1:3, 11:17)
[1] 1 11 2 12 3 13 14 15 16 17
HTH,
Matt
#MBo's answer to my question at https://stackoverflow.com/a/58773002/2556061 implies a solution for evenly interlacing vectors of unequal length. I'm reporting it here in for reference.
interleave <- function(x, y)
{
m <- length(x)
n <- length(y)
xi <- yi <- 1
len <- m + n
err <- len %/% 2
res <- vector()
for (i in 1:len)
{
err <- err - m
if (err < 0)
{
res[i] <- x[xi]
xi <- xi + 1
err <- err + len
} else
{
res[i] <- y[yi]
yi <- yi + 1
}
}
res
}
gives
interleave(1:10, 100:120)
c(100, 1, 101, 102, 2, 103, 104, 3, 105, 106, 4, 107, 108, 5, 109, 110, 111, 6, 112, 113, 7, 114, 115, 8, 116, 117, 9, 118, 119, 10, 120)
A tidyverse approach is vctrs::vec_interleave:
vctrs::vec_interleave(a, b)
#[1] 1 11 2 12 3 13

Find first greater element with higher index

I have two vectors, A and B. For every element in A I want to find the index of the first element in B that is greater and has higher index. The length of A and B are the same.
So for vectors:
A <- c(10, 5, 3, 4, 7)
B <- c(4, 8, 11, 1, 5)
I want a result vector:
R <- c(3, 3, 5, 5, NA)
Of course I can do it with two loops, but it's very slow, and I don't know how to use apply() in this situation, when the indices matter. My data set has vectors of length 20000, so the speed is really important in this case.
A few bonus questions:
What if I have a sequence of numbers (like seq = 2:10), and I want to find the first number in B that is higher than a+s for every a of A and every s of seq.
Like with question 1), but I want to know the first greater, and the first lower value, and create a matrix, which stores which one was first. So for example I have a of A, and 10 from seq. I want to find the first value of B, which is higher than a+10, or lower than a-10, and then store it's index and value.
sapply(sapply(seq_along(a),function(x) which(b[-seq(x)]>a[x])+x),"[",1)
[1] 3 3 5 5 NA
This is a great example of when sapply is less efficient than loops.
Although the sapply does make the code look neater, you are paying for that neatness with time.
Instead you can wrap a while loop inside a for loop inside a nice, neat function.
Here are benchmarks comparing a nested-apply loop against nested for-while loop (and a mixed apply-while loop, for good measure). Update: added the vapply..match.. mentioned in comments. Faster than sapply, but still much slower than while loop.
BENCHMARK:
test elapsed relative
1 for.while 0.069 1.000
2 sapply.while 0.080 1.159
3 vapply.match 0.101 1.464
4 nested.sapply 0.104 1.507
Notice you save a third of your time; The savings will likely be larger when you start adding the sequences to A.
For the second part of your question:
If you have this all wrapped up in an nice function, it is easy to add a seq to A
# Sample data
A <- c(10, 5, 3, 4, 7, 100, 2)
B <- c(4, 8, 11, 1, 5, 18, 20)
# Sample sequence
S <- seq(1, 12, 3)
# marix with all index values (with names cleaned up)
indexesOfB <- t(sapply(S, function(s) findIndx(A+s, B)))
dimnames(indexesOfB) <- list(S, A)
Lastly, if you want to instead find values of B less than A, just swap the operation in the function.
(You could include an if-clause in the function and use only a single function. I find it more efficient
to have two separate functions)
findIndx.gt(A, B) # [1] 3 3 5 5 6 NA 8 NA NA
findIndx.lt(A, B) # [1] 2 4 4 NA 8 7 NA NA NA
Then you can wrap it up in one nice pacakge
rangeFindIndx(A, B, S)
# A S indxB.gt indxB.lt
# 10 1 3 2
# 5 1 3 4
# 3 1 5 4
# 4 1 5 NA
# 7 1 6 NA
# 100 1 NA NA
# 2 1 NA NA
# 10 4 6 4
# 5 4 3 4
# ...
FUNCTIONS
(Notice they depend on reshape2)
rangeFindIndx <- function(A, B, S) {
# For each s in S, and for each a in A,
# find the first value of B, which is higher than a+s, or lower than a-s
require(reshape2)
# Create gt & lt matricies; add dimnames for melting function
indexesOfB.gt <- sapply(S, function(s) findIndx.gt(A+s, B))
indexesOfB.lt <- sapply(S, function(s) findIndx.lt(A-s, B))
dimnames(indexesOfB.gt) <- dimnames(indexesOfB.gt) <- list(A, S)
# melt the matricies and combine into one
gtltMatrix <- cbind(melt(indexesOfB.gt), melt(indexesOfB.lt)$value)
# clean up their names
names(gtltMatrix) <- c("A", "S", "indxB.gt", "indxB.lt")
return(gtltMatrix)
}
findIndx.gt <- function(A, B) {
lng <- length(A)
ret <- integer(0)
b <- NULL
for (j in seq(lng-1)) {
i <- j + 1
while (i <= lng && ((b <- B[[i]]) < A[[j]]) ) {
i <- i + 1
}
ret <- c(ret, ifelse(i<lng, i, NA))
}
c(ret, NA)
}
findIndx.lt <- function(A, B) {
lng <- length(A)
ret <- integer(0)
b <- NULL
for (j in seq(lng-1)) {
i <- j + 1
while (i <= lng && ((b <- B[[i]]) > A[[j]]) ) { # this line contains the only difference from findIndx.gt
i <- i + 1
}
ret <- c(ret, ifelse(i<lng, i, NA))
}
c(ret, NA)
}

Resources