I'm trying to recycle a vector, but don't want to recycle with the default in R.
Imagine I have 2 vectors with unequal number of elements:
gen1 = 2:10
gen2 = 1:10
rbind(gen1,gen2)
This gives this table
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 3 4 5 6 7 8 9 10 2
gen2 1 2 3 4 5 6 7 8 9 10
As you can see in the last column, the 2 gets paired with 10. But I want this:
gen1 = c(2,2:10)
gen2 = 1:10
rbind(gen1,gen2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 2 3 4 5 6 7 8 9 10
gen2 1 2 3 4 5 6 7 8 9 10
Now the 2 is duplicated, but at the front. Evidently I do not want to do this by hand since I have a collection of these non pairing vectors which I want to use this trick. Is there a way to do this?
Or perhaps a way to find the 'closest' position possible in the list.
For example, if I have
[,1] [,2] [,3] [,4] [,5] [,6]
gen1 8 9 10 8 9 10
gen2 5 6 7 8 9 10
I would like this to be:
[,1] [,2] [,3] [,4] [,5] [,6]
gen1 8 8 9 9 10 10
gen2 5 6 7 8 9 10
First example in question
1) Convert each to a ts series with appropriate alignment and then use na.locf.
library(zoo)
# inputs
gen1 <- 2:10; gen2 = 1:10
t(na.locf(cbind(gen1 = ts(gen1, start = 2), gen2 = ts(gen2)), fromLast = TRUE))
giving:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 2 3 4 5 6 7 8 9 10
gen2 1 2 3 4 5 6 7 8 9 10
2) It can also be written with pipes like this
cbind(gen1 = ts(gen1, start = 2), gen2 = ts(gen2)) |>
na.locf(fromLast = TRUE) |>
t()
3) If you want to derive the aligment from the data itself use this:
maxlen <- max(length(gen1), length(gen2))
cbind(gen1 = ts(gen1, end = maxlen), gen2 = ts(gen2, end = maxlen)) |>
na.locf(fromLast = TRUE) |>
t()
4) Another approach is to use dynamic time warping.
library(dtw)
with(dtw(gen1, gen2), rbind(gen1 = gen1[index1], gen2 = gen2[index2]))
giving:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 2 3 4 5 6 7 8 9 10
gen2 1 2 3 4 5 6 7 8 9 10
Last example in question
The last example in the question seems entirely different and is just a matter of sorting each row.
# input in reproducible form
m <- rbind(gen1 = c(8, 9, 10, 8, 9, 10), gen2 = c(5, 6, 7, 8, 9, 10))
t(apply(m, 1, sort))
giving
[,1] [,2] [,3] [,4] [,5] [,6]
gen1 8 8 9 9 10 10
gen2 5 6 7 8 9 10
I think this is the trick:
gen1 = c(2,2:10)
gen2 = 1:10
ddd=rbind(gen1,gen2)
ddd[1,] = sort(ddd[1,])
ddd
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 2 3 4 5 6 7 8 9 10
gen2 1 2 3 4 5 6 7 8 9 10
Related
I have this vector
b=c(5,8,9)
I want to perform a combination on b selecting 2 items at a time such that i have the original elements of b as my first row to get
[,1] [,2] [,3]
[1,] 5 8 9
[2,] 8 9 5
I tried combn(b, 2) and it gives me this
[,1] [,2] [,3]
[1,] 5 5 8
[2,] 8 9 9
Can i get help to achieve my desired result?
Since the second row of your desired result is not uniquely defined, there is no need for any sophisticated tools:
b <- 1:10
rbind(b, c(b[-1], b[1]))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# b 1 2 3 4 5 6 7 8 9 10
# 2 3 4 5 6 7 8 9 10 1
In this case I only "shift" b by one position in the second row, which indeed results in a permutation. I'm assuming that the elements of b don't repeat.
I'm doing cross validation and I want to separate the data into 3 folds.
I create a matrix withmat=matrix(sample.int(10, 9*100, TRUE), 6, 10) which looks like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
[3,] 7 6 6 3 8 2 3 10 7 4
[4,] 7 4 10 8 7 5 2 6 2 8
[5,] 9 7 7 5 3 9 5 8 7 8
[6,] 3 3 1 2 9 3 6 7 6 9
I want to get then 3 matrices with the data:
fold 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
fold 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 7 6 6 3 8 2 3 10 7 4
[2,] 7 4 10 8 7 5 2 6 2 8
fold 3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 9 7 7 5 3 9 5 8 7 8
[2,] 3 3 1 2 9 3 6 7 6 9
Here is my code what I did:
require(stats)
mat=matrix(sample.int(10, 9*100, TRUE), 6, 10)
folds=cut(seq(1, nrow(mat)), breaks = 3, labels = FALSE)
#Perform 10 fold cross validation
for(i in 1:3){
#segment your data by folds using the which() function
testIndexes=which(folds==i, arr.ind = TRUE)
testData=mat[testIndexes,]
trainData=mat[-testIndexes,]
}
The training data that I get from fold 1 and fold 2 are connected, I want to generate them separately.
This is the generated training set which should be separate in two folds.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
[3,] 7 6 6 3 8 2 3 10 7 4
[4,] 7 4 10 8 7 5 2 6 2 8
Each loop of my sapply function will out put a n*m matrix. n is fixed, m is not.
For example, if I run this in R:
sapply(1:3, function(x) {matrix(1:9, 3)})
and it will output:
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 6
[7,] 7 7 7
[8,] 8 8 8
[9,] 9 9 9
However, what I want is something like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
Any idea for this? Thanks
One solution is:
do.call(cbind, lapply(1:3, function(x) {matrix(1:9, 3)}))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
We can use replicate
`dim<-`(replicate(3, matrix(1:9, 3)), c(3, 3*3))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#[1,] 1 4 7 1 4 7 1 4 7
#[2,] 2 5 8 2 5 8 2 5 8
#[3,] 3 6 9 3 6 9 3 6 9
Here is the code I am working with:
A <- matrix(1:9, nrow = 3)
A
cbind(A,A,A)
This gives an output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
The desired output is...
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 1 4 4 4 7 7 7
[2,] 2 2 2 5 5 5 8 8 8
[3,] 3 3 3 6 6 6 9 9 9
In addition I tried this...
test <- (sapply(A , function(maybe) rep(maybe,each=3)))
test
Which gives an output of:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 2 3 4 5 6 7 8 9
[2,] 1 2 3 4 5 6 7 8 9
[3,] 1 2 3 4 5 6 7 8 9
The help is much appreciated.
Use rep with column indexing: A[,rep(1:3, each=3)]
I want to extract every nth element of row for each row in a matrix, here is my code:
x <- matrix(1:16,nrow=2)
x
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 3 5 7 9 11 13 15
[2,] 2 4 6 8 10 12 14 16
I have tried:
sapply(x, function(l) x[seq(1,8,2)])
which clearly fails.
I want to pull every 2nd value from "x" the desired output would be something like...
[,1] [,2] [,3] [,4]
[1,] 3 7 11 15
[2,] 4 8 12 16
You are overcomplicating it:
This gives you what you need
x[,seq(2, 8, 2)]
or, more generally
x[,seq(2, ncol(x), 2)]