I'm doing cross validation and I want to separate the data into 3 folds.
I create a matrix withmat=matrix(sample.int(10, 9*100, TRUE), 6, 10) which looks like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
[3,] 7 6 6 3 8 2 3 10 7 4
[4,] 7 4 10 8 7 5 2 6 2 8
[5,] 9 7 7 5 3 9 5 8 7 8
[6,] 3 3 1 2 9 3 6 7 6 9
I want to get then 3 matrices with the data:
fold 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
fold 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 7 6 6 3 8 2 3 10 7 4
[2,] 7 4 10 8 7 5 2 6 2 8
fold 3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 9 7 7 5 3 9 5 8 7 8
[2,] 3 3 1 2 9 3 6 7 6 9
Here is my code what I did:
require(stats)
mat=matrix(sample.int(10, 9*100, TRUE), 6, 10)
folds=cut(seq(1, nrow(mat)), breaks = 3, labels = FALSE)
#Perform 10 fold cross validation
for(i in 1:3){
#segment your data by folds using the which() function
testIndexes=which(folds==i, arr.ind = TRUE)
testData=mat[testIndexes,]
trainData=mat[-testIndexes,]
}
The training data that I get from fold 1 and fold 2 are connected, I want to generate them separately.
This is the generated training set which should be separate in two folds.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
[3,] 7 6 6 3 8 2 3 10 7 4
[4,] 7 4 10 8 7 5 2 6 2 8
Related
I was able to write a function in r to "shift" a column of a matrix over to the right by one:
shift <- function(disc){
mat <- matrix(nrow = 4, ncol = 12)
mat[,1] <- disc[,12]
for(i in 1:11){
mat[,i+1] <- disc[,i]
}
return(mat)
}
So to see how that works:
> disc0
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 2 5 10 7 16 8 7 8 8 3 4 12
[2,] 3 3 14 14 21 21 9 9 4 4 6 6
[3,] 8 9 10 11 12 13 14 15 4 5 6 7
[4,] 14 11 14 14 11 14 11 14 11 11 14 11
> shift(disc0)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 12 2 5 10 7 16 8 7 8 8 3 4
[2,] 6 3 3 14 14 21 21 9 9 4 4 6
[3,] 7 8 9 10 11 12 13 14 15 4 5 6
[4,] 11 14 11 14 14 11 14 11 14 11 11 14
What if I wanted to shift over 3 times, for example? I could do this manually:
> x <- disc0
> x <- shift(x)
> x <- shift(x)
> x <- shift(x)
> x
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 3 4 12 2 5 10 7 16 8 7 8 8
[2,] 4 6 6 3 3 14 14 21 21 9 9 4
[3,] 5 6 7 8 9 10 11 12 13 14 15 4
[4,] 11 14 11 14 11 14 14 11 14 11 14 11
So now the original first column (2,3,8,14) is now in the 4th column.
But how can I automate this? I want to write a function that will repeat my shift function n times. Thanks in advance
You could write a function that takes in the shift parameter:
shift <- function(x, num = 1){
n <- ncol(x)
x[, c((n - num +1):n, 1:(n - num))]
}
mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 2 3 4 5 6 7 8
[2,] 1 2 3 4 5 6 7 8
[3,] 1 2 3 4 5 6 7 8
[4,] 1 2 3 4 5 6 7 8
shift(mat)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 8 1 2 3 4 5 6 7
[2,] 8 1 2 3 4 5 6 7
[3,] 8 1 2 3 4 5 6 7
[4,] 8 1 2 3 4 5 6 7
shift(mat,2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 7 8 1 2 3 4 5 6
[2,] 7 8 1 2 3 4 5 6
[3,] 7 8 1 2 3 4 5 6
[4,] 7 8 1 2 3 4 5 6
shift(mat,3)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 6 7 8 1 2 3 4 5
[2,] 6 7 8 1 2 3 4 5
[3,] 6 7 8 1 2 3 4 5
[4,] 6 7 8 1 2 3 4 5
You may use a for loop -
n <- 3
for(i in seq_len(n)) {
disc0 <- shift(disc0)
}
disc0
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#[1,] 3 4 12 2 5 10 7 16 8 7 8 8
#[2,] 4 6 6 3 3 14 14 21 21 9 9 4
#[3,] 5 6 7 8 9 10 11 12 13 14 15 4
#[4,] 11 14 11 14 11 14 14 11 14 11 14 11
I have an array of number
x <- seq(1:10)
I am after a matrix with n rows. Here is an example with 3-row matrix:
1 2 3 4 5 6 7 8 9 10
NA 1 2 3 4 5 6 7 8 9
NA NA 1 2 3 4 5 6 7 8
What would be the best way to create one?
There is an odd little function called embed that will do it...
t(embed(c(NA, NA, 1:10), 3))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 2 3 4 5 6 7 8 9 10
[2,] NA 1 2 3 4 5 6 7 8 9
[3,] NA NA 1 2 3 4 5 6 7 8
For a vector x and a matrix of n rows, the equivalent would be
t(embed(c(rep(NA, n-1), x), n))
Maybe there is more simpler way to do this but one way to create this matrix would be
create_matrix <- function(x, n) {
t(sapply(seq(n), function(m) c(rep(NA, m - 1), head(x, length(x) - m + 1))))
}
create_matrix(1:10, 3)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 1 2 3 4 5 6 7 8 9 10
#[2,] NA 1 2 3 4 5 6 7 8 9
#[3,] NA NA 1 2 3 4 5 6 7 8
create_matrix(c(4, 3, 6, 8, 7), 4)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 4 3 6 8 7
#[2,] NA 4 3 6 8
#[3,] NA NA 4 3 6
#[4,] NA NA NA 4 3
I am trying perform a function to each cell of a matrix in R. I want to add the cells per 3 if they are > 0.
Example:
mat <- matrix(data=0:9, nrow=5, ncol=10, byrow=F)
mat3 <- apply(mat, MARGIN = 1, FUN= function(mat) if(mat != 0) {mat+3})
But first that created a list of length 5 and second it's all the cells who are added per 3.
For this simple case, it would be preferable to use the solutions from #akrun or #Karolis Koncevičius, but you can also do:
apply(mat, 2, function(x) ifelse(x > 0, x + 3, x))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 8 0 8 0 8 0 8 0 8
[2,] 4 9 4 9 4 9 4 9 4 9
[3,] 5 10 5 10 5 10 5 10 5 10
[4,] 6 11 6 11 6 11 6 11 6 11
[5,] 7 12 7 12 7 12 7 12 7 12
You don't need any apply, can use ifelse directly:
ifelse(mat > 0, mat+3, mat)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 8 0 8 0 8 0 8 0 8
[2,] 4 9 4 9 4 9 4 9 4 9
[3,] 5 10 5 10 5 10 5 10 5 10
[4,] 6 11 6 11 6 11 6 11 6 11
[5,] 7 12 7 12 7 12 7 12 7 12
But a faster solution would be:
mat[mat > 0] <- mat[mat > 0] + 3
mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 8 0 8 0 8 0 8 0 8
[2,] 4 9 4 9 4 9 4 9 4 9
[3,] 5 10 5 10 5 10 5 10 5 10
[4,] 6 11 6 11 6 11 6 11 6 11
[5,] 7 12 7 12 7 12 7 12 7 12
We could also do this on the fly with
mat + (mat > 0) * 3
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 0 8 0 8 0 8 0 8 0 8
#[2,] 4 9 4 9 4 9 4 9 4 9
#[3,] 5 10 5 10 5 10 5 10 5 10
#[4,] 6 11 6 11 6 11 6 11 6 11
#[5,] 7 12 7 12 7 12 7 12 7 12
Each loop of my sapply function will out put a n*m matrix. n is fixed, m is not.
For example, if I run this in R:
sapply(1:3, function(x) {matrix(1:9, 3)})
and it will output:
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 6
[7,] 7 7 7
[8,] 8 8 8
[9,] 9 9 9
However, what I want is something like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
Any idea for this? Thanks
One solution is:
do.call(cbind, lapply(1:3, function(x) {matrix(1:9, 3)}))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
We can use replicate
`dim<-`(replicate(3, matrix(1:9, 3)), c(3, 3*3))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#[1,] 1 4 7 1 4 7 1 4 7
#[2,] 2 5 8 2 5 8 2 5 8
#[3,] 3 6 9 3 6 9 3 6 9
Here is the code I am working with:
A <- matrix(1:9, nrow = 3)
A
cbind(A,A,A)
This gives an output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
The desired output is...
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 1 1 4 4 4 7 7 7
[2,] 2 2 2 5 5 5 8 8 8
[3,] 3 3 3 6 6 6 9 9 9
In addition I tried this...
test <- (sapply(A , function(maybe) rep(maybe,each=3)))
test
Which gives an output of:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 2 3 4 5 6 7 8 9
[2,] 1 2 3 4 5 6 7 8 9
[3,] 1 2 3 4 5 6 7 8 9
The help is much appreciated.
Use rep with column indexing: A[,rep(1:3, each=3)]