Create matrix by layering array - r

I have an array of number
x <- seq(1:10)
I am after a matrix with n rows. Here is an example with 3-row matrix:
1 2 3 4 5 6 7 8 9 10
NA 1 2 3 4 5 6 7 8 9
NA NA 1 2 3 4 5 6 7 8
What would be the best way to create one?

There is an odd little function called embed that will do it...
t(embed(c(NA, NA, 1:10), 3))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 2 3 4 5 6 7 8 9 10
[2,] NA 1 2 3 4 5 6 7 8 9
[3,] NA NA 1 2 3 4 5 6 7 8
For a vector x and a matrix of n rows, the equivalent would be
t(embed(c(rep(NA, n-1), x), n))

Maybe there is more simpler way to do this but one way to create this matrix would be
create_matrix <- function(x, n) {
t(sapply(seq(n), function(m) c(rep(NA, m - 1), head(x, length(x) - m + 1))))
}
create_matrix(1:10, 3)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 1 2 3 4 5 6 7 8 9 10
#[2,] NA 1 2 3 4 5 6 7 8 9
#[3,] NA NA 1 2 3 4 5 6 7 8
create_matrix(c(4, 3, 6, 8, 7), 4)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 4 3 6 8 7
#[2,] NA 4 3 6 8
#[3,] NA NA 4 3 6
#[4,] NA NA NA 4 3

Related

Convert Character String to Matrix

I have a large (rectangular) vector of strings, e.g:
my.strings <- c("1234567", "1234567", "1234567", "1234567")
which I would like to convert to a matrix:
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
# [1,] 1 2 3 4 5 6 7
# [2,] 1 2 3 4 5 6 7
# [3,] 1 2 3 4 5 6 7
# [4,] 1 2 3 4 5 6 7
Is there a simple way to do this in R? (Unfortunately, yes the strings of numbers are indeed character strings and not numeric.)
We could use strsplit to split at '', and then rbind the list elements after converting the type
do.call(rbind, type.convert(strsplit(my.strings, ""), as.is = TRUE))
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 2 3 4 5 6 7
[2,] 1 2 3 4 5 6 7
[3,] 1 2 3 4 5 6 7
[4,] 1 2 3 4 5 6 7
Here, we assume the strings have the same number of characters (nchar). If it is different, the lengths will be different and thus have to pad NA before reshaping to matrix
lst1 <- type.convert(strsplit(my.strings, ""), as.is = TRUE)
mx <- max(lengths(lst1))
do.call(rbind, lapply(lst1, `length<-`, mx))
Another possible solution:
library(tidyverse)
my.strings <- c("1234567", "1234567", "1234567", "1234567")
my.strings %>%
sapply(function(x) str_split(x,"") %>% unlist %>% as.numeric) %>%
unname %>% t
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 1 2 3 4 5 6 7
#> [2,] 1 2 3 4 5 6 7
#> [3,] 1 2 3 4 5 6 7
#> [4,] 1 2 3 4 5 6 7
Here's another way:
matrix(as.numeric(unlist(strsplit(my.strings, ""))), nrow = length(my.strings), byrow=T)
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1 2 3 4 5 6 7
[2,] 1 2 3 4 5 6 7
[3,] 1 2 3 4 5 6 7
[4,] 1 2 3 4 5 6 7

function "if " applied to each cell

I am trying perform a function to each cell of a matrix in R. I want to add the cells per 3 if they are > 0.
Example:
mat <- matrix(data=0:9, nrow=5, ncol=10, byrow=F)
mat3 <- apply(mat, MARGIN = 1, FUN= function(mat) if(mat != 0) {mat+3})
But first that created a list of length 5 and second it's all the cells who are added per 3.
For this simple case, it would be preferable to use the solutions from #akrun or #Karolis Koncevičius, but you can also do:
apply(mat, 2, function(x) ifelse(x > 0, x + 3, x))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 8 0 8 0 8 0 8 0 8
[2,] 4 9 4 9 4 9 4 9 4 9
[3,] 5 10 5 10 5 10 5 10 5 10
[4,] 6 11 6 11 6 11 6 11 6 11
[5,] 7 12 7 12 7 12 7 12 7 12
You don't need any apply, can use ifelse directly:
ifelse(mat > 0, mat+3, mat)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 8 0 8 0 8 0 8 0 8
[2,] 4 9 4 9 4 9 4 9 4 9
[3,] 5 10 5 10 5 10 5 10 5 10
[4,] 6 11 6 11 6 11 6 11 6 11
[5,] 7 12 7 12 7 12 7 12 7 12
But a faster solution would be:
mat[mat > 0] <- mat[mat > 0] + 3
mat
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 8 0 8 0 8 0 8 0 8
[2,] 4 9 4 9 4 9 4 9 4 9
[3,] 5 10 5 10 5 10 5 10 5 10
[4,] 6 11 6 11 6 11 6 11 6 11
[5,] 7 12 7 12 7 12 7 12 7 12
We could also do this on the fly with
mat + (mat > 0) * 3
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 0 8 0 8 0 8 0 8 0 8
#[2,] 4 9 4 9 4 9 4 9 4 9
#[3,] 5 10 5 10 5 10 5 10 5 10
#[4,] 6 11 6 11 6 11 6 11 6 11
#[5,] 7 12 7 12 7 12 7 12 7 12

R: Generate 3 separate folds from a matrix

I'm doing cross validation and I want to separate the data into 3 folds.
I create a matrix withmat=matrix(sample.int(10, 9*100, TRUE), 6, 10) which looks like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
[3,] 7 6 6 3 8 2 3 10 7 4
[4,] 7 4 10 8 7 5 2 6 2 8
[5,] 9 7 7 5 3 9 5 8 7 8
[6,] 3 3 1 2 9 3 6 7 6 9
I want to get then 3 matrices with the data:
fold 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
fold 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 7 6 6 3 8 2 3 10 7 4
[2,] 7 4 10 8 7 5 2 6 2 8
fold 3
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 9 7 7 5 3 9 5 8 7 8
[2,] 3 3 1 2 9 3 6 7 6 9
Here is my code what I did:
require(stats)
mat=matrix(sample.int(10, 9*100, TRUE), 6, 10)
folds=cut(seq(1, nrow(mat)), breaks = 3, labels = FALSE)
#Perform 10 fold cross validation
for(i in 1:3){
#segment your data by folds using the which() function
testIndexes=which(folds==i, arr.ind = TRUE)
testData=mat[testIndexes,]
trainData=mat[-testIndexes,]
}
The training data that I get from fold 1 and fold 2 are connected, I want to generate them separately.
This is the generated training set which should be separate in two folds.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 10 10 9 3 3 3 4 4 3 9
[2,] 9 3 5 1 3 9 5 5 4 8
[3,] 7 6 6 3 8 2 3 10 7 4
[4,] 7 4 10 8 7 5 2 6 2 8

Efficient creation of a matrix of offsets

Goal
I want to use a long vector of numbers, to create a matrix where each column is a successive offset (lag or lead) of the original vector. If n is the maximum offset, the matrix will have dimensions [length(vector), n * 2 + 1] (because we want offsets in both directions, and include the 0 offset, i.e. the original vector).
Example
To illustrate, consider the following vector:
test <- c(2, 8, 1, 10, 7, 5, 9, 3, 4, 6)
[1] 2 8 1 10 7 5 9 3 4 6
Expected output
Now we create offsets of values, let's say for n == 3:
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] NA NA NA 2 8 1 10
[2,] NA NA 2 8 1 10 7
[3,] NA 2 8 1 10 7 5
[4,] 2 8 1 10 7 5 9
[5,] 8 1 10 7 5 9 3
[6,] 1 10 7 5 9 3 4
[7,] 10 7 5 9 3 4 6
[8,] 7 5 9 3 4 6 NA
[9,] 5 9 3 4 6 NA NA
[10,] 9 3 4 6 NA NA NA
I am looking for an efficient solution. data.table or tidyverse solutions more than welcome.
Returning only the rows that have no NA's (i.e. rows 4 to 7) is also ok.
Current solution
lags <- lapply(3:1, function(x) dplyr::lag(test, x))
leads <- lapply(1:3, function(x) dplyr::lead(test, x))
l <- c(lags, test, leads)
matrix(unlist(l), nrow = length(test))
In base R, you can use embed to get rows 4 through 7. You have to reverse the column order, however.
embed(test, 7)[, 7:1]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 2 8 1 10 7 5 9
[2,] 8 1 10 7 5 9 3
[3,] 1 10 7 5 9 3 4
[4,] 10 7 5 9 3 4 6
data
test <- c(2, 8, 1, 10, 7, 5, 9, 3, 4, 6)
This will produce what you need...
n <- 3
t(embed(c(rep(NA,n), test, rep(NA,n)), length(test)))[length(test):1,]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] NA NA NA 2 8 1 10
[2,] NA NA 2 8 1 10 7
[3,] NA 2 8 1 10 7 5
[4,] 2 8 1 10 7 5 9
[5,] 8 1 10 7 5 9 3
[6,] 1 10 7 5 9 3 4
[7,] 10 7 5 9 3 4 6
[8,] 7 5 9 3 4 6 NA
[9,] 5 9 3 4 6 NA NA
[10,] 9 3 4 6 NA NA NA
This can be solved by constructing the matrix from a long vector and returning only the wanted columns and rows:
test <- c(2, 8, 1, 10, 7, 5, 9, 3, 4, 6)
n_offs <- 3L
n_row <- length(test) + n_offs + 1L
matrix(rep(c(rep(NA, n_offs), test), n_row), nrow = n_row)[1:length(test), 1:(n_offs * 2L + 1L)]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] NA NA NA 2 8 1 10
[2,] NA NA 2 8 1 10 7
[3,] NA 2 8 1 10 7 5
[4,] 2 8 1 10 7 5 9
[5,] 8 1 10 7 5 9 3
[6,] 1 10 7 5 9 3 4
[7,] 10 7 5 9 3 4 6
[8,] 7 5 9 3 4 6 NA
[9,] 5 9 3 4 6 NA NA
[10,] 9 3 4 6 NA NA NA
A variant which just returns the same result as embed(test, 7)[, 7:1] is:
matrix(rep(test, length(test) + 1L), nrow = length(test) + 1L)[
seq_len(length(test) - 2L * n_offs), seq_len(n_offs * 2L + 1L)]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 2 8 1 10 7 5 9
[2,] 8 1 10 7 5 9 3
[3,] 1 10 7 5 9 3 4
[4,] 10 7 5 9 3 4 6

How to combine matrices with sapply in R

Each loop of my sapply function will out put a n*m matrix. n is fixed, m is not.
For example, if I run this in R:
sapply(1:3, function(x) {matrix(1:9, 3)})
and it will output:
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 2 2 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 5 5
[6,] 6 6 6
[7,] 7 7 7
[8,] 8 8 8
[9,] 9 9 9
However, what I want is something like this:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
Any idea for this? Thanks
One solution is:
do.call(cbind, lapply(1:3, function(x) {matrix(1:9, 3)}))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 4 7 1 4 7 1 4 7
[2,] 2 5 8 2 5 8 2 5 8
[3,] 3 6 9 3 6 9 3 6 9
We can use replicate
`dim<-`(replicate(3, matrix(1:9, 3)), c(3, 3*3))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
#[1,] 1 4 7 1 4 7 1 4 7
#[2,] 2 5 8 2 5 8 2 5 8
#[3,] 3 6 9 3 6 9 3 6 9

Resources