Goal
I want to use a long vector of numbers, to create a matrix where each column is a successive offset (lag or lead) of the original vector. If n is the maximum offset, the matrix will have dimensions [length(vector), n * 2 + 1] (because we want offsets in both directions, and include the 0 offset, i.e. the original vector).
Example
To illustrate, consider the following vector:
test <- c(2, 8, 1, 10, 7, 5, 9, 3, 4, 6)
[1] 2 8 1 10 7 5 9 3 4 6
Expected output
Now we create offsets of values, let's say for n == 3:
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] NA NA NA 2 8 1 10
[2,] NA NA 2 8 1 10 7
[3,] NA 2 8 1 10 7 5
[4,] 2 8 1 10 7 5 9
[5,] 8 1 10 7 5 9 3
[6,] 1 10 7 5 9 3 4
[7,] 10 7 5 9 3 4 6
[8,] 7 5 9 3 4 6 NA
[9,] 5 9 3 4 6 NA NA
[10,] 9 3 4 6 NA NA NA
I am looking for an efficient solution. data.table or tidyverse solutions more than welcome.
Returning only the rows that have no NA's (i.e. rows 4 to 7) is also ok.
Current solution
lags <- lapply(3:1, function(x) dplyr::lag(test, x))
leads <- lapply(1:3, function(x) dplyr::lead(test, x))
l <- c(lags, test, leads)
matrix(unlist(l), nrow = length(test))
In base R, you can use embed to get rows 4 through 7. You have to reverse the column order, however.
embed(test, 7)[, 7:1]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 2 8 1 10 7 5 9
[2,] 8 1 10 7 5 9 3
[3,] 1 10 7 5 9 3 4
[4,] 10 7 5 9 3 4 6
data
test <- c(2, 8, 1, 10, 7, 5, 9, 3, 4, 6)
This will produce what you need...
n <- 3
t(embed(c(rep(NA,n), test, rep(NA,n)), length(test)))[length(test):1,]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] NA NA NA 2 8 1 10
[2,] NA NA 2 8 1 10 7
[3,] NA 2 8 1 10 7 5
[4,] 2 8 1 10 7 5 9
[5,] 8 1 10 7 5 9 3
[6,] 1 10 7 5 9 3 4
[7,] 10 7 5 9 3 4 6
[8,] 7 5 9 3 4 6 NA
[9,] 5 9 3 4 6 NA NA
[10,] 9 3 4 6 NA NA NA
This can be solved by constructing the matrix from a long vector and returning only the wanted columns and rows:
test <- c(2, 8, 1, 10, 7, 5, 9, 3, 4, 6)
n_offs <- 3L
n_row <- length(test) + n_offs + 1L
matrix(rep(c(rep(NA, n_offs), test), n_row), nrow = n_row)[1:length(test), 1:(n_offs * 2L + 1L)]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] NA NA NA 2 8 1 10
[2,] NA NA 2 8 1 10 7
[3,] NA 2 8 1 10 7 5
[4,] 2 8 1 10 7 5 9
[5,] 8 1 10 7 5 9 3
[6,] 1 10 7 5 9 3 4
[7,] 10 7 5 9 3 4 6
[8,] 7 5 9 3 4 6 NA
[9,] 5 9 3 4 6 NA NA
[10,] 9 3 4 6 NA NA NA
A variant which just returns the same result as embed(test, 7)[, 7:1] is:
matrix(rep(test, length(test) + 1L), nrow = length(test) + 1L)[
seq_len(length(test) - 2L * n_offs), seq_len(n_offs * 2L + 1L)]
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 2 8 1 10 7 5 9
[2,] 8 1 10 7 5 9 3
[3,] 1 10 7 5 9 3 4
[4,] 10 7 5 9 3 4 6
Related
I have a vector comprised of a set of numbers, for example:
vec <- c(1, 2, 3, 4, 5)
I wish to produce a matrix that contains the sum of each pairwise element within that vector - in this case:
[,1] [,2] [,3] [,4] [,5]
[1,] 2 3 4 5 6
[2,] 3 4 5 6 7
[3,] 4 5 6 7 8
[4,] 5 6 7 8 9
[5,] 6 7 8 9 10
You can use outer()
outer(vec,vec,"+")
Output:
[,1] [,2] [,3] [,4] [,5]
[1,] 2 3 4 5 6
[2,] 3 4 5 6 7
[3,] 4 5 6 7 8
[4,] 5 6 7 8 9
[5,] 6 7 8 9 10
Note: this may also be written as:
outer(vec,vec,`+`)
Here's one way.
vec <- c(1, 2, 3, 4, 5)
matrix(rowSums(expand.grid(vec, vec)), ncol = length(vec))
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 2 3 4 5 6
#> [2,] 3 4 5 6 7
#> [3,] 4 5 6 7 8
#> [4,] 5 6 7 8 9
#> [5,] 6 7 8 9 10
sapply can do but maybe not as efficient as outer
> sapply(vec,`+`,vec)
[,1] [,2] [,3] [,4] [,5]
[1,] 2 3 4 5 6
[2,] 3 4 5 6 7
[3,] 4 5 6 7 8
[4,] 5 6 7 8 9
[5,] 6 7 8 9 10
Suppose I have a list of nine, 2 x 2 matrices as defined by:
mat_list <- list(matrix(1, 2, 2), matrix(2, 2, 2), matrix(3, 2, 2),
matrix(4, 2, 2), matrix(5, 2, 2), matrix(6, 2, 2),
matrix(7, 2, 2), matrix(8, 2, 2), matrix(9, 2, 2))
I would like to merge these matrices into a single 6 x 6 matrix. It would look like this:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 4 4 7 7
[2,] 1 1 4 4 7 7
[3,] 2 2 5 5 8 8
[4,] 2 2 5 5 8 8
[5,] 3 3 6 6 9 9
[6,] 3 3 6 6 9 9
I can accomplish this task using the following code:
do.call( cbind, list( do.call( rbind, mat_list[1:3]),
do.call( rbind, mat_list[4:6]),
do.call( rbind, mat_list[7:9])) )
But how can this be generalized for a very large list of matrices? It would be too tedious to write out the list of do.call functions.
Maybe we can do like this
do.call(
cbind,
lapply(
split(mat_list, ceiling(seq_along(mat_list) / 3)),
function(x) do.call(rbind, x)
)
)
which gives
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 4 4 7 7
[2,] 1 1 4 4 7 7
[3,] 2 2 5 5 8 8
[4,] 2 2 5 5 8 8
[5,] 3 3 6 6 9 9
[6,] 3 3 6 6 9 9
or
> do.call(cbind, Map(function(x) do.call(rbind, x), data.frame(matrix(mat_list, 3, 3))))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 4 4 7 7
[2,] 1 1 4 4 7 7
[3,] 2 2 5 5 8 8
[4,] 2 2 5 5 8 8
[5,] 3 3 6 6 9 9
[6,] 3 3 6 6 9 9
I have an array of number
x <- seq(1:10)
I am after a matrix with n rows. Here is an example with 3-row matrix:
1 2 3 4 5 6 7 8 9 10
NA 1 2 3 4 5 6 7 8 9
NA NA 1 2 3 4 5 6 7 8
What would be the best way to create one?
There is an odd little function called embed that will do it...
t(embed(c(NA, NA, 1:10), 3))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 2 3 4 5 6 7 8 9 10
[2,] NA 1 2 3 4 5 6 7 8 9
[3,] NA NA 1 2 3 4 5 6 7 8
For a vector x and a matrix of n rows, the equivalent would be
t(embed(c(rep(NA, n-1), x), n))
Maybe there is more simpler way to do this but one way to create this matrix would be
create_matrix <- function(x, n) {
t(sapply(seq(n), function(m) c(rep(NA, m - 1), head(x, length(x) - m + 1))))
}
create_matrix(1:10, 3)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#[1,] 1 2 3 4 5 6 7 8 9 10
#[2,] NA 1 2 3 4 5 6 7 8 9
#[3,] NA NA 1 2 3 4 5 6 7 8
create_matrix(c(4, 3, 6, 8, 7), 4)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 4 3 6 8 7
#[2,] NA 4 3 6 8
#[3,] NA NA 4 3 6
#[4,] NA NA NA 4 3
I´m facing a problem with constructing a randomised matrix where I partially already have values (that need to stay fixed - so no further randomisation there).
Lets see:
matrix should end up being 10 by 10
n <- 10
I do want my first rows to be the data I enter. e.g:
row1<- c(1,4,7,6,5,3,2,8,9,10)
row2<- c(10,7,3,2,1,4,5,9,8,6)
row3<- c(9,2,4,3,8,7,10,1,6,5)
To bild a matrix with 10 rows (and 10 columns) I combined those rows with samples (no replace because I want each number to be unique in each row).
first.rows<-rbind(row1,row2,row3,sample(n,n,replace=F),sample(n,n,replace=F),sample(n,n,replace=F),sample(n,n,replace=F),sample(n,n,replace=F),sample(n,n,replace=F),sample(n,n,replace=F))
output:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
row1 1 4 7 6 5 3 2 8 9 10
row2 10 7 3 2 1 4 5 9 8 6
row3 9 2 4 3 8 7 10 1 6 5
6 1 5 4 2 10 3 8 7 9
2 5 7 8 9 6 1 3 4 10
10 6 4 1 8 3 7 2 5 9
8 5 3 2 4 1 10 7 6 9
10 7 9 6 8 2 5 4 3 1
1 10 8 4 7 3 5 2 6 9
2 1 10 4 8 9 3 6 5 7
So far so good..
However now I have the problem that there is no control for unique numbers in the columns. This is what I would need though. I get that this the case because I used rbind (and therefore only the function of no duplicates only works for the rows). But I do not know how else to approach this problem. The rows 1-3 should stay exactly as they are now.
Any ideas?
I think my previous solution Fixed values not repeated over column and row can be modified to work. You need a solver, but instead of starting with a empty grid, it starts with a pre-filled matrix:
# x is your matrix, "not filled" values should be NA
# x is a square matrix with dimension n (big n will take longer to converge)
backtrack = function(x){
n = ncol(x)
stopifnot(ncol(x)==nrow(x))
cells = list()
k = 1
for (i in 1:n){
for (j in 1:n){
if (is.na(x[i, j]))
cells[[k]] = sample(1:n)
else
cells[[k]] = NULL
k = k + 1
}
}
i = 0
while (i < n*n){
if (is.null(cells[[i+1]])){
i=i+1
next
}
candidates = cells[[i + 1]]
idx = sample(1:length(candidates), 1)
val = candidates[idx]
if (length(candidates) == 0){
cells[[i + 1]] = sample(1:n)
i = i - 1
x[as.integer(i/n) + 1, i %% n + 1] = NA
}
else {
rr = as.integer(i/n) + 1
cc = i %% n + 1
if ((val %in% x[rr, ]) || (val %in% x[, cc])){
candidates = candidates[-idx]
cells[[i + 1]] = candidates
}
else{
x[as.integer(i/n) + 1, i %% n + 1] = val
candidates = candidates[-idx]
cells[[i + 1]] = candidates
i = i + 1
}
}
}
x
}
Empty initial matrix
set.seed(1)
x = backtrack(matrix(NA, nrow = 10, ncol = 10))
print(x)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 8 10 4 6 9 7 1 2 3 5
[2,] 5 6 9 8 1 10 4 3 2 7
[3,] 10 7 1 2 8 9 5 4 6 3
[4,] 3 9 8 10 6 5 7 1 4 2
[5,] 9 1 6 4 7 3 2 5 10 8
[6,] 1 4 10 3 2 6 8 7 5 9
[7,] 2 8 5 9 10 1 3 6 7 4
[8,] 6 5 2 7 3 4 10 9 8 1
[9,] 4 3 7 1 5 2 6 8 9 10
[10,] 7 2 3 5 4 8 9 10 1 6
Pre-filled initial matrix
m = matrix(NA, ncol = 10, nrow = 10)
m[1, ] = c(1,4,7,6,5,3,2,8,9,10)
m[2, ] = c(10,7,3,2,1,4,5,9,8,6)
m[3, ] = c(9,2,4,3,8,7,10,1,6,5)
x = backtrack(m)
print(x)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 4 7 6 5 3 2 8 9 10
[2,] 10 7 3 2 1 4 5 9 8 6
[3,] 9 2 4 3 8 7 10 1 6 5
[4,] 5 9 6 8 3 2 4 7 10 1
[5,] 7 1 5 10 9 6 3 2 4 8
[6,] 2 5 8 1 10 9 6 3 7 4
[7,] 6 3 1 4 7 5 8 10 2 9
[8,] 8 10 9 5 4 1 7 6 3 2
[9,] 3 6 10 9 2 8 1 4 5 7
[10,] 4 8 2 7 6 10 9 5 1 3
NOTE: I didn't tested it for bugs.
I'm trying to create a 6x6 matrix with the cell values equal to the sum of the row index and he col index. I can do this using loops, but I'm wondering if there is a way to do this using vector functions.
Use the outer function with "+":
outer(1:6, 1:6, "+")
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 2 3 4 5 6 7
[2,] 3 4 5 6 7 8
[3,] 4 5 6 7 8 9
[4,] 5 6 7 8 9 10
[5,] 6 7 8 9 10 11
[6,] 7 8 9 10 11 12
Incidentally, this is basically a shortcut for the following vectorized approach:
matrix(rep(1:6, 6) + rep(1:6, each = 6), nrow = 6)
Here's another possibility:
m <- matrix(NA,6,6)
m <- col(m)+row(m)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 2 3 4 5 6 7
#[2,] 3 4 5 6 7 8
#[3,] 4 5 6 7 8 9
#[4,] 5 6 7 8 9 10
#[5,] 6 7 8 9 10 11
#[6,] 7 8 9 10 11 12