I am trying to create a block circulant matrix in R. The structure of a block circulant matrix is given below.
C0 C1 ... Cn-1
Cn-1 C0 C1 ... Cn-2
Cn-2 Cn-1 .... Cn-3
and so on
I have the blocks
C0 .... Cn-1
What is the easiest way to create the matrix. Is there a function already available?
Thanks for a challenging question! Here is a solution summing kronecker products of your matrices with sub- and super-diagonals.
Sample data, a list of matrices:
C <- lapply(1:3, matrix, nrow = 2, ncol = 2)
My solution:
bcm <- function(C) {
require(Matrix)
n <- length(C)
Reduce(`+`, lapply((-n+1):(n-1),
function(i) kronecker(as.matrix(bandSparse(n, n, -i)),
C[[1 + (i %% n)]])))
}
bcm(C)
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 1 1 3 3 2 2
# [2,] 1 1 3 3 2 2
# [3,] 2 2 1 1 3 3
# [4,] 2 2 1 1 3 3
# [5,] 3 3 2 2 1 1
# [6,] 3 3 2 2 1 1
I don't know if this is particularly efficient, but as I interpret your question it does what you want.
rotList <- function(L,n) {
if (n==0) return(L)
c(tail(L,n),head(L,-n))
}
rowFun <- function(n,matList) do.call(rbind,rotList(matList,n))
bcMat <- function(matList) {
n <- length(matList)
do.call(cbind,lapply(0:(n-1),rowFun,matList))
}
Example:
bcMat(list(diag(3),matrix(1:9,nrow=3),matrix(4,nrow=3,ncol=3)))
I think what you are looking for is circulant.matrix from the lgcp package.
If x is a matrix whose columns are the bases of the sub-blocks of a
block circulant matrix, then this function returns the block circulant
matrix of interest.
eg
x <- matrix(1:8,ncol=4)
circulant(x)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 1 2 3 4 5 6 7 8
# [2,] 2 1 4 3 6 5 8 7
# [3,] 7 8 1 2 3 4 5 6
# [4,] 8 7 2 1 4 3 6 5
# [5,] 5 6 7 8 1 2 3 4
# [6,] 6 5 8 7 2 1 4 3
# [7,] 3 4 5 6 7 8 1 2
# [8,] 4 3 6 5 8 7 2 1
Alternative approach
Here is a highly inefficient approach using kronecker and Reduce
bcirc <- function(list.blocks){
P <- lapply(seq_along(list.blocks), function(x,y) x ==y, x = circulant(seq_along(list.blocks)))
Reduce('+',Map(P = P, A=list.blocks, f = function(P,A) kronecker(P,A)))
}
benchmarking with #flodel and #Ben Bolker
lbirary(microbenchmark)
microbenchmark(bcm(C), bcirc(C), bcMat(C))
Unit: microseconds
expr min lq median uq max neval
bcm(C) 10836.719 10925.7845 10992.8450 11141.1240 21622.927 100
bcirc(C) 444.983 455.7275 479.5790 487.0370 569.105 100
bcMat(C) 288.558 296.4350 309.8945 348.4215 2190.231 100
Is something like this what you are looking for?
> vec <- 1:4
> sapply(rev(seq_along(vec)),function(x) c(tail(vec,x),head(vec,-x)) )
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 1
[3,] 3 4 1 2
[4,] 4 1 2 3
Related
The What
Given some matrix:
mat <- matrix(1:10,ncol=2)
I want to transform it to the following triplet format: (i,j,v) where i is the row index, j is the column index and v is the value at i,j (you can see why at the bottom)
What I have tried:
matrixToTriplet <- function(mat) {
i <- 1:nrow(mat)
nj <- ncol(mat)
j <- 1:nj
output <- matrix(numeric(0), ncol=3)
for(i_ in i) {
curr <- c(rep(i_, times=nj),j,mat[i_,])
output <- rbind(output, matrix(curr, ncol=3))
}
output
}
The output should be:
> matrixToTriplet(mat)
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 2 6
[3,] 2 1 2
[4,] 2 2 7
[5,] 3 1 3
[6,] 3 2 8
[7,] 4 1 4
[8,] 4 2 9
[9,] 5 1 5
[10,] 5 2 10
I also had another version using apply and sapply instead of for but those would explode very quickly. The kind of sizes I am working on is quite large, 1600x1600 matrices.
The Why
One might ask "why"?, the reason for this is that I need the is and js as features for a model to predict v. If there is a better way to do this I am interested to hear.
For those who really like expand.grid:
cbind(expand.grid(seq(nrow(mat)), seq(ncol(mat))), as.vector(mat))
You can do this with row and col:
x <- t(mat)
cbind(c(col(x)), c(row(x)), c(x))
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 2 6
# [3,] 2 1 2
# [4,] 2 2 7
# [5,] 3 1 3
# [6,] 3 2 8
# [7,] 4 1 4
# [8,] 4 2 9
# [9,] 5 1 5
# [10,] 5 2 10
If the row order does not matter in the final output, we can also do it with cbind(c(row(mat)), c(col(mat)), c(mat)) directly.
A benchmark will be helpful when talking about efficiency:
library(microbenchmark)
bmf <- function(mat, ...){
microbenchmark(
a = {x <- t(mat);cbind(c(col(x)), c(row(x)), c(x))},
a2 = {cbind(c(row(mat)), c(col(mat)), c(mat))},
b = {cbind(which(mat > 0, arr.ind = TRUE), val = c(mat))},
c = {cbind(expand.grid(seq(nrow(mat)), seq(ncol(mat))), as.vector(mat))},
...)
}
mat <- matrix(seq_len(10*10), 10, 10)
bmf(mat, times = 10)
# Unit: microseconds
# expr min lq mean median uq max neval
# a 7.985 9.239 18.2556 15.0415 22.756 47.065 10
# a2 4.310 4.681 5.5257 5.2405 5.755 9.099 10
# b 17.032 21.672 35.8950 28.7505 59.170 68.436 10
# c 216.101 228.736 267.7217 243.9465 288.455 380.096 10'
mat <- matrix(seq_len(1000*1000), 1000, 1000)
bmf(mat, times = 10)
# Unit: milliseconds
# expr min lq mean median uq max neval
# a 17.70805 20.51167 36.73432 21.79357 24.56775 111.6796 10
# a2 14.61793 20.95486 37.70526 25.58968 30.91322 98.44344 10
# b 41.74630 45.49698 76.61307 47.86678 122.90142 178.8363 10
# c 14.40912 17.84025 25.39672 19.29968 20.12222 85.2515 10
The simplest way would be to use which argument with arr.ind= TRUE parameter which exactly does what you want however, the issue is it expects a logical value. So we need to find a condition in which all the values turn out to be TRUE. In this case I see all values are greater than 0. So we can do
#as.vector suggested by #snoram and verified by #mt1022 that it is faster
cbind(which(mat > 0, arr.ind = TRUE), val = as.vector(mat))
# row col val
# [1,] 1 1 1
# [2,] 2 1 2
# [3,] 3 1 3
# [4,] 4 1 4
# [5,] 5 1 5
# [6,] 1 2 6
# [7,] 2 2 7
# [8,] 3 2 8
# [9,] 4 2 9
#[10,] 5 2 10
If you are not able to find any such condition which would make all the values to be TRUE we could just create a new matrix with same dimensions as mat with all values as TRUE using relist
cbind(which(relist(TRUE, mat), arr.ind = TRUE), value = as.vector(mat))
# row col value
# [1,] 1 1 1
# [2,] 2 1 2
# [3,] 3 1 3
# [4,] 4 1 4
# [5,] 5 1 5
# [6,] 1 2 6
# [7,] 2 2 7
# [8,] 3 2 8
# [9,] 4 2 9
#[10,] 5 2 10
Just for fun, here is an option using the Matrix package.
mat <- matrix(1:10,ncol=2)
#create sparse matrix
library(Matrix)
M <- Matrix(mat, sparse = TRUE)
#turn into triplet representation
M <- as(M, "TsparseMatrix")
#indices are zero-based within Matrix package
m <- cbind(M#i + 1, M#j + 1, M#x) #do you really need a matrix as output?
m[order(m[,1]),] #probably you don't need this step
# [,1] [,2] [,3]
# [1,] 1 1 1
# [2,] 1 2 6
# [3,] 2 1 2
# [4,] 2 2 7
# [5,] 3 1 3
# [6,] 3 2 8
# [7,] 4 1 4
# [8,] 4 2 9
# [9,] 5 1 5
#[10,] 5 2 10
I have two matrices m1 and m2.
m1 <- matrix(1:16, ncol = 4)
m2 <- matrix(16:1, ncol = 4)
# > m1
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16
# > m2
# [,1] [,2] [,3] [,4]
# [1,] 16 12 8 4
# [2,] 15 11 7 3
# [3,] 14 10 6 2
# [4,] 13 9 5 1
I want to find the minimum between the two matrices for each cell within a moving kernel of 3x3. The outer margines should be ignored, i.e. they can be filled with NAs and the min function should then have na.rm = TRUE. The result should look like this:
# > m3
# [,1] [,2] [,3] [,4]
# [1,] 1 1 3 3
# [2,] 1 1 2 2
# [3,] 2 2 1 1
# [4,] 3 3 1 1
I have already tried a combination of pmin{base} and runmin{caTools} like this:
pmin(runmin(m1, 3, endrule = "keep"),
runmin(m2, 3, endrule = "keep"))
However, this did not work. Probably due to the fact that
"If x is a matrix than each column will be processed separately."
(from ?runmin)
Is there any package, that performs such operations, or is it possible to apply?
Here is a base R approach:
m = pmin(m1, m2)
grid = expand.grid(seq(nrow(m)), seq(ncol(m)))
x = apply(grid, 1, function(u) {
min(m[max(1,u[1]-1):min(nrow(m), u[1]+1), max(1,u[2]-1):min(ncol(m), u[2]+1)])
})
dim(x) = dim(m)
#> x
# [,1] [,2] [,3] [,4]
#[1,] 1 1 3 3
#[2,] 1 1 2 2
#[3,] 2 2 1 1
#[4,] 3 3 1 1
I want to draw a Hankel matrix with R use only matrix(), seq() and rep() function of R. Until now, I draw this in some way:
#Do this exercise with other packages, need to rework
install.packages("matrixcalc")
library(matrixcalc)
E1 <- hankel.matrix( 5, seq( 1, 9 ) )
print(E1)
#Use matrix() only, not efficient
E2 <- matrix(c(1,2,3,4,5,2,3,4,5,6,3,4,5,6,7,4,5,6,7,8,5,6,7,8,9), ncol=5)
print(E2)
#Use seq() but not worked
E3 <- matrix(c(seq(1:5),seq(2:6),seq(3:7),seq(4:8),seq(5:9)), ncol=5)
print(E3)
E1 used a library to draw a Hankel matrix and in E2, I tried to put the number manually to draw one but it will take a lot of time if I want a new big matrix. I tried to use seq() but it not worked. It will draw like this:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 2 2 2 2
[3,] 3 3 3 3 3
[4,] 4 4 4 4 4
[5,] 5 5 5 5 5
I am still very new with R so every idea is welcome.
You can do this :
matrix(rep(1:5,5)+rep(0:4,each=5),ncol=5)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 2 3 4 5
# [2,] 2 3 4 5 6
# [3,] 3 4 5 6 7
# [4,] 4 5 6 7 8
# [5,] 5 6 7 8 9
Or more elegant but using outer:
outer(0:4,1:5,'+')
EDIT :
the rep solution works like this:
12345 12345 12345 ... (rep times, repeat the vector n times
+ 00000 11111 22222 ... (rep with each , repeat each element n times
= 12345 23456 34567 .....
outer can be tricky at first, maybe this answer here can help you to understand it and to general debug.
I'm reversing the order of the two different solutions so that the qualifying one is at the top:
A general function (meaning on that doesn't depend on the values being sequential) that only uses a couple of extra functions (like c() and "[") to do the work:
N <- c(9L, 7L, 3L, 2L, 1L, 8L, 4L, 5L, 6L, 10L)
hankel2 <- function(N, n){stopifnot(length(N)==2*n);
matrix( rep(N,n)[c(rep(TRUE,n),rep(FALSE,n+1))], n) }
hankel2(N,5)
[,1] [,2] [,3] [,4] [,5]
[1,] 9 7 3 2 1
[2,] 7 3 2 1 8
[3,] 3 2 1 8 4
[4,] 2 1 8 4 5
[5,] 1 8 4 5 6
The trick with that first (of three) efforts was to depend on argument recycling of logical vectors when used inside the "[" function. It creates a gap of n+1 items after choosing n items by indexing with FALSE ( which has the effect of omitting items.)
Embed is a cute little function that has a fairly opaque help file but occasionally delivers very compact code:
> x <- 1:10
> embed (x, 5)[1:5, 5:1]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 2 3 4 5 6
[3,] 3 4 5 6 7
[4,] 4 5 6 7 8
[5,] 5 6 7 8 9
You could make a function:
> hankel <- function( n ) embed(1:(2*n),5)[1:n, n:1]
> hankel(5)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 2 3 4 5 6
[3,] 3 4 5 6 7
[4,] 4 5 6 7 8
[5,] 5 6 7 8 9
(Admittedly not playing by the specifications although I wondered if any of the solutions so far would stand up to a vector that wasn't sequential. This one does:)
> hankel5 <- function( n ) embed(sample(1:10,10),5)[1:n, n:1]
> hankel5(5)
[,1] [,2] [,3] [,4] [,5]
[1,] 3 5 7 9 4
[2,] 5 7 9 4 10
[3,] 7 9 4 10 1
[4,] 9 4 10 1 8
[5,] 4 10 1 8 2
So this is the other general function:
hankel <- function( N, n ) {stopifnot(length(N) == 2*n); embed(N,n)[1:n, n:1]
I am trying to create a matrix by drawing random block rows from another matrix. I have managed to do so with a loop.
set.seed(1)
a_matrix <- matrix(1:10,10,5) # the matrix with original sample
b_matrix <- matrix(NA,10, 5) # a matrix to store the bootstrap sample
S2<- seq(from =1 , to = 10, by =2) #[1] 1 3 5 7 9
m <- 2 # block size of m
for (r in S2){ start_point<-sample(1:(nrow(a_matrix)-1), 1, replace=T)
#randomly choose a number 1 to length of a_matrix -1
b_block <- a_matrix[start_point:(start_point+(m-1)), 1:ncol(a_matrix)]
# randomly select blocks from matrix a
b_matrix[r,]<-as.matrix((b_block)[1,])
b_matrix[(r+1),]<-as.matrix((b_block)[2,]) # put the blocks into matrix b
}
b_matrix
#we now have a b_matrix that is made of random blocks (size m=2)
#of the original a_matrix
The loop method works but it is clearly not very efficient and it is not possible to extend it to other block size (for e.g. having a blocksize of 3) .What is a cleaner and expandable approach ? Thanks in advance
Here I tried to clean it up a bit and generalize the use of m:
random_block_sample <- function(a_matrix, m = 2L) {
N <- nrow(a_matrix)
stopifnot(m <= N)
n <- ceiling(N / m)
s <- sample(N - m + 1L, n, TRUE) # start_point
i <- unlist(lapply(s, seq, length.out = m))
b_matrix <- a_matrix[i, , drop = FALSE]
head(b_matrix, N)
}
set.seed(1L)
random_block_sample(a_matrix, m = 2L)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 3 3 3 3 3
# [2,] 4 4 4 4 4
# [3,] 4 4 4 4 4
# [4,] 5 5 5 5 5
# [5,] 6 6 6 6 6
# [6,] 7 7 7 7 7
# [7,] 9 9 9 9 9
# [8,] 10 10 10 10 10
# [9,] 2 2 2 2 2
# [10,] 3 3 3 3 3
set.seed(1L)
random_block_sample(a_matrix, m = 5L)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 2 2 2 2 2
# [2,] 3 3 3 3 3
# [3,] 4 4 4 4 4
# [4,] 5 5 5 5 5
# [5,] 6 6 6 6 6
# [6,] 3 3 3 3 3
# [7,] 4 4 4 4 4
# [8,] 5 5 5 5 5
# [9,] 6 6 6 6 6
# [10,] 7 7 7 7 7
In R, let M be the matrix
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 3
[3,] 2 4 5
[4,] 6 7 8
I would like to select the submatrix m
[,1] [,2] [,3]
[1,] 1 3 3
[2,] 2 4 5
[3,] 6 7 8
using unique on M[,1], specifying to keep the row with the maximal value in the second columnM.
At the end, the algorithm should keep row [2,] from the set \{[1,], [2,]\}. Unfortunately unique() returns me a vector with actual values, and not row numbers, after elimination of duplicates.
Is there a way to get the asnwer without the package plyr?
Thanks a lot,
Avitus
Here's how:
is.first.max <- function(x) seq_along(x) == which.max(x)
M[as.logical(ave(M[, 2], M[, 1], FUN = is.first.max)), ]
# [,1] [,2] [,3]
# [1,] 1 3 3
# [2,] 2 4 5
# [3,] 6 7 8
You're looking for duplicated.
m <- as.matrix(read.table(text="1 2 3
1 3 3
2 4 5
6 7 8"))
m <- m[order(m[,2], decreasing=TRUE), ]
m[!duplicated(m[,1]),]
# V1 V2 V3
# [1,] 6 7 8
# [2,] 2 4 5
# [3,] 1 3 3
Not the most efficient:
M <- matrix(c(1,1,2,6,2,3,4,7,3,3,5,8),4)
t(sapply(unique(M[,1]),function(i) {temp <- M[M[,1]==i,,drop=FALSE]
temp[which.max(temp[,2]),]
}))
# [,1] [,2] [,3]
#[1,] 1 3 3
#[2,] 2 4 5
#[3,] 6 7 8