I have a positive definite symmetric matrix. Pasting the matrix generated using the following code:
set.seed(123)
m <- genPositiveDefMat(
dim = 3,
covMethod = "unifcorrmat",
rangeVar = c(0,1) )
x <- as.matrix(m$Sigma)
diag(x) <- 1
x
#Output
[,1] [,2] [,3]
[1,] 1.0000000 -0.2432303 -0.4110525
[2,] -0.2432303 1.0000000 -0.1046602
[3,] -0.4110525 -0.1046602 1.0000000
Now, I want to run the matrix through iterations and in each iteration I want to replace the symmetric pair with NA. For example,
Iteration 1:
x[1,2] = x[2,1] <- NA
Iteration2:
x[1,3] = x[3,1] <- NA
and so on....
My idea was to check using a for loop
Prototype:
for( r in 1:nrow(x)
for( c in 1:ncol(x)
if x[r,c]=x[c,r]<-NA
else
x[r,c]
The issue with my code is for row 1 and column 1, the values are equal hence it sets to 0 (which is wrong). Also, the moment it is not NA it comes out of the loop.
Appreciate any help here.
Thanks
If you need the replacement done iteratively, you can use the indexes of values represented by upper.tri(x)/lower.tri to do the replacements pair-by-pair. That will allow you to pass the results to a function before/after each replacement, e.g.:
idx <- which(lower.tri(mat), arr.ind=TRUE)
sel <- cbind(
replace(mat, , seq_along(mat))[ idx ],
replace(mat, , seq_along(mat))[ idx[,2:1] ]
)
# [,1] [,2]
#[1,] 2 4 ##each row represents the lower/upper pair
#[2,] 3 7
#[3,] 6 8
for( i in seq_len(nrow(sel)) ) {
mat[ sel[i,] ] <- NA
print(mean(mat, na.rm=TRUE))
}
#[1] 0.2812249
#[1] 0.5581359
#[1] 1
Related
I have a 4x100 matrix where I would like to multiply column 1 with row 1 in its transpose etc and store these matrices somewhere to be able to take the sum of these new matrices lateron.
I really don't know where to start due to the fact that I get 4x4 matrices after the column-row-multiplication. Due to this fact I cannot store them in a matrix
data:
mm num[1:4,1:100]
mm_t num[1:100,1:4]
I'm thinking of creating a list in some way
list1=list()
for(i in 1:100){
list1[i] <- mm[,i]%*%mm_t[i,]
}
but I need some more indices i think because this just leaves me with a number in each argument..
First, your call for data is not clear. Second, are you tryign to multiply each value by itself, or do matrix multiplication
We create a 4x100 matrix and its transpose:
mm <- matrix(1:400, nrow = 4, ncol = 100)
mm.t <- t(mm)
Then we can do the matrix multiplication (which is what you did, and you get a 4 x 4 matrix from the definition of matrix multiplication https://www.wikiwand.com/en/Matrix_multiplication)
If we want to multiply each index by itself (so mm[1,1] by mm [1,1]) then:
mm * mm
This will result in 4x100 matrix where each value is the square of the original value.
If we want the matrix multiplication of each column with itself, then:
sapply(1:100, function(x) {
mm[, x] %*% mm[, x]
})
This results in 100 values: each one is the matrix product of a 4x1 vector with itself.
Let's start with some sample data. Please get in the habit of including things like this in your question:
nr = 4
nc = 100
set.seed(47)
mm = matrix(runif(nr * nc), nrow = nr)
Here's a working answer, very similar to your attempt:
result = list()
for (i in 1:ncol(mm)) result[[i]] = mm[, i] %*% t(mm[, i])
result[1:2]
# [[1]]
# [,1] [,2] [,3] [,4]
# [1,] 0.9544547 0.3653018 0.7439585 0.8035430
# [2,] 0.3653018 0.1398132 0.2847378 0.3075428
# [3,] 0.7439585 0.2847378 0.5798853 0.6263290
# [4,] 0.8035430 0.3075428 0.6263290 0.6764924
#
# [[2]]
# [,1] [,2] [,3] [,4]
# [1,] 0.3289532 0.3965557 0.2231443 0.2689613
# [2,] 0.3965557 0.4780511 0.2690022 0.3242351
# [3,] 0.2231443 0.2690022 0.1513691 0.1824490
# [4,] 0.2689613 0.3242351 0.1824490 0.2199103
As to why yours didn't work, we can experiment and see that indeed we get a number rather than a matrix. The reason is that when you subset a single row or column of a matrix, the dimensions are "dropped" and it is coerced to a plain vector. And when you matrix multiply two vectors, you get their dot product.
mmt = t(mm)
mm[, 1] %*% mmt[1, ]
# [,1]
# [1,] 2.350646
dim(mm[, 1])
# NULL
dim(mmt[1, ])
# NULL
We can avoid this by specifying drop = FALSE in the subset code
dim(mmt[1, , drop = FALSE])
# [1] 1 4
And thus slightly modify your attempt, just adding drop = FALSE will make it work.
res2 = list()
for (i in 1:ncol(mm)) res2[[i]] = mm[, i] %*% mmt[i, , drop = FALSE]
identical(result, res2)
# [1] TRUE
I have a data frame of 300x300 elements. Each of them are either -1 or +1:
[,1] [,2] [,3]
[1,] 1 -1 -1
[2,] 1 1 1
[3,] -1 -1 1
[4,] 1 1 -1
What I want is to iterate over my data frame, and multiply each value with every neighbouring value.
I.e:
For element [1,1] in my original data frame I want the product of [1,1], [1,2] and [2,1]
For element [2,2] in my original data frame I want the product of [2,2], [1,2], [2,1], [2,3] and [3,2].
I have tried to create 4 new data frames, each shifted 1 element to the right, left, up and down, respectively:
x_up <- shift(x, 1, dir='up')
x_up <- as.array(x_up)
dim(x_up) <- dims
x_down <- shift(x, 1, dir='down')
x_down <- as.array(x_down)
dim(x_down) <- dims
x_left <- shift(x, 1, dir='left')
x_left <- as.array(x_left)
dim(x_left) <- dims
x_right <- shift(x, 1, dir='right')
x_right <- as.array(x_right)
dim(x_right) <- dims
where x is my original data frame.
I can see when I used this approach, the new data frames are not rightfully shiftet; more of them are identical. I checked this with identical().
Is there another approach to my problem?
Edit:
shift() is of the 'binhf' library
I think there's probably a smarter way to do this, but the standard approach would be iterating over each element and multiplying its surroundings.
Starting with:
mat <- matrix(c(1, 1, -1, 1, -1, 1, -1, 1, -1, 1, 1, -1), ncol=3)
In order to avoid problems on positive margins, you must add a column and a row of 1's as margins (positive 1 won't be a problem when multiplying, if you were summing it would have to be 0's, for example).
mat2 <- addmargins(mat, FUN=function(x) 1)
Now you create an empty matrix to hold the output, and then iterate over the elements and multiply the neighbors.
out <- matrix(nrow=nrow(mat), ncol=ncol(mat))
for (i in 1:nrow(mat)) {
for (j in 1:ncol(mat)) {
out[i,j] <- prod(mat[i,j], mat2[i-1, j], mat2[i, j-1], mat2[i+1, j], mat2[i, j+1])
}
}
Resulting in:
> out
[,1] [,2] [,3]
[1,] -1 1 1
[2,] -1 1 -1
[3,] 1 1 1
[4,] -1 1 -1
This took less than a second for a 300x300 matrix, so it might be enough for you.
This should do the trick:
ind <- which(x==x, arr.ind=TRUE) # index matrix
# find distances (need distances of 1 or 0)
dist.mat <- as.matrix(dist(ind))
inds2mult <- apply(dist.mat, 1, function(ii) which(ii <= 1))
# get product of each list element in inds2mult
# and reform into appropriate matrix
matrix(
sapply(inds2mult, function(ii) prod(unlist(x)[ii])),
ncol=ncol(x))
# [,1] [,2] [,3]
#[1,] -1 1 1
#[2,] -1 1 -1
#[3,] 1 1 1
#[4,] -1 1 -1
To get around memory issues with large matrices in the call to dist, you can try the fields.rdist.near function (with a delta value of 1) from the fields package:
x <- matrix(rep(-1, 300*300), ncol=300)
ind <- which(x==x, arr.ind=TRUE) # index matrix
library(fields)
ind.list <- fields.rdist.near(ind, delta=1) # took my computer ~ 15 - 20 seconds
inds2mult <- tapply(ind.list$ind[,2], ind.list$ind[,1], list)
matrix(
sapply(inds2mult, function(ii) prod(unlist(x)[ii])),
ncol=ncol(x))
The delta argument from the fields.rdist.near help page:
Threshhold distance. All pairs of points that separated by more
than delta in distance are ignored.
I have a symmetric matrix mat:
A B C
A 1 . .
B . 1 .
C . . 1
And I want to calculate the two highest elements of it. Now since it's a symmetric matrix I thought of using upper.tri like so:
mat.tri<-upper.tri(mat) # convert to upper tri
mat.ord<-order(mat.tri,na.last=TRUE,decreasing=TRUE)[1:2] # order by largest
a.ind<-which(mat%in%mat.tri[mat.ord]) # get absolute indices
r.ind<-arrayInd(a.ind,dim(mat)) # get relative indices
# get row/colnames using these indices
So the above is such a roundabout way of doing things, and even then the output has 'duplicate' rows in that they are just transposed..
Anyone got a more intuitive way of doing this?
Thanks.
Liberally borrowing from the excellent ideas of #SimonO'Hanlon and #lukeA, you can construct a two-liner function to do what you want. I use:
arrayInd() to return the array index
order() to order the upper triangular elements
and the additional trick of setting the lower triangular matrix to NA, using m[lower.tr(m)] <- NA
Try this:
whichArrayMax <- function(m, n=2){
m[lower.tri(m)] <- NA
arrayInd(order(m, decreasing=TRUE)[seq(n)], .dim=dim(m))
}
mat <- matrix( c(1,2,3,2,1,5,3,5,1) , 3 , byrow = TRUE )
mat
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 1 5
[3,] 3 5 1
whichArrayMax(mat, 2)
[,1] [,2]
[1,] 2 3
[2,] 1 3
arrayInd(which.max(mat), .dim=dim(mat))
which is basically the same as which( mat == max(mat) , arr.ind = TRUE )[1,] from #SimonO'Hanlon, but more efficient.
For example: I have a list of matrices, and I would like to evaluate their differences, sort of a 3-D diff. So if I have:
m1 <- matrix(1:4, ncol=2)
m2 <- matrix(5:8, ncol=2)
m3 <- matrix(9:12, ncol=2)
mat.list <- list(m1,m2,m3)
I want to obtain
mat.diff <- list(m2-m1, m3-m2)
The solution I found is the following:
mat.diff <- mapply(function (A,B) B-A, mat.list[-length(mat.list)], mat.list[-1])
Is there a nicer/built-in way to do this?
You can do this with just lapply or other ways of looping:
mat.diff <- lapply( tail( seq_along(mat.list), -1 ),
function(i) mat.list[[i]] - mat.list[[ i-1 ]] )
You can use combn to generate the indexes of matrix and apply a function on each combination.
combn(1:length(l),2,FUN=function(x)
if(diff(x) == 1) ## apply just for consecutive index
l[[x[2]]]-l[[x[1]]],
simplify = FALSE) ## to get a list
Using #Arun data, I get :
[[1]]
[,1] [,2]
[1,] 4 4
[2,] 4 4
[[2]]
NULL
[[3]]
[,1] [,2]
[1,] 4 4
[2,] 4 4
I've got a matrix (mat1), say 100 rows and 100 columns; I want to create another matrix where every row is the same as the 1st row in mat1 (except that I want to keep the 1st col as the original values)
I've managed to do this using a loop:
mat2 <- mat1
for(i in 1:nrow(mat1))
{
mat2[i,2:ncol(mat2)] <- mat1[1,2:ncol(mat1)]
}
this works and produces the result I expect; however, I'd have thought there should be a way to do it without a loop; I've tried:
mat2 <- mat1
mat2[c(2:100),2:ncol(mat2)] <- mat1[1,2:ncol(mat1)]
Can someone point out my error?!
Thanks,
Chris
The problem is the way R fills matrices, by columns. Here is a simple example that illustrates this:
mat1 <- matrix(1:9, ncol = 3)
mat2 <- matrix(1:9, ncol = 3)
mat2[-1, -1] <- mat1[1, -1]
mat2
> mat2
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 4 4
[3,] 3 7 7
mat1[1, -1] is the vector 4,7, which you can see that R has used to fill the bit of mat2 column-wise. You wanted a row-wise operation.
One solution is to replicate the replacement vector as many times as is required:
> mat2[-1, -1] <- rep(mat1[1, -1], each = nrow(mat1)-1)
> mat2
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 4 7
[3,] 3 4 7
This works because the rep() call replicates each value in the vector when we use the "each" argument, instead of replicating (repeating) the vector:
> rep(mat1[1, -1], each = nrow(mat1)-1)
[1] 4 4 7 7
The default behaviour would also give the wrong answer:
> rep(mat1[1, -1], nrow(mat1)-1)
[1] 4 7 4 7
In part, the problem you are seeing is also the way R extends arguments to the appropriate length for the replacement. R actually, and silently, extended the replacement vector exactly in the way rep(mat1[1, -1], nrow(mat1)-1) does, which when coupled with the fill-by-column principle gave the behaviour you saw.
Try
mat2[c(2:nrow(mat2)), 2:ncol(mat2)] <- mat1[rep.int(1,nrow(mat1)-1),2:ncol(mat1)]
Another option...
n = 5
mat1 = matrix(sample(n^2, n^2), n, n)
# use matrix with byrow to copy 1st row n times
mat2 = matrix(rep(mat1[1, ], n), n, n, byrow = TRUE)
# copy 1st column
mat2[ , 1] = mat1[ , 1]
mat1
mat2