I have a vector of data such as the following:
data <- c(1, 3, 4, 7)
And I would like to apply a function to every pair of elements in the vector such that it will return an upper triangle matrix as the following does:
mat <- matrix(data = NA, nrow = length(data), ncol = length(data))
for (i in 1:(length(data) - 1)) {
for (j in (i+1):length(data)) {
mat[i, j] <- "-"(data[j], data[i])
}
}
But I would like to do so with an apply type function instead of a for loop.
I am unsure how to do so. Any suggestions?
Thanks!
We can use combn
mat[lower.tri(mat, diag=FALSE)] <- combn(data, 2,
FUN= function(x) x[2]-x[1])
t(mat)
# [,1] [,2] [,3] [,4]
#[1,] NA 2 3 6
#[2,] NA NA 1 4
#[3,] NA NA NA 3
#[4,] NA NA NA NA
data
mat <- matrix(data = NA, nrow = length(data), ncol = length(data))
Using outer:
t(outer(data,data,"-"))*
NA^lower.tri(matrix(0,length(data),length(data)),diag=TRUE)
# [,1] [,2] [,3] [,4]
#[1,] NA 2 3 6
#[2,] NA NA 1 4
#[3,] NA NA NA 3
#[4,] NA NA NA NA
Related
I'd like to find between to matrices the shared values, and return the locations (row-col) in a matrix.
set.seed(123)
m <- matrix(sample(4), 2, 2, byrow = T)
# m
# [,1] [,2]
# [1,] 2 3
# [2,] 1 4
m2 <- matrix(sample(4), 2, 2, byrow = F)
# m2
# [,1] [,2]
# [1,] 4 2
# [2,] 1 3
Expected output:
# [,1] [,2]
# [1,] NA NA
# [2,] "2-1" NA
Bonus if this could be generalized to non-identical matrices (different dim).
Equal sizes
One option would be
replace(m * NA, m == m2, paste(row(m), col(m), sep = "-")[m == m2])
# [,1] [,2]
# [1,] NA NA
# [2,] "2-1" NA
Different sizes
I believe that in this case, regardless of the approach, you will first need to trim both matrices to be of equal size.
set.seed(12)
(m <- matrix(sample(6), 2, 3, byrow = TRUE))
# [,1] [,2] [,3]
# [1,] 1 5 4
# [2,] 6 3 2
(m2 <- matrix(sample(6), 3, 2, byrow = FALSE))
# [,1] [,2]
# [1,] 2 5
# [2,] 4 3
# [3,] 1 6
out <- matrix(NA, max(nrow(m), nrow(m2)), max(ncol(m), ncol(m2)))
mrow <- min(nrow(m), nrow(m2))
mcol <- min(ncol(m), ncol(m2))
mTrim <- m[1:mrow, 1:mcol]
m2Trim <- m2[1:mrow, 1:mcol]
out[1:mrow, 1:mcol][mTrim == m2Trim] <- paste(row(mTrim), col(mTrim), sep = "-")[mTrim == m2Trim]
out
# [,1] [,2] [,3]
# [1,] NA "1-2" NA
# [2,] NA "2-2" NA
# [3,] NA NA NA
This function gives the desired output, but works on the condition that dim() is equal between the two matrices.
In order to generalize this for non identical matrices, on solution would be to subset the bigger matrix first.
The key is which(mat1==mat2, arr.ind=T) to get row-col index:
which(m==m2, arr.ind=T)
row col
[1,] 2 1
Inside a function:
find_in_matr <- function(mat1, mat2) {
if (!all(dim(mat1) == dim(mat2))) {
stop("mat1 and mat2 need to have the same dim()!")
}
m <- mat1
m[] <- NA # copy mat1 dim, and empty values
loc <- which(mat1==mat2, arr.ind=T) # find positions (both indxs)
m[loc] <- mapply(paste, sep="-", loc[, 1], loc[, 2]) # paste indxs
return(m)
}
Example:
set.seed(123)
m <- matrix(sample(4), 2, 2, byrow = T)
# m
# [,1] [,2]
# [1,] 2 3
# [2,] 1 4
m2 <- matrix(sample(4), 2, 2, byrow = F)
# m2
# [,1] [,2]
# [1,] 4 2
# [2,] 1 3
find_in_matr(m, m2)
# [,1] [,2]
# [1,] NA NA
# [2,] "2-1" NA
Silly piped version
library(magrittr)
(m == m2) %>%
`[<-`(!., NA) %>%
`[<-`((w <- which(., arr = T)), apply(w, 1, paste, collapse = '-'))
# [,1] [,2]
# [1,] NA NA
# [2,] "2-1" NA
I try to do it with ifelse() :
x <- apply(which(m == m2, arr.ind = T), 1, paste, collapse = "-")
ifelse(m != m2, NA, x)
# [,1] [,2]
# [1,] NA NA
# [2,] "2-1" NA
This method can deal with any dimensions.
e.g.
set.seed(999)
m1 <- matrix(sample(1:3, 12, replace = T), 3, 4)
m2 <- matrix(sample(1:3, 12, replace = T), 3, 4)
x <- apply(which(m1 == m2, arr.ind = T), 1, paste, collapse = "-")
ifelse(m1 != m2, NA, x)
# [,1] [,2] [,3] [,4]
# [1,] NA "1-4" NA "3-4"
# [2,] NA NA "2-3" NA
# [3,] "2-3" NA NA "1-2"
This question already has an answer here:
correlation matrix in R
(1 answer)
Closed 6 years ago.
Starting from a Matrix (nxm), I would like to create a new Matrix mxm that contains the correlation between the permutation of the columns of the starting matrix by 2. So if my input is a Matrix 3x3, I would like to calculate the correlation of the columns 12, 13, 23 and assign the results to the destination Matrix. Banally I used two nested for loop (~O(n^2))
for (i in 1:n) {
for (j in i+1:n) {
if (j <= n) {
tmp = cor(inMatrix[, i], inMatrix[, j])
dstMatrix[i,j] = tmp;
}
}
}
this appears to be working, and I was wondering if exists a better way to achieve it in R.
The simple cor(inMatrix) does it (whole matrix directly passed to cor()):
n <- 7
m <- 5
set.seed(123)
inMatrix <- replicate(m, sample(c(1, - 1), 1) * cumsum(runif(n)))
inMatrix
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.7883051 -0.4566147 0.04205953 -0.7085305 -0.7954674
# [2,] 1.1972821 -1.4134481 0.36998025 -1.2525965 -0.8200811
# [3,] 2.0802995 -1.8667822 1.32448390 -1.8467385 -1.2978771
# [4,] 3.0207667 -2.5443529 2.21402322 -2.1358983 -2.0563366
# [5,] 3.0663232 -3.1169863 2.90682662 -2.2830119 -2.2727445
# [6,] 3.5944287 -3.2199110 3.54733344 -3.2460361 -2.5909256
# [7,] 4.4868478 -4.1197359 4.54160321 -4.1483352 -2.8225513
dstMatrix <- matrix(nrow = m, ncol = m)
for (i in 1:(m - 1)) {
for (j in (i+1):m) {
if (j <= n) {
tmp = cor(inMatrix[, i], inMatrix[, j])
dstMatrix[i,j] = tmp;
}
}
}
dstMatrix
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA -0.9823516 0.9902370 -0.9688212 -0.9825973
# [2,] NA NA -0.9811424 0.9570599 0.9626469
# [3,] NA NA NA -0.9742235 -0.9862355
# [4,] NA NA NA NA 0.9331879
# [5,] NA NA NA NA NA
dstMatrix_2 <- cor(inMatrix)
dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1.0000000 -0.9823516 0.9902370 -0.9688212 -0.9825973
# [2,] -0.9823516 1.0000000 -0.9811424 0.9570599 0.9626469
# [3,] 0.9902370 -0.9811424 1.0000000 -0.9742235 -0.9862355
# [4,] -0.9688212 0.9570599 -0.9742235 1.0000000 0.9331879
# [5,] -0.9825973 0.9626469 -0.9862355 0.9331879 1.0000000
dstMatrix == dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA TRUE TRUE FALSE TRUE
# [2,] NA NA TRUE FALSE TRUE
# [3,] NA NA NA FALSE TRUE
# [4,] NA NA NA NA FALSE
# [5,] NA NA NA NA NA
# The difference lies in machine precision magnitude, not sure what caused it:
dstMatrix - dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA 0 0 -1.110223e-16 0.000000e+00
# [2,] NA NA 0 2.220446e-16 0.000000e+00
# [3,] NA NA NA -1.110223e-16 0.000000e+00
# [4,] NA NA NA NA 1.110223e-16
# [5,] NA NA NA NA NA
compute correlation coefficient for combinations of columns. combn function is used to get pairs of column numbers
As per #Sotos, function can be passed directly into combn, so it avoids using apply()
cor_vals <- combn(1:col_n, 2, function(x) cor(mat1[, x[1]], mat1[, x[2]]))
# cor_vals <- apply(combn(1:col_n, 2), 2, function(x) cor(mat1[, x[1]], mat1[, x[2]]))
assign names to correlation values
cor_vals <- setNames(cor_vals, combn(1:col_n, 2, paste0, collapse = ''))
cor_vals
# 12 13 23
# 0.1621491 -0.8211970 0.4299367
Data:
set.seed(1L)
row_n <- 3
col_n <- 3
mat1 <- matrix(runif(row_n * col_n, min = 0, max = 20), nrow = row_n, ncol = col_n)
I want to set NA's in every element of a matrix where the value in a column is greater than or equal to the value of a given vector. For example, I can create a matrix:
set.seed(1)
zz <- matrix(data = round(10L * runif(12)), nrow = 4, ncol = 3)
which gives for zz:
[,1] [,2] [,3]
[1,] 8 5 7
[2,] 6 5 1
[3,] 5 10 3
[4,] 9 1 9
and for the comparison vector (for example):
xx <- round(10L * runif(4))
where xx is:
[1] 6 3 8 2
if I perform this operation:
apply(zz,2,function(x) x >= xx)
I get:
[,1] [,2] [,3]
[1,] TRUE FALSE TRUE
[2,] TRUE TRUE FALSE
[3,] FALSE TRUE FALSE
[4,] TRUE FALSE TRUE
What I want is everywhere I have a TRUE element I want an NA and everywhere I have a FALSE I get the number in the zz matrix (e.g., manually ...):
NA 5 NA
NA NA 1
5 NA 3
NA 1 NA
I can cobble together some "for" loops to do what I want, but is there a vector-based way to do this??
Thanks for any tips.
You could simply do:
zz[zz>=xx] <- NA
# [,1] [,2] [,3]
#[1,] NA 5 NA
#[2,] NA NA 1
#[3,] 5 NA 3
#[4,] NA 1 NA
Here is one option to get the expected output. We get a logical matrix (zz >= xx), using NA^ on that returns NA for the TRUE values and 1 for the FALSE, then multiply it with original matrix 'zz' so that NA remains as such while the 1 changes to the corresponding value in 'zz'.
NA^(zz >= xx)*zz
# [,1] [,2] [,3]
#[1,] NA 5 NA
#[2,] NA NA 1
#[3,] 5 NA 3
#[4,] NA 1 NA
Or another option is ifelse
ifelse(zz >= xx, NA, zz)
data
zz <- structure(c(8, 6, 5, 9, 5, 5, 10, 1, 7, 1, 3, 9), .Dim = c(4L, 3L))
xx <- c(6, 3, 8, 2)
m <- "mData"
assign(m, matrix(data = NA, nrow = 4, ncol = 5))
Now I want to use variable m to assign values to the mData matrix
assign(m[1, 2], 35) will not work.
Any solution will be much appreciated?
I'm kind of ashamed to post this but there would be a way to do this. It feels so wrong because the R-way would be to build a list of matrices and then operate on them by passing a function to transform them using lapply.
assign.by.char <- function(x, ...) {
eval.parent(assign(x, do.call(`[<-`, list(get(x) , ...)))) }
assign.by.char(m, 1,2,35)
[,1] [,2] [,3] [,4] [,5]
[1,] NA 35 NA NA NA
[2,] NA NA NA NA NA
[3,] NA NA NA NA NA
[4,] NA NA NA NA NA
If you really need to use assign(), you could do it with replace()
m <- matrix(, 3, 3)
assign("m", replace(m, cbind(1, 2), 35))
m
# [,1] [,2] [,3]
# [1,] NA 35 NA
# [2,] NA NA NA
# [3,] NA NA NA
Or you can use assign directly (a variant of #BondedDust's solution)
assign(m, `[<-`(get(m), cbind(1,2), 35))
mData
# [,1] [,2] [,3]
#[1,] NA 35 NA
#[2,] NA NA NA
#[3,] NA NA NA
Or as a function
assign.by.char <- function(x, ...){
eval.parent(assign(x, `[<-`(get(x), ...)))}
data
mData <- matrix(, 3, 3)
m <- 'mData'
I created an empty matrix by matrix(), when I need to test whether a given matrix is empty, How can I do that? I know that is.na(matrix()) is TRUE, but if given matrix is higher dimension, it cannot determine.
What I mean empty is element full of NA or NULL.
I'm guessing that you are just looking for all. Here's a small example:
M1 <- matrix(NA, ncol = 3, nrow = 3)
# [,1] [,2] [,3]
# [1,] NA NA NA
# [2,] NA NA NA
# [3,] NA NA NA
M2 <- matrix(c(1, rep(NA, 8)), ncol = 3, nrow = 3)
M2
# [,1] [,2] [,3]
# [1,] 1 NA NA
# [2,] NA NA NA
# [3,] NA NA NA
all(is.na(M1))
# [1] TRUE
all(is.na(M2))
# [1] FALSE