I have a matrix of values that also contains NAs like:
> matrix(rexp(200), 10)
> df[ df < 0.5 ] <- NA
> df
[,1] [,2] [,3] [,4] [,5]
[1,] 2.124043 1.6119230 NA 0.7222127 1.400924
[2,] 4.143728 NA NA 1.0343577 NA
[3,] 2.395984 0.6794447 0.8327695 1.0258656 NA
[4,] NA NA NA NA 1.421674
[5,] NA 1.0446031 0.7762776 NA NA
I would like to scramble each column in my matrix and realised that I can do so using:
> df<- df[sample(nrow(df)),]
> df
[,1] [,2] [,3] [,4] [,5]
[1,] 2.395984 0.6794447 0.8327695 1.0258656 NA
[2,] 2.124043 1.6119230 NA 0.7222127 1.400924
[3,] NA NA NA NA 1.421674
[4,] 4.143728 NA NA 1.0343577 NA
[5,] NA 1.0446031 0.7762776 NA NA
However, I would like to randomise this way, while keeping the positiong of NAs the same as before. Does anybody know of an easy way to do so?
Thanks a lot!
Wrap it in an apply to randomize the columns only
apply(X = df,
MARGIN = 2,
FUN = function(x) {
x[which(!is.na(x))] <- sample(x[which(!is.na(x))])
return(x)
})
Related
I have a sparse matrix that I'm reading into R and converting to a matrix with the following code
gt <-readMM("matrix.mtx")
gt_0 <- as.matrix(gt)
However, the blank fields that are within the gt object are converted to 0 during the gt_0 <- as.matrix(gt) call.
The problem is that the actual values of my matrix are binary (0|1) so filling with 0 makes downstream analyses impossible.
I would like blanks to be filled with NA if possible not 0
Many thanks for any suggestions
Just replace 0s with NAs after you convert from dgTMatrix to matrix
# Sample data
library(Matrix)
gt <- Matrix(0+1:28, nrow = 4)
gt[-3,c(2,4:5,7)] <- gt[ 3, 1:4] <- gt[1:3, 6] <- 0
gt <- as(m, "dgTMatrix")
# Convert to matrix and replace 0s with NAs
gt_0 <- as.matrix(gt)
gt_0[gt_0 == 0] <- NA
gt_0
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#[1,] 1 NA 9 NA NA NA NA
#[2,] 2 NA 10 NA NA NA NA
#[3,] NA NA NA NA 19 NA 27
#[4,] 4 NA 12 NA NA 24 NA
This question already has an answer here:
correlation matrix in R
(1 answer)
Closed 6 years ago.
Starting from a Matrix (nxm), I would like to create a new Matrix mxm that contains the correlation between the permutation of the columns of the starting matrix by 2. So if my input is a Matrix 3x3, I would like to calculate the correlation of the columns 12, 13, 23 and assign the results to the destination Matrix. Banally I used two nested for loop (~O(n^2))
for (i in 1:n) {
for (j in i+1:n) {
if (j <= n) {
tmp = cor(inMatrix[, i], inMatrix[, j])
dstMatrix[i,j] = tmp;
}
}
}
this appears to be working, and I was wondering if exists a better way to achieve it in R.
The simple cor(inMatrix) does it (whole matrix directly passed to cor()):
n <- 7
m <- 5
set.seed(123)
inMatrix <- replicate(m, sample(c(1, - 1), 1) * cumsum(runif(n)))
inMatrix
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.7883051 -0.4566147 0.04205953 -0.7085305 -0.7954674
# [2,] 1.1972821 -1.4134481 0.36998025 -1.2525965 -0.8200811
# [3,] 2.0802995 -1.8667822 1.32448390 -1.8467385 -1.2978771
# [4,] 3.0207667 -2.5443529 2.21402322 -2.1358983 -2.0563366
# [5,] 3.0663232 -3.1169863 2.90682662 -2.2830119 -2.2727445
# [6,] 3.5944287 -3.2199110 3.54733344 -3.2460361 -2.5909256
# [7,] 4.4868478 -4.1197359 4.54160321 -4.1483352 -2.8225513
dstMatrix <- matrix(nrow = m, ncol = m)
for (i in 1:(m - 1)) {
for (j in (i+1):m) {
if (j <= n) {
tmp = cor(inMatrix[, i], inMatrix[, j])
dstMatrix[i,j] = tmp;
}
}
}
dstMatrix
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA -0.9823516 0.9902370 -0.9688212 -0.9825973
# [2,] NA NA -0.9811424 0.9570599 0.9626469
# [3,] NA NA NA -0.9742235 -0.9862355
# [4,] NA NA NA NA 0.9331879
# [5,] NA NA NA NA NA
dstMatrix_2 <- cor(inMatrix)
dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1.0000000 -0.9823516 0.9902370 -0.9688212 -0.9825973
# [2,] -0.9823516 1.0000000 -0.9811424 0.9570599 0.9626469
# [3,] 0.9902370 -0.9811424 1.0000000 -0.9742235 -0.9862355
# [4,] -0.9688212 0.9570599 -0.9742235 1.0000000 0.9331879
# [5,] -0.9825973 0.9626469 -0.9862355 0.9331879 1.0000000
dstMatrix == dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA TRUE TRUE FALSE TRUE
# [2,] NA NA TRUE FALSE TRUE
# [3,] NA NA NA FALSE TRUE
# [4,] NA NA NA NA FALSE
# [5,] NA NA NA NA NA
# The difference lies in machine precision magnitude, not sure what caused it:
dstMatrix - dstMatrix_2
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA 0 0 -1.110223e-16 0.000000e+00
# [2,] NA NA 0 2.220446e-16 0.000000e+00
# [3,] NA NA NA -1.110223e-16 0.000000e+00
# [4,] NA NA NA NA 1.110223e-16
# [5,] NA NA NA NA NA
compute correlation coefficient for combinations of columns. combn function is used to get pairs of column numbers
As per #Sotos, function can be passed directly into combn, so it avoids using apply()
cor_vals <- combn(1:col_n, 2, function(x) cor(mat1[, x[1]], mat1[, x[2]]))
# cor_vals <- apply(combn(1:col_n, 2), 2, function(x) cor(mat1[, x[1]], mat1[, x[2]]))
assign names to correlation values
cor_vals <- setNames(cor_vals, combn(1:col_n, 2, paste0, collapse = ''))
cor_vals
# 12 13 23
# 0.1621491 -0.8211970 0.4299367
Data:
set.seed(1L)
row_n <- 3
col_n <- 3
mat1 <- matrix(runif(row_n * col_n, min = 0, max = 20), nrow = row_n, ncol = col_n)
I have a character matrix M1 with names and a named list L1 with values.
I want to create a matrix M2 with values, the same size as M1. For each cell in M2, there should be the value that in L1 corresponds to the name in M1. Cells in which M1's name is not in L1, M2 should be NA.
I tried to do so but hadn't managed to.
Here's an example of what I'd want to do.
>M1
[,1] [,2] [,3] [,4]
[1,] "n1" "n4" "n7" "n10"
[2,] "n2" "n5" "n8" "n11"
[3,] "n3" "n6" "n9" "n12"
> L1
$n1
[1] 1
$n2
[1] 2
$n8
[1] 3
$n25
[1] 4
From there M2 should end up being:
> M2
[,1] [,2] [,3] [,4]
[1,] 1 NA NA NA
[2,] 2 NA 3 NA
[3,] NA NA NA NA
Reproducible Examples,
dput(m1)
structure(c("n1", "n2", "n3", "n4", "n5", "n6", "n7", "n8", "n9",
"n10", "n11", "n12"), .Dim = 3:4)
dput(L1)
structure(list(n1 = 1, n2 = 2, n8 = 3, n25 = 4), .Names = c("n1",
"n2", "n8", "n25"))
One way is to unlist L1 and match the names with each element of the matrix
apply(m1, 1:2, function(i) unlist(L1)[match(i, names(L1))])
# [,1] [,2] [,3] [,4]
#[1,] 1 NA NA NA
#[2,] 2 NA 3 NA
#[3,] NA NA NA NA
We can also do this by matching the 'm1' with names of 'L1' to get the index and use that to replace with 'L1', and change the dimensions
`dim<-`(unlist(L1)[match(m1, names(L1))], dim(m1))
# [,1] [,2] [,3] [,4]
#[1,] 1 NA NA NA
#[2,] 2 NA 3 NA
#[3,] NA NA NA NA
m <- "mData"
assign(m, matrix(data = NA, nrow = 4, ncol = 5))
Now I want to use variable m to assign values to the mData matrix
assign(m[1, 2], 35) will not work.
Any solution will be much appreciated?
I'm kind of ashamed to post this but there would be a way to do this. It feels so wrong because the R-way would be to build a list of matrices and then operate on them by passing a function to transform them using lapply.
assign.by.char <- function(x, ...) {
eval.parent(assign(x, do.call(`[<-`, list(get(x) , ...)))) }
assign.by.char(m, 1,2,35)
[,1] [,2] [,3] [,4] [,5]
[1,] NA 35 NA NA NA
[2,] NA NA NA NA NA
[3,] NA NA NA NA NA
[4,] NA NA NA NA NA
If you really need to use assign(), you could do it with replace()
m <- matrix(, 3, 3)
assign("m", replace(m, cbind(1, 2), 35))
m
# [,1] [,2] [,3]
# [1,] NA 35 NA
# [2,] NA NA NA
# [3,] NA NA NA
Or you can use assign directly (a variant of #BondedDust's solution)
assign(m, `[<-`(get(m), cbind(1,2), 35))
mData
# [,1] [,2] [,3]
#[1,] NA 35 NA
#[2,] NA NA NA
#[3,] NA NA NA
Or as a function
assign.by.char <- function(x, ...){
eval.parent(assign(x, `[<-`(get(x), ...)))}
data
mData <- matrix(, 3, 3)
m <- 'mData'
I created an empty matrix by matrix(), when I need to test whether a given matrix is empty, How can I do that? I know that is.na(matrix()) is TRUE, but if given matrix is higher dimension, it cannot determine.
What I mean empty is element full of NA or NULL.
I'm guessing that you are just looking for all. Here's a small example:
M1 <- matrix(NA, ncol = 3, nrow = 3)
# [,1] [,2] [,3]
# [1,] NA NA NA
# [2,] NA NA NA
# [3,] NA NA NA
M2 <- matrix(c(1, rep(NA, 8)), ncol = 3, nrow = 3)
M2
# [,1] [,2] [,3]
# [1,] 1 NA NA
# [2,] NA NA NA
# [3,] NA NA NA
all(is.na(M1))
# [1] TRUE
all(is.na(M2))
# [1] FALSE