I have a sparse matrix that I'm reading into R and converting to a matrix with the following code
gt <-readMM("matrix.mtx")
gt_0 <- as.matrix(gt)
However, the blank fields that are within the gt object are converted to 0 during the gt_0 <- as.matrix(gt) call.
The problem is that the actual values of my matrix are binary (0|1) so filling with 0 makes downstream analyses impossible.
I would like blanks to be filled with NA if possible not 0
Many thanks for any suggestions
Just replace 0s with NAs after you convert from dgTMatrix to matrix
# Sample data
library(Matrix)
gt <- Matrix(0+1:28, nrow = 4)
gt[-3,c(2,4:5,7)] <- gt[ 3, 1:4] <- gt[1:3, 6] <- 0
gt <- as(m, "dgTMatrix")
# Convert to matrix and replace 0s with NAs
gt_0 <- as.matrix(gt)
gt_0[gt_0 == 0] <- NA
gt_0
# [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#[1,] 1 NA 9 NA NA NA NA
#[2,] 2 NA 10 NA NA NA NA
#[3,] NA NA NA NA 19 NA 27
#[4,] 4 NA 12 NA NA 24 NA
Related
This question already has answers here:
Combining (cbind) vectors of different length
(5 answers)
Closed 3 years ago.
I have a list of c.1100 individual character vectors, each of which corresponds to a particular set of genes (the character is the gene symbol in the form: e.g. "ENSG000011", "ENSG000012" etc.
I want to merge these vectors into a single data.frame/matrix, such that each item in the list becomes its own column. However, each of the items in the list is of a different length.
However, I cannot seem to find a single way of doing this.
I've tried a number of ways within R, but the format never seems to look quite right (e.g. it pastes all of the items of the list in one row, on top of oneanother, or I get an error as the elements are each of different lengths)
Using Base R we need to...
First lets create a sample dataset with 4 vectors:
a <- rnorm(10)
b <- rnorm(5)
c <- rnorm(7)
d <- rnorm(20)
Then we can put them in a list as:
f <- list(a,b,c,d)
Then we need to find the length of the longest vector:
max_len <- max(sapply(f, length))
Then we need to make all vectors the max_len by substituting NAs in for the gap (so if you have a max_len = 20 and a current vector is only length(current) = 10 then you need the last 10 values to be NA
f1 <- lapply(f, function(x) c(x, rep(NA, max_len - length(x))))
Then you can turn this into a matrix as:
matrix(unlist(f1), ncol = length(f1), byrow = F)
which results in
[,1] [,2] [,3] [,4]
[1,] -0.53487289 -1.8570456 0.8304454 -0.6440267
[2,] 0.04283173 -1.2541836 0.9579962 -1.1664334
[3,] -1.31686110 -0.6789986 0.9424487 0.4073388
[4,] -0.54987484 -0.4326257 -1.5165032 0.1990406
[5,] 0.31529161 -0.2712977 0.1347272 -0.2479010
[6,] -1.08465865 NA 0.7442857 -1.1319033
[7,] 1.11283161 NA -0.8397640 0.2636702
[8,] 0.08882676 NA NA -0.1332037
[9,] 0.76028752 NA NA 0.1607880
[10,] -2.68513818 NA NA -2.3300150
[11,] NA NA NA -0.3356175
[12,] NA NA NA 0.8115210
[13,] NA NA NA 1.1668857
[14,] NA NA NA 0.5538027
[15,] NA NA NA -0.8910439
[16,] NA NA NA -1.4056796
[17,] NA NA NA -1.6713585
[18,] NA NA NA 0.2557690
[19,] NA NA NA -0.5970861
[20,] NA NA NA 0.1851019
I have a matrix of values that also contains NAs like:
> matrix(rexp(200), 10)
> df[ df < 0.5 ] <- NA
> df
[,1] [,2] [,3] [,4] [,5]
[1,] 2.124043 1.6119230 NA 0.7222127 1.400924
[2,] 4.143728 NA NA 1.0343577 NA
[3,] 2.395984 0.6794447 0.8327695 1.0258656 NA
[4,] NA NA NA NA 1.421674
[5,] NA 1.0446031 0.7762776 NA NA
I would like to scramble each column in my matrix and realised that I can do so using:
> df<- df[sample(nrow(df)),]
> df
[,1] [,2] [,3] [,4] [,5]
[1,] 2.395984 0.6794447 0.8327695 1.0258656 NA
[2,] 2.124043 1.6119230 NA 0.7222127 1.400924
[3,] NA NA NA NA 1.421674
[4,] 4.143728 NA NA 1.0343577 NA
[5,] NA 1.0446031 0.7762776 NA NA
However, I would like to randomise this way, while keeping the positiong of NAs the same as before. Does anybody know of an easy way to do so?
Thanks a lot!
Wrap it in an apply to randomize the columns only
apply(X = df,
MARGIN = 2,
FUN = function(x) {
x[which(!is.na(x))] <- sample(x[which(!is.na(x))])
return(x)
})
I would like apply a function to each element of a matrix considering the upper, lower, left and right neighbouring cells:
a=1; b=3; c=8; d=2
m <- matrix(1:20, nrow=4, ncol=5)
mn <- matrix(NA, nrow=4, ncol=5)
for(i in 2:3){
for(j in 2:4){
mn[i,j] <- a*m[i-1,j]+b*m[i+1,j]+c*m[i,j-1]+d*m[i,j+1]
}
}
Is there an alternative way using sapply or something else?
You can subset multiple rows and columns of a matrix, like this:
.i = 2:3
.j = 2:4
mn[.i,.j] = a*m[.i-1,.j] + b*m[.i+1,.j] + c*m[.i,.j-1] + d*m[.i,.j+1]
> mn
# [,1] [,2] [,3] [,4] [,5]
# [1,] NA NA NA NA NA
# [2,] NA 62 118 174 NA
# [3,] NA 76 132 188 NA
# [4,] NA NA NA NA NA
m <- "mData"
assign(m, matrix(data = NA, nrow = 4, ncol = 5))
Now I want to use variable m to assign values to the mData matrix
assign(m[1, 2], 35) will not work.
Any solution will be much appreciated?
I'm kind of ashamed to post this but there would be a way to do this. It feels so wrong because the R-way would be to build a list of matrices and then operate on them by passing a function to transform them using lapply.
assign.by.char <- function(x, ...) {
eval.parent(assign(x, do.call(`[<-`, list(get(x) , ...)))) }
assign.by.char(m, 1,2,35)
[,1] [,2] [,3] [,4] [,5]
[1,] NA 35 NA NA NA
[2,] NA NA NA NA NA
[3,] NA NA NA NA NA
[4,] NA NA NA NA NA
If you really need to use assign(), you could do it with replace()
m <- matrix(, 3, 3)
assign("m", replace(m, cbind(1, 2), 35))
m
# [,1] [,2] [,3]
# [1,] NA 35 NA
# [2,] NA NA NA
# [3,] NA NA NA
Or you can use assign directly (a variant of #BondedDust's solution)
assign(m, `[<-`(get(m), cbind(1,2), 35))
mData
# [,1] [,2] [,3]
#[1,] NA 35 NA
#[2,] NA NA NA
#[3,] NA NA NA
Or as a function
assign.by.char <- function(x, ...){
eval.parent(assign(x, `[<-`(get(x), ...)))}
data
mData <- matrix(, 3, 3)
m <- 'mData'
I created an empty matrix by matrix(), when I need to test whether a given matrix is empty, How can I do that? I know that is.na(matrix()) is TRUE, but if given matrix is higher dimension, it cannot determine.
What I mean empty is element full of NA or NULL.
I'm guessing that you are just looking for all. Here's a small example:
M1 <- matrix(NA, ncol = 3, nrow = 3)
# [,1] [,2] [,3]
# [1,] NA NA NA
# [2,] NA NA NA
# [3,] NA NA NA
M2 <- matrix(c(1, rep(NA, 8)), ncol = 3, nrow = 3)
M2
# [,1] [,2] [,3]
# [1,] 1 NA NA
# [2,] NA NA NA
# [3,] NA NA NA
all(is.na(M1))
# [1] TRUE
all(is.na(M2))
# [1] FALSE