convert all zeros of a matrix in R to NA [duplicate] - r

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Replace all 0 values to NA in R
Going off of this question. Is there a similar function in R such as x[is.na(x)] <- 0 except that it will changes every zero in a matrix to NA?

You can do it like this:
x[x == 0] <- NA
For example:
x = matrix(rep(0:1, 50), nrow=10)
x[x == 0] <- NA
print(x)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] NA NA NA NA NA NA NA NA NA NA
# [2,] 1 1 1 1 1 1 1 1 1 1
# [3,] NA NA NA NA NA NA NA NA NA NA
# [4,] 1 1 1 1 1 1 1 1 1 1
# [5,] NA NA NA NA NA NA NA NA NA NA
# [6,] 1 1 1 1 1 1 1 1 1 1
# [7,] NA NA NA NA NA NA NA NA NA NA
# [8,] 1 1 1 1 1 1 1 1 1 1
# [9,] NA NA NA NA NA NA NA NA NA NA
#[10,] 1 1 1 1 1 1 1 1 1 1

Related

How do I loop correctly?

Here is the data below. I'm not sure which type of looping I should be using, but here is what I am looking to do: If, for row 1, there is a 6 present, then for column 7 we have "Yes", if there is no 6 present, then column 7 has "No". Ignore columns 8 & 9.
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 1 6 1 1 6 1 NA NA NA
[2,] 5 5 5 5 5 5 NA NA NA
[3,] 1 1 6 1 1 6 NA NA NA
[4,] 5 5 5 5 5 5 NA NA NA
[5,] 6 1 1 6 1 1 NA NA NA
[6,] 5 5 5 5 5 5 NA NA NA
[7,] 1 6 1 1 6 1 NA NA NA
[8,] 5 5 5 5 5 5 NA NA NA
[9,] 1 1 6 1 1 6 NA NA NA
[10,] 5 5 5 5 5 5 NA NA NA
Here is the code that I have.
data.matrix <- matrix(data=NA,nrow = b, ncol = n+3)
b <- 10
n <- 6
for (i in 1:b)
{
data.matrix[,1:n] <- sample(6,n,replace=T)
}
Side Note: I keep getting this error
"the condition has length > 1 and only the first element will be used"
Here is a solution using apply:
a[,7] <- apply(a, 1, function(x) ifelse(max(x,na.rm = T) == 6,"YES","NO"))
where a is the input data.frame/tibble. As commented above, if you have matrix, then convert it to data.frame and perform this operation.
Here is solution with lapply and which:
res <- apply(data.matrix, 1, function(x) {
x[[7]] <- length(which(x == 6)) > 0
x
})
res <- t(res)

R- Replace all values in rows of dataframe after first NA by NA

I have a dataframe of 3500 observations and 278 variables. For each row going from the first column, I want to replace all values occurring after the first NA by NAs. For instance, I want to go from a dataframe like so:
X1 X2 X3 X4 X5
1 3 NA 6 9
1 NA 4 6 18
6 7 NA 3 1
10 1 2 NA 2
To something like
X1 X2 X3 X4 X5
1 3 NA NA NA
1 NA NA NA NA
6 7 NA NA NA
10 1 2 NA NA
I tried using the following nested for loop, but it is not terminating:
for(i in 2:3500){
firstna <- min(which(is.na(df[i,])))
df[i, firstna:278] <- NA
}
Is there a more efficient way to do this? Thanks in advance.
You could do something like this:
# sample data
mat <- matrix(1, 10, 10)
set.seed(231)
mat[sample(100, 7)] <- NA
You can use apply with cumsum and is.na to keep track of where NAs need to be placed (i.e. places across the row where the cumulative sum of NAs is greater than 0). Then, use those locations to assign NAs to the original structure in the appropriate places.
mat[t(apply(is.na(mat), 1, cumsum)) > 0 ] <- NA
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 1 1 1 1 1 NA NA NA NA
# [2,] NA NA NA NA NA NA NA NA NA NA
# [3,] 1 1 1 1 1 1 1 1 1 1
# [4,] 1 1 1 1 1 1 1 1 1 1
# [5,] 1 1 1 NA NA NA NA NA NA NA
# [6,] 1 1 1 1 1 1 1 1 1 1
# [7,] 1 NA NA NA NA NA NA NA NA NA
# [8,] 1 1 1 1 1 1 1 1 1 1
# [9,] 1 1 1 1 1 1 1 1 1 1
#[10,] 1 1 NA NA NA NA NA NA NA NA
Works the fine with data frames. Using the provided example data:
d<-read.table(text="
X1 X2 X3 X4 X5
1 3 NA 6 9
1 NA 4 6 18
6 7 NA 3 1
10 1 2 NA 2 ", header=TRUE)
d[t(apply(is.na(d), 1, cumsum)) > 0 ] <- NA
# X1 X2 X3 X4 X5
#1 1 3 NA NA NA
#2 1 NA NA NA NA
#3 6 7 NA NA NA
#4 10 1 2 NA NA
We can use rowCumsums from library(matrixStats)
library(matrixStats)
d*NA^rowCumsums(+(is.na(d)))
# X1 X2 X3 X4 X5
#1 1 3 NA NA NA
#2 1 NA NA NA NA
#3 6 7 NA NA NA
#4 10 1 2 NA NA
Or a base R option is
d*NA^do.call(cbind,Reduce(`+`,lapply(d, is.na), accumulate=TRUE))
I did this using the cumany function from the dplyr package, which returns TRUE for each element after the condition is met.
df <- read.table(text = "X1 X2 X3 X4 X5
1 3 NA 6 9
1 NA 4 6 18
6 7 NA 3 1
10 1 2 NA 2 ",
header = T)
library(plyr)
library(dplyr)
na_row_replace <- function(x){
x[which(cumany(is.na(x)))] <- NA
return(x)
}
adply(df, 1, na_row_replace)

Index matrix with randomly sampled columns

I'm trying to fill the columns of an index matrix with samples from 1:whatever using a for loop. The purpose of this is for a bootstrap coding problem. The issue I'm getting is that the for loop wont run correctly once it reaches a number that is not a multiple of the row length. For some reason it thinks I want an equal representation of number in each column. How do I get this to stop?
index.mat=matrix(NA,nr=12,nc=10,byrow=FALSE)
for(i in 1:5)
{
index.mat[,i] <- sample(1:i, i, replace=TRUE)
print(index.mat)
}
will print
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 1 2 1 NA NA NA NA NA NA
[2,] 1 1 2 4 NA NA NA NA NA NA
[3,] 1 1 2 1 NA NA NA NA NA NA
[4,] 1 1 2 2 NA NA NA NA NA NA
[5,] 1 1 2 1 NA NA NA NA NA NA
[6,] 1 1 2 4 NA NA NA NA NA NA
[7,] 1 1 2 1 NA NA NA NA NA NA
[8,] 1 1 2 2 NA NA NA NA NA NA
[9,] 1 1 2 1 NA NA NA NA NA NA
[10,] 1 1 2 4 NA NA NA NA NA NA
[11,] 1 1 2 1 NA NA NA NA NA NA
[12,] 1 1 2 2 NA NA NA NA NA NA
as the final matrix before giving the error
Error in index.mat[, i] <- sample(1:i, i, replace = TRUE) :
number of items to replace is not a multiple of replacement length
Just use sample(i, size = 12, replace = TRUE).
Your LHS is index.mat[,i] which has length 12.
Your RHS is sample(1:i, i, replace = TRUE), which has length i.
By nature, R will recycle the RHS when the lengths don't match up -- this means, when i=2, your RHS is length 2 and it will simply be repeated 6 times to match the LHS length 12.
In this particular case, if the RHS length isn't a divisor of the LHS length, you get an error -- which happens first when, you guessed it, i=5 (since 1, 2, 3, and 4 all divide 12 evenly).

How to order columns and rows to create a relatively dense submatrix

I have a large matrix which comprises 1,2 and missing (coded as NA) values. The matrix has 500000 rows by 10000 columns. There are approximately 0.05% 1- or 2-values, and the remaining values are NA.
I would like to reorder the rows and columns of the matrix so that the top left corner of the matrix comprises a relatively high number of 1s and 2s compared to the rest of the matrix. In other words, I would like to create a relatively datarich subset of the matrix, by reordering the matrix rows and columns.
Is there an efficient method of achieving this in R?
In particular, I'm interested in solutions where sorting by the number of non-NA values in each row and column is not sufficient to produce a dense corner.
In addition, I'll add a constraint. The size of the dense corner will be pre-defined.
In the following example, the goal is to reorder the rows and columns so that the top leftmost 3x3 submatrix is relatively dense (i.e. few or no NA values).
m1 <- matrix(c(rep(c(rep(NA, 3), rep(1, 7)), 1),
rep(c(rep(2, 3), rep(NA, 7)), 7),
rep(c(rep(NA, 3), rep(1, 7)), 2)
), nrow=10, byrow=TRUE)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] NA NA NA 1 1 1 1 1 1 1
[2,] 2 2 2 NA NA NA NA NA NA NA
[3,] 2 2 2 NA NA NA NA NA NA NA
[4,] 2 2 2 NA NA NA NA NA NA NA
[5,] 2 2 2 NA NA NA NA NA NA NA
[6,] 2 2 2 NA NA NA NA NA NA NA
[7,] 2 2 2 NA NA NA NA NA NA NA
[8,] 2 2 2 NA NA NA NA NA NA NA
[9,] NA NA NA 1 1 1 1 1 1 1
[10,] NA NA NA 1 1 1 1 1 1 1
The rows and columns are ordered by the number of non-NA values (using code from an answer below):
m1 <- m1[order(rowSums(is.na(m1))), order(colSums(is.na(m1)))]
However, this does not result in a dense 3x3 top leftmost corner:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] NA NA NA 1 1 1 1 1 1 1
[2,] NA NA NA 1 1 1 1 1 1 1
[3,] NA NA NA 1 1 1 1 1 1 1
[4,] 2 2 2 NA NA NA NA NA NA NA
[5,] 2 2 2 NA NA NA NA NA NA NA
[6,] 2 2 2 NA NA NA NA NA NA NA
[7,] 2 2 2 NA NA NA NA NA NA NA
[8,] 2 2 2 NA NA NA NA NA NA NA
[9,] 2 2 2 NA NA NA NA NA NA NA
[10,] 2 2 2 NA NA NA NA NA NA NA
I thought that there maybe a set of optimisation procedures that I could implement as my working matrix is too large to do the reorganisation by eye.

Assign Value to Diagonal Entries of Matrix

I need to access and assign single slots of an m*n matrix inside a for loop. The code so far:
rowCount <- 9
similMatrix = matrix(nrow = rowCount - 1, ncol = rowCount)
show(similMatrix)
for(i in (rowCount - 1)){
for (j in rowCount)
if (i == j){
similMatrix[i == j] <- 0;
}
}
show(similMatrix)
so if i = j the NA value in the matrix needs to be replaced with 0.
You want the function diag<-
m <- matrix(1:12, nrow=3)
m
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
diag(m) <- 0
m
[,1] [,2] [,3] [,4]
[1,] 0 4 7 10
[2,] 2 0 8 11
[3,] 3 6 0 12
For the purpose of setting the "diagonal" elements to zero you have already been given an answer but I wonder if you were hoping for something more general. The reasons for lack of success with that code were two-fold: the construction of your indices were flawed and the indexing was wrong. This would have succeeded:
for(i in 1:(rowCount - 1)){ # need an expression that retruns a sequence
for (j in 1:rowCount) # ditto
if (i == j){
similMatrix[i,j] <- 0; # need to index the matrix with two element if using i,j
}
}
#----------
> show(similMatrix)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 0 NA NA NA NA NA NA NA NA
[2,] NA 0 NA NA NA NA NA NA NA
[3,] NA NA 0 NA NA NA NA NA NA
[4,] NA NA NA 0 NA NA NA NA NA
[5,] NA NA NA NA 0 NA NA NA NA
[6,] NA NA NA NA NA 0 NA NA NA
[7,] NA NA NA NA NA NA 0 NA NA
[8,] NA NA NA NA NA NA NA 0 NA
But resorting to loops in R is generally considered a last resort (sometimes for the wrong reasons.) There is a much more compact way of doing the same "loop" operation and it generalizes more widely than just setting the diagonal.
similMatrix[ row(similMatrix) == col(similMatrix) ] <- 0
> similMatrix
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 0 NA NA NA NA NA NA NA NA
[2,] NA 0 NA NA NA NA NA NA NA
[3,] NA NA 0 NA NA NA NA NA NA
[4,] NA NA NA 0 NA NA NA NA NA
[5,] NA NA NA NA 0 NA NA NA NA
[6,] NA NA NA NA NA 0 NA NA NA
[7,] NA NA NA NA NA NA 0 NA NA
[8,] NA NA NA NA NA NA NA 0 NA
If you wanted to set the subdiagonal to zero you could just use:
similMatrix[ row(similMatrix)-1 == col(similMatrix) ] <- 0
You can avoid generating the extra row and col matrices using this:
mind <- min( dim(similMatrix) )
# avoid going outside dimensions if not symmetric
similMatrix[ cbind( seq(maxd),seq(maxd) ) <- 0

Resources