I want to set NA's in every element of a matrix where the value in a column is greater than or equal to the value of a given vector. For example, I can create a matrix:
set.seed(1)
zz <- matrix(data = round(10L * runif(12)), nrow = 4, ncol = 3)
which gives for zz:
[,1] [,2] [,3]
[1,] 8 5 7
[2,] 6 5 1
[3,] 5 10 3
[4,] 9 1 9
and for the comparison vector (for example):
xx <- round(10L * runif(4))
where xx is:
[1] 6 3 8 2
if I perform this operation:
apply(zz,2,function(x) x >= xx)
I get:
[,1] [,2] [,3]
[1,] TRUE FALSE TRUE
[2,] TRUE TRUE FALSE
[3,] FALSE TRUE FALSE
[4,] TRUE FALSE TRUE
What I want is everywhere I have a TRUE element I want an NA and everywhere I have a FALSE I get the number in the zz matrix (e.g., manually ...):
NA 5 NA
NA NA 1
5 NA 3
NA 1 NA
I can cobble together some "for" loops to do what I want, but is there a vector-based way to do this??
Thanks for any tips.
You could simply do:
zz[zz>=xx] <- NA
# [,1] [,2] [,3]
#[1,] NA 5 NA
#[2,] NA NA 1
#[3,] 5 NA 3
#[4,] NA 1 NA
Here is one option to get the expected output. We get a logical matrix (zz >= xx), using NA^ on that returns NA for the TRUE values and 1 for the FALSE, then multiply it with original matrix 'zz' so that NA remains as such while the 1 changes to the corresponding value in 'zz'.
NA^(zz >= xx)*zz
# [,1] [,2] [,3]
#[1,] NA 5 NA
#[2,] NA NA 1
#[3,] 5 NA 3
#[4,] NA 1 NA
Or another option is ifelse
ifelse(zz >= xx, NA, zz)
data
zz <- structure(c(8, 6, 5, 9, 5, 5, 10, 1, 7, 1, 3, 9), .Dim = c(4L, 3L))
xx <- c(6, 3, 8, 2)
Related
I need to compare each item in an array against every element in a matrix. The matrix and array can be any size. I can't use loops or if statements - mainly functions like apply(), ifelse(), and so on. NA data can be ignored. Here is an example:
x <- c(1,0,1,0,1,1,1,1,0,1,0,1)
y <- c(1, NA, 1, NA)
The y array needs to compare, in rows, to x - so that after each element in y has compared itself to x, y continues comparing to x as a new row. The solution I want is:
[,1] [,2] [,3] [,4]
[1,] TRUE NA TRUE NA
[2,] TRUE NA TRUE NA
[3,] FALSE NA FALSE NA
This function compares with logical equivalence:
answer <- function(x,y){
z <- x == y
print(z)
}
The solution returns the correct answers but in a single row, where the next row should begin after the second NA.
[1] TRUE NA TRUE NA TRUE NA TRUE NA FALSE NA FALSE NA
When I try to turn that answer into a matrix with column length(y) - 4 in this case - the output isn't correct.
answer <- function(x,y){
z <- x == y
z2 <- matrix(z, ncol = length(y))
print(z2)
}
The return is comparing values straight down by column:
[,1] [,2] [,3] [,4]
[1,] TRUE NA TRUE NA
[2,] NA TRUE NA FALSE
[3,] TRUE NA FALSE NA
What can I use to make the comparison through each column in a row instead of each row in a column? Can I use apply(z, 1, some built-in function) or a nested apply(apply()) function? The difficulty has been to ensure that the resulting matrix is the correct size, with the correct answer, while compensating for any size array/matrix comparison.
We can make the lengths same and do the comparison
x == y[col(x)]
# [,1] [,2] [,3] [,4]
#[1,] TRUE NA TRUE NA
#[2,] FALSE NA TRUE NA
#[3,] TRUE NA FALSE NA
If the comparison is by row
x == y
Or
x== y[row(x)]
Or with sweep
sweep(x, 2, y, FUN = `==`)
# [,1] [,2] [,3] [,4]
#[1,] TRUE NA TRUE NA
#[2,] FALSE NA TRUE NA
#[3,] TRUE NA FALSE NA
data
x <- matrix(c(1,0,1,0,1,1,1,1,0,1,0,1), nrow=3, ncol=4)
y <- c(1, NA, 1, NA)
You can do
t(t(x) == y)
# [,1] [,2] [,3] [,4]
#[1,] TRUE NA TRUE NA
#[2,] FALSE NA TRUE NA
#[3,] TRUE NA FALSE NA
x <- matrix(c(1,0,1,0,1,1,1,1,0,1,0,1), nrow=3, ncol=4)
y <- c(1, NA, 1, NA)
Given the following setup:
> vals = matrix(nrow = 3,ncol = 4)
[,1] [,2] [,3] [,4]
[1,] NA NA NA NA
[2,] NA NA NA NA
[3,] NA NA NA NA
> position = matrix(c(4,2,1, 6,3,2, NA,NA,3, NA,NA,4), nrow = 3, ncol = 4)
[,1] [,2] [,3] [,4]
[1,] 1 4 NA NA
[2,] 2 3 NA NA
[3,] 1 2 3 4
> temp = c(10, 5, 8, 6, 9, 2, 4, 3)
I'm trying to populate vals with the values held in temp. However, the values must be placed in the spots given by position. Specifically, each row in position represents a row in vals, and the values represent the column in which the value must be placed.
For example, position[2,2] = 3. Since that's position's second row, the respective value must go into vals[2,3]. The final result would be:
[,1] [,2] [,3] [,4]
[1,] 10 NA NA 5
[2,] NA 8 6 NA
[3,] 9 2 4 3
This would be straightforward with for-loops, but can it be done without them?
We can use a row/column indexing by cbinding the row index (created with row, c -> convert the numeric index matrix to vector), with the column index by transposing the 'position', coerce it to vector (c), remove the NA elements (na.omit), extract the elements in 'vals' based on the indexes and assign (<-) to 'temp'
vals[na.omit(cbind(c(t(row(position))), c(t(position))))] <- temp
vals
# [,1] [,2] [,3] [,4]
#[1,] 10 NA NA 5
#[2,] NA 8 6 NA
#[3,] 9 2 4 3
data
position <- structure(c(1, 2, 1, 4, 3, 2, NA, NA, 3, NA, NA, 4), .Dim = 3:4)
I created an empty matrix by matrix(), when I need to test whether a given matrix is empty, How can I do that? I know that is.na(matrix()) is TRUE, but if given matrix is higher dimension, it cannot determine.
What I mean empty is element full of NA or NULL.
I'm guessing that you are just looking for all. Here's a small example:
M1 <- matrix(NA, ncol = 3, nrow = 3)
# [,1] [,2] [,3]
# [1,] NA NA NA
# [2,] NA NA NA
# [3,] NA NA NA
M2 <- matrix(c(1, rep(NA, 8)), ncol = 3, nrow = 3)
M2
# [,1] [,2] [,3]
# [1,] 1 NA NA
# [2,] NA NA NA
# [3,] NA NA NA
all(is.na(M1))
# [1] TRUE
all(is.na(M2))
# [1] FALSE
I want to try two things :
How do I remove rows that contain NA/NaN/Inf
How do I set value of data point from NA/NaN/Inf to 0.
So far, I have tried using the following for NA values, but been getting warnings.
> eg <- data[rowSums(is.na(data)) == 0,]
Error in rowSums(is.na(data)) :
'x' must be an array of at least two dimensions
In addition: Warning message:
In is.na(data) : is.na() applied to non-(list or vector) of type 'closure'
I guess I'll throw my hat into the ring with my preferred methods:
# sample data
m <- matrix(c(1,2,NA,NaN,1,Inf,-1,1,9,3),5)
# remove all rows with non-finite values
m[!rowSums(!is.finite(m)),]
# replace all non-finite values with 0
m[!is.finite(m)] <- 0
library(functional)
m[apply(m, 1, Compose(is.finite, all)),]
Demonstration:
m <- matrix(c(1,2,3,NA,4,5), 3)
m
## [,1] [,2]
## [1,] 1 NA
## [2,] 2 4
## [3,] 3 5
m[apply(m, 1, Compose(is.finite, all)),]
## [,1] [,2]
## [1,] 2 4
## [2,] 3 5
Note: Compose(is.finite, all) is equivalent to function(x) all(is.finite(x))
To set the values to 0, use matrix indexing:
m[!is.finite(m)] <- 0
m
## [,1] [,2]
## [1,] 1 0
## [2,] 2 4
## [3,] 3 5
NaRV.omit(x) is my preferred option for question 1. Mnemonic NaRV means "not a regular value".
require(IDPmisc)
m <- matrix(c(1,2,3,NA,5, NaN, 7, 8, 9, Inf, 11, 12, -Inf, 14, 15), 5)
> m
[,1] [,2] [,3]
[1,] 1 NaN 11
[2,] 2 7 12
[3,] 3 8 -Inf
[4,] NA 9 14
[5,] 5 Inf 15
> NaRV.omit(m)
[,1] [,2] [,3]
[1,] 2 7 12
attr(,"na.action")
[1] 1 3 4 5
attr(,"class")
[1] "omit"
Just another way (for the first question):
m <- structure(c(1, 2, 3, NA, 4, 5, Inf, 5, 6, NaN, 7, 8),
.Dim = c(4L, 3L))
# [,1] [,2] [,3]
# [1,] 1 4 6
# [2,] 2 5 NaN
# [3,] 3 Inf 7
# [4,] NA 5 8
m[complete.cases(m * 0), , drop=FALSE]
# [,1] [,2] [,3]
# [1,] 1 4 6
I can't think anything else other than Matthew's answer for the second part.
I have a vector with numerical and NA elements.
For example,
data<-c(.4, -1, 1, NA, 8, NA, -.4)
data[complete.cases(data), ]
But what's the function to separate them into different vectors so I can compare them using graphs such as a boxplot and ECDF?
It's not clear what problem you are trying to solve. complete.cases creates a logical vector for selection (if you use it correctly.) You can negate it to get the other ones. You cannot address a vector as you attempted with [ , ] but if 'data' were a dataframe (or a matrix) that would have worked.
data<-c(.4, -1, 1, NA, 8, NA, -.4)
data[complete.cases(data) ]
#[1] 0.4 -1.0 1.0 8.0 -0.4
data[!complete.cases(data) ]
#[1] NA NA
If one is trying to select the non-NA items it might be easier to use !is.na(data) as the selection vector. This is a test case showing it works with matrices as well as data.frames:
> dat <- matrix( sample(c(1,2,NA), 12, rep=TRUE), 3)
> dat
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] NA NA 2 2
[3,] 1 NA 2 1
> dat[ complete.cases(dat), ]
[1] 1 1 1 1
> dat[ ! complete.cases(dat), ]
[,1] [,2] [,3] [,4]
[1,] NA NA 2 2
[2,] 1 NA 2 1