How to ignore NA in ifelse statement - r

I came to R from SAS, where numeric missing is set to infinity. So we can just say:
positiveA = A > 0;
In R, I have to be verbose like:
positiveA <- ifelse(is.na(A),0, ifelse(A > 0, 1, 0))
I find this syntax is hard to read. Is there anyway I can modify ifelse function to consider NA a special value that is always false for all comparison conditions? If not, considering NA as -Inf will work too.
Similarly, setting NA to '' (blank) in ifelse statement for character variables.
Thanks.

This syntax is easier to read:
x <- c(NA, 1, 0, -1)
(x > 0) & (!is.na(x))
# [1] FALSE TRUE FALSE FALSE
(The outer parentheses aren't necessary, but will make the statement easier to read for almost anyone other than the machine.)
Edit:
## If you want 0s and 1s
((x > 0) & (!is.na(x))) * 1
# [1] 0 1 0 0
Finally, you can make the whole thing into a function:
isPos <- function(x) {
(x > 0) & (!is.na(x)) * 1
}
isPos(x)
# [1] 0 1 0 0

Replacing a NA value with zero seems rather strange behaviour to expect. R considers NA values missing (although hidden far behind scenes where you (never) need to go they are negative very large numbers when numeric ))
All you need to do is A>0 or as.numeric(A>0) if you want 0,1 not TRUE , FALSE
# some dummy data
A <- seq(-1,1,l=11)
# add NA value as second value
A[2] <- NA
positiveA <- A>0
positiveA
[1] FALSE NA FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
as.numeric(positiveA) #
[1] 0 NA 0 0 0 0 1 1 1 1 1
note that
ifelse(A>0, 1,0) would also work.
The NA values are "retained", or dealt with appropriately. R is sensible here.

Try this:
positiveA <- ifelse(!is.na(A) & A > 0, 1, 0)

If you are working with integers you can use %in%
For example, if your numbers can go up to 2
test <- c(NA, 2, 1, 0, -1)
other people has suggested to use
(test > 0) & (!is.na(test))
or
ifelse(!is.na(test) & test > 0, 1, 0)
my solution is simpler and gives you the same result.
test %in% 1:2

YOu can use the missing argument i if_else_ from hablar:
library(hablar)
x <- c(NA, 1, 0, -1)
if_else_(x > 0, T, F, missing = F)
which gives you
[1] FALSE TRUE FALSE FALSE

Related

The logical comparison inside the loop crashes when it encounters zero

I wanted to explain my problem with codes
example_1 <- sample(-100:100, 100) # simple sample for my question
example_1[30] <- NA # changed one of them to NA
not_equal_zero <- matrix(NA, 100, 1) # matrix to find out if there is any zeros (1 for TRUE, 0 for FALSE)
for (i in 1:100) { # check each observation if it is 0 assign 1 to "not equal zero matrix"
if (example_1[i] == 0) {
not_equal_zero[i] <- 1
} else {
not_equal_zero[i] <- 0
}
}
When i = 30 it finds 0, and terminates. I am not checking only against zero. I have special values. What is the solution for this problem?
2 == 0 # it gives FALSE
0 == 0 # it gives TRUE
NA == 0 # it gives NA but i need FALSE
NA gives NA when compared to anything. You probably want to replace:
if (example_1[i] == 0)
with:
if (!is.na(example_1[i]) && example_1[i] == 0)

Cut elements from the beginning and end of an R vector

For time series analysis I handle data that often contains leading and trailing zero elements. In this example, there are 3 zeros at the beginning an 2 at the end. I want to get rid of these elements, and filter for the contents in the middle (that also may contain zeros)
vec <- c(0, 0, 0, 1, 2, 0, 3, 4, 0, 0)
I did this by looping from the beginning and end, and masking out the unwanted elements.
mask <- rep(TRUE, length(vec))
# from begin
i <- 1
while(vec[i] == 0 && i <= length(vec)) {
mask[i] <- FALSE
i <- i+1
}
# from end
i <- length(vec)
while(i >= 1 && vec[i] == 0) {
mask[i] <- FALSE
i <- i-1
}
cleanvec <- vec[mask]
cleanvec
[1] 1 2 0 3 4
This works, but I wonder if there is a more efficient way to do this, avoiding the loops.
vec[ min(which(vec != 0)) : max(which(vec != 0)) ]
Basically the which(vec != 0) part gives the positions of the numbers that are different from 0, and then you take the min and max of them.
We could use the range and Reduce to get the sequence
vec[Reduce(`:`, range(which(vec != 0)))]
#[1] 1 2 0 3 4
Take the cumsum forward and backward of abs(vec) and keep only elements > 0. if it were known that all elements of vec were non-negative, as in the question, then we could optionally omit abs.
vec[cumsum(abs(vec)) > 0 & rev(cumsum(rev(abs(vec)))) > 0]
## [1] 1 2 0 3 4

R get index satisty the condition [duplicate]

I am looking for a condition which will return the index of a vector satisfying a condition.
For example-
I have a vector b = c(0.1, 0.2, 0.7, 0.9)
I want to know the first index of b for which say b >0.65. In this case the answer should be 3
I tried which.min(subset(b, b > 0.65))
But this gives me 1 instead of 3.
Please help
Use which and take the first element of the result:
which(b > 0.65)[1]
#[1] 3
Be careful, which.max is wrong if the condition is never met, it does not return NA:
> a <- c(1, 2, 3, 2, 5)
> a >= 6
[1] FALSE FALSE FALSE FALSE FALSE
> which(a >= 6)[1]
[1] NA # desirable
> which.max(a >= 6)
[1] 1 # not desirable
Why? When all elements are equal, which.max returns 1:
> b <- c(2, 2, 2, 2, 2)
> which.max(b)
[1] 1
Note: FALSE < TRUE
You may use which.max:
which.max(b > 0.65)
# [1] 3
From ?which.max: "For a logical vector x, [...] which.max(x) return[s] the index of the first [...] TRUE
b > 0.65
# [1] FALSE FALSE TRUE TRUE
You should also have a look at the result of your code subset(b, b > 0.65) to see why it can't give you the desired result.

How to prevent NULL from killing my ifelse vectorising?

I would like to vectorize such function:
if (i > 0)
fun1(a[i])
else
fun2(i)
where fun1, fun2 are already vectorized, and a is a vector.
My attempt:
ifelse(i > 0, fun1(a[i]), fun2(i))
However it is wrong!
> a <- c(1,2,3,4,5)
> i<-c(0,1,2,3)
> ifelse(i > 0, a[i], 0)
[1] 0 2 3 1 # expected 0 1 2 3
Do I have to use sapply? Is there any simple alternative that works?
There is nothing wrong. a[i] evaluates to c(1,2,3) since a[0] is ignored. this is recycled to c(1,2,3,1) to match length(i). So you get 0 2 3 1 from your ifelse because the first element of i is FALSE, and the other come from a[i] recycled.
There is a workaroud though: you can replace non-positive indices with NA:
> ifelse(i > 0, a[ifelse(i > 0, i, NA)], 0)
[1] 0 1 2 3

Creation of a specific vector without loop or recursion in R

I've got a first vector, let's say x that consists only of 1's and -1's. Then, I have a second vector y that consists of 1's, -1's, and zeros. Now, I'd like to create a vector z that contains in index i a 1 if x[i] equals 1 and a 1 exists within the vector y between the n precedent elements (y[(i-n):i])...
more formally: z <- ifelse(x == 1 && 1 %in% y[(index(y)-n):index(y)],1,0)
I'm looking to create such a vector in R without looping or recursion. The proposition above does not work since it does not recognize to take the expression y[(index(y)-n):index(y)] element by element.
Thanks a lot for your support
Here's an approach that uses the cumsum function to test for the number of ones that have been seen so far. If the number of ones at position i is larger than the number of ones at position i-n, then the condition on the right will be satisfied.
## Generate some random y's.
> y <- sample(-1:1, 25, replace=T)
> y
[1] 0 1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 0 0 -1 -1 -1 1 -1 1 1 0 0 0 1
> n <- 3
## Compute number of ones seen at each position.
> cs <- cumsum(ifelse(y == 1, 1, 0))
> lagged.cs <- c(rep(0, n), cs[1:(length(cs)-n)])
> (cs - lagged.cs) > 0
[1] FALSE TRUE TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE
[25] TRUE
You could use apply like this, although it is essentially a pretty way to do a loop, I'm not sure if it will be faster (it may or may not).
y1 <- unlist(lapply(1:length(x), function(i){1 %in% y[max(0, (i-n)):i]}))
z <- as.numeric(x==1) * as.numeric(y1)

Resources