Screen which element is within a range in R [duplicate] - r

This question already has answers here:
How do I filter a range of numbers in R? [duplicate]
(6 answers)
Closed 3 years ago.
I would like to ask if there is a way to check for example
c(13, 20, 1, 5, 40, 15, 6, 8)
is within a range e.g. > 5 and <= 30 will give output like below:
[1] TRUE TRUE FALSE FALSE TRUE TRUE TRUE

Isn't it just this?
x <- c(13, 20, 1, 5, 40, 15, 6, 8)
x > 5 & x <= 30
#[1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE
We can also use between from dplyr or data.table but this includes upper and lower boundaries so we can do
dplyr::between(x, 6, 31)
#[1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE
Or
data.table::between(x, 6, 31)

First of all you omitted a FALSE in your expected result.
But you can achieve that by doing this :
c <- c(13, 20, 1, 5, 40, 15, 6, 8)
a <- c > 5 & c <= 30
print(a)

Related

How to validate a condition in a for loop

I am studying R end Data Science. In a question, I need to validate if a number in an array is even.
My code:
vetor <- list(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
for (i in vetor) {
if (i %% 2 == 0) {
print(i)
}
}
But the result is a warning message:
Warning message:
In if (i%%2 == 0) { :
a condição tem comprimento > 1 e somente o primeiro elemento será usado
Translating:
The condition has a length > 1 and only the first element will be used.
What I need, that each element in a list be verified if is even, and if true, then, print it.
In R, how can I do it?
The wrapper for list is not needed
vetor <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
running the OP's code
for (i in vetor) {
if (i %% 2 == 0) {
print(i)
}
}
#[1] 2
#[1] 4
#[1] 6
#[1] 8
#[1] 10
These are vectorized operations. We don't need a loop
vetor[vetor %% 2 == 0]
#[1] 2 4 6 8 10
When we wrap the vector with list, it returns a list of length 1 and the unit will be the whole vector. The for loop in R is a for each loop and not the traditional counter controlled 3 part expression loop. So, the i will be the whole vetor vector.
Because if/else expects a single element and not a vector of length greater than 1, it results in the warning message
Or if we want to store it in a list with each element of length 1, use as.list
vetor <- as.list(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
Let's break down your code and dig into each step to see what happened ...
You should notice that vetor is a list, i.e.,
> vetor
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
In this case, the iterator i in vetor denotes the array in vetor, which can be seen from
> for (i in vetor) {
+ str(i)
+ }
num [1:10] 1 2 3 4 5 6 7 8 9 10
Therefore, when you have condition i%%2==0, you are indeed running
> for (i in vetor) {
+ print(i %% 2 == 0)
+ }
[1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
which is not a single logic value as a condition for if ... else ... state. That is the reason you got the warnings.
Regarding the workaround, you can refer to #akrun's answer, which could help you a lot

unexpected results when comparing Biostrings subsequences using the identical function

I'm checking if a sequence is present at the beginning and at the end of a longer sequence. I considered using identical but this gives me a surprising result:
library(Biostrings)
EcoRI <- DNAString("GAATTC")
myseq <- DNAString("GAATTCGGGGAAAATTTTCCCCGAATTC")
EcoRI
# 6-letter "DNAString" instance
#seq: GAATTC
subseq(myseq, 1, 6)
# 6-letter "DNAString" instance
#seq: GAATTC
subseq(myseq, 23, 28)
# 6-letter "DNAString" instance
#seq: GAATTC
identical(EcoRI, subseq(myseq, 1, 6))
#TRUE
identical(EcoRI, subseq(myseq, 23, 28))
#FALSE
identical(subseq(myseq, 1, 6), subseq(myseq, 23, 28))
#FALSE
An easy fix is to use:
identical(toString(EcoRI), toString(subseq(myseq, 23, 28)))
# TRUE
But I don't understand why identical on the DNAString objects returns FALSE sometimes.
Does identical also compare the offset attributes?
attributes(EcoRI)$offset
#[1] 0
attributes(subseq(myseq, 1, 6))$offset
#[1] 0
attributes(subseq(myseq, 23, 28))$offset
#[1] 22

For values in list above a number set boolean (true, false) in R

Suppose, I have a list
a <- c(3, 5, 2, 7, 9)
and I want to do a vector operation, something like:
a_greater_than_five <- a[a>5]
but I want results something like below:
a_greater_than_five <- c(false, false, false, true, true).
Something similar to numpy in python: Check if all values in list are greater than a certain number
> a <- c(3, 5, 2, 7, 9)
> Result <- a>5
> Result
[1] FALSE FALSE FALSE TRUE TRUE
If
a <- c(3, 5, 2, 7, 9)
then simply
a > 5
# [1] FALSE FALSE FALSE TRUE TRUE

Counting number of occurrences in R

I'm learning R and I face a problem I don't know how to solve. I have an input subset similar to the following, with a client_id and 7 integers:
(client_id, 10, 8, -5, 8, 1, -23, 12)
I would like return the same vector but with an additional fields. First, one containing the number of times any of the other values are negative. For the above example, the result would be:
(client_id, 10, 8, -5, 8, 1, -23, 12, 2)
because there are just 2 negative numbers in the 7 integers.
A second field would be the number of values that are 0
(client_id, 10, 8, -5, 8, 1, -23, 12, 2, 3)
because there are 3 values between 0 and 10.
Can anyone help me with this issue?
Thanks a lot.
How about this:
> t <- c("client_id", 10, 8, -5, 8, 1, -23, 12) # create vector
> nums <- as.numeric(t[2:length(t)]) # get numbers only from vector
> sum(nums < 0) # Counts the numbers less that 0
[1] 2
> sum(nums > 0 & nums < 10) # counts the number > 0 and < 10
[1] 3
> t <- append(t,sum(nums < 0)) # append to original vector
> t <- append(t,sum(nums > 0 & nums < 10))
> t
[1] "client_id" "10" "8" "-5" "8" "1" "-23" "12" "2" "3"
>
If your vector is x it would be
x <- c(x, length(which(x[-1]<0)), length(which(x[-1]>=0 & x[-1]<=10)))
To answer to #Nishanth : you can also do
x <- c(x, sum(x[-1]<0), sum(x[-1]>=0 && x[-1]<=10))

Check the element of a vetcor is between the element of second vector in R [duplicate]

This question already has answers here:
Check which elements of a vector is between the elements of another one in R
(4 answers)
Closed 9 years ago.
I have two vectors. I want to check the first element of first vector is between first and second element of second vector , then check the second element of first vector is between the third and forth element of the second vector ,.....How can I do this in R?
For example, If we have tow vectors
a = c(1.5, 2, 3.5)
b = c(1, 2, 3, 5, 3, 8)
the final result in R should be for 1.5 is TRUE and 3.5 is TRUE and for 2 is FALSE.
x <- c(1.5,3.5,3.5,3.5,4)
y <- 1:5
x > y & x < c(y[-1],NA)
#[1] TRUE FALSE TRUE FALSE FALSE
You need to take care of vector lengths and think about, what you want the result to be for the last element of x and of course.
More robust solution:
x <- c(1.5,3.5,3.5,3.5,4)
findInterval(x,y) == seq_along(x)
#[1] TRUE FALSE TRUE FALSE FALSE
x1 <- c(1.5,3.5)
findInterval(x1,y) == seq_along(x1)
#[1] TRUE FALSE
x2 <- c(1.5,3.5,1:5+0.5)
findInterval(x2,y) == seq_along(x2)
#[1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE
Here's one way.
s <- seq_along(a)
b[s] < a[s] & a[s] < b[s+1]
# [1] TRUE FALSE TRUE
Maybe this is not an ideal and fastest solution, but it works.
a <- rnorm(99)
b <- rnorm(100)
m <- cbind(b[-length(b)], b[-1])
a > m[,1] & a < m[,2]
You should check the lengths of both initial vectors.
Here is one-line solution:
sapply(1:length(a), function(i) {a[i] > b[i] & a[i] < b[i+1]})

Resources