Check if a number is between two others - r

I am looking for a function that verifies if a number is between two other numbers. I also need to control if I want a strict comparison (a
I know the function between() in dplyr. Yet, I have to know the upper and lower numbers.
MyNumber = 8
First = 2
Second = 10
# This will return TRUE
between(MyNumber, lower = First, upper = Second)
# But this will return FALSE
between(MyNumber, lower = Second, upper = First)
# This will return TRUE. I want it to return FALSE
First = 8
between(MyNumber, lower = First, upper = Second)
I need a function that returns TRUE no matter what is the order.

Something like:
between2 <- function(number,bounds) { number > min(bounds) & number < max(bounds)}
between2(8, c(2,10))
[1] TRUE
between2(8, c(10,2))
[1] TRUE
This function also deals with your added condition
between2(8,c(8,10))
[1] FALSE

You could do it with a simple arithmetics:
between <- function(number, first, second) { (first - number) * (second - number) < 0 }
Here are some example outputs:
> between(8, 2, 10)
[1] TRUE
> between(8, 10, 2)
[1] TRUE
> between(8, 10, 12)
[1] FALSE
> between(8, 1, 2)
[1] FALSE

You could use %in% with the : function, once you now first and last:
first <- 2
last <- 10
number <- 8
number %in% first:last
[1] TRUE
first <- 10
last <- 2
number <- 8
number %in% first:last
[1] TRUE
first <- 10
last <- 12
number <- 8
number %in% first:last
[1] FALSE
first <- 12
last <- 10
number <- 8
number %in% first:last
[1] FALSE
In a function, and strict lets you consider or not strict comparison:
my_between <- function(n, f, l, strict = FALSE) {
if (!strict) {
n %in% f:l # if strict == FALSE (default)
} else {
n %in% (f+1):(l-1) # if strict == TRUE
}
}
my_between(8, 2, 10)

What's wrong with
f_between <- function (num, L, R) num>=min(L,R) & num<=max(L,R)
f_between(8, 2, 10)
#[1] TRUE
f_between(6, 6, 10)
#[1] TRUE
f_between(2, -10, -2)
#[1] FALSE
f_between(3, 5, 7)
#[1] FALSE

Related

For loop with variables in quotation mark [duplicate]

What is the difference between = and ==? I have found cases where the double equal sign will allow my script to run while one equal sign produces an error message. When should I use == instead of =?
It depends on context as to what = means. == is always for testing equality.
= can be
in most cases used as a drop-in replacement for <-, the assignment operator.
> x = 10
> x
[1] 10
used as the separator for key-value pairs used to assign values to arguments in function calls.
rnorm(n = 10, mean = 5, sd = 2)
Because of 2. above, = can't be used as a drop-in replacement for <- in all situations. Consider
> rnorm(N <- 10, mean = 5, sd = 2)
[1] 4.893132 4.572640 3.801045 3.646863 4.522483 4.881694 6.710255 6.314024
[9] 2.268258 9.387091
> rnorm(N = 10, mean = 5, sd = 2)
Error in rnorm(N = 10, mean = 5, sd = 2) : unused argument (N = 10)
> N
[1] 10
Now some would consider rnorm(N <- 10, mean = 5, sd = 2) poor programming, but it is valid and you need to be aware of the differences between = and <- for assignment.
== is always used for equality testing:
> set.seed(10)
> logi <- sample(c(TRUE, FALSE), 10, replace = TRUE)
> logi
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> logi == TRUE
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> seq.int(1, 10) == 5L
[1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
Do be careful with == too however, as it really means exactly equal to and on a computer where floating point operations are involved you may not get the answer you were expecting. For example, from ?'==':
> x1 <- 0.5 - 0.3
> x2 <- 0.3 - 0.1
> x1 == x2 # FALSE on most machines
[1] FALSE
> identical(all.equal(x1, x2), TRUE) # TRUE everywhere
[1] TRUE
where all.equal() tests for equality allowing for a little bit of fuzziness due to loss of precision/floating point operations.
= is basically a synonym for assignment ( <- ), but most often used when passing values into functions.
== is a test for equality
In the simplest of terms, take these two lines of code for example:
1) x = 10
2) x == 10
The first line (x = 10) means "I am commanding that x is equal to 10."
The second line (x == 10) means "I am asking the question, is x equal to 10?"
If you write "x == 10" first, it will give you an error message and tell you that x is not found.
If you write "x = 10," this will store x as 10.
After you have written "x = 10", then if you write "x == 10," it will respond "TRUE", as in "yes, x does equal 10, because you made x equal to 10." But if you write "x == 11" or "x == 12" or x == anything besides 10, then it will respond that "FALSE," as in "no, x does not equal 11 or 12 or anything besides 10, because you made x equal to 10."
(=) is a Assignment operator while (==) is a Equal to operator.
(=) is used for assigning the values from right to left while (==) is used for showing equality between values.
Example:
$test = 1;
if($test=2){
echo "Hello";
}
if($test==2){
echo "world";
}
//The result is Hello because = is assigning the value to $test and the second condition is false because it check the equality of $test to the value 2.
I hope this will help.

Checking for sequences in an R vector

I'm looking for a function or operation such that if I have
A <- c(1, 2, 3, 4, 5)
and
B <- c(1, 2, 3)
and C <- c(2, 1)
I'd get a TRUE when checking whether A contained B, and FALSE when checking whether A contained C
basically, the equivalent of the %in% operator but that actually cares about the order of elements
In a perfect world, I'd be able to do this without some kind of apply statement, but I may end up having to
Well, if one's allowd to use a kind-of apply loop, then this could work:
"%seq_in%" = function(b,a) any(sapply(1:(length(a)-length(b)+1),function(i) all(a[i:(i+length(b)-1)]==b)))
(edited thanks to bug-finding by John Coleman!)
EDIT 2:
I couldn't resist trying to solve the 'non-contiguous' case, too:
# find_subseq() returns positions within vec of ordered elements of x, or stops with NA upon failing
find_subseq = function(x,vec) {
p=match(x[1],vec)
if(is.na(p)||length(x)==1){ p }
else { c(p,p+find_subseq(x[-1],vec[-seq_len(p)])) }
}
"%seq_somewhere_in%" = function(b,a) all(!is.na(find_subseq(b,a)))
Examples:
1:3 %seq_in% 1:10
[1] TRUE
c(3,1,2) %seq_in% 1:10
[1] FALSE
c(1,2,3) %seq_in% c(3,2,1,2,3)
[1] TRUE
2:1 %seq_in% c(1,2,1)
[1] TRUE
1:3 %seq_somewhere_in% c(1,10,10,2,10,10,10,3,10)
[1] TRUE
Maybe you can define a custom function subseq_check like below
subseq_check <- function(x,y) grepl(toString(y),toString(x),fixed = TRUE)
which gives
> subseq_check(A,B)
[1] TRUE
> subseq_check(A,C)
[1] FALSE
A Hard-core approach
subseq_find <- function(x,y) {
inds <- which(x == head(y,1))
if (length(inds)==0) return(FALSE)
any(sapply(inds, function(k) all(x[k:(k+length(y)-1)]==y)))
}
such that
> subseq_find(A,B)
[1] TRUE
> subseq_find(A,C)
[1] FALSE

What is the difference between = and == in R?

What is the difference between = and ==? I have found cases where the double equal sign will allow my script to run while one equal sign produces an error message. When should I use == instead of =?
It depends on context as to what = means. == is always for testing equality.
= can be
in most cases used as a drop-in replacement for <-, the assignment operator.
> x = 10
> x
[1] 10
used as the separator for key-value pairs used to assign values to arguments in function calls.
rnorm(n = 10, mean = 5, sd = 2)
Because of 2. above, = can't be used as a drop-in replacement for <- in all situations. Consider
> rnorm(N <- 10, mean = 5, sd = 2)
[1] 4.893132 4.572640 3.801045 3.646863 4.522483 4.881694 6.710255 6.314024
[9] 2.268258 9.387091
> rnorm(N = 10, mean = 5, sd = 2)
Error in rnorm(N = 10, mean = 5, sd = 2) : unused argument (N = 10)
> N
[1] 10
Now some would consider rnorm(N <- 10, mean = 5, sd = 2) poor programming, but it is valid and you need to be aware of the differences between = and <- for assignment.
== is always used for equality testing:
> set.seed(10)
> logi <- sample(c(TRUE, FALSE), 10, replace = TRUE)
> logi
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> logi == TRUE
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> seq.int(1, 10) == 5L
[1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
Do be careful with == too however, as it really means exactly equal to and on a computer where floating point operations are involved you may not get the answer you were expecting. For example, from ?'==':
> x1 <- 0.5 - 0.3
> x2 <- 0.3 - 0.1
> x1 == x2 # FALSE on most machines
[1] FALSE
> identical(all.equal(x1, x2), TRUE) # TRUE everywhere
[1] TRUE
where all.equal() tests for equality allowing for a little bit of fuzziness due to loss of precision/floating point operations.
= is basically a synonym for assignment ( <- ), but most often used when passing values into functions.
== is a test for equality
In the simplest of terms, take these two lines of code for example:
1) x = 10
2) x == 10
The first line (x = 10) means "I am commanding that x is equal to 10."
The second line (x == 10) means "I am asking the question, is x equal to 10?"
If you write "x == 10" first, it will give you an error message and tell you that x is not found.
If you write "x = 10," this will store x as 10.
After you have written "x = 10", then if you write "x == 10," it will respond "TRUE", as in "yes, x does equal 10, because you made x equal to 10." But if you write "x == 11" or "x == 12" or x == anything besides 10, then it will respond that "FALSE," as in "no, x does not equal 11 or 12 or anything besides 10, because you made x equal to 10."
(=) is a Assignment operator while (==) is a Equal to operator.
(=) is used for assigning the values from right to left while (==) is used for showing equality between values.
Example:
$test = 1;
if($test=2){
echo "Hello";
}
if($test==2){
echo "world";
}
//The result is Hello because = is assigning the value to $test and the second condition is false because it check the equality of $test to the value 2.
I hope this will help.

Check the element of a vetcor is between the element of second vector in R [duplicate]

This question already has answers here:
Check which elements of a vector is between the elements of another one in R
(4 answers)
Closed 9 years ago.
I have two vectors. I want to check the first element of first vector is between first and second element of second vector , then check the second element of first vector is between the third and forth element of the second vector ,.....How can I do this in R?
For example, If we have tow vectors
a = c(1.5, 2, 3.5)
b = c(1, 2, 3, 5, 3, 8)
the final result in R should be for 1.5 is TRUE and 3.5 is TRUE and for 2 is FALSE.
x <- c(1.5,3.5,3.5,3.5,4)
y <- 1:5
x > y & x < c(y[-1],NA)
#[1] TRUE FALSE TRUE FALSE FALSE
You need to take care of vector lengths and think about, what you want the result to be for the last element of x and of course.
More robust solution:
x <- c(1.5,3.5,3.5,3.5,4)
findInterval(x,y) == seq_along(x)
#[1] TRUE FALSE TRUE FALSE FALSE
x1 <- c(1.5,3.5)
findInterval(x1,y) == seq_along(x1)
#[1] TRUE FALSE
x2 <- c(1.5,3.5,1:5+0.5)
findInterval(x2,y) == seq_along(x2)
#[1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE
Here's one way.
s <- seq_along(a)
b[s] < a[s] & a[s] < b[s+1]
# [1] TRUE FALSE TRUE
Maybe this is not an ideal and fastest solution, but it works.
a <- rnorm(99)
b <- rnorm(100)
m <- cbind(b[-length(b)], b[-1])
a > m[,1] & a < m[,2]
You should check the lengths of both initial vectors.
Here is one-line solution:
sapply(1:length(a), function(i) {a[i] > b[i] & a[i] < b[i+1]})

How to check if a vector contains n consecutive numbers

Suppose that my vector numbers contains c(1,2,3,5,7,8), and I wish to find if it contains 3 consecutive numbers, which in this case, are 1,2,3.
numbers = c(1,2,3,5,7,8)
difference = diff(numbers) //The difference output would be 1,1,2,2,1
To verify that there are 3 consecutive integers in my numbers vector, I've tried the following with little reward.
rep(1,2)%in%difference
The above code works in this case, but if my difference vector = (1,2,2,2,1), it would still return TRUE even though the "1"s are not consecutive.
Using diff and rle, something like this should work:
result <- rle(diff(numbers))
any(result$lengths>=2 & result$values==1)
# [1] TRUE
In response to the comments below, my previous answer was specifically only testing for runs of length==3 excluding longer lengths. Changing the == to >= fixes this. It also works for runs involving negative numbers:
> numbers4 <- c(-2, -1, 0, 5, 7, 8)
> result <- rle(diff(numbers4))
> any(result$lengths>=2 & result$values==1)
[1] TRUE
Benchmarks!
I am including a couple functions of mine. Feel free to add yours. To qualify, you need to write a general function that tells if a vector x contains n or more consecutive numbers. I provide a unit test function below.
The contenders:
flodel.filter <- function(x, n, incr = 1L) {
if (n > length(x)) return(FALSE)
x <- as.integer(x)
is.cons <- tail(x, -1L) == head(x, -1L) + incr
any(filter(is.cons, rep(1L, n-1L), sides = 1, method = "convolution") == n-1L,
na.rm = TRUE)
}
flodel.which <- function(x, n, incr = 1L) {
is.cons <- tail(x, -1L) == head(x, -1L) + incr
any(diff(c(0L, which(!is.cons), length(x))) >= n)
}
thelatemail.rle <- function(x, n, incr = 1L) {
result <- rle(diff(x))
any(result$lengths >= n-1L & result$values == incr)
}
improved.rle <- function(x, n, incr = 1L) {
result <- rle(diff(as.integer(x)) == incr)
any(result$lengths >= n-1L & result$values)
}
carl.seqle <- function(x, n, incr = 1) {
if(!is.numeric(x)) x <- as.numeric(x)
z <- length(x)
y <- x[-1L] != x[-z] + incr
i <- c(which(y | is.na(y)), z)
any(diff(c(0L, i)) >= n)
}
Unit tests:
check.fun <- function(fun)
stopifnot(
fun(c(1,2,3), 3),
!fun(c(1,2), 3),
!fun(c(1), 3),
!fun(c(1,1,1,1), 3),
!fun(c(1,1,2,2), 3),
fun(c(1,1,2,3), 3)
)
check.fun(flodel.filter)
check.fun(flodel.which)
check.fun(thelatemail.rle)
check.fun(improved.rle)
check.fun(carl.seqle)
Benchmarks:
x <- sample(1:10, 1000000, replace = TRUE)
library(microbenchmark)
microbenchmark(
flodel.filter(x, 6),
flodel.which(x, 6),
thelatemail.rle(x, 6),
improved.rle(x, 6),
carl.seqle(x, 6),
times = 10)
# Unit: milliseconds
# expr min lq median uq max neval
# flodel.filter(x, 6) 96.03966 102.1383 144.9404 160.9698 177.7937 10
# flodel.which(x, 6) 131.69193 137.7081 140.5211 185.3061 189.1644 10
# thelatemail.rle(x, 6) 347.79586 353.1015 361.5744 378.3878 469.5869 10
# improved.rle(x, 6) 199.35402 200.7455 205.2737 246.9670 252.4958 10
# carl.seqle(x, 6) 213.72756 240.6023 245.2652 254.1725 259.2275 10
After diff you can check for any consecutive 1s -
numbers = c(1,2,3,5,7,8)
difference = diff(numbers) == 1
## [1] TRUE TRUE FALSE FALSE TRUE
## find alteast one consecutive TRUE
any(tail(difference, -1) &
head(difference, -1))
## [1] TRUE
It's nice to see home-grown solutions here.
Fellow Stack Overflow user Carl Witthoft posted a function he named seqle() and shared it here.
The function looks like this:
seqle <- function(x,incr=1) {
if(!is.numeric(x)) x <- as.numeric(x)
n <- length(x)
y <- x[-1L] != x[-n] + incr
i <- c(which(y|is.na(y)),n)
list(lengths = diff(c(0L,i)),
values = x[head(c(0L,i)+1L,-1L)])
}
Let's see it in action. First, some data:
numbers1 <- c(1, 2, 3, 5, 7, 8)
numbers2 <- c(-2, 2, 3, 5, 6, 7, 8)
numbers3 <- c(1, 2, 2, 2, 1, 2, 3)
Now, the output:
seqle(numbers1)
# $lengths
# [1] 3 1 2
#
# $values
# [1] 1 5 7
#
seqle(numbers2)
# $lengths
# [1] 1 2 4
#
# $values
# [1] -2 2 5
#
seqle(numbers3)
# $lengths
# [1] 2 1 1 3
#
# $values
# [1] 1 2 2 1
#
Of particular interest to you is the "lengths" in the result.
Another interesting point is the incr argument. Here we can set the increment to, say, "2" and look for sequences where the difference between the numbers are two. So, for the first vector, we would expect the sequence of 3, 5, and 7 to be detected.
Let's try:
> seqle(numbers1, incr = 2)
$lengths
[1] 1 1 3 1
$values
[1] 1 2 3 8
So, we can see that we have a sequence of 1 (1), 1 (2), 3 (3, 5, 7), and 1 (8) if we set incr = 2.
How does it work with ECII's second challenge? Seems OK!
> numbers4 <- c(-2, -1, 0, 5, 7, 8)
> seqle(numbers4)
$lengths
[1] 3 1 2
$values
[1] -2 5 7
Simple but works
numbers = c(-2,2,3,4,5,10,6,7,8)
x1<-c(diff(numbers),0)
x2<-c(0,diff(numbers[-1]),0)
x3<-c(0,diff(numbers[c(-1,-2)]),0,0)
rbind(x1,x2,x3)
colSums(rbind(x1,x2,x3) )==3 #Returns TRUE or FALSE where in the vector the consecutive intervals triplet takes place
[1] FALSE TRUE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
sum(colSums(rbind(x1,x2,x3) )==3) #How many triplets of consecutive intervals occur in the vector
[1] 3
which(colSums(rbind(x1,x2,x3) )==3) #Returns the location of the triplets consecutive integers
[1] 2 3 7
Note that this will not work for consecutive negative intervals c(-2,-1,0) because of how diff() works

Resources