What is the difference between = and == in R? - r

What is the difference between = and ==? I have found cases where the double equal sign will allow my script to run while one equal sign produces an error message. When should I use == instead of =?

It depends on context as to what = means. == is always for testing equality.
= can be
in most cases used as a drop-in replacement for <-, the assignment operator.
> x = 10
> x
[1] 10
used as the separator for key-value pairs used to assign values to arguments in function calls.
rnorm(n = 10, mean = 5, sd = 2)
Because of 2. above, = can't be used as a drop-in replacement for <- in all situations. Consider
> rnorm(N <- 10, mean = 5, sd = 2)
[1] 4.893132 4.572640 3.801045 3.646863 4.522483 4.881694 6.710255 6.314024
[9] 2.268258 9.387091
> rnorm(N = 10, mean = 5, sd = 2)
Error in rnorm(N = 10, mean = 5, sd = 2) : unused argument (N = 10)
> N
[1] 10
Now some would consider rnorm(N <- 10, mean = 5, sd = 2) poor programming, but it is valid and you need to be aware of the differences between = and <- for assignment.
== is always used for equality testing:
> set.seed(10)
> logi <- sample(c(TRUE, FALSE), 10, replace = TRUE)
> logi
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> logi == TRUE
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> seq.int(1, 10) == 5L
[1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
Do be careful with == too however, as it really means exactly equal to and on a computer where floating point operations are involved you may not get the answer you were expecting. For example, from ?'==':
> x1 <- 0.5 - 0.3
> x2 <- 0.3 - 0.1
> x1 == x2 # FALSE on most machines
[1] FALSE
> identical(all.equal(x1, x2), TRUE) # TRUE everywhere
[1] TRUE
where all.equal() tests for equality allowing for a little bit of fuzziness due to loss of precision/floating point operations.

= is basically a synonym for assignment ( <- ), but most often used when passing values into functions.
== is a test for equality

In the simplest of terms, take these two lines of code for example:
1) x = 10
2) x == 10
The first line (x = 10) means "I am commanding that x is equal to 10."
The second line (x == 10) means "I am asking the question, is x equal to 10?"
If you write "x == 10" first, it will give you an error message and tell you that x is not found.
If you write "x = 10," this will store x as 10.
After you have written "x = 10", then if you write "x == 10," it will respond "TRUE", as in "yes, x does equal 10, because you made x equal to 10." But if you write "x == 11" or "x == 12" or x == anything besides 10, then it will respond that "FALSE," as in "no, x does not equal 11 or 12 or anything besides 10, because you made x equal to 10."

(=) is a Assignment operator while (==) is a Equal to operator.
(=) is used for assigning the values from right to left while (==) is used for showing equality between values.
Example:
$test = 1;
if($test=2){
echo "Hello";
}
if($test==2){
echo "world";
}
//The result is Hello because = is assigning the value to $test and the second condition is false because it check the equality of $test to the value 2.
I hope this will help.

Related

For loop with variables in quotation mark [duplicate]

What is the difference between = and ==? I have found cases where the double equal sign will allow my script to run while one equal sign produces an error message. When should I use == instead of =?
It depends on context as to what = means. == is always for testing equality.
= can be
in most cases used as a drop-in replacement for <-, the assignment operator.
> x = 10
> x
[1] 10
used as the separator for key-value pairs used to assign values to arguments in function calls.
rnorm(n = 10, mean = 5, sd = 2)
Because of 2. above, = can't be used as a drop-in replacement for <- in all situations. Consider
> rnorm(N <- 10, mean = 5, sd = 2)
[1] 4.893132 4.572640 3.801045 3.646863 4.522483 4.881694 6.710255 6.314024
[9] 2.268258 9.387091
> rnorm(N = 10, mean = 5, sd = 2)
Error in rnorm(N = 10, mean = 5, sd = 2) : unused argument (N = 10)
> N
[1] 10
Now some would consider rnorm(N <- 10, mean = 5, sd = 2) poor programming, but it is valid and you need to be aware of the differences between = and <- for assignment.
== is always used for equality testing:
> set.seed(10)
> logi <- sample(c(TRUE, FALSE), 10, replace = TRUE)
> logi
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> logi == TRUE
[1] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE
> seq.int(1, 10) == 5L
[1] FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
Do be careful with == too however, as it really means exactly equal to and on a computer where floating point operations are involved you may not get the answer you were expecting. For example, from ?'==':
> x1 <- 0.5 - 0.3
> x2 <- 0.3 - 0.1
> x1 == x2 # FALSE on most machines
[1] FALSE
> identical(all.equal(x1, x2), TRUE) # TRUE everywhere
[1] TRUE
where all.equal() tests for equality allowing for a little bit of fuzziness due to loss of precision/floating point operations.
= is basically a synonym for assignment ( <- ), but most often used when passing values into functions.
== is a test for equality
In the simplest of terms, take these two lines of code for example:
1) x = 10
2) x == 10
The first line (x = 10) means "I am commanding that x is equal to 10."
The second line (x == 10) means "I am asking the question, is x equal to 10?"
If you write "x == 10" first, it will give you an error message and tell you that x is not found.
If you write "x = 10," this will store x as 10.
After you have written "x = 10", then if you write "x == 10," it will respond "TRUE", as in "yes, x does equal 10, because you made x equal to 10." But if you write "x == 11" or "x == 12" or x == anything besides 10, then it will respond that "FALSE," as in "no, x does not equal 11 or 12 or anything besides 10, because you made x equal to 10."
(=) is a Assignment operator while (==) is a Equal to operator.
(=) is used for assigning the values from right to left while (==) is used for showing equality between values.
Example:
$test = 1;
if($test=2){
echo "Hello";
}
if($test==2){
echo "world";
}
//The result is Hello because = is assigning the value to $test and the second condition is false because it check the equality of $test to the value 2.
I hope this will help.

Checking for sequences in an R vector

I'm looking for a function or operation such that if I have
A <- c(1, 2, 3, 4, 5)
and
B <- c(1, 2, 3)
and C <- c(2, 1)
I'd get a TRUE when checking whether A contained B, and FALSE when checking whether A contained C
basically, the equivalent of the %in% operator but that actually cares about the order of elements
In a perfect world, I'd be able to do this without some kind of apply statement, but I may end up having to
Well, if one's allowd to use a kind-of apply loop, then this could work:
"%seq_in%" = function(b,a) any(sapply(1:(length(a)-length(b)+1),function(i) all(a[i:(i+length(b)-1)]==b)))
(edited thanks to bug-finding by John Coleman!)
EDIT 2:
I couldn't resist trying to solve the 'non-contiguous' case, too:
# find_subseq() returns positions within vec of ordered elements of x, or stops with NA upon failing
find_subseq = function(x,vec) {
p=match(x[1],vec)
if(is.na(p)||length(x)==1){ p }
else { c(p,p+find_subseq(x[-1],vec[-seq_len(p)])) }
}
"%seq_somewhere_in%" = function(b,a) all(!is.na(find_subseq(b,a)))
Examples:
1:3 %seq_in% 1:10
[1] TRUE
c(3,1,2) %seq_in% 1:10
[1] FALSE
c(1,2,3) %seq_in% c(3,2,1,2,3)
[1] TRUE
2:1 %seq_in% c(1,2,1)
[1] TRUE
1:3 %seq_somewhere_in% c(1,10,10,2,10,10,10,3,10)
[1] TRUE
Maybe you can define a custom function subseq_check like below
subseq_check <- function(x,y) grepl(toString(y),toString(x),fixed = TRUE)
which gives
> subseq_check(A,B)
[1] TRUE
> subseq_check(A,C)
[1] FALSE
A Hard-core approach
subseq_find <- function(x,y) {
inds <- which(x == head(y,1))
if (length(inds)==0) return(FALSE)
any(sapply(inds, function(k) all(x[k:(k+length(y)-1)]==y)))
}
such that
> subseq_find(A,B)
[1] TRUE
> subseq_find(A,C)
[1] FALSE

Check if a number is between two others

I am looking for a function that verifies if a number is between two other numbers. I also need to control if I want a strict comparison (a
I know the function between() in dplyr. Yet, I have to know the upper and lower numbers.
MyNumber = 8
First = 2
Second = 10
# This will return TRUE
between(MyNumber, lower = First, upper = Second)
# But this will return FALSE
between(MyNumber, lower = Second, upper = First)
# This will return TRUE. I want it to return FALSE
First = 8
between(MyNumber, lower = First, upper = Second)
I need a function that returns TRUE no matter what is the order.
Something like:
between2 <- function(number,bounds) { number > min(bounds) & number < max(bounds)}
between2(8, c(2,10))
[1] TRUE
between2(8, c(10,2))
[1] TRUE
This function also deals with your added condition
between2(8,c(8,10))
[1] FALSE
You could do it with a simple arithmetics:
between <- function(number, first, second) { (first - number) * (second - number) < 0 }
Here are some example outputs:
> between(8, 2, 10)
[1] TRUE
> between(8, 10, 2)
[1] TRUE
> between(8, 10, 12)
[1] FALSE
> between(8, 1, 2)
[1] FALSE
You could use %in% with the : function, once you now first and last:
first <- 2
last <- 10
number <- 8
number %in% first:last
[1] TRUE
first <- 10
last <- 2
number <- 8
number %in% first:last
[1] TRUE
first <- 10
last <- 12
number <- 8
number %in% first:last
[1] FALSE
first <- 12
last <- 10
number <- 8
number %in% first:last
[1] FALSE
In a function, and strict lets you consider or not strict comparison:
my_between <- function(n, f, l, strict = FALSE) {
if (!strict) {
n %in% f:l # if strict == FALSE (default)
} else {
n %in% (f+1):(l-1) # if strict == TRUE
}
}
my_between(8, 2, 10)
What's wrong with
f_between <- function (num, L, R) num>=min(L,R) & num<=max(L,R)
f_between(8, 2, 10)
#[1] TRUE
f_between(6, 6, 10)
#[1] TRUE
f_between(2, -10, -2)
#[1] FALSE
f_between(3, 5, 7)
#[1] FALSE

What is the behavior of the ampersand operator in R's sum(...) function

Below, a line from a script I'm translating from R into Python. I'm more experienced at Python than I am at R, and I'm running into a little trouble here:
val = sum(l & f==v)
Let l be a vector of true/false values. Let f be a vector of trivial values, and v some possible value of f to test against. I expect l and f to be of the same length. The f==v part will also yield a boolean array. Now I am left with the question what the &/ampersand (Logical AND, according to the R documentation) will do in this context. Will the sum() function return the sum of a boolean array that indicates where both the l and f==v boolean arrays are true? Or wil it sum all true values for both arrays and add them up?
Thank you in advance!
Let define several vectors :
l <- c(TRUE, FALSE, TRUE, FALSE, TRUE)
v <- 1:5
f <- rep(c(1, 4), c(3, 2))
now let see what we have when we decompose your line sum(l & f==v):
In this line, == has precedence over &:
fev <- f==v
fev
[1] TRUE FALSE FALSE TRUE FALSE
Then we do l & fev:
lafev <- l & fev
[1] TRUE FALSE FALSE FALSE FALSE
lastly, we sum:
sum(lafev)
[1] 1
The sum tells us how many simultaneous TRUE there are in l and f==v by converting the logical values to numeric: TRUE becomes 1 and FALSE becomes 0. So, in this example, 1.
Here is a sample python version:
l = [True, True, False]
f = [True, False, True]
v = [True, True, True]
f_eq_v = [f[i] == v[i] for i in range(len(f))] # f == v in R
val = sum([l[i] & f_eq_v[i] for i in range(len(l))]) # val = sum(l & f==v) in R

Creating if + & statement

I'm using R and need to create code that checks if a matrix dimensions has 3x3. I am stuck on the proper code using if() to check the nrow and ncol. Any help or suggestion greatly appreciated.
localSmoother <- function(myMatrix = matrix(),
smoothingMatrix = matrix(1, nrow = 3, ncol = 3))
{
if(class(smoothingMatrix) != "matrix") stop ("smoothingMatrix must be a matrix.")
if(ncol(smoothingMatrix) != "3" & if(nrow(smoothingMatrix) != "3")
stop ("smoothingMatrix must have dimensions of 3x3")
print(myMatrix) # see what myMatrix is.
print(smoothingMatrix) # see what smoothingMatrix is.
return(matrix(, nrow = 3, ncol = 3))
}
# # TEST the CODE:
localSmoother(myMatrix = diag(x = 2, nrow = 5, ncol = 5), smoothingMatrix = matrix(2, nrow = 5, ncol = 3))
# # Error in localSmoother(myMatrix = diag(x = 2, nrow = 5, ncol = 5), smoothingMatrix = matrix(2, :
# # smoothingMatrix must be 3x3
Most immediately, you have two if in your conditional. You should just have one.
if(ncol(smoothingMatrix) != "3" & if(nrow(smoothingMatrix) != "3")
## should be
if(ncol(smoothingMatrix) != "3" & nrow(smoothingMatrix) != "3")
Next, your logic asks when the number of rows is not 3 and the number of columns is not 3. This condition works as intended (returns TRUE and follows to the stop statement in the if block) if your dimension is c(5,4), but would fail if your dimension is c(5,3).
x <- matrix(1:20,nrow=5)
dim(x)
# [1] 5 4
ncol(x) != "3" & nrow(x) != "3"
# [1] TRUE
x <- matrix(1:12,nrow=3)
dim(x)
# [1] 3 4
ncol(x) != "3" & nrow(x) != "3"
# [1] FALSE
I would instead use either of the following equivalent lines:
if(!(ncol(smoothingMatrix) == 3 && nrow(smoothingMatrix) == 3))
## is the same as
if(ncol(smoothingMatrix) != 3 || nrow(smoothingMatrix) != 3)
Note two things:
I'm using the logical operators && and || instead of the vectorized operators & and |. Try c(TRUE, FALSE) & TRUE vs. c(TRUE, FALSE) && TRUE.
I am using the numeric form 3 instead of the character "3". R will coerce the number to a character, so the equality test works here, but it could trip you up in other cases.
It may be easier to compare on the dimension of the matrix:
if(!isTRUE(all.equal(dim(smoothingMatrix),c(3L,3L))))
(isTRUE is needed because all.equal returns either TRUE or a character vector describing the differences. Observe that all.equal(1,0) does not return FALSE but instead a character vector describing the differences. Any if statements around all.equal then throw an error if equality doesn't hold.)
all.equal(1,0)
# [1] "Mean relative difference: 1"
if(all.equal(1,0)) print(TRUE)
# Error in if (all.equal(1, 0)) print(TRUE) :
# argument is not interpretable as logical
#Blue Magister's answer is great with nice explanations, go there first.
For your specific task, you might find the stopifnot function useful. From ?stopifnot
stopifnot(...) If any of the expressions in ... are not all TRUE, stop is called, producing an error message indicating the first of the elements of ... which were not true.
Your code could then be
stopifnot(is.matrix(smoothingMatrix),
ncol(smoothingMatrix) == 3,
nrow(smoothingMatrix) == 3)
The downside is that the error messages are a bit less helpful, the upside is that you don't have to write them yourself, and the code is nice and clean.

Resources