This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 5 years ago.
I have a 1 column matrix with labels and a numeric vector.
I want to extract the labels in the matrix which are equal to one of the entries in that vector, more specifically:
> mat
[,1]
intercept 20.86636535
crim -0.23802478
zn 0.03822050
indus 0.05135584
chas 2.43504780
> vec
[1] -0.23802478 0.05135584
> mat[2, 1] == vec[1]
crim
FALSE
Currently I'm stuck with the first step. I have no idea why it returns FALSE while they hold the same numeric values.
I'd use round(as.numeric(mat[,2, drop=T]), 5) %in% round(vec, 5)
, as there may well be floating point issues.
Doing so yields:
[1] FALSE TRUE FALSE TRUE FALSE
Basically, you need to turn the second column into a vector (using drop=T) and then turn it from a character to a numeric. The rounding (in this case, to 5 decimal places) then bridges the floating point problem that I mentioned before (along with David Arenburg).
I hope that helps you.
Related
This question already has answers here:
Why does "one" < 2 equal FALSE in R?
(2 answers)
Why is the expression "1"==1 evaluating to TRUE? [duplicate]
(1 answer)
Closed 3 years ago.
Just like the title says, why does "1" == 1 is TRUE? What is the real reason behind this? Is R trying to be kind or is this something else? I was thinking since "1" (or any numbers it really doesn't matter) where read by R as a character it would automatically return FALSE if compare with as.numeric(1) or as.integer(1).
> as.character(1) == as.numeric(1)
[1] TRUE
or
> "1" == 1
[1] TRUE
I guess it is a simple question but I'd like to get an answer. Thank you.
According to ?==
For numerical and complex values, remember == and != do not allow for the finite representation of fractions, nor for rounding error. Using all.equal with identical is almost always preferable. S
In another paragraph, it is also written
x, y
atomic vectors, symbols, calls, or other objects for which methods have been written. If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.
identical(as.character(1), as.numeric(1))
#[1] FALSE
This question already has answers here:
Is there an R function for finding the index of an element in a vector?
(4 answers)
Closed 4 years ago.
I am confused about the which function. Basically I thought that it checks at which position of an input object (e.g., a vector) a logical condition is true. As seen in the documentation:
which(LETTERS == "R")
[1] 18
In other words, it goes through all LETTERS values and checks if value == R. But this seems to be a misunderstanding. If I input
a <- c("test","test2","test3","test4")
b <- c("test","test3")
which(a==b)
[1] 1
it returns [1] 1 although test3 does also appear in both vectors. Also, if I input a shorter vector for a, it returns a warning:
a <- c("test","test2","test3")
b <- c("test","test3")
which(a==b)
[1] 1
Warning message:
In a == b : longer object length is not a multiple of shorter object length
My question here is twofold:
How can I return the positions of a character vector a that match a character vector b?
How does which() operate because I obviously misunderstand the function.
Thank you for your answers
Edit: thank you for your quick replies, you clarified my misunderstanding!
You need to give which an input that tells it what elements of a are in b:
which(a%in%b)
[1] 1 3
which essentially identifies which elements are TRUE in a logical vector.
== compares values 1 by 1 (a[1]==b[1]);(a[2]==b[2])..... and not as sets.
for set operations use %in%
use a[which(a %in% b)] to get [1] "test" "test3"
which() returns the index of TRUE expressions (!) not the value.
which(a %in% b) will return
[1] 1 3
the reason for the strange warning message is R's recycling
Warning message:
In a == b : longer object length is not a multiple of shorter object length
so when you compare a vector of length 4 with a vector of length 2, value by value (using == ), R 'recycles' the short vector. in 4 and 2 it works and you will get an answer for this question: (a1==b1,a2==b2,a3==b1,a4==b2). in case of length 4 and 3 - you get a warning message saying the short vector cannot be multiplied by an integer to get the long vector length.
This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 5 years ago.
I am doing a simple row sum and two columns give me 0 (which is the number it should give), but the last one gives an epsilon, but not zero per se.
# generate the row values that their sumation should give zero.
d<-0.8
c<-1-d
a<-0.5
b<-0.5
e<-0.2
f<-1-e
Perc<-c(-1, a,b,c,-1,d,e,f,-1)
# Put them in a 3x3 matrix
div<-matrix(ncol = 3, byrow = TRUE,Perc)
# Do the row sum
rowSums(div)
# RESULT
[1] 0.000000e+00 0.000000e+00 5.551115e-17
rowSums(div)[3]==0
[1] FALSE
I am using this version of R: version 3.4.1 (2017-06-30) -- "Single Candle"
Any idea why ? and how i can fix this?
This happens because the machines can't store decimal numbers exactly. There can be a small error for some numbers.
The fix here is to use the all.equal function. It takes the tolerance level of the machine into account when comparing two numbers.
all.equal(sum(div[3, ]), 0)
TRUE
This question already has answers here:
r programming - check for every value in a vector if it is numeric
(2 answers)
Closed 6 years ago.
Suppose I have a vector
x <- c('a', 'b', 1, 2)
What is the easiest way for me to get an output that indicates whether or not the components of x are numeric? I.e., the output should be
something(x)
[1] FALSE FALSE TRUE TRUE
The way I know how to do this is to convert x to a matrix and use apply:
apply(as.matrix(x), FUN = is.numeric, MARGIN = 1)
But after testing, this actually doesn't work - I forgot that the types are coerced to become strings.
We can use is.na and as.numeric
!is.na(as.numeric(x))
#[1] FALSE FALSE TRUE TRUE
with a friendly warning
Or use grep to match only numeric elements from start (^) to end ($)
grepl("^-?[0-9.]+$", x)
#[1] FALSE FALSE TRUE TRUE
This question already has answers here:
Why are these numbers not equal?
(6 answers)
Closed 9 years ago.
If I type:
x<-seq(0,20,.05)
x[30]
x[30]==1.45
Why do I obtain a False from the last line of code? What did I do wrong here?
This question has been asked a million times, albeit in different forms. This is due to floating point inaccuracy. Also here's another link on floating point errors you may want to catch up on!
Try this to first see what's going on:
x <- seq(0, 20, 0.5)
sprintf("%.20f", x[30]) # convert value to string with 20 decimal places
# [1] "14.50000000000000000000"
x[30] == 14.5
# [1] TRUE
All is well so far. Now, try this:
x <- seq(0, 20, 0.05)
sprintf("%.20f", x[30]) # convert value to string with 20 decimal places
# [1] "1.45000000000000017764"
x[30] == 1.45
# [1] FALSE
You can see that the machine is able to accurately represent this number only up to certain digits. Here, up to 15 digits or so. So, by directly comparing the results, you get of course a FALSE. Instead what you could do is to use all.equal which has a parameter for tolerance which equals .Machine$double.eps ^ 0.5. On my machine this evaluates to 1.490116e-08. This means if the absolute difference between the numbers x[30] and 1.45... is < this threshold, then all.equal evaluates this to TRUE.
all.equal(x[30], 1.45)
[1] TRUE
Another way of doing this is to explicitly check with a specific threshold (as #eddi's answer shows it).
This has to do with the fact that these are double's, and the correct way of comparing double's in any language is to do something like:
abs(x[30] - 1.45) < 1e-8 # or whatever precision you think is appropriate