1 %% 0.1 = 0.1 AND 1 %% 0.2 = 0.2? - r

Am I missing a thing or a bug in the base package?
I am running on R-2.15.2, on Win 7-32
x %% y modulus (x mod y) 5 %% 2 is 1
from : http://www.statmethods.net/management/operators.html
> 1 %% 0.1
[1] 0.1
> 1 %% 0.2
[1] 0.2
Both of them must be 0.
The below examples works as expected.
For example:
1 %% 0.15
must be 0.1 ( 1.0 = 6 x 0.15 + 0.1)
> 1 %% 0.11 # expected result
[1] 0.01
> 1 %% 0.15
[1] 0.1
> 1 %% 0.3
[1] 0.1
> 1 %% 0.4
[1] 0.2
> 1 %% 0.5
[1] 0

First of all, I cannot reproduce this using R version 2.15.1 running on x86_64.
If that's what happens in your environment, this almost certainly has to do with the fact that neither 0.1 nor 0.2 can be represented exactly using binary floating-point arithmetic:
> sprintf("%.20f", 0.1)
[1] "0.10000000000000000555"
> sprintf("%.20f", 0.2)
[1] "0.20000000000000001110"
The documentation for %% has the following to say:
%% and x %/% y can be used for non-integer y, e.g. 1 %/% 0.2, but the results are subject to representation error and so may be platform-dependent. Because the IEC 60059 representation of 0.2 is a binary fraction slightly larger than 0.2, the answer to 1 %/% 0.2 should be 4 but most platforms give 5.
There are many other similar pitfalls having to do with the properties of floating-point arithmetic (not just in R). The classic paper on the subject is What Every Computer Scientist Should Know About Floating-Point Arithmetic.

Related

Subset does not work with some numeric values but with others

I ran in a very strange problem I don't know how to solve and have never seen. I can subset a data.frame for some but not for other numeric values.
Here is the data I use:
library(dplyr)
ws <- seq(0, 1, by=.1)
kombos <- expand.grid(weightjaw2 = ws,
weightjaw3 = ws) %>% as.data.frame
kombos$kombi <- 1:nrow(kombos)
kombos$weightjaw2 <- as.numeric(kombos$weightjaw2)
kombos$weightjaw3 <- as.numeric(kombos$weightjaw3)
class(kombos$weightjaw2)
[1] "numeric"
Now, I need to subset this data.frame. This works well, say for example, the value 0.1.
kombos %>% filter(weightjaw2==0.1)
weightjaw2 weightjaw3 kombi
1 0.1 0.0 2
2 0.1 0.1 13
3 0.1 0.2 24
4 0.1 0.3 35
5 0.1 0.4 46
6 0.1 0.5 57
7 0.1 0.6 68
8 0.1 0.7 79
9 0.1 0.8 90
10 0.1 0.9 101
11 0.1 1.0 112
Strangely enough, this does not work for values of 0.3, 0.6, and 0.7.
kombos %>% filter(weightjaw2==0.3)
[1] weightjaw2 weightjaw3 kombi
<0 rows> (or 0-length row.names)
The same holds for subset(kombos, weightjaw2==0.3). Why is that and how can I solve this?
EDIT
I solved this using dyplyr::near():
kombos %>% filter(near(weightjaw2, 0.3))
The == requires both lhs and rhs to be exactly equal. The 'weightjaw2' column is not exactly equal to 0.3 due to the precision checks. One option is to convert the column to character in filter to subset the rows
library(dplyr)
kombos %>%
filter(as.character(weightjaw2) == 0.3)

Normalize blocks/sub-matrices within a matrix

I want to normalize (i.e., 0-1) blocks/sub-matrices within a square matrix based on row/col names. It is important that the normalized matrix correspond to the original matrix. The below code extracts the blocks, e.g. all col/row names == "A" and normalizes it by its max value. How do I put that matrix of normalized blocks back together so it corresponds to the original matrix, such that each single value of the normalized blocks are in the same place as in the original matrix. I.e. you cannot put the blocks together and then e.g. sort the normalized matrix by the original's matrix row/col names.
#dummy code
mat <- matrix(round(runif(90, 0, 50),),9,9)
rownames(mat) <- rep(LETTERS[1:3],3)
colnames(mat) <- rep(LETTERS[1:3],3)
mat.n <- matrix(0,nrow(mat),ncol(mat), dimnames = list(rownames(mat),colnames(mat)))
for(i in 1:length(LETTERS[1:3])){
? <- mat[rownames(mat)==LETTERS[1:3][i],colnames(mat)==LETTERS[1:3][i]] / max(mat[rownames(mat)==LETTERS[1:3][i],colnames(mat)==LETTERS[1:3][i]])
#For example,
mat.n[rownames(mat)==LETTERS[1:3][i],colnames(mat)==LETTERS[1:3][i]] <- # doesn't work
}
UPDATE
Using ave() as #G. Grothendieck suggested works for the blocks, but I'm not sure how it's normalizing beyond that.
mat.n <- mat / ave(mat, rownames(mat)[row(mat)], colnames(mat)[col(mat)], FUN = max)
Within block the normalization works, e.g.
mat[rownames(mat)=="A",colnames(mat)=="A"]
A A A
A 13 18 15
A 38 33 41
A 12 18 47
mat.n[rownames(mat.n)=="A",colnames(mat.n)=="A"]
A A A
A 0.2765957 0.3829787 0.3191489
A 0.8085106 0.7021277 0.8723404
A 0.2553191 0.3829787 1.0000000
But beyond that, it looks weird.
> round(mat.n,1)
A B C A B C A B C
A 0.3 0.2 0.1 0.4 0.2 1.0 0.3 0.9 1.0
B 0.9 0.8 0.9 0.4 0.5 0.4 0.4 0.9 0.0
C 0.0 0.4 0.4 0.0 0.8 0.5 0.4 0.9 0.0
A 0.8 0.9 0.5 0.7 0.9 0.6 0.9 0.4 0.4
B 0.1 0.8 0.7 1.0 0.3 0.5 0.1 1.0 0.8
C 0.4 0.0 0.2 0.2 0.2 0.6 1.0 0.4 1.0
A 0.3 0.4 0.3 0.4 0.6 0.8 1.0 1.0 0.3
B 0.6 0.2 0.5 0.9 0.3 0.2 0.9 0.3 1.0
C 0.5 0.9 0.7 1.0 0.4 0.5 1.0 1.0 0.9
In this case, I would expect 3 1s across the whole matrix- 1 for each block. But there're 10 1s, e.g. mat.n[3,2], mat.n[1,9]. I'm not sure how this function normalized between blocks.
UPDATE 2
#Original matrix.
#Suggested solution produces `NaN`
mat <- as.matrix(read.csv(text=",1.21,1.1,2.2,1.1,1.1,1.21,2.2,2.2,1.21,1.22,1.22,1.1,1.1,2.2,2.1,2.2,2.1,2.2,2.2,2.2,1.21,2.1,2.1,1.21,1.21,1.21,1.21,1.21,2.2,1.21,2.2,1.1,1.22,1.22,1.22,1.22,1.21,1.22,2.1,2.1,2.1,1.22
1.21,0,0,0,0,0,0,0,0,292,13,0,0,0,0,0,0,0,0,0,0,22,0,0,94,19,79,0,9,0,126,0,0,0,0,0,0,0,0,0,0,0,0
1.1,0,0,0,155,166,0,0,0,0,0,0,4,76,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,34,0,0,0,0,0,0,0,0,0,0
2.2,0,0,0,0,0,0,0,0,0,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1.1,0,201,0,0,79,0,0,0,0,0,0,0,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1.1,0,33,0,91,0,0,0,0,0,0,0,0,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1.21,8,0,0,0,0,0,0,0,404,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,37,26,18,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2.2,0,0,0,0,0,0,0,9,0,0,0,0,0,0,0,0,0,162,79,1,0,0,0,0,0,0,0,0,10,0,27,0,0,0,0,0,0,0,0,0,0,0
2.2,0,0,0,0,0,0,9,0,0,0,0,0,0,0,0,0,0,33,17,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0
1.21,207,0,0,0,0,1644,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,0,0,16,17,402,0,0,0,606,0,0,0,0,0,0,0,0,0,0,0,0
1.22,13,0,0,0,0,0,0,0,0,0,12,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,26,0,0,15,0,0,0,0,0
1.22,0,0,0,0,0,0,0,0,0,71,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,374,6,121,6,21,0,0,0,0
1.1,0,0,0,44,0,0,0,0,0,0,0,0,103,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,33,0,0,0,0,0,0,0,0,0,0
1.1,0,0,0,24,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,0,0,0,0,0,0,0,0,0,0,0,10,0,0,0,0,0,0,0,0,0,0
2.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,0,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,18,0,0,0,0,353,116,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,29,0,5,0
2.2,0,0,0,0,0,0,0,37,0,0,0,0,0,4,0,0,0,36,46,62,0,0,0,0,0,0,0,0,0,0,73,0,0,0,0,0,0,1,0,0,0,0
2.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,61,0,0,0,0,0,0,0,38,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0
2.2,17,0,23,0,0,0,444,65,0,0,0,0,0,0,0,78,0,0,42,30,15,0,0,0,0,0,0,0,4,0,18,0,0,0,0,0,0,0,0,0,0,0
2.2,0,0,0,0,0,0,75,8,0,0,0,0,0,0,0,87,0,74,0,85,0,0,0,0,0,0,0,0,1,0,19,0,25,0,0,0,0,0,0,0,0,0
2.2,0,0,13,0,0,0,12,20,0,0,0,0,0,0,0,118,0,29,92,0,25,0,0,0,0,0,0,0,0,0,16,0,48,0,0,0,0,0,0,0,0,0
1.21,14,0,1,0,0,0,0,0,17,0,0,0,0,0,0,0,0,0,0,14,0,0,0,0,0,0,0,0,3,0,20,0,0,0,0,0,0,0,0,0,0,0
2.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,204,0,0,0,0,0,0,0,133,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,44,0,0
2.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,67,0,0,0,0,0,0,143,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,12,15,0
1.21,79,0,0,0,0,0,0,0,34,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,38,26,6,9,0,112,0,0,0,0,0,0,0,0,0,0,0,0
1.21,11,0,0,0,0,17,0,0,49,0,0,0,0,0,0,0,0,0,0,0,0,0,0,28,0,0,0,32,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1.21,40,0,0,0,0,0,0,0,122,0,0,0,0,0,0,0,0,0,0,0,3,0,0,24,11,0,887,20,0,389,0,0,0,0,0,0,0,0,0,0,0,0
1.21,14,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,0,50,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1.21,34,0,0,0,0,26,0,0,56,0,0,0,0,0,0,0,0,0,0,0,0,0,0,54,9,297,13,0,0,16,0,0,0,0,0,0,0,0,0,0,0,0
2.2,0,0,0,0,0,0,39,0,0,0,0,0,0,0,0,25,0,17,12,20,25,0,0,0,0,0,0,0,0,0,393,0,7,0,0,0,0,0,0,0,0,0
1.21,177,0,0,0,0,8,0,0,775,0,0,0,0,0,0,0,0,0,0,0,0,0,0,113,0,227,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2.2,0,0,0,0,0,0,21,17,0,0,0,0,0,0,0,0,0,42,30,16,0,0,0,0,0,0,0,0,165,0,0,0,0,0,0,0,0,0,0,0,0,0
1.1,0,6,0,28,0,0,0,0,0,0,0,9,30,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1.22,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,4,37,0,0,0,0,0,0,0,0,3,0,0,0,0,14,7,0,0,18,0,0,0,0
1.22,0,0,0,0,0,0,0,0,0,44,785,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,21,0,44,177,13,24,0,0,0,0
1.22,0,0,0,0,0,0,30,0,0,182,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,12,0,1231,135,17,0,0,0,0
1.22,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,73,1308,0,669,16,0,0,0,8
1.21,0,0,0,0,0,0,0,0,0,15,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,13,33,197,626,0,44,0,0,0,0
1.22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,24,37,12,80,0,0,0,0,16
2.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,24,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,24,54,0
2.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,0,0,0,0,0,0,27,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,75,0,0,0
2.1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,58,0,1,0,0,0,0,28,24,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,61,2,0,0
1.22,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,31,9,0,0,0,0"))
ids <- read.csv(text=",x
1,1.21
2,1.1
3,2.2
4,1.1
5,1.1
6,1.21
7,2.2
8,2.2
9,1.21
10,1.22
11,1.22
12,1.1
13,1.1
14,2.2
15,2.1
16,2.2
17,2.1
18,2.2
19,2.2
20,2.2
21,1.21
22,2.1
23,2.1
24,1.21
25,1.21
26,1.21
27,1.21
28,1.21
29,2.2
30,1.21
31,2.2
32,1.1
33,1.22
34,1.22
35,1.22
36,1.22
37,1.21
38,1.22
39,2.1
40,2.1
41,2.1
42,1.22")
mat <- mat[,-1]
rownames(mat) <- ids$x
colnames(mat) <- ids$x
ans <- mat / ave(mat, rownames(mat)[row(mat)], colnames(mat)[col(mat)], FUN = max)
Any help is much appreciated, thanks.
Use ave to get the maxima:
mat / ave(mat, rownames(mat)[row(mat)], colnames(mat)[col(mat)], FUN = max)
For example, there are 9 ones, as expected, and there is one 1 in each block also as expected. (There could be more than 9 if the matrix happened to have multiple maxima in one or more blocks but there shoud not be less than 9.)
set.seed(123)
mat <- matrix(round(runif(90, 0, 50),),9,9)
rownames(mat) <- rep(LETTERS[1:3],3)
colnames(mat) <- rep(LETTERS[1:3],3)
ans <- mat / ave(mat, rownames(mat)[row(mat)], colnames(mat)[col(mat)], FUN = max)
sum(ans == 1)
## [1] 9
# there are no duplicates (i.e. a block showing up more than once) hence
# there is exactly one 1 in each block
w <- which(ans == 1, arr = TRUE)
anyDuplicated(cbind(rownames(mat)[w[, 1]], colnames(mat)[w[, 2]]))
## [1] 0
ADDED
If some blocks are entirely zero (which is the case in UPDATE 2) then you will get NaNs for those blocks. If you want 0s instead for the all-zero blocks try this:
xmax <- function(x) if (all(x == 0)) 0 else x/max(x)
ave(mat, rownames(mat)[row(mat)], colnames(mat)[col(mat)], FUN = xmax)

Round sequence of numbers to chosen numbers

I got a vector of numbers from 0 to 1. I'd like to divide them to X amount of groups - for example if X=5, then round the numbers to 5 groups: all numbers from 0 to 0.2 will be 0, all from 0.2 to 0.4 will be 0.2, etc.
For example, if I have x <- c(0.34,0.07,0.56) and X=5 like the above explanation, I'll get (0.2, 0, 0.4).
So far, the only way I found to that is by looping over the entire vector. Is there a more elegant way to do that?
You can simply do:
floor(x*X)/X
# [1] 0.2 0.0 0.4
More testing cases:
X = 10
floor(x*X)/X
# [1] 0.3 0.0 0.5
X = 2
floor(x*X)/X
# [1] 0.0 0.0 0.5
X = 5
floor(x*X)/X
# [1] 0.2 0.0 0.4
Data:
x <- c(0.34,0.07,0.56)
Try:
cut.alt <- function(x, X) {
out <- cut(x, breaks=(1:X-1)/X)
levels(out) <- as.character((1:X-1)/X)
out
}
cut with breaks set to (1:X-1)/X divides the vector x into groups like OP asks. Then changing the levels to the value of the cutoff gives the answer.
Or using plyr:
library(plyr)
round_any(x, 1/X,floor)
# [1] 0.2 0.0 0.4

How do you generate a regular non-integer sequence in julia?

How are regular, non-integer sequences generated in julia?
I'm trying to get 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
In MATLAB, I would use
0.1:0.1:1
And in R
seq(0.1, 1, by = 0.1)
But I can't find anything except integer sequences in julia (e.g., 1:10). Searching for "sequence" in the docs only gives me information about how strings are sequences.
Similarly to Matlab, but with the difference that 0.1:0.1:1 defines a Range:
julia> typeof(0.1:0.1:1)
Range{Float64} (constructor with 3 methods)
and thus if an Array is needed:
julia> [0.1:0.1:1]
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Unfortunately, this use of Range is only briefly mentioned at this point of the documentation.
Edit: As mentioned in the comments by #ivarne it is possible to achieve a similar result using linspace:
julia> linspace(.1,1,10)
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
but note that the results are not exactly the same due to rounding differences:
julia> linspace(.1,1,10)==[0.1:0.1:1]
false
The original answer is now deprecated. You should use collect() to generate a sequence.
## In Julia
> collect(0:.1:1)
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
## In R
> seq(0, 1, .1)
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
They are generated the same way as in Matlab
julia> sequence = 0:.1:1
0.0:0.1:1.0
Alternatively, you can use the range() function, which allows you to specify the length, step size, or both
julia> range(0, 1, length = 5)
0.0:0.25:1.0
julia> range(0, 1, step = .01)
0.0:0.01:1.0
julia> range(0, step = .01, length = 5)
0.0:0.01:0.04
You can still do all of the thinks you would normally do with a vector, eg indexing
julia> sequence[4]
0.3
math and stats...
julia> sum(sequence)
5.5
julia> using Statistics
julia> mean(sequence)
0.5
This will (in most cases) work the same way as a vector, but nothing is actually allocated. It can be comfortable to make the vector, but in most cases you shouldn't (it's less performant). This works because
julia> sequence isa AbstractArray
true
If you truly need the vector, you can collect(), splat (...) or use a comprehension:
julia> v = collect(sequence)
11-element Array{Float64,1}:
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
julia> v == [sequence...] == [x for x in sequence]
true

modulus bug in R [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why are these numbers not equal?
Just noticed this bug in R. I'm guessing it's the way 0.6 is represented, but anyone know exactly what's going on?
According to R:
0.3 %% 0.2 = 0.1
0.4 %% 0.2 = 0
0.5 %% 0.2 = 0.1
**0.6 %% 0.2 = 0.2**
0.7 %% 0.2 = 0.1
0.8 %% 0.2 = 0
What's going on?
In addition to #joshua Ulrich's comment
from ?'%%'
%% and x %/% y can be used for non-integer y, e.g. 1 %/% 0.2, but the results are subject to representation error and so may be platform-dependent. Because the IEC 60059 representation of 0.2 is a binary fraction slightly larger than 0.2, the answer to 1 %/% 0.2 should be 4 but most platforms give 5.
also similar to why we get this
> .1 + .1 + .1 == .3
[1] FALSE
as #Ben Boker pointed out, you may want to use something like
> 3:8 %% 2 / 10
[1] 0.1 0.0 0.1 0.0 0.1 0.0

Resources