Julia: Floats sum is wrong? - julia

How can I make sure that by adding 0.2 at every iteration I get the correct result?
some = 0.0
for i in 1:10
some += 0.2
println(some)
end
the code above gives me
0.2
0.4
0.6000000000000001
0.8
1.0
1.2
1.4
1.5999999999999999
1.7999999999999998
1.9999999999999998

Floats are only approximatively correct and if adding up to infinity the error will become infinite, but you can still calculate with it pretty precisely. If you need to evaluate the result and look if it is correct you can use isapprox(a,b) or a ≈ b.
I.e.
some = 0.
for i in 1:1000000
some += 0.2
end
isapprox(some, 1000000 * 0.2)
# true
Otherwise, you can add integer numbers in the for loop and then divide by 10.
some = 0.
for i in 1:10
some += 2.
println(some/10.)
end
#0.2
#0.4
#0.6
#0.8
#1.0
#1.2
#1.4
#1.6
#1.8
#2.0
More info about counting with floats:
https://en.wikipedia.org/wiki/Floating-point_arithmetic

You can iterate over a range since they use some clever tricks to return more "natural" values:
julia> collect(0:0.2:2)
11-element Vector{Float64}:
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
julia> collect(range(0.0, step=0.2, length=11))
11-element Vector{Float64}:
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0

Related

How can I efficiently turn a vector of 1-dim array into a 2-dim array(matrix) in Julia?

As mentioned I have a vector of 1D matrice, such as:
P_predefined = [[.3 .4 .2 .1], [.2 .3 .5 0.], [.1 0. .8 .1], [.4 0. 0. .6]]
I would like to turn it into a matrix of 2D, I tried to use vcat, for which I expected to behave like vstack in Python, but it doesn't work.
vcat(algorithm.predefinedP)
It still returns a vector
[[0.3 0.4 0.2 0.1], [0.2 0.3 0.5 0.0], [0.1 0.0 0.8 0.1], [0.4 0.0 0.0 0.6]] #Vector{Matrix{Float64}}
How should I do it in the right way?
Julia 1.9 has stack, which can be used on earlier Julia versions via the package Compat.
julia> using Compat
julia> P_predefined = vec.([[.3 .4 .2 .1], [.2 .3 .5 0.], [.1 0. .8 .1], [.4 0. 0. .6]])
4-element Vector{Vector{Float64}}:
[0.3, 0.4, 0.2, 0.1]
[0.2, 0.3, 0.5, 0.0]
[0.1, 0.0, 0.8, 0.1]
[0.4, 0.0, 0.0, 0.6]
julia> stack(P_predefined)
4×4 Matrix{Float64}:
0.3 0.2 0.1 0.4
0.4 0.3 0.0 0.0
0.2 0.5 0.8 0.0
0.1 0.0 0.1 0.6
julia> stack(P_predefined; dims=1)
4×4 Matrix{Float64}:
0.3 0.4 0.2 0.1
0.2 0.3 0.5 0.0
0.1 0.0 0.8 0.1
0.4 0.0 0.0 0.6
Note: your P_predefined is a vector of 1xn matrices, instead of vectors. I've used vec here to convert them to vectors.
vcat(A...)
Concatenate along dimension 1. To efficiently concatenate a large
vector of arrays, use reduce(vcat, x).
julia> reduce(vcat, P_predefined)
4×4 Matrix{Float64}:
0.3 0.4 0.2 0.1
0.2 0.3 0.5 0.0
0.1 0.0 0.8 0.1
0.4 0.0 0.0 0.6
Since you mention efficiency, a comprehension will be 2X faster than vcat.
m, n = length(P_predefined), length(P_predefined[1])
#btime mat = [$P_predefined[i][j] for i=1:$m, j=1:$n]
#btime mat = reduce(vcat, $P_predefined)
# 29.347 ns (1 allocation: 192 bytes)
# 72.131 ns (1 allocation: 192 bytes)

Show what the calculated bins breaks are in a histogram

It is my understanding that when plotting histogram, it's not that every unique data point gets its own bin, there's an algorithm that calculates how many bins to use. How do I find out how the data were partitioned to create the number of bins? E.g. 0-5,6-10,... How do I get R to show me where the breaks are via text output?
I've found various methods to calculate number of bins but that's just theory
I think you need to use $breaks:
set.seed(10)
hist(rnorm(200,0,1),20)$breaks
[1] -2.4 -2.2 -2.0 -1.8 -1.6 -1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4

Remove values in vector from double variable in R

I have a variable of type double X: 1.5 1.3 0.6 1.8 2.9 2.1 1.5 1.4 5.8 0.0
and a vector V: c(0.6,2.9). I want to remove the values in V from X
test<-X[!X %in% V]
The values are not removed from test:
test
[1] 1.5 1.3 0.6 1.8 2.9 2.1 1.5 1.4 5.8 0.0`
I tried the following:
are.equal <- function(x, y, eps = .Machine$double.eps^0.5) abs(x - y) < eps
test=X[!(are.equal(X,0.6))]
0.6 were removed..
I could have something odd in my data or my system.
Any idea?

rollapply function on specific column of dataframes within list

I must admit to complete lunacy when trying to understand how functions within functions are defined and passed in R. The examples always presume you understand every nuance and don't provide descriptions of the process. I have yet to come across a plain English, idiots guide break down of the process. So the first question is do you know of one?
Now my physical problem.
I have a list of data.frames: fileData.
I want to use the rollapply() function on specific columns in each data.frame. I then want all the results(lists) combined. So starting with one of the data.frames using the built in mtcars dataframes as an example:
Of course I need to tell rollapply() to use the function PPI() along with the associated parameters which are the columns.
PPI <- function(a, b){
value = (a + b)
PPI = sum(value)
return(PPI)
}
I tried this:
f <- function(x) PPI(x$mpg, x$disp)
fileData<- list(mtcars, mtcars, mtcars)
df <- fileData[[1]]
and got stopped at
rollapply(df, 20, f)
Error in x$mpg : $ operator is invalid for atomic vectors
I think this is related to Zoo using matrices but other numerous attempts couldn't resolve the rollapply issue. So moving onto what I believe is next:
lapply(fileData, function(x) rollapply ......
Seems a mile away. Some guidance and solutions would be very welcome.
Thanks.
I will Try to help you and show how you can debug the problem. One trick that is very helpful in R is to learn how to debug. Gnerelly I am using browser function.
problem :
Here I am changing you function f by adding one line :
f <- function(x) {
browser()
PPI(x$changeFactor_A, x$changeFactor_B)
}
Now when you run :
rollapply(df, 1, f)
The debugger stops and you can inspect the value of the argument x:
Browse[1]> x
[1,]
1e+05
as you see is a scalar value , so you can't apply the $ operator on it, hence you get the error:
Error in x$changeFactor_A : $ operator is invalid for atomic vectors
general guides
Now I will explain how you should do this.
Either you change your PPI function, to have a single parameter excees: so you do the subtraction outside of it (easier)
Or you use mapply to get a generalized solution. (Harder but more general and very useful)
Avoid using $ within functions. Personally, I use it only on the R console.
complete solution:
I assume that you data.frames(zoo objects) have changeFactor_A and changeFactor_B columns.
sapply(fileData,function(dat){
dat <- transform(dat,excess= changeFactor_A-changeFactor_B)
rollapply(dat[,'excess'],2,sum)
}
Or More generally :
sapply(fileData,function(dat){
excess <- get_excess(dat,'changeFactor_A','changeFactor_B')
rollapply(excess,2,sum)
}
Where
get_excess <-
function(data,colA,colB){
### do whatever you want here
### return a vector
excess
}
Look at the "Usage" section of the help page to ?rollapply. I'll admit that R help pages are not easy to parse, and I see how you got confused.
The problem is that rollapply can deal with ts, zoo or general numeric vectors, but only a single series. You are feeding it a function that takes two arguments, asset and benchmark. Granted, your f and PPI can trivially be vectorized, but rollapply simply isn't made for that.
Solution: calculate your excess outside rollapply (excess is easily vectorially calculated, and it does not involve any rolling calculations), and only then rollapply your function to it:
> mtcars$excess <- mtcars$mpg-mtcars$disp
> rollapply(mtcars$excess, 3, sum)
[1] -363.2 -460.8 -663.1 -784.8 -893.9 ...
You may possibly be interested in mapply, which vectorizes a function for multiple arguments, similarly to apply and friends, which work on single arguments. However, I know of no analogue of mapply with rolling windows.
I sweated away and took some time to slowly understand how to break down the process and protocol of calling a function with arguments from another function. A great site that helped was Advanced R from the one and only Hadley Wickham, again! The pictures showing the process breakdown are near ideal. Although I still needed my thinking cap on for a few details.
Here is a complete example with notes. Hopefully someone else finds it useful.
library(zoo)
#Create a list of dataframes for the example.
listOfDataFrames<- list(mtcars, mtcars, mtcars)
#Give each element a name.
names(listOfDataFrames) <- c("A", "B", "C")
#This is a simple function just for the example!
#I want to perform this function on column 'col' of matrix 'm'.
#Of course to make the whole task worthwhile, this function is usually something more complex.
fApplyFunction <- function(m,col){
mean(m[,col])
}
#This function is called from lapply() and does 'something' to the dataframe that is passed.
#I created this function to keep lapply() very simply.
#The something is to apply the function fApplyFunction(), wich requires an argument 'thisCol'.
fOnEachElement <- function(thisDF, thisCol){
#Convert to matrix for zoo library.
thisMatrix <- as.matrix(thisDF)
rollapply(thisMatrix, 5, fApplyFunction, thisCol, partial = FALSE, by.column = FALSE)
}
#This is where the program really starts!
#
#Apply a function to each element of list.
#The list is 'fileData', with each element being a dataframe.
#The function to apply to each element is 'fOnEachElement'
#The additional argument for 'fOnEachElement' is "vs", which is the name of the column I want the function performed on.
#lapply() returns each result as an element of a list.
listResults <- lapply(listOfDataFrames, fOnEachElement, "vs")
#Combine all elements of the list into one dataframe.
combinedResults <- do.call(cbind, listResults)
#Now that I understand the argument passing, I could call rollapply() directly from lapply()...
#Note that ONLY the additional arguments of rollapply() are passed. The primary argurment is passed automatically by lapply().
listResults2 <- lapply(listOfDataFrames, rollapply, 5, fApplyFunction, "vs", partial = FALSE, by.column = FALSE)
Results:
> combinedResults
A B C
[1,] 0.4 0.4 0.4
[2,] 0.6 0.6 0.6
[3,] 0.6 0.6 0.6
[4,] 0.6 0.6 0.6
[5,] 0.6 0.6 0.6
[6,] 0.8 0.8 0.8
[7,] 0.8 0.8 0.8
[8,] 0.8 0.8 0.8
[9,] 0.6 0.6 0.6
[10,] 0.4 0.4 0.4
[11,] 0.2 0.2 0.2
[12,] 0.0 0.0 0.0
[13,] 0.0 0.0 0.0
[14,] 0.2 0.2 0.2
[15,] 0.4 0.4 0.4
[16,] 0.6 0.6 0.6
[17,] 0.8 0.8 0.8
[18,] 0.8 0.8 0.8
[19,] 0.6 0.6 0.6
[20,] 0.4 0.4 0.4
[21,] 0.2 0.2 0.2
[22,] 0.2 0.2 0.2
[23,] 0.2 0.2 0.2
[24,] 0.4 0.4 0.4
[25,] 0.4 0.4 0.4
[26,] 0.4 0.4 0.4
[27,] 0.2 0.2 0.2
[28,] 0.4 0.4 0.4
> listResults
$A
[1] 0.4 0.6 0.6 0.6 0.6 0.8 0.8 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 0.8 0.6
[20] 0.4 0.2 0.2 0.2 0.4 0.4 0.4 0.2 0.4
$B
[1] 0.4 0.6 0.6 0.6 0.6 0.8 0.8 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 0.8 0.6
[20] 0.4 0.2 0.2 0.2 0.4 0.4 0.4 0.2 0.4
$C
[1] 0.4 0.6 0.6 0.6 0.6 0.8 0.8 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 0.8 0.6
[20] 0.4 0.2 0.2 0.2 0.4 0.4 0.4 0.2 0.4
> listResults2
$A
[1] 0.4 0.6 0.6 0.6 0.6 0.8 0.8 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 0.8 0.6
[20] 0.4 0.2 0.2 0.2 0.4 0.4 0.4 0.2 0.4
$B
[1] 0.4 0.6 0.6 0.6 0.6 0.8 0.8 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 0.8 0.6
[20] 0.4 0.2 0.2 0.2 0.4 0.4 0.4 0.2 0.4
$C
[1] 0.4 0.6 0.6 0.6 0.6 0.8 0.8 0.8 0.6 0.4 0.2 0.0 0.0 0.2 0.4 0.6 0.8 0.8 0.6
[20] 0.4 0.2 0.2 0.2 0.4 0.4 0.4 0.2 0.4

How do I make a list of numbers by tenths?

I know that 1:10 will give me a vector of all integers from 1 to 10, but how can I get numbers from 1 to 2 going up by tenths (i.e., 1.0, 1.1, 1.2, ..., 2.0)?
Try seq
> seq(1, 2, by = 0.1)
[1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
Just in the spirit of there is more than one way to do things, another option is:
> (10:20)/10
[1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0

Resources