Julia - Alternate fieldnames for types - julia

If I have a type called Measurements
type Measurements
x
y
z
end
where x represents width, y height and z depth
However in some cases, meaning in the case of other types, may not be so clear. Is there a way to make it so that I could call both
julia> m = Measurements(10,5,12);
julia> m.x
10
julia> m.width
10
f.x like so
julia> Base.getfield(m::Measurement, width) = m.x
which returns the error
ERROR: cannot add methods to a builtin function

The comments suggest a few ways to get better looking field access, and #DNF's suggestion to use accessor functions (setters/getters) is a good encapsulation pattern. Another option allowed by Julia language is to define:
Base.getindex(v::Measurements,s::Symbol) = s==:width ? v.x : s==:height ? v.y : s==:depth ? v.z : throw(BoundsError(s))
then,
m[:width], m[:height], m[:depth]
work. Is this simple enough? In the same way, more aliases can be added. There might be some minor runtime cost involved, so hot loops should use . based access.

Related

Failure to report number that is too small

I did the following calculations in Julia
z = LinRange(-0.09025000000000001,0.19025000000000003,5)
d = Normal.(0.05*(1-0.95) .+ 0.95.*z .- 0.0051^2/2, 0.0051 .* (similar(z) .*0 .+1))
minimum(cdf.(d, (z[3]+z[2])/2))
The problem I have is that the last code sometimes gives me the correct result 4.418051841202834e-239, sometimes reports the error DomainError with NaN: Normal: the condition σ >= zero(σ) is not satisfied. I think this is because 4.418051841202834e-239 is too small. But I was wondering why my code can give me different results.
In addition to points mentioned by others, here are a few more:
Firstly, don't use LinRange when numerical accuracy is of importance. This is what the range function is for. LinRange can be used when numerical precision is of lesser importance, since it is faster. From the docstring of range:
Special care is taken to ensure intermediate values are computed rationally. To avoid this induced overhead, see the LinRange constructor.
Example:
julia> LinRange(-0.09025000000000001,0.19025000000000003,5) .- range(-0.09025000000000001,0.19025000000000003,5)
0.0:-3.469446951953614e-18:-1.3877787807814457e-17
Secondly, this is a pretty terrible way to create a vector of a certain value:
0.0051 .* (similar(z) .*0 .+1)
Other's have mentioned ones, etc. but I think it's better to use fill
fill(0.0051, size(z))
which directly fills the array with the right value. Perhaps one should use convert(eltype(z), 0.0051) inside fill.
Thirdly, don't create this vector at all! You use broadcasting, so just use the scalar value:
d = Normal.(0.05*(1-0.95) .+ 0.95.*z .- 0.0051^2/2, 0.0051) # look! just a scalar!
This is how broadcasting works, it expands singleton dimensions implicitly to match other arguments (without actually wasting that memory).
Much of the point of broadcasting is that you don't need to create that sort of 'dummy arrays' anymore. If you find yourself doing that, give it another think; constant-valued arrays are inherently wasteful, and you shouldn't need to create them.
There are two problems:
Noted by #Dan Getz: similar does no initialize the values and quite often unused areas of memory have values corresponding to NaN. In that case multiplication by 0 does not help since NaN * 0 == NaN. Instead you want to have ones(eltype(z),size(z))
you need to use higher precision than Float64. BigFloat is one way to go - just you need to remember to call setprecision(BigFloat, 128) so you actually control how many bits you use. However, much more time-efficient solution (if you run computations at scale) will be to use a dedicated package such as DoubleFloats.
Sample corrected code using DoubleFloats below:
julia> z = LinRange(df64"-0.09025000000000001",df64"0.19025000000000003",5)
5-element LinRange{Double64, Int64}:
-0.09025000000000001,-0.020125,0.05000000000000001,0.12012500000000002,0.19025000000000003
julia> d = Normal.(0.05*(1-0.95) .+ 0.95.*z .- 0.0051^2/2, 0.0051 .* ones(eltype(z),size(z)))
5-element Vector{Normal{Double64}}:
Normal{Double64}(μ=-0.083250505, σ=0.0051)
Normal{Double64}(μ=-0.016631754999999998, σ=0.0051)
Normal{Double64}(μ=0.049986995000000006, σ=0.0051)
Normal{Double64}(μ=0.11660574500000001, σ=0.0051)
Normal{Double64}(μ=0.18322449500000001, σ=0.0051)
julia> minimum(cdf.(d, (z[3]+z[2])/2))
4.418051841203009e-239
The problem in the code is similar(z) which produces a vector with undefined entries and is used without initialization. Use ones(length(z)) instead.

What is the Rust way to deal with constant math when we only have a generic trait

Here is my broken example code. Note that I just want to double the value and allow anything which knows how to multiply.
fn by_two<T: Mul<Output = T>>(x: T) -> T {
x * 2
}
I think the issue you are running into is you needs to say what you are multiplying with. 2 defaults to an <integer> which if unspecified counts as an i32, so here is that example:
fn by_two<T: Mul<i32, Output=T>>(x: T) -> T {
x * 2
}
Alternatively, you could just leave the result up to the trait. This is generally* equivalent to just using x * 2 without calling the function.
fn by_two<T: Mul<i32>>(x: T) -> <T as Mul<i32>>::Output {
x * 2
}
However, this won't work with most types since Rust usually expects primitives to be multiplied with other privatives of the same type. We could change it to use From<i32> to make it more inclusive.
fn by_two<T: Mul + From<i32>>(x: T) -> <T as Mul>::Output {
x * T::from(2)
}
At this point you may be noticing the problem. We need to figure out what type "2" is so it can multiply with T. The From<i32> solution works well for primitive types, but you will start to run into difficulties if you tried to plug a vector or matrix into the function. One solution would be to add a separate generic to allow the user to specify what it gets multiplied with.
fn by_two<T: Mul<K>, K: From<i32>>(x: T) -> <T as Mul<K>>::Output {
x * K::from(2)
}
However, this is starting to get a bit cumbersome. Unlike the previous versions, we can't let the compiler infer a type to use for K since there might be multiple types which could satisfy the requirements put on K.
One option which doesn't quite solve our problem, but may help us a bit would be to use a crate like num-traits. Though to be honest, this really just amounts to more or less the same thing as example 2, but with a more generalized concept of 2.
use num_traits::identities::One;
fn by_two<T: One + Add<Output=T>>(x: T) -> T {
x * (T::one() + T::one())
}
You have a ton of options, but none are all inclusive. If you want to keep your match as generic as possible, you are probably best off writing the function first, then determining what mathematical properties the values must satisfy to make it work. You can use num-traits for basic trait bounds such as differentiating between floats and ints. Then for more complex concepts, I recommend using alga.
Well... what is two? For example, if I define my multiplication on Color as mixing two colors, then what does it mean to multiply a color by two?
The easiest way to solve this is by choosing a very small data type that two fits in (such as u8) and accepting any T which can (try to) convert from u8:
fn by_two<T: Mul<Output = T> + TryFrom<u8>>(x: T) -> T {
x * 2.try_into().ok().expect("could not convert two into T")
}
The more general (but cumbersome) way is to use the num crate, and accept anything that can be multiplied, and has a concept of One, using One + One as two.

dropping singleton dimensions in julia

Just playing around with Julia (1.0) and one thing that I need to use a lot in Python/numpy/matlab is the squeeze function to drop the singleton dimensions.
I found out that one way to do this in Julia is:
a = rand(3, 3, 1);
a = dropdims(a, dims = tuple(findall(size(a) .== 1)...))
The second line seems a bit cumbersome and not easy to read and parse instantly (this could also be my bias that I bring from other languages). However, I wonder if this is the canonical way to do this in Julia?
The actual answer to this question surprised me. What you are asking could be rephrased as:
why doesn't dropdims(a) remove all singleton dimensions?
I'm going to quote Tim Holy from the relevant issue here:
it's not possible to have squeeze(A) return a type that the compiler
can infer---the sizes of the input matrix are a runtime variable, so
there's no way for the compiler to know how many dimensions the output
will have. So it can't possibly give you the type stability you seek.
Type stability aside, there are also some other surprising implications of what you have written. For example, note that:
julia> f(a) = dropdims(a, dims = tuple(findall(size(a) .== 1)...))
f (generic function with 1 method)
julia> f(rand(1,1,1))
0-dimensional Array{Float64,0}:
0.9939103383167442
In summary, including such a method in Base Julia would encourage users to use it, resulting in potentially type-unstable code that, under some circumstances, will not be fast (something the core developers are strenuously trying to avoid). In languages like Python, rigorous type-stability is not enforced, and so you will find such functions.
Of course, nothing stops you from defining your own method as you have. And I don't think you'll find a significantly simpler way of writing it. For example, the proposition for Base that was not implemented was the method:
function squeeze(A::AbstractArray)
singleton_dims = tuple((d for d in 1:ndims(A) if size(A, d) == 1)...)
return squeeze(A, singleton_dims)
end
Just be aware of the potential implications of using it.
Let me simply add that "uncontrolled" dropdims (drop any singleton dimension) is a frequent source of bugs. For example, suppose you have some loop that asks for a data array A from some external source, and you run R = sum(A, dims=2) on it and then get rid of all singleton dimensions. But then suppose that one time out of 10000, your external source returns A for which size(A, 1) happens to be 1: boom, suddenly you're dropping more dimensions than you intended and perhaps at risk for grossly misinterpreting your data.
If you specify those dimensions manually instead (e.g., dropdims(R, dims=2)) then you are immune from bugs like these.
You can get rid of tuple in favor of a comma ,:
dropdims(a, dims = (findall(size(a) .== 1)...,))
I'm a bit surprised at Colin's revelation; surely something relying on 'reshape' is type stable? (plus, as a bonus, returns a view rather than a copy).
julia> function squeeze( A :: AbstractArray )
keepdims = Tuple(i for i in size(A) if i != 1);
return reshape( A, keepdims );
end;
julia> a = randn(2,1,3,1,4,1,5,1,6,1,7);
julia> size( squeeze(a) )
(2, 3, 4, 5, 6, 7)
No?

Can I use a subtype of a function parameter in the function definition?

I would like to use a subtype of a function parameter in my function definition. Is this possible? For example, I would like to write something like:
g{T1, T2<:T1}(x::T1, y::T2) = x + y
So that g will be defined for any x::T1 and any y that is a subtype of T1. Obviously, if I knew, for example, that T1 would always be Number, then I could write g{T<:Number}(x::Number, y::T) = x + y and this would work fine. But this question is for cases where T1 is not known until run-time.
Read on if you're wondering why I would want to do this:
A full description of what I'm trying to do would be a bit cumbersome, but what follows is a simplified example.
I have a parameterised type, and a simple method defined over that type:
type MyVectorType{T}
x::Vector{T}
end
f1!{T}(m::MyVectorType{T}, xNew::T) = (m.x[1] = xNew)
I also have another type, with an abstract super-type defined as follows
abstract MyAbstract
type MyType <: MyAbstract ; end
I create an instance of MyVectorType with vector element type set to MyAbstract using:
m1 = MyVectorType(Array(MyAbstract, 1))
I now want to place an instance of MyType in MyVectorType. I can do this, since MyType <: MyAbstract. However, I can't do this with f1!, since the function definition means that xNew must be of type T, and T will be MyAbstract, not MyType.
The two solutions I can think of to this problem are:
f2!(m::MyVectorType, xNew) = (m.x[1] = xNew)
f3!{T1, T2}(m::MyVectorType{T1}, xNew::T2) = T2 <: T1 ? (m.x[1] = xNew) : error("Oh dear!")
The first is essentially a duck-typing solution. The second performs the appropriate error check in the first step.
Which is preferred? Or is there a third, better solution I am not aware of?
The ability to define a function g{T, S<:T}(::Vector{T}, ::S) has been referred to as "triangular dispatch" as an analogy to diagonal dispatch: f{T}(::Vector{T}, ::T). (Imagine a table with a type hierarchy labelling the rows and columns, arranged such that the super types are to the top and left. The rows represent the element type of the first argument, and the columns the type of the second. Diagonal dispatch will only match the cells along the diagonal of the table, whereas triangular dispatch matches the diagonal and everything below it, forming a triangle.)
This simply isn't implemented yet. It's a complicated problem, especially once you start considering the scoping of T and S outside of function definitions and in the context of invariance. See issue #3766 and #6984 for more details.
So, practically, in this case, I think duck-typing is just fine. You're relying upon the implementation of myVectorType to do the error checking when it assigns its elements, which it should be doing in any case.
The solution in base julia for setting elements of an array is something like this:
f!{T}(A::Vector{T}, x::T) = (A[1] = x)
f!{T}(A::Vector{T}, x) = f!(A, convert(T, x))
Note that it doesn't worry about the type hierarchy or the subtype "triangle." It just tries to convert x to T… which is a no-op if x::S, S<:T. And convert will throw an error if it cannot do the conversion or doesn't know how.
UPDATE: This is now implemented on the latest development version (0.6-dev)! In this case I think I'd still recommend using convert like I originally answered, but you can now define restrictions within the static method parameters in a left-to-right manner.
julia> f!{T1, T2<:T1}(A::Vector{T1}, x::T2) = "success!"
julia> f!(Any[1,2,3], 4.)
"success!"
julia> f!(Integer[1,2,3], 4.)
ERROR: MethodError: no method matching f!(::Array{Integer,1}, ::Float64)
Closest candidates are:
f!{T1,T2<:T1}(::Array{T1,1}, ::T2<:T1) at REPL[1]:1
julia> f!([1.,2.,3.], 4.)
"success!"

Unit-safe square roots

I just wondered how it is possible to write a user-defined square root function (sqrt) in a way that it interacts properly with F#'s unit system.
What it should be like:
let sqrt (x : float<'u ^ 2>) =
let x' = x / 1.0<'u ^ 2> // Delete unit
(x ** 0.5) * 1.0<'u> // Reassign unit
But this is disallowed due to nonzero constants not being allowed to have generic units.
Is there a way to write this function? With the builtin sqrt it works fine, so what magic does it perform?
Allowing nonzero generic constants would make it very easy to break the safety of the type system for units (see Andrew Kennedy's papers). I believe that the answer to your last question is that sqrt is indeed magic in some sense in that it shouldn't be possible to define a parametric function with that type signature through normal means. However, it is possible to do what you want (at least in the current version of F#) by taking advantage of boxing and casting:
let sqrt (x : float<'u^2>) =
let x' = (float x) ** 0.5 (* delete unit and calculate sqrt *)
((box x') :?> float<'u>)
#kvb is right, more generally:
If you have a non-unit aware algorithm (e.g. say you write 'cube root'), and you want to put units on it, you can wrap the algorithm in a function with the right type signature and use e.g. "float" to 'cast away' the units as they come in and the box-and-downcast approach to 'add back' the appropriate units on the way out.
In the RTM release (after Beta2), F# will have primitive library functions for the 'add back units', since the box-and-downcast approach is a currently bit of a hack to overcome the lack of these primitives in the language/library.

Resources