Type Hierarchy for Containers Values confusing [duplicate] - julia

I try to understand typing in Julia and encounter the following problem with Array. I wrote a function bloch_vector_2d(Array{Complex,2}); the detailed implementation is irrelevant. When calling, here is the complaint:
julia> bloch_vector_2d(rhoA)
ERROR: MethodError: no method matching bloch_vector_2d(::Array{Complex{Float64},2})
Closest candidates are:
bloch_vector_2d(::Array{Complex,2}) at REPL[56]:2
bloch_vector_2d(::StateAB) at REPL[54]:1
Stacktrace:
[1] top-level scope at REPL[64]:1
The problem is that an array of parent type is not automatically a parent of an array of child type.
julia> Complex{Float64} <: Complex
true
julia> Array{Complex{Float64},2} <: Array{Complex,2}
false
I think it would make sense to impose in julia that Array{Complex{Float64},2} <: Array{Complex,2}. Or what is the right way to implement this in Julia? Any helps or comments are appreciated!

This issue is discussed in detail in the Julia Manual here.
Quoting the relevant part of it:
In other words, in the parlance of type theory, Julia's type parameters are invariant, rather than being covariant (or even contravariant). This is for practical reasons: while any instance of Point{Float64} may conceptually be like an instance of Point{Real} as well, the two types have different representations in memory:
An instance of Point{Float64} can be represented compactly and efficiently as an immediate pair of 64-bit values;
An instance of Point{Real} must be able to hold any pair of instances of Real. Since objects that are instances of Real can be of arbitrary size and structure, in practice an instance of Point{Real} must be represented as a pair of pointers to individually allocated Real objects.
Now going back to your question how to write a method signature then you have:
julia> Array{Complex{Float64},2} <: Array{<:Complex,2}
true
Note the difference:
Array{<:Complex,2} represents a union of all types that are 2D arrays whose eltype is a subtype of Complex (i.e. no array will have this exact type).
Array{Complex,2} is a type that an array can have and this type means that you can store Complex values in it that can have mixed parameter.
Here is an example:
julia> x = Complex[im 1im;
1.0im Float16(1)im]
2×2 Array{Complex,2}:
im 0+1im
0.0+1.0im 0.0+1.0im
julia> typeof.(x)
2×2 Array{DataType,2}:
Complex{Bool} Complex{Int64}
Complex{Float64} Complex{Float16}
Also note that the notation Array{<:Complex,2} is the same as writing Array{T,2} where T<:Complex (or more compactly Matrix{T} where T<:Complex).

This is more of a comment, but I can't hesitate posting it. This question apprars so often. I'll tell you why that phenomenon must arise.
A Bag{Apple} is a Bag{Fruit}, right? Because, when I have a JuicePress{Fruit}, I can give it a Bag{Apple} to make some juice, because Apples are Fruits.
But now we run into a problem: my fruit juice factory, in which I process different fruits, has a failure. I order a new JuicePress{Fruit}. Now, I unfortunately get delivered a replacement JuicePress{Lemon} -- but Lemons are Fruits, so surely a JuicePress{Lemon} is a JuicePress{Fruit}, right?
However, the next day, I feed apples to the new press, and the machine explodes. I hope you see why: JuicePress{Lemon} is not a JuicePress{Fruit}. On the contrary: a JuicePress{Fruit} is a JuicePress{Lemon} -- I can press lemons with a fruit-agnostic press! They could have sent me a JuicePress{Plant}, though, since Fruits are Plants.
Now we can get more abstract. The real reason is: function input arguments are contravariant, while function output arguments are covariant (in an idealized setting)2. That is, when we have
f : A -> B
then I can pass in supertypes of A, and end up with subtypes of B. Hence, when we fix the first argument, the induced function
(Tree -> Apple) <: (Tree -> Fruit)
whenever Apple <: Fruit -- this is the covariant case, it preserves the direction of <:. But when we fix the second one,
(Fruit -> Juice) <: (Apple -> Juice)
whenever Fruit >: Apple -- this inverts the diretion of <:, and therefore is called contravariant.
This carries over to other parametric data types, since there, too, you usually have "output-like" parameters (as in the Bag), and "input-like" parameters (as with the JuicePress). There can also be parameters that behave like neither (e.g., when they occur in both fashions) -- these are then called invariant.
There are now two ways in which languages with parametric types solve this problem. The, in my opinion, more elegant one is to mark every parameter: no annotation means invariant, + means covariant, - means contravariant (this has technical reasons -- those parameters are said to occur in "positive" and "negative position"). So we had the Bag[+T <: Fruit], or the JuicePress[-T <: Fruit] (should be Scala syntax, but I haven't tried it). This makes subtyping more complicated, though.
The other route to go is what Julia does (and, BTW, Java): all types are invariant1, but you can specify upper and lower unions at the call site. So you have to say
makejuice(::JoicePress{>:T}, ::Bag{<:T}) where {T}
And that's how we arrive at the other answers.
1Except for tuples, but that's weird.
2This terminology comes from category theory. The Hom-functor is contravariant in the first, and covariant in the second argument. There's an intuitive realization of subtyping through the "forgetful" functor from the category Typ to the poset of Types under the <: relation. And the CT terminology in turn comes from tensors.

While the "how it works" discussion has been done in the another answer, the best way to implement your method is the following:
function bloch_vector_2d(a::AbstractArray{Complex{T}}) where T<:Real
sum(a) + 5*one(T) # returning something to see how this is working
end
Now this will work like this:
julia> bloch_vector_2d(ones(Complex{Float64},4,3))
17.0 + 0.0im

Related

Arrays of abstract type in julia in functions

I try to understand typing in Julia and encounter the following problem with Array. I wrote a function bloch_vector_2d(Array{Complex,2}); the detailed implementation is irrelevant. When calling, here is the complaint:
julia> bloch_vector_2d(rhoA)
ERROR: MethodError: no method matching bloch_vector_2d(::Array{Complex{Float64},2})
Closest candidates are:
bloch_vector_2d(::Array{Complex,2}) at REPL[56]:2
bloch_vector_2d(::StateAB) at REPL[54]:1
Stacktrace:
[1] top-level scope at REPL[64]:1
The problem is that an array of parent type is not automatically a parent of an array of child type.
julia> Complex{Float64} <: Complex
true
julia> Array{Complex{Float64},2} <: Array{Complex,2}
false
I think it would make sense to impose in julia that Array{Complex{Float64},2} <: Array{Complex,2}. Or what is the right way to implement this in Julia? Any helps or comments are appreciated!
This issue is discussed in detail in the Julia Manual here.
Quoting the relevant part of it:
In other words, in the parlance of type theory, Julia's type parameters are invariant, rather than being covariant (or even contravariant). This is for practical reasons: while any instance of Point{Float64} may conceptually be like an instance of Point{Real} as well, the two types have different representations in memory:
An instance of Point{Float64} can be represented compactly and efficiently as an immediate pair of 64-bit values;
An instance of Point{Real} must be able to hold any pair of instances of Real. Since objects that are instances of Real can be of arbitrary size and structure, in practice an instance of Point{Real} must be represented as a pair of pointers to individually allocated Real objects.
Now going back to your question how to write a method signature then you have:
julia> Array{Complex{Float64},2} <: Array{<:Complex,2}
true
Note the difference:
Array{<:Complex,2} represents a union of all types that are 2D arrays whose eltype is a subtype of Complex (i.e. no array will have this exact type).
Array{Complex,2} is a type that an array can have and this type means that you can store Complex values in it that can have mixed parameter.
Here is an example:
julia> x = Complex[im 1im;
1.0im Float16(1)im]
2×2 Array{Complex,2}:
im 0+1im
0.0+1.0im 0.0+1.0im
julia> typeof.(x)
2×2 Array{DataType,2}:
Complex{Bool} Complex{Int64}
Complex{Float64} Complex{Float16}
Also note that the notation Array{<:Complex,2} is the same as writing Array{T,2} where T<:Complex (or more compactly Matrix{T} where T<:Complex).
This is more of a comment, but I can't hesitate posting it. This question apprars so often. I'll tell you why that phenomenon must arise.
A Bag{Apple} is a Bag{Fruit}, right? Because, when I have a JuicePress{Fruit}, I can give it a Bag{Apple} to make some juice, because Apples are Fruits.
But now we run into a problem: my fruit juice factory, in which I process different fruits, has a failure. I order a new JuicePress{Fruit}. Now, I unfortunately get delivered a replacement JuicePress{Lemon} -- but Lemons are Fruits, so surely a JuicePress{Lemon} is a JuicePress{Fruit}, right?
However, the next day, I feed apples to the new press, and the machine explodes. I hope you see why: JuicePress{Lemon} is not a JuicePress{Fruit}. On the contrary: a JuicePress{Fruit} is a JuicePress{Lemon} -- I can press lemons with a fruit-agnostic press! They could have sent me a JuicePress{Plant}, though, since Fruits are Plants.
Now we can get more abstract. The real reason is: function input arguments are contravariant, while function output arguments are covariant (in an idealized setting)2. That is, when we have
f : A -> B
then I can pass in supertypes of A, and end up with subtypes of B. Hence, when we fix the first argument, the induced function
(Tree -> Apple) <: (Tree -> Fruit)
whenever Apple <: Fruit -- this is the covariant case, it preserves the direction of <:. But when we fix the second one,
(Fruit -> Juice) <: (Apple -> Juice)
whenever Fruit >: Apple -- this inverts the diretion of <:, and therefore is called contravariant.
This carries over to other parametric data types, since there, too, you usually have "output-like" parameters (as in the Bag), and "input-like" parameters (as with the JuicePress). There can also be parameters that behave like neither (e.g., when they occur in both fashions) -- these are then called invariant.
There are now two ways in which languages with parametric types solve this problem. The, in my opinion, more elegant one is to mark every parameter: no annotation means invariant, + means covariant, - means contravariant (this has technical reasons -- those parameters are said to occur in "positive" and "negative position"). So we had the Bag[+T <: Fruit], or the JuicePress[-T <: Fruit] (should be Scala syntax, but I haven't tried it). This makes subtyping more complicated, though.
The other route to go is what Julia does (and, BTW, Java): all types are invariant1, but you can specify upper and lower unions at the call site. So you have to say
makejuice(::JoicePress{>:T}, ::Bag{<:T}) where {T}
And that's how we arrive at the other answers.
1Except for tuples, but that's weird.
2This terminology comes from category theory. The Hom-functor is contravariant in the first, and covariant in the second argument. There's an intuitive realization of subtyping through the "forgetful" functor from the category Typ to the poset of Types under the <: relation. And the CT terminology in turn comes from tensors.
While the "how it works" discussion has been done in the another answer, the best way to implement your method is the following:
function bloch_vector_2d(a::AbstractArray{Complex{T}}) where T<:Real
sum(a) + 5*one(T) # returning something to see how this is working
end
Now this will work like this:
julia> bloch_vector_2d(ones(Complex{Float64},4,3))
17.0 + 0.0im

Julia: write a union of types using "abstract type"

I am trying to write the following union of types:
FloatVector = Union{Array{Float64,1}, DataArray{Float64,1}, Array{Union{Float64, Missings.Missing},1}};
using the abstract types syntax. Ideally, I would like to do something similar to this link. I tried the below, but unfortunately it does not work:
abstract type FloatVector end
type Array{Float64,1} <: FloatVector end
type DataArray{Float64,1} <: FloatVector end
type Array{Union{Float64, Missings.Missing},1} <: FloatVector end
I am not confident with abstract types and I couldn't find a good reference on a similar problem. I would be glad if you could explain me how to proceed and the advantages over the Union.
It is not possible to do what you want using abstract types in Julia. Union is a way to represent your requirement.
Now to understand why observe that in Julia type system you have three restrictions:
Only abstract types can have subtypes.
A type can have only one supertype.
You have to specify a supertype of a type when defining it.
In this way we know that types create a tree like structure, where the root is Any (with the exception of Union{} which has no values and is subtype of all other types, but you will probably not need this in practice).
Your definition would violate rule #2 and rule #3. Observe that all those types are already defined (either in Base or in a package) - so you cannot redefine them (rule #3). Also they already have a supertype so you cannot add another (rule #2). See for example the result of this call:
julia> supertype(Array{Float64, 1})
DenseArray{Float64,1}
And you see that Array{Float64, 1} is already defined and has a supertype.
Having said that maybe another more general definition will be just enough for you:
const FloatOrMissingVector = AbstractVector{<:Union{Float64, Missing}}
This definition has two differences:
It allows other than dense vectors (which I guess is what you would want anyway as typically you do not care if a vector is dense or sparse)
It would allow e.g. Vector{Missing} as a subtype (so a structure containing only missing values) - if you would want to rule this out you would have to specify union as you have proposed above.

Function of parameter type in type definition

Assume I want to store I vector together with its norm. I expected the corresponding type definition to be straightforward:
immutable VectorWithNorm1{Vec <: AbstractVector}
vec::Vec
norm::eltype(Vec)
end
However, this doesn't work as intended:
julia> fieldtype(VectorWithNorm1{Vector{Float64}},:norm)
Any
It seems I have to do
immutable VectorWithNorm2{Vec <: AbstractVector, Eltype}
vec::Vec
norm::Eltype
end
and rely on the user to not abuse the Eltype parameter. Is this correct?
PS: This is just a made-up example to illustrate the problem. It is not the actual problem I'm facing.
Any calculations on a type parameter currently do not work
(although I did discuss the issue with Jeff Bezanson at JuliaCon, and he seemed amenable to fixing it).
The problem currently is that the expression for the type of norm gets evaluated when the parameterized type is defined, and gets called with a TypeVar, but it is not yet bound to a value, which is what you really need it to be called with, at the time that that parameter is actually bound to create a concrete type.
I've run into this a lot, where I want to do some calculation on the number of bits of a floating point type, i.e. to calculate and use the number of UInts needed to store a fp value of a particular precision, and use an NTuple{N,UInt} to hold the mantissa.

How can I dispatch on traits relating two types, where the second type that co-satisfies the trait is uniquely determined by the first?

Say I have a Julia trait that relates to two types: one type is a sort of "base" type that may satisfy a sort of partial trait, the other is an associated type that is uniquely determined by the base type. (That is, the relation from BaseType -> AssociatedType is a function.) Together, these types satisfy a composite trait that is the one of interest to me.
For example:
using Traits
#traitdef IsProduct{X} begin
isnew(X) -> Bool
coolness(X) -> Float64
end
#traitdef IsProductWithMeasurement{X,M} begin
#constraints begin
istrait(IsProduct{X})
end
measurements(X) -> M
#Maybe some other stuff that dispatches on (X,M), e.g.
#fits_in(X,M) -> Bool
#how_many_fit_in(X,M) -> Int64
#But I don't want to implement these now
end
Now here are a couple of example types. Please ignore the particulars of the examples; they are just meant as MWEs and there is nothing relevant in the details:
type Rope
color::ASCIIString
age_in_years::Float64
strength::Float64
length::Float64
end
type Paper
color::ASCIIString
age_in_years::Int64
content::ASCIIString
width::Float64
height::Float64
end
function isnew(x::Rope)
(x.age_in_years < 10.0)::Bool
end
function coolness(x::Rope)
if x.color=="Orange"
return 2.0::Float64
elseif x.color!="Taupe"
return 1.0::Float64
else
return 0.0::Float64
end
end
function isnew(x::Paper)
(x.age_in_years < 1.0)::Bool
end
function coolness(x::Paper)
(x.content=="StackOverflow Answers" ? 1000.0 : 0.0)::Float64
end
Since I've defined these functions, I can do
#assert istrait(IsProduct{Rope})
#assert istrait(IsProduct{Paper})
And now if I define
function measurements(x::Rope)
(x.length)::Float64
end
function measurements(x::Paper)
(x.height,x.width)::Tuple{Float64,Float64}
end
Then I can do
#assert istrait(IsProductWithMeasurement{Rope,Float64})
#assert istrait(IsProductWithMeasurement{Paper,Tuple{Float64,Float64}})
So far so good; these run without error. Now, what I want to do is write a function like the following:
#traitfn function get_measurements{X,M;IsProductWithMeasurement{X,M}}(similar_items::Array{X,1})
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
Generically, this function is meant to be an example of "I want to use the fact that I, as the programmer, know that BaseType is always associated to AssociatedType to help the compiler with type inference. I know that whenever I do a certain task [in this case, get_measurements, but generically this could work in a bunch of cases] then I want the compiler to infer the output type of that function in a consistently patterned way."
That is, e.g.
do_something_that_makes_arrays_of_assoc_type(x::BaseType)
will always spit out Array{AssociatedType}, and
do_something_that_makes_tuples(x::BaseType)
will always spit out Tuple{Int64,BaseType,AssociatedType}.
AND, one such relationship holds for all pairs of <BaseType,AssociatedType>; e.g. if BatmanType is the base type to which RobinType is associated, and SupermanType is the base type to which LexLutherType is always associated, then
do_something_that_makes_tuple(x::BatManType)
will always output Tuple{Int64,BatmanType,RobinType}, and
do_something_that_makes_tuple(x::SuperManType)
will always output Tuple{Int64,SupermanType,LexLutherType}.
So, I understand this relationship, and I want the compiler to understand it for the sake of speed.
Now, back to the function example. If this makes sense, you will have realized that while the function definition I gave as an example is 'correct' in the sense that it satisfies this relationship and does compile, it is un-callable because the compiler doesn't understand the relationship between X and M, even though I do. In particular, since M doesn't appear in the method signature, there is no way for Julia to dispatch on the function.
So far, the only thing I have thought to do to solve this problem is to create a sort of workaround where I "compute" the associated type on the fly, and I can still use method dispatch to do this computation. Consider:
function get_measurement_type_of_product(x::Rope)
Float64
end
function get_measurement_type_of_product(x::Paper)
Tuple{Float64,Float64}
end
#traitfn function get_measurements{X;IsProduct{X}}(similar_items::Array{X,1})
M = get_measurement_type_of_product(similar_items[1]::X)
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
Then indeed this compiles and is callable:
julia> get_measurements(Array{Rope,1}([Rope("blue",1.0,1.0,1.0),Rope("red",2.0,2.0,2.0)]))
2-element Array{Float64,1}:
1.0
2.0
But this is not ideal, because (a) I have to redefine this map each time, even though I feel as though I already told the compiler about the relationship between X and M by making them satisfy the trait, and (b) as far as I can guess--maybe this is wrong; I don't have direct evidence for this--the compiler won't necessarily be able to optimize as well as I want, since the relationship between X and M is "hidden" inside the return value of the function call.
One last thought: if I had the ability, what I would ideally do is something like this:
#traitdef IsProduct{X} begin
isnew(X) -> Bool
coolness(X) -> Float64
∃ ! M s.t. measurements(X) -> M
end
and then have some way of referring to the type that uniquely witnesses the existence relationship, so e.g.
#traitfn function get_measurements{X;IsProduct{X},IsWitnessType{IsProduct{X},M}}(similar_items::Array{X,1})
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
because this would be somehow dispatchable.
So: what is my specific question? I am asking, given that you presumably by this point understand that my goals are
Have my code exhibit this sort of structure generically, so that
I can effectively repeat this design pattern across a lot of cases
and then program in the abstract at the high-level of X and M,
and
do (1) in such a way that the compiler can still optimize to the best of its ability / is as aware of the relationship among
types as I, the coder, am
then, how should I do this? I think the answer is
Use Traits.jl
Do something pretty similar to what you've done so far
Also do ____some clever thing____ that the answerer will indicate,
but I'm open to the idea that in fact the correct answer is
Abandon this approach, you're thinking about the problem the wrong way
Instead, think about it this way: ____MWE____
I'd also be perfectly satisfied by answers of the form
What you are asking for is a "sophisticated" feature of Julia that is still under development, and is expected to be included in v0.x.y, so just wait...
and I'm less enthusiastic about (but still curious to hear) an answer such as
Abandon Julia; instead use the language ________ that is designed for this type of thing
I also think this might be related to the question of typing Julia's function outputs, which as I take it is also under consideration, though I haven't been able to puzzle out the exact representation of this problem in terms of that one.

Hash instability in Julia composite types

In Julia, composite types with at least one field that have identical values hash to different values. This means composite types don't function correctly if you use them as dictionary keys or anything else that is dependent on the hashed value. This behavior is inconsistent with the behavior of other types, such as Vector{Int} as well.
More concretely,
vectors of non-composite types that are different objects but have identical values hash to the same value:
julia> hash([1,2,3])==hash([1,2,3])
true
composite types with no fields hash to the same value:
julia> type A end
julia> hash(A())==hash(A())
true
composite types with at least one field hash to different values if they're different objects that have the same value:
julia> type B
b::Int
end
julia> hash(B(1))==hash(B(1))
false
however, the same object maintains its hash even if the underlying values change:
julia> b=B(1)
julia> hash(b)
0x0b2c67452351ff52
julia> b.b=2;
julia> hash(b)
0x0b2c67452351ff52
this is inconsistent with the behavior of vectors (if you change an element, the hash changes):
julia> a = [1,2,3];
julia> hash(a)
0xd468fb40d24a17cf
julia> a[1]=2;
julia> hash(a)
0x777c61a790f5843f
this issue is not present for immutable types:
julia> immutable C
c::Int
end
julia> hash(C(1))==hash(C(1))
true
Is there something fundamental driving this behavior from a language design perspective? Are there plans to fix or correct this behavior?
I'm not a Julia language designer, but I'l say this sort of behavior is not surprising when comparing mutable and immutable values. Your type B is mutable: it's not entirely obvious that two instances of it, even if they have the same value for field b, should be considered equal. If you feel like this should be the case, you are free to implement a hash function for it. But in general, mutable objects have independent identities. Immutable entities, like C are impossible to tell apart from each other, therefore it makes sense for them to obey structural hashing.
Two bank accounts that happen to have $5 in them are not identical and probably shouldn't hash to the same number. But two instances of $5 are impossible to distinguish from each other.

Resources