How to define multiple dispatch, when we have so many structs? - julia

If we have a few structs, using multiple dispatch isn't a problem.
But when we have so many struct, how to use multiple dispatch?
For example, we have N structs like this:
struct An
a::Float64
end
And a function like:
f!(a::Ai) = exp(Ai.a)
When N is large, it will be a headache.
Consider that this function is simple and easy! A function could be so large!

If the function definition is the same for all the structs, you could define them as a concrete type of some abstract type, and leave only one function that dispatches on the abstract type:
julia> abstract type Allmystructs end
julia> struct A1 <: Allmystructs
a::Float64
end
julia> struct A2 <: Allmystructs
a::Float64
end
julia> f(A :: Allmystructs) = exp(A.a)
f (generic function with 1 method)
julia> test1 = A1(5)
A1(5.0)
julia> test2 = A2(8)
A2(8.0)
julia> f(test1)
148.4131591025766
julia> f(test2)
2980.9579870417283
Of course, this may not be what you are looking for if the function definition for each type of struct is different. In that case, metaprogramming can be your friend.
Edit: Typos.

You can enumerate the names of your structs in a loop, and use #eval to generate and evaluate code for each of them:
julia> for S in [:A1, :A2, :A3]
#eval begin
struct $S
a::Float64
end
f(x::$S) = exp(x.a)
end
end
julia> A2(2)
A2(2.0)
julia> f(A2(2))
7.38905609893065
I here defined the structs in the same place, because I was trying it out in the console.
But there might be a better alternatives to this. eval is usually considered a sign of suboptimal design.

Usually when code has a lot of very similar structs, this suggests that maybe composition is an alternative.
As an example, suppose we have a colored geometric shape library with a lot of different structs:
const Point = Tuple{Float64, Float64}
struct Disc
center::Point
radius::Float64
red::Float64
green::Float64
blue::Float64
end
struct Rectangle
topleft::Point
bottomright::Point
red::Float64
green::Float64
blue::Float64
end
# ... etc., e.g. Triangle, Hexagon
Now suppose we want to introduce a luminance() function which returns the perceived luminance of the color of the shape. One way is to define a method for each struct, but since the methods are all the same, we can also do:
const Shape = Union{Disc, Rectangle, Triangle, Hexagon}
luminance(shape::Shape) = 0.299*shape.red + 0.587*shape.green + 0.114*shape.blue)
This is still a little annoying because we need to have all the shapes available in one place in order to list them. Adding new shapes would be a hassle. So indeed, we can make an abstract type Shape end and have each shape subtype it, as suggested in the accepted answer. But in many ways, this approach is still unsatisfactory, because it constrains all future Shapes to share the same layout!
A better way to approach this problem is to decouple the red, green, and blue properties shared by all the colored shapes. Thus we introduce a type hierarchy as:
const Point = Tuple{Float64, Float64}
struct Color
red::Float64
green::Float64
blue::Float64
end
abstract type Figure end
struct Disc <: Figure
center::Point
radius::Float64
end
struct Rectangle <: Figure
topleft::Point
bottomright::Point
end
struct ColoredShape{F <: Figure}
figure::F
color::Color
end
Now, instead of using Rectangle((0.0, 0.0), (1.0, 1.0), 0.5, 0.5, 0.5) to represent a gray rectangle, we would use ColoredShape(Rectange((0.0, 0.0), (1.0, 1.0)), Color(0.5, 0.5, 0.5)). Instead of defining multiple identical luminance methods, we would define it just once for the Color struct. (You could also, optionally, define another method for ColoredShape that delegates to the property color, but this is only one additional method instead of N!) This pattern also allows the functionality we define for colors to be reused across other contexts, besides colored shapes.
In general, it is preferable to split concepts down to the smallest digestible pieces for re-usability and understandability. If there are lots of very similar structs, such that defining functions for all of them seems to be a chore, this would suggest that there could possibly be some shared functionality to factor out.

Related

Arrays of abstract type in julia in functions

I try to understand typing in Julia and encounter the following problem with Array. I wrote a function bloch_vector_2d(Array{Complex,2}); the detailed implementation is irrelevant. When calling, here is the complaint:
julia> bloch_vector_2d(rhoA)
ERROR: MethodError: no method matching bloch_vector_2d(::Array{Complex{Float64},2})
Closest candidates are:
bloch_vector_2d(::Array{Complex,2}) at REPL[56]:2
bloch_vector_2d(::StateAB) at REPL[54]:1
Stacktrace:
[1] top-level scope at REPL[64]:1
The problem is that an array of parent type is not automatically a parent of an array of child type.
julia> Complex{Float64} <: Complex
true
julia> Array{Complex{Float64},2} <: Array{Complex,2}
false
I think it would make sense to impose in julia that Array{Complex{Float64},2} <: Array{Complex,2}. Or what is the right way to implement this in Julia? Any helps or comments are appreciated!
This issue is discussed in detail in the Julia Manual here.
Quoting the relevant part of it:
In other words, in the parlance of type theory, Julia's type parameters are invariant, rather than being covariant (or even contravariant). This is for practical reasons: while any instance of Point{Float64} may conceptually be like an instance of Point{Real} as well, the two types have different representations in memory:
An instance of Point{Float64} can be represented compactly and efficiently as an immediate pair of 64-bit values;
An instance of Point{Real} must be able to hold any pair of instances of Real. Since objects that are instances of Real can be of arbitrary size and structure, in practice an instance of Point{Real} must be represented as a pair of pointers to individually allocated Real objects.
Now going back to your question how to write a method signature then you have:
julia> Array{Complex{Float64},2} <: Array{<:Complex,2}
true
Note the difference:
Array{<:Complex,2} represents a union of all types that are 2D arrays whose eltype is a subtype of Complex (i.e. no array will have this exact type).
Array{Complex,2} is a type that an array can have and this type means that you can store Complex values in it that can have mixed parameter.
Here is an example:
julia> x = Complex[im 1im;
1.0im Float16(1)im]
2×2 Array{Complex,2}:
im 0+1im
0.0+1.0im 0.0+1.0im
julia> typeof.(x)
2×2 Array{DataType,2}:
Complex{Bool} Complex{Int64}
Complex{Float64} Complex{Float16}
Also note that the notation Array{<:Complex,2} is the same as writing Array{T,2} where T<:Complex (or more compactly Matrix{T} where T<:Complex).
This is more of a comment, but I can't hesitate posting it. This question apprars so often. I'll tell you why that phenomenon must arise.
A Bag{Apple} is a Bag{Fruit}, right? Because, when I have a JuicePress{Fruit}, I can give it a Bag{Apple} to make some juice, because Apples are Fruits.
But now we run into a problem: my fruit juice factory, in which I process different fruits, has a failure. I order a new JuicePress{Fruit}. Now, I unfortunately get delivered a replacement JuicePress{Lemon} -- but Lemons are Fruits, so surely a JuicePress{Lemon} is a JuicePress{Fruit}, right?
However, the next day, I feed apples to the new press, and the machine explodes. I hope you see why: JuicePress{Lemon} is not a JuicePress{Fruit}. On the contrary: a JuicePress{Fruit} is a JuicePress{Lemon} -- I can press lemons with a fruit-agnostic press! They could have sent me a JuicePress{Plant}, though, since Fruits are Plants.
Now we can get more abstract. The real reason is: function input arguments are contravariant, while function output arguments are covariant (in an idealized setting)2. That is, when we have
f : A -> B
then I can pass in supertypes of A, and end up with subtypes of B. Hence, when we fix the first argument, the induced function
(Tree -> Apple) <: (Tree -> Fruit)
whenever Apple <: Fruit -- this is the covariant case, it preserves the direction of <:. But when we fix the second one,
(Fruit -> Juice) <: (Apple -> Juice)
whenever Fruit >: Apple -- this inverts the diretion of <:, and therefore is called contravariant.
This carries over to other parametric data types, since there, too, you usually have "output-like" parameters (as in the Bag), and "input-like" parameters (as with the JuicePress). There can also be parameters that behave like neither (e.g., when they occur in both fashions) -- these are then called invariant.
There are now two ways in which languages with parametric types solve this problem. The, in my opinion, more elegant one is to mark every parameter: no annotation means invariant, + means covariant, - means contravariant (this has technical reasons -- those parameters are said to occur in "positive" and "negative position"). So we had the Bag[+T <: Fruit], or the JuicePress[-T <: Fruit] (should be Scala syntax, but I haven't tried it). This makes subtyping more complicated, though.
The other route to go is what Julia does (and, BTW, Java): all types are invariant1, but you can specify upper and lower unions at the call site. So you have to say
makejuice(::JoicePress{>:T}, ::Bag{<:T}) where {T}
And that's how we arrive at the other answers.
1Except for tuples, but that's weird.
2This terminology comes from category theory. The Hom-functor is contravariant in the first, and covariant in the second argument. There's an intuitive realization of subtyping through the "forgetful" functor from the category Typ to the poset of Types under the <: relation. And the CT terminology in turn comes from tensors.
While the "how it works" discussion has been done in the another answer, the best way to implement your method is the following:
function bloch_vector_2d(a::AbstractArray{Complex{T}}) where T<:Real
sum(a) + 5*one(T) # returning something to see how this is working
end
Now this will work like this:
julia> bloch_vector_2d(ones(Complex{Float64},4,3))
17.0 + 0.0im

Create multiple Methods automatically

I can define a function that handles Integers:
function twice(a::Int64) a + a end
This function cannot handle Floats. If I want that, I need to define another method:
function twice(a::Float64) a + a end
But wait, this looks exactly the same, apart from the type definition. So, when I dream at night, there is a possibility to create such method definitions (everything identical apart from the type/combination of types) with a... macro maybe? Something like
#create_all_methods ("twice(a::$type ) a + a end", ["Int64", "Float64"])
Is this possible and if so, how?
Does the question maybe make no sense at all, because there is no situation where function twice(a) a + a end wouldn't achieve the exact same thing anyway?
Thanks for your help in advance.
There are multiple ways to achieve that. The easiest would be to just omit the type of aT then you methodn would look like:
function twice(a) a + a end
This is equivalent to
function twice(a::Any) a + a end
But maybe you don't want to define this for all types, or you already have another definition for twice(a::Any), so you could restrict your definition to the common supertype of Int64 and Float64. This common supertype can be found with typejoin(Float64, Int64) and yields the result Real, so your definition would now be
function twice(a::Real) a + a end
This also creates a method for other subtypes of Real such asa::Int32, so if you really want the method only for Int64 and Float64 you can create a union type. Then the method would look like
function twice(a::Union{Int64, Float64}) a + a end
Finally, it is indeed possible to achieve what you wanted to achieve with your macro. It does not make sense in this case, but either the function eval or the macro #eval is often used in more complicated cases. Your code could then look like
for T in (Int64, Float64)
#eval function twice(a::$T) a + a end
end
If you you just started learning Julia, I would not advice to use eval, as there are some dangers/anti-patterns associated with the usage of eval.
A straightforward way of creating methods for two or more input types is to use Union:
twice(a::Union{T1, T2, T3}) = a + a
where T1, T2, T3, ect. are concrete types, such as Int or Float64.
More commonly, you would define it for some abstract supertype instead, for example
twice(a::Number) = a + a
But most of the time, you should just start by defining a generic function
twice(a) = a + a
and then add types when you find that it becomes necessary. Most of the time it isn't.

Type parameters and inner constructors in julia 0.6

I am in doubt about how to restrict type parameters for parametric types with abstract types in julia 0.6, using the where syntax.
Consider the example where I want to make a parametric abstract type that takes integers, and define structs inheriting from that. If I try:
abstract type AbstractFoo{T} where T<: Integer end
it fails, but instead I can use the non-where syntax
abstract type AbstractFoo{T<:Integer} end
Is this the recommended format?
Given this, how do I implement my subtype
mutable struct Foo{T} <: AbstractFoo{T} where T <: Integer
bar::T
end
fails too (Invalid subtyping). I can bypass the where syntax again with
mutable struct Foo{T<:Integer} <: AbstractFoo{T}
bar::T
end
But that seems to be redundant (because T is already restricted to be an Integer). 2. Could I leave it out?:
mutable struct Foo{T} <: AbstractFoo{T}
bar::T
end
Finally, with the deprecation of inner constructor syntax, is there any way around defining the inner constructor as:
mutable struct Foo{T} <: AbstractFoo{T}
bar::T
Foo{T}(x::T) where T = new(x)
end
This makes Foo(3) impossible - requiring me to use Foo{Int}(3). 3. Is this intentional or is there a better way around this?
EDIT: I guess for the inner constructor question I can always define an outer constructor Foo(x::T) where {T} = Foo{T}(x).
I would write:
abstract type AbstractFoo{T<:Integer} end
mutable struct Foo{T} <: AbstractFoo{T}
bar::T
Foo(x::T) where T = new{T}(x)
end
this 1) limit x to Integer 2) allow write Foo(2)
About the questions:
yes, that the right and only right format
that's valid and T will be restrict to Integer, but you may get a worse error message since it is arised from AbstractFoo, not Foo. Your user may not notice that Foo is a subtype of AbstractFoo and get confused.
This is intended and is one of the main purposes of introducing where syntax. In the new syntax, T{S} always specify the type parameter of T and where S is introducing type parameter of the function. So Foo{T}(x) where T = defines what Foo{Int}(2) should do, and Foo(x::T) where T = defines what Foo(2) should do. After where was introduced, there are no "inner constructor" anymore. You can define any function inside any struct (not necessarily constructors of that type), and define any constructor outside type definition - the only difference is inside type definition, you have access of new.
I am on slightly shaky ground here, but I had a similar question not long ago. Based on what I learned from that, I suggest that you try this and see how it works for you:
abstract type AbstractFoo{T} end
mutable struct Foo{T<:Integer} <: AbstractFoo{T}
bar::T
end
It seems most reasonable to me to restrict the parameter on the concrete type rather than on the abstract one.

generation of self-referential immutable types in Julia

Julia lang documentation explains how inner constructors and the new() function can be used to construct self-referential objects:
type SelfReferential
obj::SelfReferential
SelfReferential() = (x = new(); x.obj = x)
end
However this approach does not work for immutable types, because it essentially uses mutation of the incompletely initialized instance x.
How can I generate a self-referential immutable object in Julia?
Since you seemed to have used Haskell before, I will tailor this answer from a functional programming perspective. A common use case of self-referential immutable types is in creating a lazy list.
As a strict (i.e. not lazy) language, it is not possible for an immutable object to directly reference itself.
This does not preclude, however, referencing itself indirectly, using a mutable object like a Ref or Vector.
For the particular case of lazy structures, I might recommend confining the mutability to a special object, say, Lazy{T}. For instance,
import Base: getindex
type Lazy
thunk
value
Lazy(thunk) = new(thunk)
end
evaluate!(lazy::Lazy) = (lazy.value = lazy.thunk(); lazy.value)
getindex(lazy::Lazy) = isdefined(lazy, :value) ? lazy.value : evaluate!(lazy)
Then, for instance, it's possible to make a simple lazy list as follows:
import Base: first, tail, start, next, done, iteratorsize, HasLength, SizeUnknown
abstract List
immutable Cons <: List
head
tail::Lazy
end
immutable Nil <: List end
macro cons(x, y)
quote
Cons($(esc(x)), Lazy(() -> $(esc(y))))
end
end
first(xs::Cons) = xs.head
tail(xs::Cons) = xs.tail[]
start(xs::Cons) = xs
next(::Cons, xs) = first(xs), tail(xs)
done(::List, ::Cons) = false
done(::List, ::Nil) = true
iteratorsize(::Nil) = HasLength()
iteratorsize(::Cons) = SizeUnknown()
Which indeed works as it would in a language like Haskell:
julia> xs = #cons(1, ys)
Cons(1,Lazy(false,#3,#undef))
julia> ys = #cons(2, xs)
Cons(2,Lazy(false,#5,#undef))
julia> [take(xs, 5)...]
5-element Array{Int64,1}:
1
2
1
2
1
This functionality may seem complex, but it luckily has already been implemented in Lazy.jl.
It is important to note that the above code creates a lot of overhead due to type instability and mutable types. If your goal in using immutable was not expressiveness, but rather performance, then obviously such an approach is not appropriate. But it is in general not possible to have a stack-allocated structure that references itself, so in cases where you want maximal performance, it's best to avoid self-reference entirely.

Can I use a subtype of a function parameter in the function definition?

I would like to use a subtype of a function parameter in my function definition. Is this possible? For example, I would like to write something like:
g{T1, T2<:T1}(x::T1, y::T2) = x + y
So that g will be defined for any x::T1 and any y that is a subtype of T1. Obviously, if I knew, for example, that T1 would always be Number, then I could write g{T<:Number}(x::Number, y::T) = x + y and this would work fine. But this question is for cases where T1 is not known until run-time.
Read on if you're wondering why I would want to do this:
A full description of what I'm trying to do would be a bit cumbersome, but what follows is a simplified example.
I have a parameterised type, and a simple method defined over that type:
type MyVectorType{T}
x::Vector{T}
end
f1!{T}(m::MyVectorType{T}, xNew::T) = (m.x[1] = xNew)
I also have another type, with an abstract super-type defined as follows
abstract MyAbstract
type MyType <: MyAbstract ; end
I create an instance of MyVectorType with vector element type set to MyAbstract using:
m1 = MyVectorType(Array(MyAbstract, 1))
I now want to place an instance of MyType in MyVectorType. I can do this, since MyType <: MyAbstract. However, I can't do this with f1!, since the function definition means that xNew must be of type T, and T will be MyAbstract, not MyType.
The two solutions I can think of to this problem are:
f2!(m::MyVectorType, xNew) = (m.x[1] = xNew)
f3!{T1, T2}(m::MyVectorType{T1}, xNew::T2) = T2 <: T1 ? (m.x[1] = xNew) : error("Oh dear!")
The first is essentially a duck-typing solution. The second performs the appropriate error check in the first step.
Which is preferred? Or is there a third, better solution I am not aware of?
The ability to define a function g{T, S<:T}(::Vector{T}, ::S) has been referred to as "triangular dispatch" as an analogy to diagonal dispatch: f{T}(::Vector{T}, ::T). (Imagine a table with a type hierarchy labelling the rows and columns, arranged such that the super types are to the top and left. The rows represent the element type of the first argument, and the columns the type of the second. Diagonal dispatch will only match the cells along the diagonal of the table, whereas triangular dispatch matches the diagonal and everything below it, forming a triangle.)
This simply isn't implemented yet. It's a complicated problem, especially once you start considering the scoping of T and S outside of function definitions and in the context of invariance. See issue #3766 and #6984 for more details.
So, practically, in this case, I think duck-typing is just fine. You're relying upon the implementation of myVectorType to do the error checking when it assigns its elements, which it should be doing in any case.
The solution in base julia for setting elements of an array is something like this:
f!{T}(A::Vector{T}, x::T) = (A[1] = x)
f!{T}(A::Vector{T}, x) = f!(A, convert(T, x))
Note that it doesn't worry about the type hierarchy or the subtype "triangle." It just tries to convert x to T… which is a no-op if x::S, S<:T. And convert will throw an error if it cannot do the conversion or doesn't know how.
UPDATE: This is now implemented on the latest development version (0.6-dev)! In this case I think I'd still recommend using convert like I originally answered, but you can now define restrictions within the static method parameters in a left-to-right manner.
julia> f!{T1, T2<:T1}(A::Vector{T1}, x::T2) = "success!"
julia> f!(Any[1,2,3], 4.)
"success!"
julia> f!(Integer[1,2,3], 4.)
ERROR: MethodError: no method matching f!(::Array{Integer,1}, ::Float64)
Closest candidates are:
f!{T1,T2<:T1}(::Array{T1,1}, ::T2<:T1) at REPL[1]:1
julia> f!([1.,2.,3.], 4.)
"success!"

Resources