Type unstable & factory constructor - julia

Say, I have a type hierarchy
abstract A
immutable B <: A end
immutable C <: A end
The constructor of A follows factory pattern:
function A(x::Int)
if x > 0
B()
else
C()
end
end
It returns different subtypes based on the input as expected. However, it is also type unstable as I cannot find a way to force the return type to be A.
So, is it bad to have factory pattern here? Does the type instability only affects immutable types rather than mutable types, since the latter is reference type anyway.
Do I have to opt to the parametric type for this?
immutable D{T <: A}
type::T
end
function D(x::Int)
if x > 0
D(B())
else
D(C())
end
end
It feels a bit bad.
Actually, how bad it is to have type unstable functions? Is is worthwhile to trade for better code readability?
Alternatively, should I define typealias A Union{B,C} instead?

Well, you could do this:
function A(x::Int)
if x > 0
B()::A
else
C()::A
end
end
but it doesn't help:
julia> #code_warntype A(5)
Variables:
x::Int64
Body:
begin # none, line 2:
unless (Base.slt_int)(0,x::Int64)::Bool goto 0 # none, line 3:
return $(Expr(:new, :((top(getfield))(Main,:B)::Type{B})))
goto 1
0: # none, line 5:
return $(Expr(:new, :((top(getfield))(Main,:C)::Type{C})))
1:
end::Union{B,C}
You can't create instances of an abstract type. Moreover, in current julia, any abstract type is automatically "type-unstable," meaning that the compiler can't generate optimized code for it. So there is no such thing as "forcing the return type to be A" and then having that somehow make the function type-stable (in the sense of obtaining great performance).
You can implement a type-stable factory pattern, but the output type should be determined by the input types, not the input values. For example:
A(x::Vector) = B()
A(x::Matrix) = C()
is a type-stable constructor for objects of the A hierarchy.
If there aren't obvious types to use to signal your intent, you can always use Val:
A(x, ::Type{Val{1}}) = B()
A(x, ::Type{Val{2}}) = C()
A(1, Val{1}) # returns B()
A(1, Val{2}) # returns C()

Related

Trying to pass an array into a function

I'm very new to Julia, and I'm trying to just pass an array of numbers into a function and count the number of zeros in it. I keep getting the error:
ERROR: UndefVarError: array not defined
I really don't understand what I am doing wrong, so I'm sorry if this seems like such an easy task that I can't do.
function number_of_zeros(lst::array[])
count = 0
for e in lst
if e == 0
count + 1
end
end
println(count)
end
lst = [0,1,2,3,0,4]
number_of_zeros(lst)
There are two issues with your function definition:
As noted in Shayan's answer and Dan's comment, the array type in Julia is called Array (capitalized) rather than array. To see:
julia> array
ERROR: UndefVarError: array not defined
julia> Array
Array
Empty square brackets are used to instantiate an array, and if preceded by a type, they specifically instantiate an array holding objects of that type:
julia> x = Int[]
Int64[]
julia> push!(x, 3); x
1-element Vector{Int64}:
3
julia> push!(x, "test"); x
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Int64
Thus when you do Array[] you are actually instantiating an empty vector of Arrays:
julia> y = Array[]
Array[]
julia> push!(y, rand(2)); y
1-element Vector{Array}:
[0.10298669573927233, 0.04327245960128345]
Now it is important to note that there's a difference between a type and an object of a type, and if you want to restrict the types of input arguments to your functions, you want to do this by specifying the type that the function should accept, not an instance of this type. To see this, consider what would happen if you had fixed your array typo and passed an Array[] instead:
julia> f(x::Array[])
ERROR: TypeError: in typeassert, expected Type, got a value of type Vector{Array}
Here Julia complains that you have provided a value of the type Vector{Array} in the type annotation, when I should have provided a type.
More generally though, you should think about why you are adding any type restrictions to your functions. If you define a function without any input types, Julia will still compile a method instance specialised for the type of input provided when first call the function, and therefore generate (most of the time) machine code that is optimal with respect to the specific types passed.
That is, there is no difference between
number_of_zeros(lst::Vector{Int64})
and
number_of_zeros(lst)
in terms of runtime performance when the second definition is called with an argument of type Vector{Int64}. Some people still like type annotations as a form of error check, but you also need to consider that adding type annotations makes your methods less generic and will often restrict you from using them in combination with code other people have written. The most common example of this are Julia's excellent autodiff capabilities - they rely on running your code with dual numbers, which are a specific numerical type enabling automatic differentiation. If you strictly type your functions as suggested (Vector{Int}) you preclude your functions from being automatically differentiated in this way.
Finally just a note of caution about the Array type - Julia's array's can be multidimensional, which means that Array{Int} is not a concrete type:
julia> isconcretetype(Array{Int})
false
to make it concrete, the dimensionality of the array has to be provided:
julia> isconcretetype(Array{Int, 1})
true
First, it might be better to avoid variable names similar to function names. count is a built-in function of Julia. So if you want to use the count function in the number_of_zeros function, you will undoubtedly face a problem.
Second, consider returning the value instead of printing it (Although you didn't write the print function in the correct place).
Third, You can update the value by += not just a +!
Last but not least, Types in Julia are constantly introduced with the first capital letter! So we don't have an array standard type. It's an Array.
Here is the correction of your code.
function number_of_zeros(lst::Array{Int64})
counter = 0
for e in lst
if e == 0
counter += 1
end
end
return counter
end
lst = [0,1,2,3,0,4]
number_of_zeros(lst)
would result in 2.
Additional explanation
First, it might be better to avoid variable names similar to function names. count is a built-in function of Julia. So if you want to use the count function in the number_of_zeros function, you will undoubtedly face a problem.
Check this example:
function number_of_zeros(lst::Array{Int64})
count = 0
for e in lst
if e == 0
count += 1
end
end
return count, count(==(1), lst)
end
number_of_zeros(lst)
This code will lead to this error:
ERROR: MethodError: objects of type Int64 are not callable
Maybe you forgot to use an operator such as *, ^, %, / etc. ?
Stacktrace:
[1] number_of_zeros(lst::Vector{Int64})
# Main \t.jl:10
[2] top-level scope
# \t.jl:16
Because I overwrote the count variable on the count function! It's possible to avoid such problems by calling the function from its module:
function number_of_zeros(lst::Array{Int64})
count = 0
for e in lst
if e == 0
count += 1
end
end
return count, Base.count(==(1), lst)
The point is I used Base.count, then the compiler knows which count I mean by Base.count.

Julia: Parametric types with inner constructor: new and typeof

Trying to understand parametric types and the new function available for inner methods. The manual states "special function available to inner constructors which created a new object of the type". See the section of the manual on new here and the section of the manual on inner constructor methods here.
Consider an inner method designed to calculate the sum of x, where x could be, say, a vector or a tuple, and is given the parametric type T. A natural thing to want is for the type of the elements of x to be inherited by their sum s. I don't seem to need new for that, correct?
struct M{T}
x::T
s
function M(x)
s = sum(x)
x,s
end
end
julia> M([1,2,3])
([1, 2, 3], 6)
julia> M([1.,2.,3.])
([1.0, 2.0, 3.0], 6.0)
julia> typeof(M([1.,2.,3.]))
Tuple{Vector{Float64}, Float64}
Edit: Correction! I intended to have the last line of the inner constructor be M(x,s)... It's still an interesting question, so I won't correct it. How does M(x,s) differ from new{typeof(x)}(x,s)?
One usage of new I have seen is in combination with typeof(), something like:
struct M{T}
x::T
s
function M(x)
s = sum(x)
new{typeof(x)}(x,s)
end
end
julia> M([1,2,3])
M{Vector{Int64}}([1, 2, 3], 6)
julia> M([1.,2.,3.])
M{Vector{Float64}}([1.0, 2.0, 3.0], 6.0)
What if wanted to constrain s to the same type as x? That is, for instance, if x is a vector, then s should be a vector (in this case, a vector of one element). How would I do that? If I replace the last line of the inner constructor with x, new{typeof(x)}(s), I get the understandable error:
MethodError: Cannot `convert` an object of type Int64 to an object of type Vector{Int64}
Here are the rules:
If you are writing an outer constructor for a type M, the constructor should return an instance of M by eventually calling the inner constructor, like this: M(<args>).
If you are writing an inner constructor, this will override the default inner constructor. So you must return an instance of M by calling new(<args>).
The new "special function" exists to allow the construction of a type that doesn't have a constructor yet. Observe the following example:
julia> struct A
x::Int
function A(x)
A(x)
end
end
julia> A(4)
ERROR: StackOverflowError:
Stacktrace:
[1] A(::Int64) at ./REPL[3]:4 (repeats 79984 times)
This is a circular definition of the constructor for A, which results in a stack overflow. You cannot pull yourself up by your bootstraps, so Julia provides the new function as a way to circumvent this problem.
You should provide the new function with a number of arguments equal to the number of fields in your struct. Note that the new function will attempt to convert the types of its inputs to match the declared types of the fields of your struct:
julia> struct B
x::Float64
B(x) = new(x)
end
julia> B(5)
B(5.0)
julia> B('a')
B(97.0)
julia> B("a")
ERROR: MethodError: Cannot `convert` an object of type String to an object
of type Float64
(The inner constructor for B above is exactly the same as the default inner constructor.)
When you're defining parametric types, the new function must be provided with a number of parameters equal to the number of parameters for your type (and in the same order), analogously to the default inner constructor for parametric types. First observe how the default inner constructor for parametric types is used:
julia> struct Foo{T}
x::T
end
julia> Foo{String}("a")
Foo{String}("a")
Now if you were writing an inner constructor for Foo, instead of writing Foo{T}(x) inside the constructor, you would replace the Foo with new, like this: new{T}(x).
You might need typeof to help define the constructor, but often you don't. Here's one way you could define your M type:
struct M{I, T}
x::I
s::T
function M(x::I) where I
s = sum(x)
new{I, typeof(s)}(x, s)
end
end
I'm using typeof here so that I could be any iterable type that returns numbers:
julia> typeof(M(1:3))
M{UnitRange{Int64},Int64}
julia> g = (rand() for _ in 1:10)
Base.Generator{UnitRange{Int64},var"#5#6"}(var"#5#6"(), 1:10)
julia> typeof(M(g))
M{Base.Generator{UnitRange{Int64},var"#5#6"},Float64}
Note that providing the parameters for your type is required when you are using new inside an inner constructor for a parametric type:
julia> struct C{T}
x::Int
C(x) = new(x)
end
ERROR: syntax: too few type parameters specified in "new{...}" around REPL[6]:1
Remember, a constructor is designed to construct something. Specifically, the constructor M is designed to construct a value of type M. Your example constructor
struct M{T}
x::T
s
function M(x)
s = sum(x)
x,s
end
end
means that the result of evaluating the expression M([1 2 3]) is a tuple, not an instance of M. If I encountered such a constructor in the wild, I'd assume it was a bug and report it. new is the internal magic that allows you to actually construct a value of type M.
It's a matter of abstraction. If you just want a tuple in the first place, then forget about the structure called M and just define a function m at module scope that returns a tuple. But if you intend to treat this as a special data type, potentially for use with dynamic dispatch but even just for self-documentation purposes, then your constructor should return a value of type M.

Is it possible to get the return type of a Julia function in an unevaluated context?

I want to get the result type of a function call in Julia without evaluating the function, and use that type. The desired usage looks somewhat like this:
foo(x::Int32) = x
foo(x::Float32) = x
y = 0.0f0
# Assert that y has the type of result of foo(Float32)
y::#resultof foo(Float32) # This apparently does not work in Julia
While in the case above, I can simply use y::typeof(foo(1.0f0)) with evaluation of a dummy variable, in more complicated cases, initializing a dummy variable might be inconvenient and expensive. For example, I want to use the iterator type returned by function eachline(filename::AbstractString; keep::Bool=false), but using the typeof really requires opening a file successfully, which looks like an overkill.
From a C++ background, what I am asking is that is there an equivalent of std::result_of in Julia. The question is almost the same as this one but the language is Julia.
After some research I see that Julia allows for return value of different types in one function, where the type inference looks very hard. For example,
foo(x::Int64) = x == 1 ? 1 : 1.0
The return type can now be Int64 or Float64, depending on the input value. Nevertheless, in this case, I am still wondering if there are some macro tricks that can deduce that the return type is of Union{ Int64, Float64 }?
To summarize, my questions are:
Fundamentally, is it possible to get the function return type by only supplying argument types in Julia?
If 1 is not possible, for functions that have one deterministic return type (as in the 1st example), is it possible to get the return type unevaluated?
(Might be unrelated with what I want but I think it can boost my understanding) When Julia codes are compiled, are the return types of the functions known? Or is the type information only determined at run time?
1) Yes, Base.return_types(foo, (Int64,)) will return an array containing the return type you're asking for, i.e. Union{ Int64, Float64 } in this case. If you drop the second argument, a tuple specifying the input argument types, you'll get all possible infered return types.
It should be noted, however, that the compiler might at any point decide to return Any or any other correct but imprecise return type.
2) see 1) ?
3) For given input argument types, the compiler tries to infer the return type at compile-time. If multiple are possible, it will infer a Union type. If it fails completely, or there are too many different return types, it will infer Any as the return type. In the latter cases, the actual return type is only known at runtime.
Demonstration of 1):
julia> foo(x::Float32) = x
foo (generic function with 1 methods)
julia> foo(x::Int32) = x
foo (generic function with 2 methods)
julia> foo(x::Int64) = x == 1 ? 1 : 1.0
foo (generic function with 3 methods)
julia> Base.return_types(foo)
3-element Array{Any,1}:
Union{Float64, Int64}
Int32
Float32
julia> Base.return_types(foo, (Int64,))
1-element Array{Any,1}:
Union{Float64, Int64}
julia> Base.return_types(foo, (Int32,))
1-element Array{Any,1}:
Int32

Unicity of complex key dictionaries in Go but not in Julia?

In GO when I use a struct as a key for a map, there is an unicity of the keys.
For example, the following code produce a map with only one key : map[{x 1}:1]
package main
import (
"fmt"
)
type MyT struct {
A string
B int
}
func main() {
dic := make(map[MyT]int)
for i := 1; i <= 10; i++ {
dic[MyT{"x", 1}] = 1
}
fmt.Println(dic)
}
// result : map[{x 1}:1]
I Tried to do the same in Julia and I had a strange surprise :
This Julia code, similar to the GO one, produces a dictionary whith 10 keys !
type MyT
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
println(dic)
# Dict(MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1,MyT("x",1)=>1)
println(keys(dic))
# MyT[MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1),MyT("x",1)]
So what I did wrong ?
Thank you #DanGetz for the solution ! :
immutable MyT # or struct MyT with julia > 0.6
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
println(dic) # Dict(MyT("x", 1)=>1)
println(keys(dic)) # MyT[MyT("x", 1)]
Mutable values hash by identity in Julia, since without additional knowledge about what a type represents, one cannot know if two values with the same structure mean the same thing or not. Hashing mutable objects by value can be especially problematic if you mutate a value after using it as a dictionary key – this is not a problem when hashing by identity since the identity of a mutable object remains the same even when it is modified. On the other hand, it's perfectly safe to hash immutable objects by value – since they cannot be mutated, and accordingly that is the default behavior for immutable types. In the given example, if you make MyT immutable you will automatically get the behavior you're expecting:
immutable MyT # `struct MyT` in 0.6
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
julia> dic
Dict{MyT,Int64} with 1 entry:
MyT("x", 1) => 1
julia> keys(dic)
Base.KeyIterator for a Dict{MyT,Int64} with 1 entry. Keys:
MyT("x", 1)
For a type holding a String and an Int value that you want to use as a hash key, immutability is probably the right choice. In fact, immutability is the right choice more often than not, which is why the keywords introducing structural types has been change in 0.6 to struct for immutable structures and mutable struct for mutable structures – on the principle that people will reach for the shorter, simpler name first, so that should be the better default choice – i.e. immutability.
As #ntdef has written, you can change the hashing behavior of your type by overloading the Base.hash function. However, his definition is incorrect in a few respects (which is probably our fault for failing to document this more prominently and thoroughly):
The method signature of Base.hash that you want to overload is Base.hash(::T, ::UInt).
The Base.hash(::T, ::UInt) method must return a UInt value.
If you are overloading Base.hash, you should also overload Base.== to match.
So this would be a correct way to make your mutable type hash by value (new Julia session required to redefine MyT):
type MyT # `mutable struct MyT` in 0.6
A::String
B::Int64
end
import Base: ==, hash
==(x::MyT, y::MyT) = x.A == y.A && x.B == y.B
hash(x::MyT, h::UInt) = hash((MyT, x.A, x.B), h)
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
julia> dic
Dict{MyT,Int64} with 1 entry:
MyT("x", 1) => 1
julia> keys(dic)
Base.KeyIterator for a Dict{MyT,Int64} with 1 entry. Keys:
MyT("x", 1)
This is kind of annoying to do manually, but the AutoHashEquals package automates this, taking the tedium out of it. All you need to do is prefix the type definition with the #auto_hash_equals macro:
using AutoHashEquals
#auto_hash_equals type MyT # `#auto_hash_equals mutable struct MyT` in 0.6
A::String
B::Int64
end
Bottom line:
If you have a type that should have value-based equality and hashing, seriously consider making it immutable.
If your type really has to be mutable, then think hard about whether it's a good idea to use as a hash key.
If you really need to use a mutable type as a hash key with value-based equality and hashing semantics, use the AutoHashEquals package.
You did not do anything wrong. The difference between the languages is in how they choose to hash a struct when using it as a key in the map/Dict. In go, structs are hashed by their values rather than their pointer addresses. This allows programmers to more easily implement multidimensional maps by using structs rather than maps of maps. See this blog post for more info.
Reproducing Julia's Behavior in Go
To reproduce Julia's behavior in go, redefine the map to use a pointer to MyT as the key type:
func main() {
dic := make(map[MyT]int)
pdic := make(map[*MyT]int)
for i := 1; i <= 10; i++ {
t := MyT{"x", 1}
dic[t] = 1
pdic[&t] = 1
}
fmt.Println(dic)
fmt.Println(pdic)
}
Here, pdic uses the pointer to a MyT struct as its key type. Because each MyT created in the loop has a different memory address, the key will be different. This produces the output:
map[{x 1}:1]
map[0x1040a140:1 0x1040a150:1 0x1040a160:1 0x1040a180:1 0x1040a1b0:1 0x1040a1c0:1 0x1040a130:1 0x1040a170:1 0x1040a190:1 0x1040a1a0:1]
You can play with this on play.golang.org. Unlike in Julia (see below), the way the map type is implemented go means you cannot specify a custom hashing function for a user-defined struct.
Reproducing Go's Behavior in Julia
Julia uses the function Base.hash(::K, ::UInt) to hash keys for its Dict{K,V} type. While it doesn't explicitly say so in the documentation, the default hashing algorithm uses the output from object_id, as you can see in the source code. To reproduce go's behavior in Julia, define a new hash function for your type that hashes the values of the struct:
Base.hash(t::MyT, h::Uint) = Base.hash((t.A, t.B), h)
Note that you should also define the == operator in the same way to guarantee hash(x)==hash(y) implies isequal(x,y), as mentioned in the documentation.
However, the easiest way to get Julia to act like go in your example is to redefine MyT as immutable. As an immutable type, Julia will hash MyT by its value rather than its object_id. As an example:
immutable MyT
A::String
B::Int64
end
dic = Dict{MyT, Int64}()
for i in 1:10
dic[MyT("x", 1)] = 1
end
dic[MyT("y", 2)] = 2
println(dic) # prints "Dict(MyT("y",2)=>2,MyT("x",1)=>1)"
Edit: Please refer to #StefanKarpinski's answer. The Base.hash function must return a UInt for it to be a valid hash, so my example won't work. Also there's some funkiness regarding user defined types which involves recursion.
The reason you get 10 different keys is due to the fact that Julia uses the hash function when determining the key to a dict. In this case, I'm guessing that it's using the address of the object in memory as the key for the dictionary. If you'd like to explicitly make (A,B) the unique key, you'll need to override the hash function for your particular type, with something like this:
Base.hash(x::MyT) = (x.A, x.B)
That will replicate the Go behavior, with only one item in the Dict.
Here's the documentation to the hash function.
Hope that helps!

Check if a type implements an interface in Julia

How to check that a type implements an interface in Julia?
For exemple iteration interface is implemented by the functions start, next, done.
I need is to have a specialization of a function depending on wether the argument type implements a given interface or not.
EDIT
Here is an example of what I would like to do.
Consider the following code:
a = [7,8,9]
f = 1.0
s = Set()
push!(s,30)
push!(s,40)
function getsummary(obj)
println("Object of type ", typeof(obj))
end
function getsummary{T<:AbstractArray}(obj::T)
println("Iterable Object starting with ", next(obj, start(obj))[1])
end
getsummary(a)
getsummary(f)
getsummary(s)
The output is:
Iterable Object starting with 7
Object of type Float64
Object of type Set{Any}
Which is what we would expect since Set is not an AbstractArray. But clearly my second method only requires the type T to implement the iteration interface.
my issue isn't only related to the iteration interface but to all interfaces defined by a set of functions.
EDIT-2
I think my question is related to
https://github.com/JuliaLang/julia/issues/5
Since we could have imagined something like T<:Iterable
Typically, this is done with traits. See Traits.jl for one implementation; a similar approach is used in Base to dispatch on Base.iteratorsize, Base.linearindexing, etc. For instance, this is how Base implements collect using the iteratorsize trait:
"""
collect(element_type, collection)
Return an `Array` with the given element type of all items in a collection or iterable.
The result has the same shape and number of dimensions as `collection`.
"""
collect{T}(::Type{T}, itr) = _collect(T, itr, iteratorsize(itr))
_collect{T}(::Type{T}, itr, isz::HasLength) = copy!(Array{T,1}(Int(length(itr)::Integer)), itr)
_collect{T}(::Type{T}, itr, isz::HasShape) = copy!(similar(Array{T}, indices(itr)), itr)
function _collect{T}(::Type{T}, itr, isz::SizeUnknown)
a = Array{T,1}(0)
for x in itr
push!(a,x)
end
return a
end
See also Mauro Werder's talk on traits.
I would define a iterability(::T) trait as follows:
immutable Iterable end
immutable NotIterable end
iterability(T) =
if method_exists(length, (T,)) || !isa(Base.iteratorsize(T), Base.HasLength)
Iterable()
else
NotIterable()
end
which seems to work:
julia> iterability(Set)
Iterable()
julia> iterability(Number)
Iterable()
julia> iterability(Symbol)
NotIterable()
you can check whether a type implements an interface via methodswith as follows:
foo(a_type::Type, an_interface::Symbol) = an_interface ∈ [i.name for i in methodswith(a_type, true)]
julia> foo(EachLine, :done)
true
but I don't quite understand the dynamic dispatch approach you mentioned in the comment, what does the generic function looks like? what's the input & output of the function? I guess you want something like this?
function foo(a_type::Type, an_interface::Symbol)
# assume bar baz are predefined
if an_interface ∈ [i.name for i in methodswith(a_type, true)]
# call function bar
else
# call function baz
end
end
or some metaprogramming stuff to generate those functions respectively at compile time?

Resources