Error checking in outer constructor in Julia - julia

I have the following struct with two outer constructors
struct SingleSpinState <: EPState
spins::BitArray{1}
end
SingleSpinState(n_sites::Int) = SingleSpinState(rand(Bool, n_sites))
SingleSpinState(n_sites::Int, n_particles::Int) = SingleSpinState(cat(1,trues(n_particles),falses(n_sites - n_particles)))
In the second constructor I would like to check that n_sites > n_particles. According to the documentation essential error checking should go on in inner constructors, yet it seems to me that the above situation will be quite common: the outer constructor uses the inner constructor but its arguments will be constrained in some way.
What is proper way to deal with this situation?

You can define multiple inner constructors:
julia> struct SingleSpinState
spins::BitVector
SingleSpinState(n_sites::Int) = new(bitrand(n_sites))
function SingleSpinState(n_sites::Int, n_particles::Int)
if !(n_sites > n_particles)
throw(ArgumentError("n_sites must be larger than n_particles"))
end
new([trues(n_particles); falses(n_sites-n_particles)])
end
end
julia> SingleSpinState(2)
SingleSpinState(Bool[false, true])
julia> SingleSpinState(2, 1)
SingleSpinState(Bool[true, false])
julia> SingleSpinState(2, 3)
ERROR: ArgumentError: n_sites must be larger than n_particles
Stacktrace:
[...]

Related

Julia: Parametric types with inner constructor: new and typeof

Trying to understand parametric types and the new function available for inner methods. The manual states "special function available to inner constructors which created a new object of the type". See the section of the manual on new here and the section of the manual on inner constructor methods here.
Consider an inner method designed to calculate the sum of x, where x could be, say, a vector or a tuple, and is given the parametric type T. A natural thing to want is for the type of the elements of x to be inherited by their sum s. I don't seem to need new for that, correct?
struct M{T}
x::T
s
function M(x)
s = sum(x)
x,s
end
end
julia> M([1,2,3])
([1, 2, 3], 6)
julia> M([1.,2.,3.])
([1.0, 2.0, 3.0], 6.0)
julia> typeof(M([1.,2.,3.]))
Tuple{Vector{Float64}, Float64}
Edit: Correction! I intended to have the last line of the inner constructor be M(x,s)... It's still an interesting question, so I won't correct it. How does M(x,s) differ from new{typeof(x)}(x,s)?
One usage of new I have seen is in combination with typeof(), something like:
struct M{T}
x::T
s
function M(x)
s = sum(x)
new{typeof(x)}(x,s)
end
end
julia> M([1,2,3])
M{Vector{Int64}}([1, 2, 3], 6)
julia> M([1.,2.,3.])
M{Vector{Float64}}([1.0, 2.0, 3.0], 6.0)
What if wanted to constrain s to the same type as x? That is, for instance, if x is a vector, then s should be a vector (in this case, a vector of one element). How would I do that? If I replace the last line of the inner constructor with x, new{typeof(x)}(s), I get the understandable error:
MethodError: Cannot `convert` an object of type Int64 to an object of type Vector{Int64}
Here are the rules:
If you are writing an outer constructor for a type M, the constructor should return an instance of M by eventually calling the inner constructor, like this: M(<args>).
If you are writing an inner constructor, this will override the default inner constructor. So you must return an instance of M by calling new(<args>).
The new "special function" exists to allow the construction of a type that doesn't have a constructor yet. Observe the following example:
julia> struct A
x::Int
function A(x)
A(x)
end
end
julia> A(4)
ERROR: StackOverflowError:
Stacktrace:
[1] A(::Int64) at ./REPL[3]:4 (repeats 79984 times)
This is a circular definition of the constructor for A, which results in a stack overflow. You cannot pull yourself up by your bootstraps, so Julia provides the new function as a way to circumvent this problem.
You should provide the new function with a number of arguments equal to the number of fields in your struct. Note that the new function will attempt to convert the types of its inputs to match the declared types of the fields of your struct:
julia> struct B
x::Float64
B(x) = new(x)
end
julia> B(5)
B(5.0)
julia> B('a')
B(97.0)
julia> B("a")
ERROR: MethodError: Cannot `convert` an object of type String to an object
of type Float64
(The inner constructor for B above is exactly the same as the default inner constructor.)
When you're defining parametric types, the new function must be provided with a number of parameters equal to the number of parameters for your type (and in the same order), analogously to the default inner constructor for parametric types. First observe how the default inner constructor for parametric types is used:
julia> struct Foo{T}
x::T
end
julia> Foo{String}("a")
Foo{String}("a")
Now if you were writing an inner constructor for Foo, instead of writing Foo{T}(x) inside the constructor, you would replace the Foo with new, like this: new{T}(x).
You might need typeof to help define the constructor, but often you don't. Here's one way you could define your M type:
struct M{I, T}
x::I
s::T
function M(x::I) where I
s = sum(x)
new{I, typeof(s)}(x, s)
end
end
I'm using typeof here so that I could be any iterable type that returns numbers:
julia> typeof(M(1:3))
M{UnitRange{Int64},Int64}
julia> g = (rand() for _ in 1:10)
Base.Generator{UnitRange{Int64},var"#5#6"}(var"#5#6"(), 1:10)
julia> typeof(M(g))
M{Base.Generator{UnitRange{Int64},var"#5#6"},Float64}
Note that providing the parameters for your type is required when you are using new inside an inner constructor for a parametric type:
julia> struct C{T}
x::Int
C(x) = new(x)
end
ERROR: syntax: too few type parameters specified in "new{...}" around REPL[6]:1
Remember, a constructor is designed to construct something. Specifically, the constructor M is designed to construct a value of type M. Your example constructor
struct M{T}
x::T
s
function M(x)
s = sum(x)
x,s
end
end
means that the result of evaluating the expression M([1 2 3]) is a tuple, not an instance of M. If I encountered such a constructor in the wild, I'd assume it was a bug and report it. new is the internal magic that allows you to actually construct a value of type M.
It's a matter of abstraction. If you just want a tuple in the first place, then forget about the structure called M and just define a function m at module scope that returns a tuple. But if you intend to treat this as a special data type, potentially for use with dynamic dispatch but even just for self-documentation purposes, then your constructor should return a value of type M.

Why use Inner Constructor Methods in Julia?

If I am understanding the docs correctly, the value of Inner Constructor Methods is that I can use them as a regular constructor but with some additional changes to the values?
For example, using a normal constructor it is not possible to take the constructor arguments and add the number 1 to them but with an Inner Constructor, this is possible?
Inner constructor allows you to replace the default constructor. For example:
julia> struct A
x::Int
A(a::Int,b::Int)=new(a+b)
end
julia> A(3)
ERROR: MethodError: no method matching A(::Int64)
julia> A(3,5)
A(8)
Note that when the inner constructor is not defined, it actually exists with the default parameter set. However adding the external constructor(s) will not override the behavior of the internal one:
julia> struct B
x::Int
end
julia> B(a::Int,b::Int)=B(a+b);
julia> B(3)
B(3)
julia> B(3,5)
B(8)

Stack overflow when I am trying to make a composite type with a matrix as a field

In the following code, I have a composite type, and in my real code several of the fields are matrices. In this example is only 1. I keep getting stack overflow when I try constructing the composite type. Here is the code sample:
struct Tables
eij::Array{Float64,2}
end
Tables(eij::Array{Float64,2}) = Tables(eij) # works if this is commented out
x = 5 # arbitrary matrix dimension
e_ij = Array{Float64,2}(undef,x,x)
for i=1:x
for j=1:x
e_ij[i,j] = i*j/2.3 #dummy numbers, but not the case in real code
end
end
vdwTable = Tables([e_ij[i,j] for i=1:x,j=1:x])
I use the temporary variable e_ij to make the matrix first since I don't want the composite Tables to be mutable. So, my reasoning is that by generating the tables first in dummy variables like e_ij, I can then initialize the immutable Tables that I really want.
If I comment out the outer constructor for the struct Tables it works. However, I actually want to have several different outer constructors for cases where different fields are not passed data to be initialized. In these cases, I want to give them default matrices.
The error I get is the following: ERROR: LoadError: StackOverflowError: on
the line vdwTable = Tables([e_ij[i,j] for i=1:x,j=1:x])
When you define a composite type, an inner constructor is automatically defined, so this:
struct Tables
eij::Array{Float64,2}
end
is equivalent to this:
struct Tables
eij::Array{Float64,2}
Tables(eij::Array{Float64,2}) = new(eij)
end
When you define this outer constructor
Tables(eij::Array{Float64,2}) = Tables(eij)
you get in the way of the inner constructor. Your outer constructor just calls itself recursively until you get a stack overflow.
Doing this, on the other hand,
Tables(eij) = Tables(eij)
is actually equivalent to this:
Tables(eij::Any) = Tables(eij)
so when you subsequently call
vdwTable = Tables([e_ij[i,j] for i=1:x,j=1:x])
it then just ignores your outer constructor, because there is a more specific method match, namely the inner constructor. So that particular outer constructor is quite useless, it will either be ignored, or it will recurse until stack overflow.
The simplest solution is: just don't make an outer constructor. If you do need an outer one to enforce some conditions, make sure that it doesn't shadow the inner constructor by having the same type signature. For example,
Tables() = Tables(zero(5, 5))
should work.
I would probably do it like this, though:
struct Tables
eij::Array{Float64,2}
Tables(eij=zeros(5, 5)) = new(eij)
end
For your second example, with two fields, you can try this:
struct Tables
eij::Array{Float64,2}
sij::Array{Float64,2}
Tables(eij=zeros(5,5), sij=zeros(5,5)) = new(eij, sij)
end
Your inputs will be converted to Float64 matrices, if that is possible, otherwise an exception will be raised.
DNF gave a proper explanation so +1. I would just like to add one small comment (not an answer to the question but something that is relevant from my experience), that is too long for a comment.
When you omit specifying an inner constructor yourself Julia automatically defines one inner and one outer constructor:
julia> struct Tables
eij::Array{Float64,2}
end
julia> methods(Tables)
# 2 methods for generic function "(::Type)":
[1] Tables(eij::Array{Float64,2}) in Main at REPL[1]:2
[2] Tables(eij) in Main at REPL[1]:2
while defining an inner constructor suppresses definition of the outer constructor:
julia> struct Tables
eij::Array{Float64,2}
Tables(eij::Array{Float64,2}) = new(eij)
end
julia> methods(Tables)
# 1 method for generic function "(::Type)":
[1] Tables(eij::Array{Float64,2}) in Main at REPL[1]:3
So the cases are not 100% equivalent. The purpose of the auto-generated outer constructor is to perform an automatic conversion of its argument if it is possible, see e.g. (this is the result in the first case - when no inner constructor was defined):
julia> #code_lowered Tables([true false
true false])
CodeInfo(
1 ─ %1 = (Core.apply_type)(Main.Array, Main.Float64, 2)
│ %2 = (Base.convert)(%1, eij)
│ %3 = %new(Main.Tables, %2)
└── return %3
)
while in the second case the same call would throw a method error.

Can I build a parameterless constructor for a parametric type in an outer constructor?

In order to instantiate a type like x = MyType{Int}()
I can define a inner constructor.
immutable MyType{T}
x::Vector{T}
MyType() = new(T[])
end
Is it possible to achieve the same objective using an outer constructor?
This can be done using the following syntax:
(::Type{MyType{T}}){T}() = MyType{T}(T[])
The thing in the first set of parentheses describes the called object. ::T means "of type T", so this is a definition for calling an object of type Type{MyType{T}}, which means the object MyType{T} itself. Next {T} means that T is a parameter of this definition, and a value for it must be available in order to call this definition. So MyType{Int} matches, but MyType doesn't. From there on, the syntax should be familiar.
This syntax is definitely a bit fiddly and unintuitive, and we hope to improve it in a future version of the language, hopefully v0.6.
I may be wrong, but if you cannot build parameterless function like this:
julia> f{T}() = show(T)
WARNING: static parameter T does not occur in signature for f at none:1.
The method will not be callable.
f (generic function with 1 method)
therefore you won't be able to do this:
julia> immutable MyType{T}
x::Vector{T}
end
julia> MyType{T}() = MyType{T}(T[])
WARNING: static parameter T does not occur in signature for call at none:1.
The method will not be callable.
MyType{T}
julia> x = MyType{Int}()
ERROR: MethodError: `convert` has no method matching convert(::Type{MyType{Int64}})
...
Every outer constructor is also a function.
You can say
f(T::Type) = show(T)
and also
MyType(T::Type) = MyType(T[])
But julia needs to see the type in the call to know which you want.

Julia: why must parametric types have outer constructors?

The following works:
type TypeA
x :: Array
y :: Int
TypeA(x :: Array ) = new(x, 2)
end
julia> y = TypeA([1,2,3])
TypeA([1,2,3],2)
This does not:
type TypeB{S}
x :: Array{S}
y :: Int
TypeB{S}( x:: Array{S} ) = new(x,2)
end
julia> y = TypeB([1,2,3])
ERROR: `TypeB{S}` has no method matching TypeB{S}(::Array{Int64,1})
In order to get the second case to work, one has to declare the constructor outside of the type declaration. This is slightly undesirable.
My question is why this problem exists from a Julia-design standpoint so I can better reason about the Julia type-system.
Thank you.
This works:
type TypeB{S}
x::Array{S}
y::Int
TypeB(x::Array{S}) = new(x,2)
end
TypeB{Int}([1,2,3])
which I figured out by reading the manual, but I must admit I don't really understand inner constructors that well, especially for parametric types. I think its because you are actually defining a family of types, so the inner constructor is only sensible for each individual type - hence you need to specify the {Int} to say which type you want. You can add an outer constructor to make it easier, i.e.
type TypeB{S}
x::Array{S}
y::Int
TypeB(x::Array{S}) = new(x,2)
end
TypeB{S}(x::Array{S}) = TypeB{S}(x)
TypeB([1,2,3])
I think it'd be good to bring it up on the Julia issues page, because I feel like this outer helper constructor could be provided by default.
EDIT: This Julia issue points out the problems with providing an outer constructor by default.

Resources