How can I dispatch on traits relating two types, where the second type that co-satisfies the trait is uniquely determined by the first? - julia

Say I have a Julia trait that relates to two types: one type is a sort of "base" type that may satisfy a sort of partial trait, the other is an associated type that is uniquely determined by the base type. (That is, the relation from BaseType -> AssociatedType is a function.) Together, these types satisfy a composite trait that is the one of interest to me.
For example:
using Traits
#traitdef IsProduct{X} begin
isnew(X) -> Bool
coolness(X) -> Float64
end
#traitdef IsProductWithMeasurement{X,M} begin
#constraints begin
istrait(IsProduct{X})
end
measurements(X) -> M
#Maybe some other stuff that dispatches on (X,M), e.g.
#fits_in(X,M) -> Bool
#how_many_fit_in(X,M) -> Int64
#But I don't want to implement these now
end
Now here are a couple of example types. Please ignore the particulars of the examples; they are just meant as MWEs and there is nothing relevant in the details:
type Rope
color::ASCIIString
age_in_years::Float64
strength::Float64
length::Float64
end
type Paper
color::ASCIIString
age_in_years::Int64
content::ASCIIString
width::Float64
height::Float64
end
function isnew(x::Rope)
(x.age_in_years < 10.0)::Bool
end
function coolness(x::Rope)
if x.color=="Orange"
return 2.0::Float64
elseif x.color!="Taupe"
return 1.0::Float64
else
return 0.0::Float64
end
end
function isnew(x::Paper)
(x.age_in_years < 1.0)::Bool
end
function coolness(x::Paper)
(x.content=="StackOverflow Answers" ? 1000.0 : 0.0)::Float64
end
Since I've defined these functions, I can do
#assert istrait(IsProduct{Rope})
#assert istrait(IsProduct{Paper})
And now if I define
function measurements(x::Rope)
(x.length)::Float64
end
function measurements(x::Paper)
(x.height,x.width)::Tuple{Float64,Float64}
end
Then I can do
#assert istrait(IsProductWithMeasurement{Rope,Float64})
#assert istrait(IsProductWithMeasurement{Paper,Tuple{Float64,Float64}})
So far so good; these run without error. Now, what I want to do is write a function like the following:
#traitfn function get_measurements{X,M;IsProductWithMeasurement{X,M}}(similar_items::Array{X,1})
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
Generically, this function is meant to be an example of "I want to use the fact that I, as the programmer, know that BaseType is always associated to AssociatedType to help the compiler with type inference. I know that whenever I do a certain task [in this case, get_measurements, but generically this could work in a bunch of cases] then I want the compiler to infer the output type of that function in a consistently patterned way."
That is, e.g.
do_something_that_makes_arrays_of_assoc_type(x::BaseType)
will always spit out Array{AssociatedType}, and
do_something_that_makes_tuples(x::BaseType)
will always spit out Tuple{Int64,BaseType,AssociatedType}.
AND, one such relationship holds for all pairs of <BaseType,AssociatedType>; e.g. if BatmanType is the base type to which RobinType is associated, and SupermanType is the base type to which LexLutherType is always associated, then
do_something_that_makes_tuple(x::BatManType)
will always output Tuple{Int64,BatmanType,RobinType}, and
do_something_that_makes_tuple(x::SuperManType)
will always output Tuple{Int64,SupermanType,LexLutherType}.
So, I understand this relationship, and I want the compiler to understand it for the sake of speed.
Now, back to the function example. If this makes sense, you will have realized that while the function definition I gave as an example is 'correct' in the sense that it satisfies this relationship and does compile, it is un-callable because the compiler doesn't understand the relationship between X and M, even though I do. In particular, since M doesn't appear in the method signature, there is no way for Julia to dispatch on the function.
So far, the only thing I have thought to do to solve this problem is to create a sort of workaround where I "compute" the associated type on the fly, and I can still use method dispatch to do this computation. Consider:
function get_measurement_type_of_product(x::Rope)
Float64
end
function get_measurement_type_of_product(x::Paper)
Tuple{Float64,Float64}
end
#traitfn function get_measurements{X;IsProduct{X}}(similar_items::Array{X,1})
M = get_measurement_type_of_product(similar_items[1]::X)
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
Then indeed this compiles and is callable:
julia> get_measurements(Array{Rope,1}([Rope("blue",1.0,1.0,1.0),Rope("red",2.0,2.0,2.0)]))
2-element Array{Float64,1}:
1.0
2.0
But this is not ideal, because (a) I have to redefine this map each time, even though I feel as though I already told the compiler about the relationship between X and M by making them satisfy the trait, and (b) as far as I can guess--maybe this is wrong; I don't have direct evidence for this--the compiler won't necessarily be able to optimize as well as I want, since the relationship between X and M is "hidden" inside the return value of the function call.
One last thought: if I had the ability, what I would ideally do is something like this:
#traitdef IsProduct{X} begin
isnew(X) -> Bool
coolness(X) -> Float64
∃ ! M s.t. measurements(X) -> M
end
and then have some way of referring to the type that uniquely witnesses the existence relationship, so e.g.
#traitfn function get_measurements{X;IsProduct{X},IsWitnessType{IsProduct{X},M}}(similar_items::Array{X,1})
all_measurements = Array{M,1}(length(similar_items))
for i in eachindex(similar_items)
all_measurements[i] = measurements(similar_items[i])::M
end
all_measurements::Array{M,1}
end
because this would be somehow dispatchable.
So: what is my specific question? I am asking, given that you presumably by this point understand that my goals are
Have my code exhibit this sort of structure generically, so that
I can effectively repeat this design pattern across a lot of cases
and then program in the abstract at the high-level of X and M,
and
do (1) in such a way that the compiler can still optimize to the best of its ability / is as aware of the relationship among
types as I, the coder, am
then, how should I do this? I think the answer is
Use Traits.jl
Do something pretty similar to what you've done so far
Also do ____some clever thing____ that the answerer will indicate,
but I'm open to the idea that in fact the correct answer is
Abandon this approach, you're thinking about the problem the wrong way
Instead, think about it this way: ____MWE____
I'd also be perfectly satisfied by answers of the form
What you are asking for is a "sophisticated" feature of Julia that is still under development, and is expected to be included in v0.x.y, so just wait...
and I'm less enthusiastic about (but still curious to hear) an answer such as
Abandon Julia; instead use the language ________ that is designed for this type of thing
I also think this might be related to the question of typing Julia's function outputs, which as I take it is also under consideration, though I haven't been able to puzzle out the exact representation of this problem in terms of that one.

Related

The Idiomatic Way To Do OOP, Types and Methods in Julia

As I'm learning Julia, I am wondering how to properly do things I might have done in Python, Java or C++ before. For example, previously I might have used an abstract base class (or interface) to define a family of models through classes. Each class might then have a method like calculate. So to call it I might have model.calculate(), where the model is an object from one of the inheriting classes.
I get that Julia uses multiple dispatch to overload functions with different signatures such as calculate(model). The question I have is how to create different models. Do I use the type system for that and create different types like:
abstract type Model end
type BlackScholes <: Model end
type Heston <: Model end
where BlackScholes and Heston are different types of model? If so, then I can overload different calculate methods:
function calculate(model::BlackScholes)
# code
end
function calculate(model::Heston)
# code
end
But I'm not sure if this is a proper and idiomatic use of types in Julia. I will greatly appreciate your guidance!
This is a hard question to answer. Julia offers a wide range of tools to solve any given problem, and it would be hard for even a core developer of the language to assert that one particular approach is "right" or even "idiomatic".
For example, in the realm of simulating and solving stochastic differential equations, you could look at the approach taken by Chris Rackauckas (and many others) in the suite of packages under the JuliaDiffEq umbrella. However, many of these people are extremely experienced Julia coders, and what they do may be somewhat out of reach for less experienced Julia coders who just want to model something in a manner that is reasonably sensible and attainable for a mere mortal.
It is is possible that the only "right" answer to this question is to direct users to the Performance Tips section of the docs, and then assert that as long as you aren't violating any of the recommendations there, then what you are doing is probably okay.
I think the best way I can answer this question from my own personal experience is to provide an example of how I (a mere mortal) would approach the problem of simulating different Ito processes. It is actually not too far off what you have put in the question, although with one additional layer. To be clear, I make no claim that this is the "right" way to do things, merely that it is one approach that utilizes multiple dispatch and Julia's type system in a reasonably sensible fashion.
I start off with an abstract type, for nesting specific subtypes that represent specific models.
abstract type ItoProcess ; end
Now I define some specific model subtypes, e.g.
struct GeometricBrownianMotion <: ItoProcess
mu::Float64
sigma::Float64
end
struct Heston <: ItoProcess
mu::Float64
kappa::Float64
theta::Float64
xi::Float64
end
Note, in this case I don't need to add constructors that convert arguments to Float64, since Julia does this automatically, e.g. GeometricBrownianMotion(1, 2.0) will work out-of-the-box, as Julia will automatically convert 1 to 1.0 when constructing the type.
However, I might want to add some constructors for common parameterizations, e.g.
GeometricBrownianMotion() = GeometricBrownianMotion(0.0, 1.0)
I might also want some functions that return useful information about my models, e.g.
number_parameter(model::GeometricBrownianMotion) = 2
number_parameter(model::Heston) = 4
In fact, given how I've defined the models above, I could actually be a bit sneaky and define a method that works for all subtypes:
number_parameter(model::T) where {T<:ItoProcess} = length(fieldnames(typeof(model)))
Now I want to add some code that allows me to simulate my models:
function simulate(model::T, numobs::Int, stval) where {T<:ItoProcess}
# code here that is common to all subtypes of ItoProcess
simulate_inner(model, somethingelse)
# maybe more code that is common to all subtypes of ItoProcess
end
function simulate_inner(model::GeometricBrownianMotion, somethingelse)
# code here that is specific to GeometricBrownianMotion
end
function simulate_inner(model::Heston, somethingelse)
# code here that is specific to Heston
end
Note that I have used the abstract type to allow me to group all code that is common to all subtypes of ItoProcess in the simulate function. I then use multiple dispatch and simulate_inner to run any code that needs to be specific to a particular subtype of ItoProcess. For the aforementioned reasons, I hesitate to use the phrase "idiomatic", but let me instead say that the above is quite a common pattern in typical Julia code.
The one thing to be careful of in the above code is to ensure that the output type of the simulate function is type-stable, that is, the output type can be uniquely determined by the input types. Type stability is usually an important factor in ensuring performant Julia code. An easy way in this case to ensure type-stability is to always return Matrix{Float64} (if the output type is fixed for all subtypes of ItoProcess then obviously it is uniquely determined). I examine a case where the output type depends on input types below for my estimate example. Anyway, for simulate I might always return Matrix{Float64} since for GeometricBrownianMotion I only need one column, but for Heston I will need two (the first for price of the asset, the second for the volatility process).
In fact, depending on how the code is used, type-stability is not always necessary for performant code (see eg using function barriers to prevent type-instability from flowing through to other parts of your program), but it is a good habit to be in (for Julia code).
I might also want routines to estimate these models. Again, I can follow the same approach (but with a small twist):
function estimate(modeltype::Type{T}, data)::T where {T<:ItoProcess}
# again, code common to all subtypes of ItoProcess
estimate_inner(modeltype, data)
# more common code
return T(some stuff generated from function that can be used to construct T)
end
function estimate_inner(modeltype::Type{GeometricBrownianMotion}, data)
# code specific to GeometricBrownianMotion
end
function estimate_inner(modeltype::Type{Heston}, data)
# code specific to Heston
end
There are a few differences from the simulate case. Instead of inputting an instance of GeometricBrownianMotion or Heston, I instead input the type itself. This is because I don't actually need an instance of the type with defined values for the fields. In fact, the values of those fields is the very thing I am attempting to estimate! But I still want to use multiple dispatch, hence the ::Type{T} construct. Note also I have specified an output type for estimate. This output type is dependent on the ::Type{T} input, and so the function is type-stable (output type can be uniquely determined by input types). But common with the simulate case, I have structured the code so that code that is common to all subtypes of ItoProcess only needs to be written once, and code that is specific to the subtypes is separted out.
This answer is turning into an essay, so I should tie it off here. Hopefully this is useful to the OP, as well as anyone else getting into Julia. I just want to finish by emphasizing that what I have done above is only one approach, there are others that will be just as performant, but I have personally found the above to be useful from a structural perspective, as well as reasonably common across the Julia ecosystem.

Puzzling results for Julia typeof

I am puzzled by the following results of typeof in the Julia 1.0.0 REPL:
# This makes sense.
julia> typeof(10)
Int64
# This surprised me.
julia> typeof(function)
ERROR: syntax: unexpected ")"
# No answer at all for return example and no error either.
julia> typeof(return)
# In the next two examples the REPL returns the input code.
julia> typeof(in)
typeof(in)
julia> typeof(typeof)
typeof(typeof)
# The "for" word returns an error like the "function" word.
julia> typeof(for)
ERROR: syntax: unexpected ")"
The Julia 1.0.0 documentation says for typeof
"Get the concrete type of x."
The typeof(function) example is the one that really surprised me. I expected a function to be a first-class object in Julia and have a type. I guess I need to understand types in Julia.
Any suggestions?
Edit
Per some comment questions below, here is an example based on a small function:
julia> function test() return "test"; end
test (generic function with 1 method)
julia> test()
"test"
julia> typeof(test)
typeof(test)
Based on this example, I would have expected typeof(test) to return generic function, not typeof(test).
To be clear, I am not a hardcore user of the Julia internals. What follows is an answer designed to be (hopefully) an intuitive explanation of what functions are in Julia for the non-hardcore user. I do think this (very good) question could also benefit from a more technical answer provided by one of the more core developers of the language. Also, this answer is longer than I'd like, but I've used multiple examples to try and make things as intuitive as possible.
As has been pointed out in the comments, function itself is a reserved keyword, and is not an actual function istself per se, and so is orthogonal to the actual question. This answer is intended to address your edit to the question.
Since Julia v0.6+, Function is an abstract supertype, much in the same way that Number is an abstract supertype. All functions, e.g. mean, user-defined functions, and anonymous functions, are subtypes of Function, in the same way that Float64 and Int are subtypes of Number.
This structure is deliberate and has several advantages.
Firstly, for reasons I don't fully understand, structuring functions in this way was the key to allowing anonymous functions in Julia to run just as fast as in-built functions from Base. See here and here as starting points if you want to learn more about this.
Secondly, because each function is its own subtype, you can now dispatch on specific functions. For example:
f1(f::T, x) where {T<:typeof(mean)} = f(x)
and:
f1(f::T, x) where {T<:typeof(sum)} = f(x) + 1
are different dispatch methods for the function f1
So, given all this, why does, e.g. typeof(sum) return typeof(sum), especially given that typeof(Float64) returns DataType? The issue here is that, roughly speaking, from a syntactical perspective, sum needs to serves two purposes simultaneously. It needs to be both a value, like e.g. 1.0, albeit one that is used to call the sum function on some input. But, it is also needs to be a type name, like Float64.
Obviously, it can't do both at the same time. So sum on its own behaves like a value. You can write f = sum ; f(randn(5)) to see how it behaves like a value. But we also need some way of representing the type of sum that will work not just for sum, but for any user-defined function, and any anonymous function. The developers decided to go with the (arguably) simplest option and have the type of sum print literally as typeof(sum), hence the behaviour you observe. Similarly if I write f1(x) = x ; typeof(f1), that will also return typeof(f1).
Anonymous functions are a bit more tricky, since they are not named as such. What should we do for typeof(x -> x^2)? What actually happens is that when you build an anonymous function, it is stored as a temporary global variable in the module Main, and given a number that serves as its type for lookup purposes. So if you write f = (x -> x^2), you'll get something back like #3 (generic function with 1 method), and typeof(f) will return something like getfield(Main, Symbol("##3#4")), where you can see that Symbol("##3#4") is the temporary type of this anonymous function stored in Main. (a side effect of this is that if you write code that keeps arbitrarily generating the same anonymous function over and over you will eventually overflow memory, since they are all actually being stored as separate global variables of their own type - however, this does not prevent you from doing something like this for n = 1:largenumber ; findall(y -> y > 1.0, x) ; end inside a function, since in this case the anonymous function is only compiled once at compile-time).
Relating all of this back to the Function supertype, you'll note that typeof(sum) <: Function returns true, showing that the type of sum, aka typeof(sum) is indeed a subtype of Function. And note also that typeof(typeof(sum)) returns DataType, in much the same way that typeof(typeof(1.0)) returns DataType, which shows how sum actually behaves like a value.
Now, given everything I've said, all the examples in your question now make sense. typeof(function) and typeof(for) return errors as they should, since function and for are reserved syntax. typeof(typeof) and typeof(in) correctly return (respectively) typeof(typeof), and typeof(in), since typeof and in are both functions. Note of course that typeof(typeof(typeof)) returns DataType.

dealing with types in kwargs in Julia

How can I use kwargs in a Julia function and declare their types for speed?
function f(x::Float64; kwargs...)
kwargs = Dict(kwargs)
if haskey(kwargs, :c)
c::Float64 = kwargs[:c]
else
c::Float64 = 1.0
end
return x^2 + c
end
f(0.0, c=10.0)
yields:
ERROR: LoadError: syntax: multiple type declarations for "c"
Of course I can define the function as f(x::Float64, c::Float64=1.0) to achieve the result, but I have MANY optional arguments with default values to pass, so I'd prefer to use kwargs.
Thanks.
Related post
As noted in another answer, this really only matters if you're going to have a type instability. If you do, the answer is to layer your functions. Have a top layer which does type checking and all sorts of setup, and then call a function which uses dispatch to be fast. For example,
function f(x::Float64; kwargs...)
kwargs = Dict(kwargs)
if haskey(kwargs, :c)
c = kwargs[:c]
else
c = 1.0
end
return _f(x,c)
end
_f(x,c) = x^2 + c
If most of your time is spent in the inner function, then this will be faster (it might not be for very simple functions). This allows for very general usage too, where you have have a keyword argument be by default nothing and do and if nothing ... which could setup a complicated default, and not have to worry about the type stability since it will be shielded from the inner function.
This kind of high-level type-checking wrapper above a performance sensitive inner function is used a lot in DifferentialEquations.jl. Check out the high-level wrapper for the SDE solvers which led to nice speedups by insuring type stability (the inner function is sde_solve) (or check out the solve for ODEProblem, it's much more complex since it handles conversions to different pacakges but it's the same idea).
A simpler answer for small examples like yours may be possible after this PR merges.
To fix some confusion, here's a declaration form:
function f(x::Float64; kwargs...)
local c::Float64 # Ensures the type of `c` will be `Float64`
kwargs = Dict(kwargs)
if haskey(kwargs, :c)
c = float(kwargs[:c])
else
c = 1.0
end
return x^2 + c
end
This will force anything that saves to c to convert to a Float64 or error, resulting in a type-stability, but is not as general of a solution. What form you use really depends on what you're doing.
Lastly, there's also the type assert, as #TotalVerb showed:
function f(x::Float64; c::Float64=1.0, kwargs...)
return x^2 + c
end
That's clean, or you could assert in the function:
function f(x::Float64; kwargs...)
kwargs = Dict(kwargs)
if haskey(kwargs, :c)
c = float(kwargs[:c])::Float64
else
c = 1.0
end
return x^2 + c
end
which will cause convertions only on the lines where the assertion occurs (i.e. the #TotalVerb form won't dispatch, so you can't make another function with c::Int, and it will only assert (convert) when the keyword arg is first read in).
Summary
The first solution will dispatch to be type stable in _f no matter what type the user makes c, and so if _f is a long calculation, this will get pretty much optimal performance, but for really quick calls it will have dispatch overhead.
The second solution will fix any type stability by forcing anything you set c to be a Float64 (it will try to convert, and if it can't, error). Thus this gets speed by forcing type stability, or erroring.
The assert in the keyword spot (#TotalVerb's answer) is the cleanest, but won't auto-convert later (so you could get a type-instability. But if you don't accidentally convert it later, then you have type stability, types can be inferred, and so you'll get optimal performance) and you can't extend it to cases where the function has c passed in as other types (no dispatch).
The last solution is pretty much the same as 3, except not as nice. I wouldn't recommend it. If you're doing something complicated with asserts, you likely are designing something wrong or really want to do something like the first (dispatch in a longer function call which is type stable).
But note that dispatch with version 3 may be fixed in the near future, which would allow you to have a different function with c::Float64 and c::Int (if necessary). Hopefully your solution is in here somewhere.
Note that declaring types does not give you increased performance; you may wish to relax the type constraints on x and c for your code to be more generic. Anyway, this is probably what you want:
function f(x::Float64; c::Float64=1.0, kwargs...)
return x^2 + c
end
See the keyword arguments section of the manual.

How to declare array type that can have int and floats

I am a newbie to Julia and still trying to figure out everything.
I want to restrict input variable type to array that can contain int and floats.
I would really appreciate any help.
function foo(array::??)
As I mentioned in the comment, you don't want to mix them for performance reasons. However, if your array can be either Floats or Ints, but you don't know which it will be, then the best approach is to make it dispatch on the parametric type:
function foo{T<:Number,N}(array::Array{T,N})
This will make it compile a separate function for arrays of each number type (only when needed), and since the type will be known for the compiler, it will run an optimized version of the function whether you give it foo([0.1,0.3,0.4]), foo([1 2 3]), foo([1//2 3//4]), etc.
Updated syntax in Julia 0.6+
function foo(array::Array{T,N}) where {T<:Number,N}
For more generality, you can use Array{Union{Int64,Float64},N} as a type. This will allow Floats and Ints, and you can use its constructor like
arr = Array{Union{Int64,Float64},2}(4,4) # The 2 is the dimension, (4,4) is the size
and you can allow dispatching onto weird things like this as well by doing
function foo{T,N}(array::Array{T,N})
i.e. just remove the restriction on T. However, since the compiler cannot know in advance whether any element of the array is an Int or a Float, it cannot optimize it very well. So in general you should not do this...
But let me explain one way you can work with this and still get something with decent performance. It also works by multiple dispatch. Essentially, if you encase your inner loops with a function call which is a strictly typed dispatch, then when doing all of the hard calculations it can know exactly what type it is and optimize the code anyways. This is best explained by an example. Let's say we want to do:
function foo{T,N}(array::Array{T,N})
for i in eachindex(array)
val = array[i]
# do algorithm X on val
end
end
You can check using #code_warntype that val will not compile as an Int64 or Float64 because it won't know until runtime what type it will be for each i. If you check #code_llvm (or #code_native for the assembly) you see that there is a really long code that is generated in order to handle this. What we can instead do is define
function inner_foo{T<:Number}(val::T)
# Do algorithm X on val
end
and then instead define foo as
function foo2{T,N}(array::Array{T,N})
for i in eachindex(array)
inner_foo(array[i])
end
end
While this looks the same to you, it is very different to the compiler. Note that inner_foo(array[i]) dispatches a specially-compiled function for whatever number type it sees, so in foo2 algorithm X is calculated efficiently, and the only non-efficient part is the wrapping above inner_foo (so if all your time is spent in inner_foo, you will get basically maximal performance).
This is why Julia is built around multiple-dispatch: it's a design which allows you to push things out to optimized functions whenever possible. Julia is fast because of it. Use it.
This should be a comment to Chris' answer, but I don't have enough points to comment.
As Chris points out, using function barriers can be quite useful to generate optimal code. However be aware that dynamic dispatch has some overhead. This may or may not be important depending on the complexity of the inner function.
function foo1{T,N}(array::Array{T,N})
s = 0.0
for i in eachindex(array)
val = array[i]
s += val*val
end
s
end
function foo2{T,N}(array::Array{T,N})
s = 0.0
for i in eachindex(array)
s += inner_foo(array[i])
end
s
end
function foo3{T,N}(array::Array{T,N})
s = 0.0
for i in eachindex(array)
val = array[i]
if isa(val, Float64)
s += inner_foo(val::Float64)
else
s += inner_foo(val::Int64)
end
end
s
end
function inner_foo{T<:Number}(val::T)
val*val
end
For A = Array{Union{Int64,Float64},N}, foo2 doesn't provide much speedup over foo1 since benefit of the optimised inner_foo is countered by the cost of dynamic dispatch.
foo3 is much faster (~7 times) and could be used if possible types are limited and known ahead of time (as in above example where elements are either Int64 or Float64)
See https://groups.google.com/forum/#!topic/julia-users/OBs0fmNmjCU for further discussion.

Define a new method with only a few changes

I want to write a version that accepts a supplementary argument. The difference with the initial version only resides in a few lines of codes, potentially within loops. A typical example is to user a vector of weight w.
One solution is to completely rewrite a new function
function f(Vector::a)
...
for x in a
...
s += x[i]
...
end
...
end
function f(a::Vector, w::Vector)
...
for x in a
...
s += x[i] * w[i]
...
end
...
end
This solution duplicates code and therefore makes the program harder to maintain.
I could split ... into different helper functions, which are called by both functions, but the resulting code would be hard to follow
Another solution is to write only one function and use a ? : structure for each line that should be changed
function f(a, w::Union(Nothing, Vector) = nothing)
....
for x in a
...
s += (w == nothing)? x[i] : x[i] * w[i]
...
end
....
end
This code requires to check a condition at every step in a loop, which does not sound efficient, compared to the first version.
I'm sure there is a better solution, maybe using macros. What would be a good way to deal with this?
There are lots of ways to do this sort of thing, ranging from optional arguments to custom types to metaprogramming with #eval'ed code generation (this would splice in the changes for each new method as you loop over a list of possibilities).
I think in this case I'd use a combination of the approaches suggested by #ColinTBowers and #GnimucKey.
It's fairly simple to define a custom array type that is all ones:
immutable Ones{N} <: AbstractArray{Int,N}
dims::NTuple{N, Int}
end
Base.size(O::Ones) = O.dims
Base.getindex(O::Ones, I::Int...) = (checkbounds(O, I...); 1)
I've chosen to use an Int as the element type since it tends to promote well. Now all you need is to be a bit more flexible in your argument list and you're good to go:
function f(a::Vector, w::AbstractVector=Ones(size(a))
…
This should have a lower overhead than either of the other proposed solutions; getindex should inline nicely as a bounds check and the number 1, there's no type instability, and you don't need to rewrite your algorithm. If you're sure that all your accesses are in-bounds, you could even remove the bounds checking as an additional optimization. Or on a recent 0.4, you could define and use Base.unsafe_getindex(O::Ones, I::Int...) = 1 (that won't quite work on 0.3 since it's not guaranteed to be defined for all AbstractArrays).
In this case, using Optional Arguments may play the trick.
Just make the w argument default to ones().
I've come up against this problem a few times. If you want to avoid the conditional if statement inside the loop, one possibility is to use multiple dispatch over some dummy types. For example:
abstract MyFuncTypes
type FuncWithNoWeight <: MyFuncTypes; end
evaluate(x::Vector, i::Int, ::FuncWithNoWeight) = x[i]
type FuncWithWeight{T} <: MyFuncTypes
w::Vector{T}
end
evaluate(x::Vector, i::Int, wT::FuncWithWeight) = x[i] * wT.w[i]
function f(a, w::MyFuncTypes=FuncWithNoWeight())
....
for x in a
...
s += evaluate(x, i, w)
...
end
....
end
I extend the evaluate method over FuncWithNoWeight and FuncWithWeight in order to get the appropriate behaviour. I also nest these types within an abstract type MyFuncTypes, which is the second input to f (with default value of FuncWithNoWeight). From here, multiple dispatch and Julia's type system takes care of the rest.
One neat thing about this approach is that if you decide later on you want to add a third type of behaviour inside the loop (not necessarily even weighting, pretty much any type of transformation will be possible), it is as simple as defining a new type, nesting it under MyFuncTypes, and extending the evaluate method to the new type.
UPDATE: As Matt B. has pointed out, the first version of my answer accidentally introduced type instability into the function with my solution. As a general rule I typically find that if Matt posts something it is worth paying close attention (hint, hint, check out his answer). I'm still learning a lot about Julia (and am answering questions on StackOverflow to facilitate that learning). I've updated my answer to remove the type instability pointed out by Matt.

Resources