Efficient way to pass optional argument to a function in Julia - julia

I would like to create a function in Julia which accepts an optional argument
let's call it "BMI", which is itself a function, such that, if this optional argument is not included, "do_something" skips a block of instructions.
That is, something like
function do_something(age, height; BMI=None)
print("hi, I am $age years old and my height is $height")
if window!=None
print("My BMI is $(BMI(age,height))")
end
print("bye")
end
What is the best way to accomplish this in Julia?

There are few approaches to your problem. First of all, you may use nothing to distinguish whether a BMI was passed to your function
function do_something(age, height; BMI = nothing)
print("hi, I am $age years old and my height is $height")
if !isnothing(BMI)
print("My BMI is $(BMI(age,height))")
end
print("bye")
end
If you are on an older version of Julia (I think 1.1 or lower) you should use BMI !== nothing, take notice of double equal sign. There are reasons why it is better than use !=. It may not look important in your particular case, but it is better to make good habits from the start.
But at the same time, I would recommend to use multiple dispatch, which may look excessive here, but it gives you taste and feel of Julia and also make it possible to naturally extend your initial declaration
do_bmi(bmi::Nothing, age, height) = nothing
do_bmi(bmi, age, height) = print("My BMI is $(bmi(age,height))")
function do_something(age, height; BMI = nothing)
print("hi, I am $age years old and my height is $height")
do_bmi(BMI, age, height)
print("bye")
end
For example, if you would like to give user possibilty to choose BMI from the set of predefined functions, abbreviated by some String, all you have to do is define this function
function do_bmi(bmi::AbstractString, age, height)
if bmi == "standard"
do_bmi((a, h) -> a^2/h, age, height)
else
println("Unknown BMI keyword $bmi")
end
end
and call your original function like this
do_something(20, 170, BMI = "standard")

Related

How do I represent sparse arrays in Pari/GP?

I have a function that returns integer values to integer input. The output values are relatively sparse; the function only returns around 2^14 unique outputs for input values 1....2^16. I want to create a dataset that lets me quickly find the inputs that produce any given output.
At present, I'm storing my dataset in a Map of Lists, with each output value serving as the key for a List of input values. This seems slow and appears to use a whole of stack space. Is there a more efficient way to create/store/access my dataset?
Added:
It turns out the time taken by my sparesearray() function varies hugely on the ratio of output values (i.e., keys) to input values (values stored in the lists). Here's the time taken for a function that requires many lists, each with only a few values:
? sparsearray(2^16,x->x\7);
time = 126 ms.
Here's the time taken for a function that requires only a few lists, each with many values:
? sparsearray(2^12,x->x%7);
time = 218 ms.
? sparsearray(2^13,x->x%7);
time = 892 ms.
? sparsearray(2^14,x->x%7);
time = 3,609 ms.
As you can see, the time increases exponentially!
Here's my code:
\\ sparsearray takes two arguments, an integer "n" and a closure "myfun",
\\ and returns a Map() in which each key a number, and each key is associated
\\ with a List() of the input numbers for which the closure produces that output.
\\ E.g.:
\\ ? sparsearray(10,x->x%3)
\\ %1 = Map([0, List([3, 6, 9]); 1, List([1, 4, 7, 10]); 2, List([2, 5, 8])])
sparsearray(n,myfun=(x)->x)=
{
my(m=Map(),output,oldvalue=List());
for(loop=1,n,
output=myfun(loop);
if(!mapisdefined(m,output),
/* then */
oldvalue=List(),
/* else */
oldvalue=mapget(m,output));
listput(oldvalue,loop);
mapput(m,output,oldvalue));
m
}
To some extent, the behavior you are seeing is to be expected. PARI appears to pass lists and maps by value rather than reference except to the special inbuilt functions for manipulating them. This can be seen by creating a wrapper function like mylistput(list,item)=listput(list,item);. When you try to use this function you will discover that it doesn't work because it is operating on a copy of the list. Arguably, this is a bug in PARI, but perhaps they have their reasons. The upshot of this behavior is each time you add an element to one of the lists stored in the map, the entire list is being copied, possibly twice.
The following is a solution that avoids this issue.
sparsearray(n,myfun=(x)->x)=
{
my(vi=vector(n, i, i)); \\ input values
my(vo=vector(n, i, myfun(vi[i]))); \\ output values
my(perm=vecsort(vo,,1)); \\ obtain order of output values as a permutation
my(list=List(), bucket=List(), key);
for(loop=1, #perm,
if(loop==1||vo[perm[loop]]<>key,
if(#bucket, listput(list,[key,Vec(bucket)]);bucket=List()); key=vo[perm[loop]]);
listput(bucket,vi[perm[loop]])
);
if(#bucket, listput(list,[key,Vec(bucket)]));
Mat(Col(list))
}
The output is a matrix in the same format as a map - if you would rather a map then it can be converted with Map(...), but you probably want a matrix for processing since there is no built in function on a map to get the list of keys.
I did a little bit of reworking of the above to try and make something more akin to GroupBy in C#. (a function that could have utility for many things)
VecGroupBy(v, f)={
my(g=vector(#v, i, f(v[i]))); \\ groups
my(perm=vecsort(g,,1));
my(list=List(), bucket=List(), key);
for(loop=1, #perm,
if(loop==1||g[perm[loop]]<>key,
if(#bucket, listput(list,[key,Vec(bucket)]);bucket=List()); key=g[perm[loop]]);
listput(bucket, v[perm[loop]])
);
if(#bucket, listput(list,[key,Vec(bucket)]));
Mat(Col(list))
}
You would use this like VecGroupBy([1..300],i->i%7).
There is no good native GP solution because of the way garbage collection occurs because passing arguments by reference has to be restricted in GP's memory model (from version 2.13 on, it is supported for function arguments using the ~ modifier, but not for map components).
Here is a solution using the libpari function vec_equiv(), which returns the equivalence classes of identical objects in a vector.
install(vec_equiv,G);
sparsearray(n, f=x->x)=
{
my(v = vector(n, x, f(x)), e = vec_equiv(v));
[vector(#e, i, v[e[i][1]]), e];
}
? sparsearray(10, x->x%3)
%1 = [[0, 1, 2], [Vecsmall([3, 6, 9]), Vecsmall([1, 4, 7, 10]), Vecsmall([2, 5, 8])]]
(you have 3 values corresponding to the 3 given sets of indices)
The behaviour is linear as expected
? sparsearray(2^20,x->x%7);
time = 307 ms.
? sparsearray(2^21,x->x%7);
time = 670 ms.
? sparsearray(2^22,x->x%7);
time = 1,353 ms.
Use mapput, mapget and mapisdefined methods on a map created with Map(). If multiple dimensions are required, then use a polynomial or vector key.
I guess that is what you are already doing, and I'm not sure there is a better way. Do you have some code? From personal experience, 2^16 values with 2^14 keys should not be an issue with regards to speed or memory - there may be some unnecessary copying going on in your implementation.

What is the "best practices" way to check if optional arguments are used in a function call in julia

In python I might have a function like this:
def sum_these(x, y=None):
if y is None:
y = 1
return x + y
What is the equivalent use in julia? To be exact I know I could probably do:
function sum_these(x, y=0)
if y == 0
y = 1
end
x + y
end
However I'd rather not use zero, instead some value with the same meaning as None in python
EDIT
Just for clarity's sake, the result of these examples functions isn't important. The events(any) that happen if y is None are important, and for my cases setting y=[some number] is undesirable. After some search I think, unless someone provides a better solution is to do something like:
function sum_these(x, y=nothing)
if y == nothing
do stuff
end
return something
function sum_these(x, y=nothing)
if y == nothing
do stuff
end
return something
end
That's not only perfectly fine, but because nothing is a singleton of type Void, the y==nothing will actually compile away so the if statement is actually no runtime cost here. I talk about this in depth in a blog post, but what it really means is that function auto-specialization allows for checks against nothing to always be free in type-stable/inferrable functions.
However, you may want to consider splitting this into two different functions:
function sum_these(x)
return something
end
function sum_these(x, y)
do stuff
return something
end
Of course, this is just a style difference and the right choice is determined by how much code is shared in the return something.
Maybe use multiple dispatch with an empty fallback?
function f(x, y=nothing)
...
do_something(x, y)
...
return something
end
do_something(x, y) = nothing
function do_something(x, y::Void)
...
end
add other relevant vars to do_something as necessary, and return something or mutate as necessary.
Your question is a bit confusing. It seems like what you want is
function sum_these(x, y=1)
return x + y
end
But that doesn't quite do what you are asking either, since even if you call sum_these(3, 0) in your example, it replaces 0 with 1.
Also in Python, I would use
def sum_these(x, y=1):
return x + y
Perhaps I misunderstand your question.

Add my custom loss function to torch

I want to add a loss function to torch that calculates the edit distance between predicted and target values.
Is there an easy way to implement this idea?
Or do I have to write my own class with backward and forward functions?
If your criterion can be represented as a composition of existing modules and criteria, it's a good idea to simply construct such composition using containers. The only problem is that standard containers are designed to work with modules only, not criteria. The difference is in :forward method signature:
module:forward(input)
criterion:forward(input, target)
Luckily, we are free to define our own container which is able work with criteria too. For example, sequential:
local GeneralizedSequential, _ = torch.class('nn.GeneralizedSequential', 'nn.Sequential')
function GeneralizedSequential:forward(input, target)
return self:updateOutput(input, target)
end
function GeneralizedSequential:updateOutput(input, target)
local currentOutput = input
for i=1,#self.modules do
currentOutput = self.modules[i]:updateOutput(currentOutput, target)
end
self.output = currentOutput
return currentOutput
end
Below is an illustration of how to implement nn.CrossEntropyCriterion having this generalized sequential container:
function MyCrossEntropyCriterion(weights)
criterion = nn.GeneralizedSequential()
criterion:add(nn.LogSoftMax())
criterion:add(nn.ClassNLLCriterion(weights))
return criterion
end
Check whether everything is correct:
output = torch.rand(3,3)
target = torch.Tensor({1, 2, 3})
mycrit = MyCrossEntropyCriterion()
-- print(mycrit)
print(mycrit:forward(output, target))
print(mycrit:backward(output, target))
crit = nn.CrossEntropyCriterion()
-- print(crit)
print(crit:forward(output, target))
print(crit:backward(output, target))
Just to add to the accepted answer, you have to be careful that the loss function you define (edit distance in your case) is differentiable with respect to the network parameters.

Parametric Type Creation

I'm struggling to understand parametric type creation in julia. I know that I can create a type with the following:
type EconData
values
dates::Array{Date}
colnames::Array{ASCIIString}
function EconData(values, dates, colnames)
if size(values, 1) != size(dates, 1)
error("Date/data dimension mismatch.")
end
if size(values, 2) != size(colnames, 2)
error("Name/data dimension mismatch.")
end
new(values, dates, colnames)
end
end
ed1 = EconData([1;2;3], [Date(2014,1), Date(2014,2), Date(2014,3)], ["series"])
However, I can't figure out how to specify how values will be typed. It seems reasonable to me to do something like
type EconData{T}
values::Array{T}
...
function EconData(values::Array{T}, dates, colnames)
...
However, this (and similar attempts) simply produce and error:
ERROR: `EconData{T}` has no method matching EconData{T}(::Array{Int64,1}, ::Array{Date,1}, ::Array{ASCIIString,2})
How can I specify the type of values?
The answer is that things get funky with parametric types and inner constructors - in fact, I think its probably the most confusing thing in Julia. The immediate solution is to provide a suitable outer constructor:
using Dates
type EconData{T}
values::Vector{T}
dates::Array{Date}
colnames::Array{ASCIIString}
function EconData(values, dates, colnames)
if size(values, 1) != size(dates, 1)
error("Date/data dimension mismatch.")
end
if size(values, 2) != size(colnames, 2)
error("Name/data dimension mismatch.")
end
new(values, dates, colnames)
end
end
EconData{T}(v::Vector{T},d,n) = EconData{T}(v,d,n)
ed1 = EconData([1,2,3], [Date(2014,1), Date(2014,2), Date(2014,3)], ["series"])
What also would have worked is to have done
ed1 = EconData{Int}([1,2,3], [Date(2014,1), Date(2014,2), Date(2014,3)], ["series"])
My explanation might be wrong, but I think the probably is that there is no parametric type constructor method made by default, so you have to call the constructor for a specific instantiation of the type (my second version) or add the outer constructor yourself (first version).
Some other comments: you should be explicit about dimensions. i.e. if all your fields are vectors (1D), use Vector{T} or Array{T,1}, and if their are matrices (2D) use Matrix{T} or Array{T,2}. Make it parametric on the dimension if you need to. If you don't, slow code could be generated because functions using this type aren't really sure about the actual data structure until runtime, so will have lots of checks.

OCaml: Does storing some values to be used later introduce "side effects"?

For a homework assignment, we've been instructed to complete a task without introducing any "side-effects". I've looked up "side-effects" on Wikipedia, and though I get that in theory it means "modifies a state or has an observable interaction with calling functions", I'm having trouble figuring out specifics.
For example, would creating a value that holds a non-compile time result be introducing side effects?
Say I had (might not be syntactically perfect):
val myList = (someFunction x y);;
if List.exists ((=) 7) myList then true else false;;
Would this introduce side-effects? I guess maybe I'm confused on what "modifies a state" means in the definition of side-effects.
No; a side-effect refers to e.g. mutating a ref cell with the assignment operator :=, or other things where the value referred to by a name changes over time. In this case, myList is an immutable value that never changes during the program, thus it is effect-free.
See also
http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)
A good way to think about it is "have I changed anything which any later code (including running this same function again later) could ever possibly see other than the value I'm returning?" If so, that's a side effect. If not, then you can know that there isn't one.
So, something like:
let inc_nosf v = v+1
has no side effects because it just returns a new value which is one more than an integer v. So if you run the following code in the ocaml toplevel, you get the corresponding results:
# let x = 5;;
val x : int = 5
# inc_nosf x;;
- : int = 6
# x;;
- : int = 5
As you can see, the value of x didn't change. So, since we didn't save the return value, then nothing really got incremented. Our function itself only modifies the return value, not x itself. So to save it into x, we'd have to do:
# let x = inc_nosf x;;
val x : int = 6
# x;;
- : int = 6
Since the inc_nosf function has no side effects (that is, it only communicates with the outside world using its return value, not by making any other changes).
But something like:
let inc_sf r = r := !r+1
has side effects because it changes the value stored in the reference represented by r. So if you run similar code in the top level, you get this, instead:
# let y = ref 5;;
val y : int ref = {contents = 5}
# inc_sf y;;
- : unit = ()
# y;;
- : int ref = {contents = 6}
So, in this case, even though we still don't save the return value, it got incremented anyway. That means there must have been changes to something other than the return value. In this case, that change was the assignment using := which changed the stored value of the ref.
As a good rule of thumb, in Ocaml, if you avoid using refs, records, classes, strings, arrays, and hash tables, then you will avoid any risk of side effects. Although you can safely use string literals as long as you avoid modifying the string in place using functions like String.set or String.fill. Basically, any function which can modify a data type in place will cause a side effect.

Resources