I'm upgrading a Netlogo model from v5.3.1 to v6.01. In the model, I have a series of lists that I combine/manipulate using the map primitive. I've tried to update the code using the new anonymous procedures, but I can't quite figure it out. I was using the ? syntax, but ? is no longer defined.
Original code:
C, WC-Alpha, A, and Z are all lists
alpha is a constant
set C-alpha map [? ^ (- alpha)] C ;creates a vector of C^-alpha
set R map [? * (A * Z)] WC-alpha ;creates R vector
Have you had a look at the dictionary entry for map? It shows the new syntax, where essentially you define the variable to be used by map. For example, yours might look like:
set C-alpha map [ i -> i ^ (- alpha) ] C
where you explicitly state that you will be using i as the variable for the mapping operation. This allows for more readable code in map and other anonymous procedures.
I am writing a script converting Python's Keras (v1.1.0) model to Julia's Flux model, and I am struggling with implementing regularization (I have read https://fluxml.ai/Flux.jl/stable/models/regularisation/) as a way to get to know Julia.
So, in Keras's json model I have something like: "W_regularizer": {"l2": 0.0010000000474974513, "name": "WeightRegularizer", "l1": 0.0} for each Dense layer. I want to use these coefficients to create regularization in the Flux model. The problem is that, in Flux it is added directly to the loss instead of being defined as a property of the layer itself.
To avoid posting too much code here, I've added it to the repo. Here is a small script that takes the json and createa Flux's Chain: https://github.com/iegorval/Keras2Flux.jl/blob/master/Keras2Flux/src/Keras2Flux.jl
Now, I want to create a penalty for each Dense layer with the predefined l1/l2 coefficient. I tried to do it like this:
using Pkg
pkg"activate /home/username/.julia/dev/Keras2Flux"
using Flux
using Keras2Flux
using LinearAlgebra
function get_penalty(model::Chain, regs::Array{Any, 1})
index_model = 1
index_regs = 1
penalties = []
for layer in model
if layer isa Dense
penalty(m) = regs[index_regs](m[index_model].W)
push!(penalties, penalty)
index_regs += 1
index_model += 1
total_penalty(m) = sum([p(m) for p in penalties])
return total_penalty
model, regs = convert_keras2flux("examples/keras_1_1_0.json")
penalty = get_penalty(model, regs)
So, I create a penalty function for each Dense layer and then sum it up to the total penalty. However, it gives me this error:
ERROR: LoadError: BoundsError: attempt to access 3-element Array{Any,1} at index [4]
I understand what it means but I really don't understand how to fix it. So, it seems that when I call total_penalty(model), it uses index_regs == 4 (so, the values of index_regs and index_model as they are AFTER the for-cycle). Instead, I want to use their actual indices that I had while pushing the given penalty to the list of penalties.
On the other hand, if I did it not as a list of functions but as a list of values, it also would not be correct, because I will define the loss as:
loss(x, y) = binarycrossentropy(model(x), y) + total_penalty(model). If I was to use it just as list of values, then I would have a static total_penalty, while it should be recalculated for every Dense layer every time during the model training.
I would be thankful if somebody with Julia experience gives me some advise because I am definitely failing to understand how it works in Julia and, specifically, in Flux. How would I create total_penalty that would be recalculated automatically during training?
There are a couple parts to your question, and since you are new to Flux (and Julia?), I will answer in steps. But I suggest the solution at the end as a cleaner way to handle this.
First, there is the issue of p(m) calculating the penalty using index_regs and index_model as the values after the for-loop. This is because of the scoping rules in Julia. When you define the closure penalty(m) = regs[index_regs](m[index_model].W), index_regs is bound to the variable defined in get_penalty. So, as index_regs changes, so does the output of p(m). The other issue is the naming of the function as penalty(m). Every time you run this line, you are redefining penalty and all references to it that you pushed onto penalties. Instead, you should prefer to create an anonymous function. Here is how we incorporate these changes:
function get_penalty(model::Chain, regs::Array{Any, 1})
index_model = 1
index_regs = 1
penalties = []
for layer in model
if layer isa Dense
penalty = let i = index_regs, index_model = index_model
m -> regs[i](m[index_model].W)
push!(penalties, penalty)
index_regs += 1
index_model += 1
total_penalty(m) = sum([p(m) for p in penalties])
return total_penalty
I used i and index_model in the let block to drive home the scoping rules. I'd encourage you to replace the anonymous function in the let block with global penalty(m) = ... (and remove the assignment to penalty before the let block) to see the difference of using anonymous vs named functions.
But, if we go back to your original issue, you want to calculate the regularization penalty for your model using the stored coefficients. Ideally, these would be stored with each Dense layer as in Keras. You can recreate the same functionality in Flux:
using Flux, Functor
struct RegularizedDense{T, LT<:Dense}
#functor RegularizedDense
(l::RegularizedDense)(x) = l.layer(x)
penalty(l) = 0
penalty(l::RegularizedDense) =
l.w_l1 * norm(l.layer.W, 1) + l.w_l2 * norm(l.layer.W, 2)
penalty(model::Chain) = sum(penalty(layer) for layer in model)
Then, in your Keras2Flux source, you can redefine get_regularization to return w_l1_reg and w_l2_reg instead of functions. And in create_dense you can do:
function create_dense(config::Dict{String,Any}, prev_out_dim::Int64=-1)
# ... code you have already written
dense = Dense(in, out, activation; initW = init, initb = zeros)
w_l1, w_l2 = get_regularization(config)
return RegularizedDense(dense, w_l1, w_l2)
Lastly, you can compute your loss function like so:
loss(x, y, m) = binarycrossentropy(m(x), y) + penalty(m)
# ... later for training
train!((x, y) -> loss(x, y, m), training_data, params)
We define loss as a function of (x, y, m) to avoid performance issues.
So, in the end, this approach is cleaner because after model construction, you don't need to pass around an array of regularization functions and figure out how to index each function correctly with the corresponding dense layer.
If you prefer to keep the regularizer and model separate (i.e. have standard Dense layers in your model chain), then you can do that too. Let me know if you want that solution, but I'll leave it out for now.
I have a function which converts a Map k v (from Data.Map.Strict in containers) to a regular function (k -> v). The code is of the following form:
import qualified Data.Map.Strict as Map
funcFromMap :: (Ord k) => Map.Map k v -> k -> v
funcFromMap map = (\k -> fromMaybe (error "error message") $ Map.lookup k map)
When I run my application with time profiling, this function takes the top spot at ~40% total time. This is surprising, because it only gets called on the result of a fold that performs dynamic programming computations that I would expect to be significantly more expensive than funcFromMap. Is writing a function of the above form generally a bad idea for some reason?
p.s: The rest of my code is designed to avoid lookups for keys that aren't in the map, so I think this implementation should at least be safe.
Note that you aren't turning a Map into a function. You're turning a function (namely Map.lookup) into another function. Haskell programming is all about turning functions into other functions, so if that was inefficient, we'd all be in a lot of trouble!
In short, there's nothing wrong with funcFromMap (except that it already exists as the function (!), as #chi pointed out), and there's no reason it should be inefficient.
First, make sure you're reading the "individual" column instead of the "inherited" column in the profile output. The "individual" column gives the time actually spent in the function itself.
If the "individual" column genuinely says 40%, then what's happened is that lookup has been inlined into your funcFromMap, and for some reason the actual map lookups are very expensive in your application. I think we'd need to see a minimal example that illustrates the problem to say why.
Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified? This requires, if we want to use it anyway, to go through a very artificial process, by representing these data in the form of a ridiculous single element table.
Ada, which had the same kind of limitation, abandoned it in its 2012 redesign to the great satisfaction of its users. A small keyword (like out in Ada) could very well indicate that the possibility of keeping the modifications of a parameter at the output is required.
From my experience in Julia it is useful to understand the difference between a value and a binding.
Each value in Julia has a concrete type and location in memory. Value can be mutable or immutable. In particular when you define your own composite type you can decide if objects of this type should be mutable (mutable struct) or immutable (struct).
Of course Julia has in-built types and some of them are mutable (e.g. arrays) and other are immutable (e.g. numbers, strings). Of course there are design trade-offs between them. From my perspective two major benefits of immutable values are:
if a compiler works with immutable values it can perform many optimizations to speed up code;
a user is can be sure that passing an immutable to a function will not change it and such encapsulation can simplify code analysis.
However, in particular, if you want to wrap an immutable value in a mutable wrapper a standard way to do it is to use Ref like this:
julia> x = Ref(1)
julia> x[]
julia> x[] = 10
julia> x
julia> x[]
You can pass such values to a function and modify them inside. Of course Ref introduces a different type so method implementation has to be a bit different.
A variable is a name bound to a value. In general, except for some special cases like:
rebinding a variable from module A in module B;
redefining some constants, e.g. trying to reassign a function name with a non-function value;
rebinding a variable that has a specified type of allowed values with a value that cannot be converted to this type;
you can rebind a variable to point to any value you wish. Rebinding is performed most of the time using = or some special constructs (like in for, let or catch statements).
Now - getting to the point - function is passed a value not a binding. You can modify a binding of a function parameter (in other words: you can rebind a value that a parameter is pointing to), but this parameter is a fresh variable whose scope lies inside a function.
If, for instance, we wanted a call like:
x = 10
change a binding of variable x it is impossible because f does not even know of existence of x. It only gets passed its value. In particular - as I have noted above - adding such a functionality would break the rule that module A cannot rebind variables form module B, as f might be defined in a module different than where x is defined.
What to do
Actually it is easy enough to work without this feature from my experience:
What I typically do is simply return a value from a function that I assign to a variable. In Julia it is very easy because of tuple unpacking syntax like e.g. x,y,z = f(x,y,z), where f can be defined e.g. as f(x,y,z) = 2x,3y,4z;
You can use macros which get expanded before code execution and thus can have an effect modifying a binding of a variable, e.g. macro plusone(x) return esc(:($x = $x+1)) end and now writing y=100; #plusone(y) will change the binding of y;
Finally you can use Ref as discussed above (or any other mutable wrapper - as you have noted in your question).
"Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified?" asked by Schemer
Your question is wrong because you assume the wrong things.
Parameters are variables
When you pass things to a function, often those things are values and not variables.
for example:
function double(x::Int64)
2 * x
Now what happens when you call it using
What is the point of the function modifying it's parameter x , it's pointless. Furthermore the function has no idea how it is called.
Furthermore, Julia is built for speed.
A function that modifies its parameter will be hard to optimise because it causes side effects. A side effect is when a procedure/function changes objects/things outside of it's scope.
If a function does not modifies a variable that is part of its calling parameter then you can be safe knowing.
the variable will not have its value changed
the result of the function can be optimised to a constant
not calling the function will not break the program's behaviour
Those above three factors are what makes FUNCTIONAL language fast and NON FUNCTIONAL language slow.
Furthermore when you move into Parallel programming or Multi Threaded programming, you absolutely DO NOT WANT a variable having it's value changed without you (The programmer) knowing about it.
"How would you implement with your proposed macro, the function F(x) which returns a boolean value and modifies c by c:= c + 1. F can be used in the following piece of Ada code : c:= 0; While F(c) Loop ... End Loop;" asked by Schemer
I would write
function F(x)
boolean_result = perform_some_logic()
return (boolean_result,x+1)
flag = true
c = 0
(flag,c) = F(c)
while flag
(flag,c) = F(c)
"Unfortunately no, because, and I should have said that, c has to take again the value 0 when F return the value False (c increases as long the Loop lives and return to 0 when it dies). " said Schemer
Then I would write
function F(x)
boolean_result = perform_some_logic()
if boolean_result == true
return (true,x+1)
return (false,0)
flag = true
c = 0
(flag,c) = F(c)
while flag
(flag,c) = F(c)
One reason that pushes me away from functional languages like Lisp is that I have no idea how to do a 'raw' array iteration. Say, I have an array in C that represents the screen pixels's RGB values. Changing colors is trivial with a for loop in C, but how do you do this elegantly in Lisp?
Sorry, I haven't phrased my question correctly.
In C, when I want to change color on the screen, I simply write a for loop over a part of the array.
BUT in scheme, clojure or haskell all data is immutable. So when I change a part of matrix, it would return a brand new matrix. That's a bit awkward. Is there a 'clean' way to change the color of a part of matrix without recursing over whole array and making copies?
In a functional language, you would use recursion.
The recursion scheme can be named.
For example, to recurse over an array of data, applying a function to each pixel, you can manually recurse over the structure of the array:
map f [] = []
-- the empty array
map f (x:xs) = f x : map f xs
-- apply f to the head of the array, and loop on the tail.
(in Haskell).
This recursive form is so common it is called map in most libraries.
To "iterate" through an array in some language like Lisp is a simple map.
The structure is (map f x) where f is a function you want applied to every element of the list/array x.
I'm new to OCaml, and I'd like to implement Gaussian Elimination as an exercise. I can easily do it with a stateful algorithm, meaning keep a matrix in memory and recursively operating on it by passing around a reference to it.
This statefulness, however, smacks of imperative programming. I know there are capabilities in OCaml to do this, but I'd like to ask if there is some clever functional way I haven't thought of first.
OCaml arrays are mutable, and it's hard to avoid treating them just like arrays in an imperative language.
Haskell has immutable arrays, but from my (limited) experience with Haskell, you end up switching to monadic, mutable arrays in most cases. Immutable arrays are probably amazing for certain specific purposes. I've always imagined you could write a beautiful implementation of dynamic programming in Haskell, where the dependencies among array entries are defined entirely by the expressions in them. The key is that you really only need to specify the contents of each array entry one time. I don't think Gaussian elimination follows this pattern, and so it seems it might not be a good fit for immutable arrays. It would be interesting to see how it works out, however.
You can use a Map to emulate a matrix. The key would be a pair of integers referencing the row and column. You'll want to use your own get x y function to ensure x < n and y < n though, instead of accessing the Map directly. (edit) You can use the compare function in Pervasives directly.
module OrderedPairs = struct
type t = int * int
let compare = Pervasives.compare
module Pairs = Map.Make (OrderedPairs)
let get_ n set x y =
assert( x < n && y < n );
Pairs.find (x,y) set
let set_ n set x y v =
assert( x < n && y < n );
Pairs.add (x,y) set v
Actually, having a general set of functions (get x y and set x y at a minimum), without specifying the implementation, would be an even better option. The functions then can be passed to the function, or be implemented in a module through a functor (a better solution, but having a set of functions just doing what you need would be a first step since you're new to OCaml). In this way you can use a Map, Array, Hashtbl, or a set of functions to access a file on the hard-drive to implement the matrix if you wanted. This is the really important aspect of functional programming; that you trust the interface over exploiting the side-effects, and not worry about the underlying implementation --since it's presumed to be pure.
The answers so far are using/emulating mutable data-types, but what does a functional approach look like?
To see, let's decompose the problem into some functional components:
Gaussian elimination involves a sequence of row operations, so it is useful first to define a function taking 2 rows and scaling factors, and returning the resultant row operation result.
The row operations we want should eliminate a variable (column) from a particular row, so lets define a function which takes a pair of rows and a column index and uses the previously defined row operation to return the modified row with that column entry zero.
Then we define two functions, one to convert a matrix into triangular form, and another to back-substitute a triangular matrix to the diagonal form (using the previously defined functions) by eliminating each column in turn. We could iterate or recurse over the columns, and the matrix could be defined as a list, vector or array of lists, vectors or arrays. The input is not changed, but a modified matrix is returned, so we can finally do:
let out_matrix = to_diagonal (to_triangular in_matrix);
What makes it functional is not whether the data-types (array or list) are mutable, but how they they are used. This approach may not be particularly 'clever' or be the most efficient way to do Gaussian eliminations in OCaml, but using pure functions lets you express the algorithm cleanly.