I made a simple recursive function, and expected it to work (but it doesn't):
open System
open System.Threading
let f =
let r = Random()
let rec d =
printfn "%d" (r.Next())
Thread.Sleep(1000)
d
d
f
With the help of Intellisense I ended up with the following working function (but without understanding why previous function didn't work):
open System
open System.Threading
let f : unit =
let r = Random()
let rec d() =
printfn "%d" (r.Next())
Thread.Sleep(1000)
d()
d()
f
So why do I need to explicitly state unit and ()?
In the first version, you declared a recursive object (let rec d), or a value. You're saying that the d object is recursive, but how an object could be recursive? How does it call itself? Of course, this doesn't make sense.
It's not possible to use recursive objects in F# and this is the reason why your first version doesn't work.
In the second version, you declared a recursive function (let rec d()). Adding (), you're explicitly stating that d is a function.
Furthermore you explicitly stated, with unit, that the function f (called just once) will not return anything, or, at least, you're saying that f will return a value of a not specific type. In F#, even the simplest functions must always return a value.
In your case, F# will try to infer the type that f will return. Because there's no specific type annotation and your f is not doing something (like a calculation) that will return a specific value using a specific type, the F# compiler will assign a generic return type to f, but your code is still ambiguous and you have to specify the unit type (the simplest type that a F# function could return) to be more specific.
The value restriction error is related indeed to F#'s powerful type inference. Please have a look at this interesting article about this error.
In your first attempt, you define not a function, but a value. The value d is defined in terms of itself - that is, in order to know what d is, you need to first know what d is. No wonder it doesn't work!
To make this a bit more clear, I will point out that your definition is of the same kind as this:
let x = x
Would you expect this to work?
In your second attempt, you gave d a parameter. It is the parameter that made it a function and not a value. Compare:
let rec x() = x()
This will still cause a stack overflow when executed, but at least it will compile: it's a function that unconditionally calls itself.
You didn't have to give it specifically a unit parameter, any parameter would do. You could have made it a number, a string, or even a generic type. It's just that unit is the simplest option when you don't care what it is.
And you didn't actually need to annotate f with a type. That was an extraneous step.
In conclusion, I'd like to point out that even in your second code block, f is still a value, not a function. In practical terms it means that the code inside f will be executed just once, when f is defined, and not every time you mention f as part of some other expression, which is apparently what you intuitively expect.
Related
Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified? This requires, if we want to use it anyway, to go through a very artificial process, by representing these data in the form of a ridiculous single element table.
Ada, which had the same kind of limitation, abandoned it in its 2012 redesign to the great satisfaction of its users. A small keyword (like out in Ada) could very well indicate that the possibility of keeping the modifications of a parameter at the output is required.
From my experience in Julia it is useful to understand the difference between a value and a binding.
Values
Each value in Julia has a concrete type and location in memory. Value can be mutable or immutable. In particular when you define your own composite type you can decide if objects of this type should be mutable (mutable struct) or immutable (struct).
Of course Julia has in-built types and some of them are mutable (e.g. arrays) and other are immutable (e.g. numbers, strings). Of course there are design trade-offs between them. From my perspective two major benefits of immutable values are:
if a compiler works with immutable values it can perform many optimizations to speed up code;
a user is can be sure that passing an immutable to a function will not change it and such encapsulation can simplify code analysis.
However, in particular, if you want to wrap an immutable value in a mutable wrapper a standard way to do it is to use Ref like this:
julia> x = Ref(1)
Base.RefValue{Int64}(1)
julia> x[]
1
julia> x[] = 10
10
julia> x
Base.RefValue{Int64}(10)
julia> x[]
10
You can pass such values to a function and modify them inside. Of course Ref introduces a different type so method implementation has to be a bit different.
Variables
A variable is a name bound to a value. In general, except for some special cases like:
rebinding a variable from module A in module B;
redefining some constants, e.g. trying to reassign a function name with a non-function value;
rebinding a variable that has a specified type of allowed values with a value that cannot be converted to this type;
you can rebind a variable to point to any value you wish. Rebinding is performed most of the time using = or some special constructs (like in for, let or catch statements).
Now - getting to the point - function is passed a value not a binding. You can modify a binding of a function parameter (in other words: you can rebind a value that a parameter is pointing to), but this parameter is a fresh variable whose scope lies inside a function.
If, for instance, we wanted a call like:
x = 10
f(x)
change a binding of variable x it is impossible because f does not even know of existence of x. It only gets passed its value. In particular - as I have noted above - adding such a functionality would break the rule that module A cannot rebind variables form module B, as f might be defined in a module different than where x is defined.
What to do
Actually it is easy enough to work without this feature from my experience:
What I typically do is simply return a value from a function that I assign to a variable. In Julia it is very easy because of tuple unpacking syntax like e.g. x,y,z = f(x,y,z), where f can be defined e.g. as f(x,y,z) = 2x,3y,4z;
You can use macros which get expanded before code execution and thus can have an effect modifying a binding of a variable, e.g. macro plusone(x) return esc(:($x = $x+1)) end and now writing y=100; #plusone(y) will change the binding of y;
Finally you can use Ref as discussed above (or any other mutable wrapper - as you have noted in your question).
"Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified?" asked by Schemer
Your question is wrong because you assume the wrong things.
Parameters are variables
When you pass things to a function, often those things are values and not variables.
for example:
function double(x::Int64)
2 * x
end
Now what happens when you call it using
double(4)
What is the point of the function modifying it's parameter x , it's pointless. Furthermore the function has no idea how it is called.
Furthermore, Julia is built for speed.
A function that modifies its parameter will be hard to optimise because it causes side effects. A side effect is when a procedure/function changes objects/things outside of it's scope.
If a function does not modifies a variable that is part of its calling parameter then you can be safe knowing.
the variable will not have its value changed
the result of the function can be optimised to a constant
not calling the function will not break the program's behaviour
Those above three factors are what makes FUNCTIONAL language fast and NON FUNCTIONAL language slow.
Furthermore when you move into Parallel programming or Multi Threaded programming, you absolutely DO NOT WANT a variable having it's value changed without you (The programmer) knowing about it.
"How would you implement with your proposed macro, the function F(x) which returns a boolean value and modifies c by c:= c + 1. F can be used in the following piece of Ada code : c:= 0; While F(c) Loop ... End Loop;" asked by Schemer
I would write
function F(x)
boolean_result = perform_some_logic()
return (boolean_result,x+1)
end
flag = true
c = 0
(flag,c) = F(c)
while flag
do_stuff()
(flag,c) = F(c)
end
"Unfortunately no, because, and I should have said that, c has to take again the value 0 when F return the value False (c increases as long the Loop lives and return to 0 when it dies). " said Schemer
Then I would write
function F(x)
boolean_result = perform_some_logic()
if boolean_result == true
return (true,x+1)
else
return (false,0)
end
end
flag = true
c = 0
(flag,c) = F(c)
while flag
do_stuff()
(flag,c) = F(c)
end
I am puzzled by the following results of typeof in the Julia 1.0.0 REPL:
# This makes sense.
julia> typeof(10)
Int64
# This surprised me.
julia> typeof(function)
ERROR: syntax: unexpected ")"
# No answer at all for return example and no error either.
julia> typeof(return)
# In the next two examples the REPL returns the input code.
julia> typeof(in)
typeof(in)
julia> typeof(typeof)
typeof(typeof)
# The "for" word returns an error like the "function" word.
julia> typeof(for)
ERROR: syntax: unexpected ")"
The Julia 1.0.0 documentation says for typeof
"Get the concrete type of x."
The typeof(function) example is the one that really surprised me. I expected a function to be a first-class object in Julia and have a type. I guess I need to understand types in Julia.
Any suggestions?
Edit
Per some comment questions below, here is an example based on a small function:
julia> function test() return "test"; end
test (generic function with 1 method)
julia> test()
"test"
julia> typeof(test)
typeof(test)
Based on this example, I would have expected typeof(test) to return generic function, not typeof(test).
To be clear, I am not a hardcore user of the Julia internals. What follows is an answer designed to be (hopefully) an intuitive explanation of what functions are in Julia for the non-hardcore user. I do think this (very good) question could also benefit from a more technical answer provided by one of the more core developers of the language. Also, this answer is longer than I'd like, but I've used multiple examples to try and make things as intuitive as possible.
As has been pointed out in the comments, function itself is a reserved keyword, and is not an actual function istself per se, and so is orthogonal to the actual question. This answer is intended to address your edit to the question.
Since Julia v0.6+, Function is an abstract supertype, much in the same way that Number is an abstract supertype. All functions, e.g. mean, user-defined functions, and anonymous functions, are subtypes of Function, in the same way that Float64 and Int are subtypes of Number.
This structure is deliberate and has several advantages.
Firstly, for reasons I don't fully understand, structuring functions in this way was the key to allowing anonymous functions in Julia to run just as fast as in-built functions from Base. See here and here as starting points if you want to learn more about this.
Secondly, because each function is its own subtype, you can now dispatch on specific functions. For example:
f1(f::T, x) where {T<:typeof(mean)} = f(x)
and:
f1(f::T, x) where {T<:typeof(sum)} = f(x) + 1
are different dispatch methods for the function f1
So, given all this, why does, e.g. typeof(sum) return typeof(sum), especially given that typeof(Float64) returns DataType? The issue here is that, roughly speaking, from a syntactical perspective, sum needs to serves two purposes simultaneously. It needs to be both a value, like e.g. 1.0, albeit one that is used to call the sum function on some input. But, it is also needs to be a type name, like Float64.
Obviously, it can't do both at the same time. So sum on its own behaves like a value. You can write f = sum ; f(randn(5)) to see how it behaves like a value. But we also need some way of representing the type of sum that will work not just for sum, but for any user-defined function, and any anonymous function. The developers decided to go with the (arguably) simplest option and have the type of sum print literally as typeof(sum), hence the behaviour you observe. Similarly if I write f1(x) = x ; typeof(f1), that will also return typeof(f1).
Anonymous functions are a bit more tricky, since they are not named as such. What should we do for typeof(x -> x^2)? What actually happens is that when you build an anonymous function, it is stored as a temporary global variable in the module Main, and given a number that serves as its type for lookup purposes. So if you write f = (x -> x^2), you'll get something back like #3 (generic function with 1 method), and typeof(f) will return something like getfield(Main, Symbol("##3#4")), where you can see that Symbol("##3#4") is the temporary type of this anonymous function stored in Main. (a side effect of this is that if you write code that keeps arbitrarily generating the same anonymous function over and over you will eventually overflow memory, since they are all actually being stored as separate global variables of their own type - however, this does not prevent you from doing something like this for n = 1:largenumber ; findall(y -> y > 1.0, x) ; end inside a function, since in this case the anonymous function is only compiled once at compile-time).
Relating all of this back to the Function supertype, you'll note that typeof(sum) <: Function returns true, showing that the type of sum, aka typeof(sum) is indeed a subtype of Function. And note also that typeof(typeof(sum)) returns DataType, in much the same way that typeof(typeof(1.0)) returns DataType, which shows how sum actually behaves like a value.
Now, given everything I've said, all the examples in your question now make sense. typeof(function) and typeof(for) return errors as they should, since function and for are reserved syntax. typeof(typeof) and typeof(in) correctly return (respectively) typeof(typeof), and typeof(in), since typeof and in are both functions. Note of course that typeof(typeof(typeof)) returns DataType.
I am sorry about the title, but I couldn't find a better one.
Let's define
function test(n)
print("test executed")
return n
end
f(n) = test(n)
Every time we call f we get
f(5)
test executed
5
Is there a way to tell julia to evaluate test once in the definition of f?
I expect that this is probably not going to be possible, in which case I have a slightly different question. If ar=[1,2,:x,-2,2*:x] is there any way to define f(x) to be the sum of ar, i.e. f(x) = 3*x+1?
If you want to compile based on type information, you can use #generated functions. But it seems like you want to compile based on the runtime values of the input. In this case, you might want to do memoization. There is a library Memoize that provides a macro for memoizing functions.
How can I use kwargs in a Julia function and declare their types for speed?
function f(x::Float64; kwargs...)
kwargs = Dict(kwargs)
if haskey(kwargs, :c)
c::Float64 = kwargs[:c]
else
c::Float64 = 1.0
end
return x^2 + c
end
f(0.0, c=10.0)
yields:
ERROR: LoadError: syntax: multiple type declarations for "c"
Of course I can define the function as f(x::Float64, c::Float64=1.0) to achieve the result, but I have MANY optional arguments with default values to pass, so I'd prefer to use kwargs.
Thanks.
Related post
As noted in another answer, this really only matters if you're going to have a type instability. If you do, the answer is to layer your functions. Have a top layer which does type checking and all sorts of setup, and then call a function which uses dispatch to be fast. For example,
function f(x::Float64; kwargs...)
kwargs = Dict(kwargs)
if haskey(kwargs, :c)
c = kwargs[:c]
else
c = 1.0
end
return _f(x,c)
end
_f(x,c) = x^2 + c
If most of your time is spent in the inner function, then this will be faster (it might not be for very simple functions). This allows for very general usage too, where you have have a keyword argument be by default nothing and do and if nothing ... which could setup a complicated default, and not have to worry about the type stability since it will be shielded from the inner function.
This kind of high-level type-checking wrapper above a performance sensitive inner function is used a lot in DifferentialEquations.jl. Check out the high-level wrapper for the SDE solvers which led to nice speedups by insuring type stability (the inner function is sde_solve) (or check out the solve for ODEProblem, it's much more complex since it handles conversions to different pacakges but it's the same idea).
A simpler answer for small examples like yours may be possible after this PR merges.
To fix some confusion, here's a declaration form:
function f(x::Float64; kwargs...)
local c::Float64 # Ensures the type of `c` will be `Float64`
kwargs = Dict(kwargs)
if haskey(kwargs, :c)
c = float(kwargs[:c])
else
c = 1.0
end
return x^2 + c
end
This will force anything that saves to c to convert to a Float64 or error, resulting in a type-stability, but is not as general of a solution. What form you use really depends on what you're doing.
Lastly, there's also the type assert, as #TotalVerb showed:
function f(x::Float64; c::Float64=1.0, kwargs...)
return x^2 + c
end
That's clean, or you could assert in the function:
function f(x::Float64; kwargs...)
kwargs = Dict(kwargs)
if haskey(kwargs, :c)
c = float(kwargs[:c])::Float64
else
c = 1.0
end
return x^2 + c
end
which will cause convertions only on the lines where the assertion occurs (i.e. the #TotalVerb form won't dispatch, so you can't make another function with c::Int, and it will only assert (convert) when the keyword arg is first read in).
Summary
The first solution will dispatch to be type stable in _f no matter what type the user makes c, and so if _f is a long calculation, this will get pretty much optimal performance, but for really quick calls it will have dispatch overhead.
The second solution will fix any type stability by forcing anything you set c to be a Float64 (it will try to convert, and if it can't, error). Thus this gets speed by forcing type stability, or erroring.
The assert in the keyword spot (#TotalVerb's answer) is the cleanest, but won't auto-convert later (so you could get a type-instability. But if you don't accidentally convert it later, then you have type stability, types can be inferred, and so you'll get optimal performance) and you can't extend it to cases where the function has c passed in as other types (no dispatch).
The last solution is pretty much the same as 3, except not as nice. I wouldn't recommend it. If you're doing something complicated with asserts, you likely are designing something wrong or really want to do something like the first (dispatch in a longer function call which is type stable).
But note that dispatch with version 3 may be fixed in the near future, which would allow you to have a different function with c::Float64 and c::Int (if necessary). Hopefully your solution is in here somewhere.
Note that declaring types does not give you increased performance; you may wish to relax the type constraints on x and c for your code to be more generic. Anyway, this is probably what you want:
function f(x::Float64; c::Float64=1.0, kwargs...)
return x^2 + c
end
See the keyword arguments section of the manual.
I have read a lot about continuations and a very common definition I saw is, it returns the control state.
I am taking a functional programming course taught in SML.
Our professor defined continuations to be:
"What keeps track of what we still have to do"
; "Gives us control of the call stack"
A lot of his examples revolve around trees. Before this chapter, we did tail recursion. I understand that tail recursion lets go of the stack to hold the recursively called functions by having an additional argument to "build" up the answer. Reversing a list would be built in a new accumulator where we append to it accordingly. Also, he said something about functions are called(but not evaluated) except till we reach the end where we replace backwards. He said an improved version of tail recursion would be using CPS(Continuation Programming Style).
Could someone give a simplified explanation of what continuations are and why they are favoured over other programming styles?
I found this stackoverflow link that helped me, but still did not clarify the idea for me:
I just don't get continuations!
Continuations simply treat "what happens next" as first class objects that can be used once unconditionally, ignored in favour of something else, or used multiple times.
To address what Continuation Passing Style is, here is some expression written normally:
let h x = f (g x)
g is applied to x and f is applied to the result.
Notice that g does not have any control. Its result will be passed to f no matter what.
in CPS this is written
let h x next = (g x (fun result -> f result next))
g not only has x as an argument, but a continuation that takes the output of g and returns the final value. This function calls f in the same manner, and gives next as the continuation.
What happened? What changed that made this so much more useful than f (g x)? The difference is that now g is in control. It can decide whether to use what happens next or not. That is the essence of continuations.
An example of where continuations arise are imperative programming languages where you have control structures. Whiles, blocks, ordinary statements, breaks and continues are all generalized through continuations, because these control structures take what happens next and decide what to do with it, for example we can have
...
while(condition1) {
statement1;
if(condition2) break;
statement2;
if(condition3) continue;
statement3;
}
return statement3;
...
The while, the block, the statement, the break and the continue can all be described in a functional model through continuations. Each construct can be considered to be a function that accepts the
current environment containing
the enclosing scopes
optional functions accepting the current environment and returning a continuation to
break from the inner most loop
continue from the inner most loop
return from the current function.
all the blocks associated with it (if-blocks, while-block, etc)
a continuation to the next statement
and returns the new environment.
In the while loop, the condition is evaluated according to the current environment. If it is evaluated to true, then the block is evaluated and returns the new environment. The result of evaluating the while loop again with the new environment is returned. If it is evaluated to false, the result of evaluating the next statement is returned.
With the break statement, we lookup the break function in the environment. If there is no function found then we are not inside a loop and we give an error. Otherwise we give the current environment to the function and return the evaluated continuation, which would be the statement after the the while loop.
With the continue statement the same would happen, except the continuation would be the while loop.
With the return statement the continuation would be the statement following the call to the current function, but it would remove the current enclosing scope from the environment.