I am learning Julia, Here is something I am unable to figure out.
Case1: I started Julia console and overridden the default sqrt function with a number 10. So now the function doesn't work. To me coming from R its bit of surprise, In R usually even if we override a function it works because of the method dispatch. Clearly, Julia's way of doing it is different which is okay. But now I am unable to reset to its natural state. I have to restart Julia to make it(sqrt) work again.
julia> sqrt = 10
10
julia> sqrt(5)
ERROR: MethodError: objects of type Int64 are not callable
Stacktrace:
[1] top-level scope at REPL[2]:1
julia> sqrt = Nothing
Nothing
julia> sqrt(5)
ERROR: MethodError: no method matching Nothing(::Int64)
Closest candidates are:
Nothing() at boot.jl:324
Stacktrace:
[1] top-level scope at REPL[4]:1
Case2: I started Julia console and used sqrt function to calculate things and it worked. But now if I try to reset to a constant it's not working. (I am assuming its because its compiled already and hence can't be overridden).
julia> sqrt(7)
2.6457513110645907
julia> sqrt = 10
ERROR: cannot assign a value to variable Base.sqrt from module Main
Stacktrace:
[1] top-level scope at REPL[2]:1
My question is:
Is there any way to reset the function to original state without restarting Julia?
Also is my assumption for case2 is correct?
I am not sure if its already answered somewhere. I tried to find it out, but couldn't. I am still giving a read on this thing. But asking for help if anyone knows this. Thanks in advance.
AFAIU, the rationale for this behavior is as follows:
when you run sqrt(5) in a program, it means that you know about the sqrt function. If, later on, you try to assign a new value to sqrt, Julia forbids you to do so, on the grounds that sqrt in your program actually refers to the Base.sqrt function, which is a constant. (Note that the same is true of any function that got exported by a package you're using; there is no specificity related to Base here, except that you don't have to explicitly using Base in order to be able to call the functions it defines).
if you don't use sqrt first as a function, Julia can't assume that you know about Base.sqrt. So if the first mention of sqrt in your program is an assignment, it will happily create a new variable of that name. This makes Julia more future-proof: suppose that you write a program that declares and uses a variable named foo. As of Julia 1.5.3, no Base.foo function exists. But now imagine that Julia 1.6 introduces a Base.foo function. We wouldn't want such a change to break your existing code, which should continue working with newer Julia versions. Therefore the safe choice in this case consists in letting you freely declare a variable foo without worrying about the name collision with Base.foo.
Now in the case where you accidentally create in an interactive session a global variable that collides with an existing function, a simple solution would be to simply reassign your variable to the function from Base (or whichever module the original function came from):
julia> sqrt = 1
1
# Oops, looks like I made a mistake
julia> sqrt(2)
ERROR: MethodError: objects of type Int64 are not callable
Stacktrace:
[1] top-level scope at REPL[2]:1
# Let's try and fix it
julia> sqrt = Base.sqrt
sqrt (generic function with 20 methods)
julia> sqrt(2)
1.4142135623730951
Related
I have two methods with the same name and same amount of "normal" arguments. However, they differ in the number of keyword arguments (which are all non-optional).
I think the following minimal example illustrates my point well:
julia> f(a; b) = a+b
f (generic function with 1 method)
julia> f(a; b, c)= a+b+c
f (generic function with 1 method)
julia> f(1; b=1)
ERROR: UndefKeywordError: keyword argument c not assigned
Stacktrace:
[1] top-level scope
# REPL[72]:1
julia> f(1; b=1, c=2)
4
The second line of output ("f (generic function with 1 method)") already shows that Julia doesn't understand what I want - it should be 2 methods (which it would be e.g. if I wrote f(a, b; c) = a+b+c).
I already found this in the Julia discourse forum, which seems to explain why this doesn't work for multiple dispatch. However, in my case not the types but the counts differ, so I hope that there is a (neat) solution.
Regarding why I want this: I have a function that makes some calculations (to be precise it calculates Stark curves) for different molecules. This function needs to act differently for different molecule types (to be precise different symmetries) and also takes different arguments (different quantum numbers).
So that's why I need multiple dispatch - but why keyword arguments, you might ask. That is because I create the quantum numbers using distributions and passing them as NamedTuple in connection with ....
I'd like to not change this behaviour, but maybe you were curious to why I would need something like this.
In your example, you can reuse the keyword arguments as positional arguments to make f(1; b=1) work, but it won't behave like true keyword dispatch. For one, f(1; c=2) would call _f(a,b).
function f(a; b=missing, c=missing)
_f(a, skipmissing((b, c))...)
end
_f(a,b,c) = a+b+c
_f(a,b) = a+b
Not sure if this is applicable to the actual use case you described, though, things can be harder to reorder than b and c, and as a comment noted, a NamedTuple (which is ordered) is already dispatchable.
As a novice to Julia, I, like many others, am perplexed by the fact that loops in Julia create their own local scope (but not on the REPL nor within functions). There is much discussion online about this topic, but most of the questions here are about the particulars of this behaviour, such as why doing a=1 inside the loop doesn't affect variable a outside the loop, but a[1]=1 works. I get how it works now, for the most part.
My question is why was Julia implemented with this behaviour. Is there a benefit to this from the prespective of the user? I cannot think of one. Or was it necessary for some technical reason?
I appologise if this has been asked already, but all the questions and answers I've seen so far were about how this works and how to deal with it, but I am curious about WHY Julia was implemented this way.
Firstly, loops in Julia only introduce a new scope of the sort that hides variables existing outside the loop (as per your complaint) if the scope outside the loop is global scope. So, for instance
function foo()
a = 0
# Loop does not hide existing variable `a`, will work just fine
for i = 100
a += i^2
end
return a
end
julia> foo()
10000
in other words
# Anywhere other than global scope
a = 0
for i = 100
a += i^2
end
a == 10000 # TRUE
This is because in Julia, as in many many other languages, global scope may be considered harmful. At the very least, a for loop with global scope would encounter significant performance penalties. For instance, consider the following:
julia> a = 0
0
julia> #time for i=1:100
# Technically this "global" keyword is superfluous since we're running this at the repl, but doesn't hurt to be explicit
global a += rand()^2
end
0.000022 seconds (200 allocations: 3.125 KiB)
julia> function bar()
a = 0
for i=1:100
a += rand()^2
end
return a
end
bar (generic function with 1 method)
julia> #time bar()
0.000002 seconds
33.21364180865362
Note the massive difference in allocations (the bottom version has zero) and the ~10x time difference.
Now, you may have noticed I used a special keyword global there in the global example, but since this was being run in the REPL, that doesn't actually do anything other than make it explicit what is happening.
That brings us to the other significant difference you have noticed: when run in the REPL, for loops appear not to introduce a new scope, even though the REPL is certainly global scope. This is because it turns out to be a huge pain when debugging to have to add a bunch of global qualifiers to code you have copy-pasted from somewhere deeper in your program (say within a function, where loops do not hide outside variables). So for the sake of convenience when debugging, the REPL effectively adds those global keywords for you, making the presumption that if you cared about performance you wouldn't just be pasting raw loops into the REPL, and if you are just pasting raw loops into the REPL, you're probably debugging or something.
In a script, however, it is presumed that you do care about performance, so you will get an error if you try to use a global variable within a loop without explicitly declaring it as such.
The details are substantially more complicated, as the other answer explains in more technically correct terms. Some of this complication, as far as I know, is due to a reversal on the decision of whether or not global variables should or should not be accessible by default within a loop in the REPL that happened around the time of Julia v0.7.
for loops in Julia introduce a so called local (soft) scope, see https://docs.julialang.org/en/v1/manual/variables-and-scoping/#man-scope-table.
The rules for local (soft) scope are (quoting):
If x is not already a local variable and all of the scope constructs containing the assignment are soft scopes (loops, try/catch blocks, or struct blocks), the behavior depends on whether the global variable x is defined:
if global x is undefined, a new local named x is created in the scope of the assignment;
if global x is defined, the assignment is considered ambiguous:
in non-interactive contexts (files, eval), an ambiguity warning is printed and a new local is created;
in interactive contexts (REPL, notebooks), the global variable x is assigned.
So your statement:
why doing a=1 inside the loop doesn't affect variable a outside the loop
is only true in non-interactive contexts if the for loop is not inside a hard local scope (typically if for loop is in a global scope), and the variable you assign to is defined in global scope. However, you will get a warning then.
Now the crucial part of your question is I think:
My question is why was Julia implemented with this behaviour. Is there a benefit to this from the prespective of the user?
The answer is that for loop creates a new binding for a variable that is defined within its scope. To see the consequence consider the following code (I assume that variable x is not defined in enclosing scope so that x is defined in local scope):
julia> v = []
Any[]
julia> for i in 1:2
x = i
push!(v, () -> x)
end
julia> v[1]()
1
julia> v[2]()
2
Whe have created two anonymous functions and all works as you probably expected.
Now let us check what would happen in Python:
>>> v = []
>>> for i in range(1, 3):
... x = i
... v.append(lambda: x)
...
>>> v[0]()
2
>>> v[1]()
2
The result might surprise you. Both anonymous functions return 2. This is a consequence of not creating a local variable with a new binding in each iteration of the loop.
However, if in Julia you were working in REPL and x were defined in global scope you would get:
julia> x = 0
0
julia> v = []
Any[]
julia> for i in 1:2
x = i
push!(v, () -> x)
end
julia> v[1]()
2
julia> v[2]()
2
just like in Python.
The other consideration, as explained in the other answer is performance. But most likely performance critical code is written inside a function anyway, and the discussed performance considerations are only relevant in global scope.
EDIT
This is a design choice of Matlab, quoting from https://research.wmz.ninja/articles/2017/05/closures-in-matlab.html:
When an anonymous function is created, the immediate values of the referenced local variables will be captured. Hence if any changes to the referenced local variables made after the creation of this anonymous function will not affect this anonymous function.
So as you can see in Matlab there is a difference of anonymous function vs. a closure, which does something different:
When a nested function is created, the immediate values of the referenced local variables will not be captured. When the nested function is called, it will use the current values of the referenced local variables.
In Julia there is no such difference as you can see in the examples above.
And quoting the documentation of Matlab https://www.mathworks.com/help/matlab/matlab_prog/anonymous-functions.html:
Because a, b, and c are available at the time you create parabola, the function handle includes those values. The values persist within the function handle even if you clear the variables:
(but I think it is not as explicit as the explanation I linked above)
function somefun()
x::Int = 1
x = 0.5
end
this compiles with no warning. of course calling it produces an InexactError: Int64(0.5). question: can you enforce a compile time check?
Julia is a dynamic language in this sense. So, no, it appears you cannot detect if the result of an assignment will result in such an error without running the function first, as this kind of type checking is done at runtime.
I wasn't sure myself, so I wrapped this function in a module to force (pre)compilation in the absence of running the function, and the result was that no error was thrown, which confirms this idea. (see here if you want to see what I mean by this).
Having said this, to answer the spirit of your question: is there a way to avoid such obscure runtime errors from creeping up in unexpected ways?
Yes there is. Consider the following two, almost equivalent functions:
function fun1(x ); y::Int = x; return y; end;
function fun2(x::Int); y::Int = x; return y; end;
fun1(0.5) # ERROR: InexactError: Int64(0.5)
fun2(0.5) # ERROR: MethodError: no method matching fun2(::Float64)
You may think, big deal, we exchanged one error for another. But this is not the case. In the first instance, you don't know that your input will cause a problem until the point where it gets used in the function. Whereas in the second case, you are effectively enforcing a type check at the point of calling the function.
This is a trivial example of programming "by contract", by making use of Julia's elegant type-checking system. See Design By Contract for details.
So the answer to your question is, yes, if you rethink your design and follow good programming practices, such that this kind of errors are caught early on, then you can avoid having them occuring later on in obscure scenarios where they are hard to fix or detect.
The Julia manual provides a style guide which may also be of help (the example I give above is right at the top!).
It's worth thinking through what "compile time" really is in Julia — because it's probably not what you're thinking.
When you define the function:
julia> function somefun()
x::Int = 1
x = 0.5
end
somefun (generic function with 1 method)
You are not compiling it. Julia won't compile it, in fact, until you call it. Julia's compiler can be thought of as Just-Barely-Ahead-Of-Time, standing in contrast to typical JIT or AOT designs.
Now, when you call the function it compiles it and then runs it which throws the error. You can see this compilation happening the very first time you call the function — it takes a bit more time and memory as it generates and caches the specialized code:
julia> #time try somefun() catch end
0.005828 seconds (6.76 k allocations: 400.791 KiB)
julia> #time try somefun() catch end
0.000107 seconds (6 allocations: 208 bytes)
So perhaps you can see that with Julia's compilation model it doesn't so much matter if it gets caught at compile time or not — even if Julia refused to compile (and cache) the code it'd behave exactly like what you currently see. It'd still allow you to define the function in the first place, and it'd still only throw its error upon calling the function.
The question you mean to ask is if Julia could (or should) catch this error at function definition time. And then the question is really — is it ok to define a method that always results in an error? What about a function like error itself? In Julia, it's totally fine to define a method that unconditionally errors like this one, and there can be good reasons to do so.
Now, there are ways to ask Julia if it is able to detect that this method will always unconditionally error:
julia> #code_typed somefun()
CodeInfo(
1 ─ invoke Base.convert(Main.Int::Type{Int64}, 0.5::Float64)::Union{}
└── $(Expr(:unreachable))::Union{}
) => Union{}
This is the very first step in Julia's process of compilation, and in this case it can see that everything beyond convert(Int, 0.5) is unreachable — that is, it errors. Further, it knows that since the function will never return, it's return type is Union{} (that is, no possible type can ever be returned!) So you can ask Julia to do this step with, for example, the #inferred macro as part of a test suite.
I'm currently programming on Julia via command line session.
I know that the predefined functions in Julia (e.g. sqrt) can take on a variable value but in a particular session I tried to use the function as sqrt(25) and it gave me 5.0 but in the same session when I wrote sqrt=9, then it says
"Error: cannot assign variable Base.sqrt from module Main"
and if I have to make this happen I have to open a new session all over again and assign a variable value to sqrt sqrt=9and when I do so then again it says
ERROR:MethodError: objects of type Int64 are not callable
when I try to use sqrt as a function.
The same thing happens with pi.
The topic you are asking about is a bit tricky. Although I agree with the general recommendation of PilouPili it is sometimes not that obvious.
Your question can be decomposed into two issues:
ERROR:MethodError: objects of type Int64 are not callable
This one is pretty clear, I guess, and should be expected if you have some experience in other programming languages. The situation is that name sqrt in the current scope is bound to value 9 and objects of type Int64 are not callable.
The other error
"Error: cannot assign variable Base.sqrt from module Main"
is more complex and may be non obvious. You are free to use name sqrt for your own variables in your current scope until you call or reference the sqrt function. Only after such operations the binding to sqrt is resolved in the current scope (only recently some corner case bugs related to when bindings are resolved were fixed https://github.com/JuliaLang/julia/issues/30234). From this moment you are not allowed to change the value of sqrt because Julia disallows assigning values to variables imported from other modules.
The relevant passages from the Julia manual are (https://docs.julialang.org/en/latest/manual/modules/):
The statement using Lib means that a module called Lib will be available for resolving names as needed. When a global variable is encountered that has no definition in the current module, the system will search for it among variables exported by Lib and import it if it is found there. This means that all uses of that global within the current module will resolve to the definition of that variable in Lib.
and
Once a variable is made visible via using or import, a module may not create its own variable with the same name. Imported variables are read-only; assigning to a global variable always affects a variable owned by the current module, or else raises an error.
To better understand what these rules mean I think it is best to illustrate them with an example:
julia> module A
export x, y
x = 10
y = 100
end
Main.A
julia> using .A
julia> x = 1000
1000
julia> y
100
julia> y = 1000
ERROR: cannot assign variable A.y from module Main
on the other hand by calling import explicitly instead of using you resolve the binding immediately:
julia> module A
export x, y
x = 10
y = 100
end
Main.A
julia> import .A: x, y
julia> x = 1000
ERROR: cannot assign variable A.x from module Main
And you can see that it is not something specific to Base or functions but any module and any variable. And this is actually the case when you might encounter this problem in practice, as it is rather obvious that sqrt is a predefined function, but something like SOME_CONSTANT might or might not be defined and exported by the modules you call using on and the behavior of Julia code will differ in cases when you first assign to SOME_CONSTANT in global scope and when you first read SOME_CONSTANT.
Finally, there is a special case when you want to add methods to functions defined in the other module - which is allowed or not depending on how you introduce the name in the current scope, you can read about the details here what are the rules in this case.
Bogumił's answer is good and accurate, but I think it can be described a bit more succinctly.
Julia — like most programming languages — allows you to define your own variables with the same name as things that are built-in (or more generally, provided by using a package). This is a great thing, because otherwise you'd have to tip-toe around the many hundreds of names that Julia and its packages provide.
There are two catches, though:
Once you've used a built-in name or a name from any package, you can no longer define your own variables with the same name (in the same scope). That's the Error: cannot assign variable Base.sqrt from module Main: you've already used sqrt(4) and now are trying to define sqrt=2. The workaround? Just use a different name for your variable.
You can define your own variable with the same name as a built-in name or a name from any package if you've not used it yet, but then it takes on the definition you've given it. That's what's happening with ERROR: MethodError: objects of type Int64 are not callable: you've defined something like sqrt=2 and then tried to use sqrt(4). The workaround? You can still reference the other definition by qualifying it with its module name. In this case sqrt is provided by Base, so you can still call Base.sqrt. You can even re-assign sqrt = Base.sqrt to restore its original definition.
In this manner, the names provided by Base and other packages you're using are a bit like Schrödinger's cat: they're in this kinda-there-but-not-really state until you do something with them. If you look at them first, then they exist, but if you define your own variable first, then they don't.
I am puzzled by the following results of typeof in the Julia 1.0.0 REPL:
# This makes sense.
julia> typeof(10)
Int64
# This surprised me.
julia> typeof(function)
ERROR: syntax: unexpected ")"
# No answer at all for return example and no error either.
julia> typeof(return)
# In the next two examples the REPL returns the input code.
julia> typeof(in)
typeof(in)
julia> typeof(typeof)
typeof(typeof)
# The "for" word returns an error like the "function" word.
julia> typeof(for)
ERROR: syntax: unexpected ")"
The Julia 1.0.0 documentation says for typeof
"Get the concrete type of x."
The typeof(function) example is the one that really surprised me. I expected a function to be a first-class object in Julia and have a type. I guess I need to understand types in Julia.
Any suggestions?
Edit
Per some comment questions below, here is an example based on a small function:
julia> function test() return "test"; end
test (generic function with 1 method)
julia> test()
"test"
julia> typeof(test)
typeof(test)
Based on this example, I would have expected typeof(test) to return generic function, not typeof(test).
To be clear, I am not a hardcore user of the Julia internals. What follows is an answer designed to be (hopefully) an intuitive explanation of what functions are in Julia for the non-hardcore user. I do think this (very good) question could also benefit from a more technical answer provided by one of the more core developers of the language. Also, this answer is longer than I'd like, but I've used multiple examples to try and make things as intuitive as possible.
As has been pointed out in the comments, function itself is a reserved keyword, and is not an actual function istself per se, and so is orthogonal to the actual question. This answer is intended to address your edit to the question.
Since Julia v0.6+, Function is an abstract supertype, much in the same way that Number is an abstract supertype. All functions, e.g. mean, user-defined functions, and anonymous functions, are subtypes of Function, in the same way that Float64 and Int are subtypes of Number.
This structure is deliberate and has several advantages.
Firstly, for reasons I don't fully understand, structuring functions in this way was the key to allowing anonymous functions in Julia to run just as fast as in-built functions from Base. See here and here as starting points if you want to learn more about this.
Secondly, because each function is its own subtype, you can now dispatch on specific functions. For example:
f1(f::T, x) where {T<:typeof(mean)} = f(x)
and:
f1(f::T, x) where {T<:typeof(sum)} = f(x) + 1
are different dispatch methods for the function f1
So, given all this, why does, e.g. typeof(sum) return typeof(sum), especially given that typeof(Float64) returns DataType? The issue here is that, roughly speaking, from a syntactical perspective, sum needs to serves two purposes simultaneously. It needs to be both a value, like e.g. 1.0, albeit one that is used to call the sum function on some input. But, it is also needs to be a type name, like Float64.
Obviously, it can't do both at the same time. So sum on its own behaves like a value. You can write f = sum ; f(randn(5)) to see how it behaves like a value. But we also need some way of representing the type of sum that will work not just for sum, but for any user-defined function, and any anonymous function. The developers decided to go with the (arguably) simplest option and have the type of sum print literally as typeof(sum), hence the behaviour you observe. Similarly if I write f1(x) = x ; typeof(f1), that will also return typeof(f1).
Anonymous functions are a bit more tricky, since they are not named as such. What should we do for typeof(x -> x^2)? What actually happens is that when you build an anonymous function, it is stored as a temporary global variable in the module Main, and given a number that serves as its type for lookup purposes. So if you write f = (x -> x^2), you'll get something back like #3 (generic function with 1 method), and typeof(f) will return something like getfield(Main, Symbol("##3#4")), where you can see that Symbol("##3#4") is the temporary type of this anonymous function stored in Main. (a side effect of this is that if you write code that keeps arbitrarily generating the same anonymous function over and over you will eventually overflow memory, since they are all actually being stored as separate global variables of their own type - however, this does not prevent you from doing something like this for n = 1:largenumber ; findall(y -> y > 1.0, x) ; end inside a function, since in this case the anonymous function is only compiled once at compile-time).
Relating all of this back to the Function supertype, you'll note that typeof(sum) <: Function returns true, showing that the type of sum, aka typeof(sum) is indeed a subtype of Function. And note also that typeof(typeof(sum)) returns DataType, in much the same way that typeof(typeof(1.0)) returns DataType, which shows how sum actually behaves like a value.
Now, given everything I've said, all the examples in your question now make sense. typeof(function) and typeof(for) return errors as they should, since function and for are reserved syntax. typeof(typeof) and typeof(in) correctly return (respectively) typeof(typeof), and typeof(in), since typeof and in are both functions. Note of course that typeof(typeof(typeof)) returns DataType.