Scope of for loops in julia [duplicate] - julia

This snippet of code is from JuliaBoxTutorials
myfriends = ["Ted", "Robyn", "Barney", "Lily", "Marshall"]
i = 1;
while i <= length(myfriends)
friend = myfriends[i]
println("Hi $friend, it's great to see you!")
i += 1
end
giving this error when run with Julia v1.0
UndefVarError: i not defined
Stacktrace:
[1] top-level scope at ./In[12]:5 [inlined]
[2] top-level scope at ./none:0
But when i += 1 is replaced with global i += 1 it works. I guess this was still working in v0.6 and the tutorial will be adapted once the new Intro to Julia is published this Friday.
I was just wondering, is it possible, to make a while loop without stating a global variable?

As #Michael Paul and #crstnbr already replied in the comments, the scoping rules have been changed (Scope of variables in Julia). The for and while loop introduce a new scope and have no access to the outside (global) variables. You can get scope access using the global keyword but the recommended workflow is wrapping your code in functions.
One of the benefits of the new design is that the user is forced to avoid such global constructs which directly affect the performance of functions - which cannot be type stable when they access global variables.
One downside is the confusion when experimenting in the REPL and seeing such errors.
In my opinion the new behaviour is the cleaner one with respect to predictability. It was however a very tough and long-running discussion within the whole Julia community ;)
There is currently a discussion if the REPL will be changed to behave like the old one by making use of let-wraps: https://github.com/JuliaLang/julia/issues/28789
This is something which is not practical to be done manually (much more complicated then using the global keyword), see the example by Stefan Karpinski: https://github.com/JuliaLang/julia/issues/28789#issuecomment-414666648
Anyways, for the sake of completeness (although I would not recommend doing this) here is a version using global:
myfriends = ["Ted", "Robyn", "Barney", "Lily", "Marshall"]
i = 1;
N = length(myfriends)
while i <= N # you cannot even call a function here
# with a global, like length(myfriends)
global i, myfriends
friend = myfriends[i]
println("Hi $friend, it's great to see you!")
i += 1
end
Note however that this is also completely valid:
myfriends = ["Ted", "Robyn", "Barney", "Lily", "Marshall"]
greet(friend) = println("Hi $friend, it's great to see you!")
for friend in myfriends
greet(friend)
end

I found that this works in Julia v1.0:
let
myfriends = ["Ted", "Robyn", "Barney", "Lily", "Marshall"]
i = 1;
while i <= length(myfriends) # this function here seems to work no problem
friend = myfriends[i]
println("Hi $friend, it's great to see you!")
i = i + 1 # gives syntax error when prefixed with global
end
end
In fact, it will give me a syntax error if I try to make i global :) within the while loop.

Related

How to reassign result of concatenation in Julia?

I need to create Vector of Vector of predefined structure in Julia. As of now I am trying to do it via iterative concatenation:
struct Scenario
prob::Float64 # probability
time::Float64 # duration of visit
profit::Int64 # profit of visit
end
possible_times = [60, 90, 120, 150, 180]
scenarios = Scenario[]
for point in 1:num_points
profit = rand(1:4)
new_scenario = [Scenario(0.2, possible_times[i], profit) for i=1:5]
scenarios = vcat(scenarios, new_scenario)
end
display(scenarios)
But I got the following
Warning: Assignment to `scenarios` in soft scope is ambiguous because a global variable by the same name exists: `scenarios` will be treated as a new local. Disambiguate by using `local scenarios` to suppress this warning or `global scenarios` to assign to the existing global variable.
ERROR: LoadError: UndefVarError: scenarios not defined
So the first question is how to save the result of intermediate concatenation? And the second question is that way correct to achieve the goal? Or I do it wrong and there is another way?
Normally use append! instead of vcat:
for point in 1:num_points
profit = rand(1:4)
new_scenario = [Scenario(0.2, possible_times[i], profit) for i=1:5]
append!(scenarios, new_scenario)
end
If you want to use vcat use the global keyword:
for point in 1:num_points
profit = rand(1:4)
new_scenario = [Scenario(0.2, possible_times[i], profit) for i=1:5]
global scenarios = vcat(scenarios, new_scenario)
end
The point is that in scenarios = vcat(scenarios, new_scenario) you reassign the scenarios variable which is in global scope.
In general, the situation is a bit more complex (Julia behavior will depend on whether the code is run in interactive or non-interactive session), as you can read in this section of the Julia Manual (bullet 3 in this section on Soft scope). But if you do not want to dig into the details of scoping a simple and safe rule is: if you assign to a global variable then prefix the assignment operation with global.

Does each assignment mean that a copy is being made?

Recently I learned that in R there are no references, rather all object are immutable and each assignment makes a copy.
Uh-oh.
Copying large matrices over and over seems pretty horrible...
Now I'm in a paranoia, copypasting code all the time because I'm afraid of making helper functions (passing parameters = assignment? returning values = assignment?), I'm afraid of making helper variables if I'm not 100% sure an object would be copied anyway...
Example:
What I would love to make:
foo = function(someGivenLargeObject) {
returnedMatrix = someGivenLargeObject$someLargeMatrix # <- BAD?!?!?!?!
if(someCondition)
returnedMatrix = operateOn(returnedMatrix)
if(otherCondition)
returnedMatrix = operateOn(returnedMatrix)
returnedMatrix
}
What I'm making instead:
foo = function(someGivenLargeObject) { # <- still BAD?!?!?!
returnedMatrix = NULL # <- No copy of someLargeMatrix is made!
if(someCondition)
returnedMatrix = operateOn(someGivenLargeObject$someLargeMatrix)
if(otherCondition)
returnedMatrix = operateOn(
if(is.null(returnedMatrix))
someGivenLargeObject$someLargeMatrix
else
returnedMatrix
) # <- ^ Incredible clutter! Unreadable!
if(is.null(returnedMatrix))
return(someGivenLargeObject$someLargeMatrix)
else
return(returnedMatrix) # <- does return copy stuff?!?!?!?!
The readability loss in the second version of the function is pretty amazing IMO; yet - is this the price to avoid the unecessary copying of someLargeMatrix in case neither someCondition nor otherCondition holds? Because the line returnedMatrix = someGivenLargeObject$someLargeMatrix would necessite this copying?
Or am I in a paranoia, may I go safely with the more readable version of the function because making a reference to someLargeMatrix doesn't necessite copying? (BUT THERE ARE NO REFERENCES IN R!!!)
Also I hope that a function call / function return doesn't copy stuff either?
}
Side note: Just so that it is clear: I didn't yet run into an issue when I knew an object was copied unecessarily in a situation like that I described above. I'm just perplexed by having read that "there are no references in R", so this question is based on my worries from what might be the implication of this lack of references, rather than any empirical observation.
Donald Knuth famously said "Premature Optimization is the root of all evil",
http://wiki.c2.com/?PrematureOptimization
it is good to be aware about this, but code clarity is on most cases more important.
R is usually smart enough to figure out when copy is needed.
(not all assignments cause a copy only assignments that are later modified)

Finding a Module's path, using the Module object

What is the sane way to go from a Module object to a path to the file in which it was declared?
To be precise, I am looking for the file where the keyword module occurs.
The indirect method is to find the location of the automatically defined eval method in each module.
moduleloc(mm::Module) = first(functionloc(mm.eval, (Symbol,)))
for example
moduleloc(mm::Module) = first(functionloc(mm.eval, (Symbol,)))
using DataStructures
moduleloc(DataStructures)
Outputs:
/home/oxinabox/.julia/v0.6/DataStructures/src/DataStructures.jl
This indirect method works, but it feels like a bit of a kludge.
Have I missed some inbuilt function to do this?
I will remind answered that Modules are not the same thing as packages.
Consider the existence of submodules, or even modules that are being loaded via includeing some abolute path that is outside the package directory or loadpath.
Modules simply do not store the file location where they were defined. You can see that for yourself in their definition in C. Your only hope is to look through the bindings they hold.
Methods, on the other hand, do store their file location. And eval is the one function that is defined in every single module (although not baremodules). Slightly more correct might be:
moduleloc(mm::Module) = first(functionloc(mm.eval, (Any,)))
as that more precisely mirrors the auto-defined eval method.
If you aren't looking for a programmatic way of doing it you can use the methods function.
using DataFrames
locations = methods(DataFrames.readtable).ms
It's for all methods but it's hardly difficult to find the right one unless you have an enormous number of methods that differ only in small ways.
There is now pathof:
using DataStructures
pathof(DataStructures)
"/home/ederag/.julia/packages/DataStructures/59MD0/src/DataStructures.jl"
See also: pkgdir.
pkgdir(DataStructures)
"/home/ederag/.julia/packages/DataStructures/59MD0"
Tested with julia-1.7.3
require obviously needs to perform that operation. Looking into loading.jl, I found that finding the module path has changed a bit recently: in v0.6.0, there is a function
load_hook(prefix::String, name::String, ::Void)
which you can call "manually":
julia> Base.load_hook(Pkg.dir(), "DataFrames", nothing)
"/home/philipp/.julia/v0.6/DataFrames/src/DataFrames.jl"
However, this has changed to the better in the current master; there's now a function find_package, which we can copy:
macro return_if_file(path)
quote
path = $(esc(path))
isfile(path) && return path
end
end
function find_package(name::String)
endswith(name, ".jl") && (name = chop(name, 0, 3))
for dir in [Pkg.dir(); LOAD_PATH]
dir = abspath(dir)
#return_if_file joinpath(dir, "$name.jl")
#return_if_file joinpath(dir, "$name.jl", "src", "$name.jl")
#return_if_file joinpath(dir, name, "src", "$name.jl")
end
return nothing
end
and add a little helper:
find_package(m::Module) = find_package(string(module_name(m)))
Basically, this takes Pkg.dir() and looks in the "usual locations".
Additionally, chop in v0.6.0 doesn't take these additional arguments, which we can fix by adding
chop(s::AbstractString, m, n) = SubString(s, m, endof(s)-n)
Also, if you're not on Unix, you might want to care about the definitions of isfile_casesensitive above the linked code.
And if you're not so concerned about corner cases, maybe this is enough or can serve as a basis:
function modulepath(m::Module)
name = string(module_name(m))
Pkg.dir(name, "src", "$name.jl")
end
julia> Pkg.dir("DataStructures")
"/home/liso/.julia/v0.7/DataStructures"
Edit: I now realized that you want to use Module object!
julia> m = DataStructures
julia> Pkg.dir(repr(m))
"/home/liso/.julia/v0.7/DataStructures"
Edit2: I am not sure if you are trying to find path to module or to object defined in module (I hope that parsing path from next result is easy):
julia> repr(which(DataStructures.eval, (String,)))
"eval(x) in DataStructures at /home/liso/.julia/v0.7/DataStructures/src/DataStructures.jl:3"

How does one clear or remove a global in julia?

Is there any syntax that does something similar to MATLAB's "clear" i.e. if I have a global variable "a". How do I get rid of it? How do I do the analog of
clear a
See the latest answer to this question here: https://docs.julialang.org/en/v1/manual/faq/#How-do-I-delete-an-object-in-memory%3F
Retrieved from the docs:
Julia does not have an analog of MATLAB’s clear function; once a name
is defined in a Julia session (technically, in module Main), it is
always present.
If memory usage is your concern, you can always replace objects with
ones that consume less memory. For example, if A is a gigabyte-sized
array that you no longer need, you can free the memory with A = 0. The
memory will be released the next time the garbage collector runs; you
can force this to happen with gc().
Julia 0.6 < 1.0
In Julia 0.6. You can remove the variable and free up it's memory by calling clear!().
You have to call clear! on the symbolic name of the variable:
julia> x = 5
5
julia> sizeof(x)
8
julia> clear!(:x)
julia> sizeof(x)
0
As DFN pointed out, this won't actually remove the objects but set them to nothing. This is useful for freeing up memory from you workspace as you can "delete" the memory footprint for non-constant objects.
Julia 1.0+
This does not work in Julia 1.0+. If you are using 1.0+ you will have to set the object to Nothing and let the garbage collector take it from there.
This is from the official docs here.
As of 0.3.9, it's possible to clear all global variables (get a new workspace), through the workspace() function.
It's also possible to get the variables from the last workspace by using LastMain (e.g. LastMain.foobar).
So currently the only way of doing what you desire, is to clear everything and transfer everything but the variable you want to your new workspace.
Currently, one doesn't. There is, however, an issue to track that feature:
https://github.com/JuliaLang/julia/issues/2385
For Julia-0.6.4,
clear!(:x)
is working as mentioned by #niczky AND it's working in iJulia.
However, for Julia-1.0.0,
clear!(:x)
... throws up the following:
ERROR: UndefVarError: clear! not defined
Stacktrace:
[1] top-level scope at none:0
So, it's broken for Julia-1.0.0.
Absolutely clear!(:x) does not work with julia 0.6.0 in notebook(IJulia)! You may choose to use x = 0 as an alternative.

Is stricter error reporting available in R?

In PHP we can do error_reporting(E_ALL) or error_reporting(E_ALL|E_STRICT) to have warnings about suspicious code. In g++ you can supply -Wall (and other flags) to get more checking of your code. Is there some similar in R?
As a specific example, I was refactoring a block of code into some functions. In one of those functions I had this line:
if(nm %in% fields$non_numeric)...
Much later I realized that I had overlooked adding fields to the parameter list, but R did not complain about an undefined variable.
(Posting as an answer rather than a comment)
How about ?codetools::checkUsage (codetools is a built-in package) ... ?
This is not really an answer, I just can't resist showing how you could declare globals explicitly. #Ben Bolker should post his comment as the Answer.
To avoiding seeing globals, you can take a function "up" one environment -- it'll be able to see all the standard functions and such (mean, etc), but not anything you put in the global environment:
explicit.globals = function(f) {
name = deparse(substitute(f))
env = parent.frame()
enclos = parent.env(.GlobalEnv)
environment(f) = enclos
env[[name]] = f
}
Then getting a global is just retrieving it from .GlobalEnv:
global = function(n) {
name = deparse(substitute(n))
env = parent.frame()
env[[name]] = get(name, .GlobalEnv)
}
assign('global', global, env=baseenv())
And it would be used like
a = 2
b = 3
f = function() {
global(a)
a
b
}
explicit.globals(f)
And called like
> f()
Error in f() : object 'b' not found
I personally wouldn't go for this but if you're used to PHP it might make sense.
Summing up, there is really no correct answer: as Owen and gsk3 point out, R functions will use globals if a variable is not in the local scope. This may be desirable in some situations, so how could the "error" be pointed out?
checkUsage() does nothing that R's built-in error-checking does not (in this case). checkUsageEnv(.GlobalEnv) is a useful way to check a file of helper functions (and might be great as a pre-hook for svn or git; or as part of an automated build process).
I feel the best solution when refactoring is: at the very start to move all global code to a function (e.g. call it main()) and then the only global code would be to call that function. Do this first, then start extracting functions, etc.

Resources