In Julia am I allowed to create & use static fields? Let me explain my problem with a simplified example. Let's say we have a type:
type Foo
bar::Dict()
baz::Int
qux::Float64
function Foo(fname,baz_value,qux_value)
dict = JLD.load(fname)["dict"] # It is a simple dictionary loading from a special file
new(dict,baz_value,quz_value)
end
end
Now, As you can see, I load a dictionary from a jld file and store it into the Foo type with the other two variables baz and qux_value. Now, let's say I will create 3 Foo object type.
vars = [ Foo("mydictfile.jld",38,37.0) for i=1:3]
Here, as you can see, all of the Foo objects load the same dictionary. This is a quite big file (~10GB)and I don't want to load it many times. So,
I simply ask that, is there any way in julia so that, I load it just once and all of there 3 types can reach it? (That's way I simply use Static keyword inside the question)
For such a simple question, my approach might look like silly, but as a next step, I make this Foo type iterable and I need to use this dictionary inside the next(d::Foo, state) function.
EDIT
Actually, I've found a way right now. But I want to ask that whether this is a correct or not.
Rather than giving the file name to the FOO constructor, If I load the dictionary into a variable before creating the objects and give the same variable into all of the constructors, I guess all the constructors just create a pointer to the same dictionary rather than creating again and again. Am I right ?
So, modified version will be like that:
dict = JLD.load("mydictfile.jld")["dict"]
vars = [ Foo(dict,38,37.0) for i=1:3]
By the way,I still want to hear if I do the same thing completely inside the Foo type (I mean constructor of it)
You are making the type "too special" by adding the inner constructor. Julia provides default constructors if you do not provide an inner constructor; these just fill in the fields in an object of the new type.
So you can do something like:
immutable Foo{K,V}
bar::Dict{K,V}
baz::Int
qux::Float64
end
dict = JLD.load("mydictfile.jld")["dict"]
vars = [Foo(dict, i, i+1) for i in 1:3]
Note that it was a syntax error to include the parentheses after Dict in the type definition.
The {K,V} makes the Foo type parametric, so that you can make different kinds of Foo type, with different Dict types inside, if necessary. Even if you only use it for a single type of Dict, this will give more efficient code, since the type parameters K and V will be inferred when you create the Foo object. See the Julia manual: http://docs.julialang.org/en/release-0.5/manual/performance-tips/#avoid-fields-with-abstract-containers
So now you can try the code without even having the JLD file available (as we do not, for example):
julia> dict = Dict("a" => 1, "b" => 2)
julia> vars = [Foo(dict, i, Float64(i+1)) for i in 1:3]
3-element Array{Foo{String,Int64},1}:
Foo{String,Int64}(Dict("b"=>2,"a"=>1),1,2.0)
Foo{String,Int64}(Dict("b"=>2,"a"=>1),2,3.0)
Foo{String,Int64}(Dict("b"=>2,"a"=>1),3,4.0)
You can see that it is indeed the same dictionary (i.e. only a reference is actually stored in the type object) by modifying one of them and seeing that the others also change, i.e. that they point to the same dictionary object:
julia> vars[1].bar["c"] = 10
10
julia> vars
3-element Array{Foo{String,Int64},1}:
Foo{String,Int64}(Dict("c"=>10,"b"=>2,"a"=>1),1,2.0)
Foo{String,Int64}(Dict("c"=>10,"b"=>2,"a"=>1),2,3.0)
Foo{String,Int64}(Dict("c"=>10,"b"=>2,"a"=>1),3,4.0)
Related
I want to generate a Dict with undef values so I can later loop over the keys and fill in correct values. I can initialise an such a Dict using concrete types in the following way and it all works fine:
currencies = ["USD", "AUD", "GBP"]
struct example
num::Float64
end
undef_array = Array{example}(undef,3)
Dict{String,example}(zip(currencies, undef_array))
When my struct has an abstract type however I can still generate the undef array but I cannot create the dict. I get an error "UndefRefError: access to undefined reference"
abstract type abstract_num end
struct example2
num::abstract_num
end
undef_array = Array{example2}(undef,3)
Dict{String,example2}(zip(currencies, undef_array))
Although it is possible to create such a Dict with a concrete array:
struct numnum <: abstract_num
num::Float64
end
def_array = [example2(numnum(5.0)), example2(numnum(6.0)), example2(numnum(4.5))]
Dict{String,example2}(zip(currencies, def_array))
Question
My question is whether it is possible to generate a Dict with undef values of a type that relies on an abstract type? Is it is possible what is the best way to do it?
In your second (not working) example, undef_array is an array whos elements aren't initialized:
julia> undef_array = Array{example2}(undef,3)
3-element Array{example2,1}:
#undef
#undef
#undef
The reason is that it's not possible to instantiate an object of type example2 because your abstract type abstract_num (the type of the field of example2) doesn't have any concrete subtypes and, thus, can't be instantiated either. As a consequence even indexing undef_array[1] gives an UndefRefError and, hence, also zip won't work.
Compare this to the first case where the array elements are (arbitrarily) initialized:
julia> undef_array = Array{example}(undef,3)
3-element Array{example,1}:
example(1.17014136e-315)
example(1.17014144e-315)
example(1.17014152e-315)
and undef_array[1] works just fine.
Having said that, I'm not really sure what you try to achieve here. Why not just create a mydict = Dict{String, example2}() and fill it with content when the time comes? (As said above, you would have to define concrete subtypes of abstract_num first)
For performance reasons you should, in general, avoid creating types with fields of an abstract type.
Try:
a=Dict{String,Union{example3,UndefInitializer}}(currencies .=>undef)
However, for representing missing values the type Missing is usually more appropriate:
b=Dict{String,Union{example3,Missing}}(currencies .=>missing)
Please note that typeof(undef) yields UndefInitializer while typeof(missing) yields Missing - hence the need for Union types in the Dict. The dot (.) you can see above (.=>) is the famous Julia dot operator.
Moreover, I recommend to keep to Julia's naming conversion - struct and DataType names should start with a Capital Letter.
Last but not least, in your first example where concrete type Float64 was given, Julia has allocated the array to some concrete address in memory - beware that it can contain some garbage data (have a look at console log below):
julia> undef_array = Array{example}(undef,3)
3-element Array{example,1}:
example(9.13315366e-316)
example(1.43236026e-315)
example(1.4214423e-316)
I saw this example in the Julia language documentation. It uses something called Base. What is this Base?
immutable Squares
count::Int
end
Base.start(::Squares) = 1
Base.next(S::Squares, state) = (state*state, state+1)
Base.done(S::Squares, s) = s > S.count;
Base.eltype(::Type{Squares}) = Int # Note that this is defined for the type
Base.length(S::Squares) = S.count;
Base is a module which defines many of the functions, types and macros used in the Julia language. You can view the files for everything it contains here or call whos(Base) to print a list.
In fact, these functions and types (which include things like sum and Int) are so fundamental to the language that they are included in Julia's top-level scope by default.
This means that we can just use sum instead of Base.sum every time we want to use that particular function. Both names refer to the same thing:
Julia> sum === Base.sum
true
Julia> #which sum # show where the name is defined
Base
So why, you might ask, is it necessary is write things like Base.start instead of simply start?
The point is that start is just a name. We are free to rebind names in the top-level scope to anything we like. For instance start = 0 will rebind the name 'start' to the integer 0 (so that it no longer refers to Base.start).
Concentrating now on the specific example in docs, if we simply wrote start(::Squares) = 1, then we find that we have created a new function with 1 method:
Julia> start
start (generic function with 1 method)
But Julia's iterator interface (invoked using the for loop) requires us to add the new method to Base.start! We haven't done this and so we get an error if we try to iterate:
julia> for i in Squares(7)
println(i)
end
ERROR: MethodError: no method matching start(::Squares)
By updating the Base.start function instead by writing Base.start(::Squares) = 1, the iterator interface can use the method for the Squares type and iteration will work as we expect (as long as Base.done and Base.next are also extended for this type).
I'll grant that for something so fundamental, the explanation is buried a bit far down in the documentation, but http://docs.julialang.org/en/release-0.4/manual/modules/#standard-modules describes this:
There are three important standard modules: Main, Core, and Base.
Base is the standard library (the contents of base/). All modules
implicitly contain using Base, since this is needed in the vast
majority of cases.
In julia how do we know if a type is manipulated by value or by reference?
In java for example (at least for the sdk):
the basic types (those that have names starting with lower case letters, like "int") are manipulated by value
Objects (those that have names starting with capital letters, like "HashMap") and arrays are manipulated by reference
It is therefore easy to know what happens to a type modified inside a function.
I am pretty sure my question is a duplicate but I can't find the dup...
EDIT
This code :
function modifyArray(a::Array{ASCIIString,1})
push!(a, "chocolate")
end
function modifyInt(i::Int)
i += 7
end
myarray = ["alice", "bob"]
modifyArray(myarray)
#show myarray
myint = 1
modifyInt(myint)
#show myint
returns :
myarray = ASCIIString["alice","bob", "chocolate"]
myint = 1
which was a bit confusing to me, and the reason why I submitted this question. The comment of #StefanKarpinski clarified the issue.
My confusion came from the fact i considred += as an operator , a method like push! which is modifying the object itself . but it is not.
i += 7 should be seen as i = i + 7 ( a binding to a different object ). Indeed this behavior will be the same for modifyArray if I use for example a = ["chocolate"].
The corresponding terms in Julia are mutable and immutable types:
immutable objects (either bitstypes, such as Int or composite types declared with immutable, such as Complex) cannot be modified once created, and so are passed by copying.
mutable objects (arrays, or composite types declared with type) are passed by reference, so can be modified by calling functions. By convention such functions end with an exclamation mark (e.g., sort!), but this is not enforced by the language.
Note however that an immutable object can contain a mutable object, which can still be modified by a function.
This is explained in more detail in the FAQ.
I think the most rigourous answer is the one in
Julia function argument by reference
Strictly speaking, Julia is not "call-by-reference" but "call-by-value where the value is a
reference" , or "call-by-sharing", as used by most languages such as
python, java, ruby...
Is there a way to enforce a dictionary being constant?
I have a function which reads out a file for parameters (and ignores comments) and stores it in a dict:
function getparameters(filename::AbstractString)
f = open(filename,"r")
dict = Dict{AbstractString, AbstractString}()
for ln in eachline(f)
m = match(r"^\s*(?P<key>\w+)\s+(?P<value>[\w+-.]+)", ln)
if m != nothing
dict[m[:key]] = m[:value]
end
end
close(f)
return dict
end
This works just fine. Since i have a lot of parameters, which i will end up using on different places, my idea was to let this dict be global. And as we all know, global variables are not that great, so i wanted to ensure that the dict and its members are immutable.
Is this a good approach? How do i do it? Do i have to do it?
Bonus answerable stuff :)
Is my code even ok? (it is the first thing i did with julia, and coming from c/c++ and python i have the tendencies to do things differently.) Do i need to check whether the file is actually open? Is my reading of the file "julia"-like? I could also readall and then use eachmatch. I don't see the "right way to do it" (like in python).
Why not use an ImmutableDict? It's defined in base but not exported. You use one as follows:
julia> id = Base.ImmutableDict("key1"=>1)
Base.ImmutableDict{String,Int64} with 1 entry:
"key1" => 1
julia> id["key1"]
1
julia> id["key1"] = 2
ERROR: MethodError: no method matching setindex!(::Base.ImmutableDict{String,Int64}, ::Int64, ::String)
in eval(::Module, ::Any) at .\boot.jl:234
in macro expansion at .\REPL.jl:92 [inlined]
in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at .\event.jl:46
julia> id2 = Base.ImmutableDict(id,"key2"=>2)
Base.ImmutableDict{String,Int64} with 2 entries:
"key2" => 2
"key1" => 1
julia> id.value
1
You may want to define a constructor which takes in an array of pairs (or keys and values) and uses that algorithm to define the whole dict (that's the only way to do so, see the note at the bottom).
Just an added note, the actual internal representation is that each dictionary only contains one key-value pair, and a dictionary. The get method just walks through the dictionaries checking if it has the right value. The reason for this is because arrays are mutable: if you did a naive construction of an immutable type with a mutable field, the field is still mutable and thus while id["key1"]=2 wouldn't work, id.keys[1]=2 would. They go around this by not using a mutable type for holding the values (thus holding only single values) and then also holding an immutable dict. If you wanted to make this work directly on arrays, you could use something like ImmutableArrays.jl but I don't think that you'd get a performance advantage because you'd still have to loop through the array when checking for a key...
First off, I am new to Julia (I have been using/learning it since only two weeks). So do not put any confidence in what I am going to say unless it is validated by others.
The dictionary data structure Dict is defined here
julia/base/dict.jl
There is also a data structure called ImmutableDict in that file. However as const variables aren't actually const why would immutable dictionaries be immutable?
The comment states:
ImmutableDict is a Dictionary implemented as an immutable linked list,
which is optimal for small dictionaries that are constructed over many individual insertions
Note that it is not possible to remove a value, although it can be partially overridden and hidden
by inserting a new value with the same key
So let us call what you want to define as a dictionary UnmodifiableDict to avoid confusion. Such object would probably have
a similar data structure as Dict.
a constructor that takes a Dict as input to fill its data structure.
specialization (a new dispatch?) of the the method setindex! that is called by the operator [] =
in order to forbid modification of the data structure. This should be the case of all other functions that end with ! and hence modify the data.
As far as I understood, It is only possible to have subtypes of abstract types. Therefore you can't make UnmodifiableDict as a subtype of Dict and only redefine functions such as setindex!
Unfortunately this is a needed restriction for having run-time types and not compile-time types. You can't have such a good performance without a few restrictions.
Bottom line:
The only solution I see is to copy paste the code of the type Dict and its functions, replace Dict by UnmodifiableDict everywhere and modify the functions that end with ! to raise an exception if called.
you may also want to have a look at those threads.
https://groups.google.com/forum/#!topic/julia-users/n-lqjybIO_w
https://github.com/JuliaLang/julia/issues/1974
REVISION
Thanks to Chris Rackauckas for pointing out the error in my earlier response. I'll leave it below as an illustration of what doesn't work. But, Chris is right, the const declaration doesn't actually seem to improve performance when you feed the dictionary into the function. Thus, see Chris' answer for the best resolution to this issue:
D1 = [i => sind(i) for i = 0.0:5:3600];
const D2 = [i => sind(i) for i = 0.0:5:3600];
function test(D)
for jdx = 1:1000
# D[2] = 2
for idx = 0.0:5:3600
a = D[idx]
end
end
end
## Times given after an initial run to allow for compiling
#time test(D1); # 0.017789 seconds (4 allocations: 160 bytes)
#time test(D2); # 0.015075 seconds (4 allocations: 160 bytes)
Old Response
If you want your dictionary to be a constant, you can use:
const MyDict = getparameters( .. )
Update Keep in mind though that in base Julia, unlike some other languages, it's not that you cannot redefine constants, instead, it's just that you get a warning when doing so.
julia> const a = 2
2
julia> a = 3
WARNING: redefining constant a
3
julia> a
3
It is odd that you don't get the constant redefinition warning when adding a new key-val pair to the dictionary. But, you still see the performance boost from declaring it as a constant:
D1 = [i => sind(i) for i = 0.0:5:3600];
const D2 = [i => sind(i) for i = 0.0:5:3600];
function test1()
for jdx = 1:1000
for idx = 0.0:5:3600
a = D1[idx]
end
end
end
function test2()
for jdx = 1:1000
for idx = 0.0:5:3600
a = D2[idx]
end
end
end
## Times given after an initial run to allow for compiling
#time test1(); # 0.049204 seconds (1.44 M allocations: 22.003 MB, 5.64% gc time)
#time test2(); # 0.013657 seconds (4 allocations: 160 bytes)
To add to the existing answers, if you like immutability and would like to get performant (but still persistent) operations which change and extend the dictionary, check out FunctionalCollections.jl's PersistentHashMap type.
If you want to maximize performance and take maximal advantage of immutability, and you don't plan on doing any operations on the dictionary whatsoever, consider implementing a perfect hash function-based dictionary. In fact, if your dictionary is a compile-time constant, these can even be computed ahead of time (using metaprogramming) and precompiled.
In Julia, is there any way to get the name of a passed to a function?
x = 10
function myfunc(a)
# do something here
end
assert(myfunc(x) == "x")
Do I need to use macros or is there a native method that provides introspection?
You can grab the variable name with a macro:
julia> macro mymacro(arg)
string(arg)
end
julia> #mymacro(x)
"x"
julia> #assert(#mymacro(x) == "x")
but as others have said, I'm not sure why you'd need that.
Macros operate on the AST (code tree) during compile time, and the x is passed into the macro as the Symbol :x. You can turn a Symbol into a string and vice versa. Macros replace code with code, so the #mymacro(x) is simply pulled out and replaced with string(:x).
Ok, contradicting myself: technically this is possible in a very hacky way, under one (fairly limiting) condition: the function name must have only one method signature. The idea is very similar the answers to such questions for Python. Before the demo, I must emphasize that these are internal compiler details and are subject to change. Briefly:
julia> function foo(x)
bt = backtrace()
fobj = eval(current_module(), symbol(Profile.lookup(bt[3]).func))
Base.arg_decl_parts(fobj.env.defs)[2][1][1]
end
foo (generic function with 1 method)
julia> foo(1)
"x"
Let me re-emphasize that this is a bad idea, and should not be used for anything! (well, except for backtrace display). This is basically "stupid compiler tricks", but I'm showing it because it can be kind of educational to play with these objects, and the explanation does lead to a more useful answer to the clarifying comment by #ejang.
Explanation:
bt = backtrace() generates a ... backtrace ... from the current position. bt is an array of pointers, where each pointer is the address of a frame in the current call stack.
Profile.lookup(bt[3]) returns a LineInfo object with the function name (and several other details about each frame). Note that bt[1] and bt[2] are in the backtrace-generation function itself, so we need to go further up the stack to get the caller.
Profile.lookup(...).func returns the function name (the symbol :foo)
eval(current_module(), Profile.lookup(...)) returns the function object associated with the name :foo in the current_module(). If we modify the definition of function foo to return fobj, then note the equivalence to the foo object in the REPL:
julia> function foo(x)
bt = backtrace()
fobj = eval(current_module(), symbol(Profile.lookup(bt[3]).func))
end
foo (generic function with 1 method)
julia> foo(1) == foo
true
fobj.env.defs returns the first Method entry from the MethodTable for foo/fobj
Base.decl_arg_parts is a helper function (defined in methodshow.jl) that extracts argument information from a given Method.
the rest of the indexing drills down to the name of the argument.
Regarding the restriction that the function have only one method signature, the reason is that multiple signatures will all be listed (see defs.next) in the MethodTable. As far as I know there is no currently exposed interface to get the specific method associated with a given frame address. (as an exercise for the advanced reader: one way to do this would be to modify the address lookup functionality in jl_getFunctionInfo to also return the mangled function name, which could then be re-associated with the specific method invocation; however, I don't think we currently store a reverse mapping from mangled name -> Method).
Note also that (1) backtraces are slow (2) there is no notion of "function-local" eval in Julia, so even if one has the variable name, I believe it would be impossible to actually access the variable (and the compiler may completely elide local variables, unused or otherwise, put them in a register, etc.)
As for the IDE-style introspection use mentioned in the comments: foo.env.defs as shown above is one place to start for "object introspection". From the debugging side, Gallium.jl can inspect DWARF local variable info in a given frame. Finally, JuliaParser.jl is a pure-Julia implementation of the Julia parser that is actively used in several IDEs to introspect code blocks at a high level.
Another method is to use the function's vinfo. Here is an example:
function test(argx::Int64)
vinfo = code_lowered(test,(Int64,))
string(vinfo[1].args[1][1])
end
test (generic function with 1 method)
julia> test(10)
"argx"
The above depends on knowing the signature of the function, but this is a non-issue if it is coded within the function itself (otherwise some macro magic could be needed).