Julia: select all except a few indices [duplicate] - julia

I have an array a=rand(100), I want to get every value except the values at the indices notidx=[2;50]. Is there a clean way to get a at the other values? I am looking for a good way to do both a copy and a view.
Currently I make the array [1;3:49;51:100] by symdiff(1:100,notidx), but a[symdiff(1:length(a),notidx)] and view(a,a[symdiff(1:length(a),notidx)]) are not very clean (or understandable to others) ways of doing this.

I don't have anything super clean, but you can do
a[setdiff(1:end, notidx)]
which is slightly cleaner than what you had, or
ind = trues(length(a))
ind[notidx] = false
a[ind]
which is pretty verbose but very clear.

Update:
If you are using julia-v0.5+, you can also use the new generator expression, for example:
view(a, [i for i in indices(a)... if i ∉ notidx])
and
[a[i] for i in indices(a)... if i ∉ notidx]
Old post:
To get a copy, you can firstly make a copy of a, then manipulate it with deleteat! to delete those values at specific indices. After you've done this, it's convenient to get a view of a via indexin:
a = rand(100)
# => 100-element Array{Float64,1}:
0.62636
0.488919
0.499884
....
b = copy(a)
deleteat!(b, [2,50])
# => 98-element Array{Float64,1}:
0.62636
0.499884
....

Related

Immutable dictionary

Is there a way to enforce a dictionary being constant?
I have a function which reads out a file for parameters (and ignores comments) and stores it in a dict:
function getparameters(filename::AbstractString)
f = open(filename,"r")
dict = Dict{AbstractString, AbstractString}()
for ln in eachline(f)
m = match(r"^\s*(?P<key>\w+)\s+(?P<value>[\w+-.]+)", ln)
if m != nothing
dict[m[:key]] = m[:value]
end
end
close(f)
return dict
end
This works just fine. Since i have a lot of parameters, which i will end up using on different places, my idea was to let this dict be global. And as we all know, global variables are not that great, so i wanted to ensure that the dict and its members are immutable.
Is this a good approach? How do i do it? Do i have to do it?
Bonus answerable stuff :)
Is my code even ok? (it is the first thing i did with julia, and coming from c/c++ and python i have the tendencies to do things differently.) Do i need to check whether the file is actually open? Is my reading of the file "julia"-like? I could also readall and then use eachmatch. I don't see the "right way to do it" (like in python).
Why not use an ImmutableDict? It's defined in base but not exported. You use one as follows:
julia> id = Base.ImmutableDict("key1"=>1)
Base.ImmutableDict{String,Int64} with 1 entry:
"key1" => 1
julia> id["key1"]
1
julia> id["key1"] = 2
ERROR: MethodError: no method matching setindex!(::Base.ImmutableDict{String,Int64}, ::Int64, ::String)
in eval(::Module, ::Any) at .\boot.jl:234
in macro expansion at .\REPL.jl:92 [inlined]
in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at .\event.jl:46
julia> id2 = Base.ImmutableDict(id,"key2"=>2)
Base.ImmutableDict{String,Int64} with 2 entries:
"key2" => 2
"key1" => 1
julia> id.value
1
You may want to define a constructor which takes in an array of pairs (or keys and values) and uses that algorithm to define the whole dict (that's the only way to do so, see the note at the bottom).
Just an added note, the actual internal representation is that each dictionary only contains one key-value pair, and a dictionary. The get method just walks through the dictionaries checking if it has the right value. The reason for this is because arrays are mutable: if you did a naive construction of an immutable type with a mutable field, the field is still mutable and thus while id["key1"]=2 wouldn't work, id.keys[1]=2 would. They go around this by not using a mutable type for holding the values (thus holding only single values) and then also holding an immutable dict. If you wanted to make this work directly on arrays, you could use something like ImmutableArrays.jl but I don't think that you'd get a performance advantage because you'd still have to loop through the array when checking for a key...
First off, I am new to Julia (I have been using/learning it since only two weeks). So do not put any confidence in what I am going to say unless it is validated by others.
The dictionary data structure Dict is defined here
julia/base/dict.jl
There is also a data structure called ImmutableDict in that file. However as const variables aren't actually const why would immutable dictionaries be immutable?
The comment states:
ImmutableDict is a Dictionary implemented as an immutable linked list,
which is optimal for small dictionaries that are constructed over many individual insertions
Note that it is not possible to remove a value, although it can be partially overridden and hidden
by inserting a new value with the same key
So let us call what you want to define as a dictionary UnmodifiableDict to avoid confusion. Such object would probably have
a similar data structure as Dict.
a constructor that takes a Dict as input to fill its data structure.
specialization (a new dispatch?) of the the method setindex! that is called by the operator [] =
in order to forbid modification of the data structure. This should be the case of all other functions that end with ! and hence modify the data.
As far as I understood, It is only possible to have subtypes of abstract types. Therefore you can't make UnmodifiableDict as a subtype of Dict and only redefine functions such as setindex!
Unfortunately this is a needed restriction for having run-time types and not compile-time types. You can't have such a good performance without a few restrictions.
Bottom line:
The only solution I see is to copy paste the code of the type Dict and its functions, replace Dict by UnmodifiableDict everywhere and modify the functions that end with ! to raise an exception if called.
you may also want to have a look at those threads.
https://groups.google.com/forum/#!topic/julia-users/n-lqjybIO_w
https://github.com/JuliaLang/julia/issues/1974
REVISION
Thanks to Chris Rackauckas for pointing out the error in my earlier response. I'll leave it below as an illustration of what doesn't work. But, Chris is right, the const declaration doesn't actually seem to improve performance when you feed the dictionary into the function. Thus, see Chris' answer for the best resolution to this issue:
D1 = [i => sind(i) for i = 0.0:5:3600];
const D2 = [i => sind(i) for i = 0.0:5:3600];
function test(D)
for jdx = 1:1000
# D[2] = 2
for idx = 0.0:5:3600
a = D[idx]
end
end
end
## Times given after an initial run to allow for compiling
#time test(D1); # 0.017789 seconds (4 allocations: 160 bytes)
#time test(D2); # 0.015075 seconds (4 allocations: 160 bytes)
Old Response
If you want your dictionary to be a constant, you can use:
const MyDict = getparameters( .. )
Update Keep in mind though that in base Julia, unlike some other languages, it's not that you cannot redefine constants, instead, it's just that you get a warning when doing so.
julia> const a = 2
2
julia> a = 3
WARNING: redefining constant a
3
julia> a
3
It is odd that you don't get the constant redefinition warning when adding a new key-val pair to the dictionary. But, you still see the performance boost from declaring it as a constant:
D1 = [i => sind(i) for i = 0.0:5:3600];
const D2 = [i => sind(i) for i = 0.0:5:3600];
function test1()
for jdx = 1:1000
for idx = 0.0:5:3600
a = D1[idx]
end
end
end
function test2()
for jdx = 1:1000
for idx = 0.0:5:3600
a = D2[idx]
end
end
end
## Times given after an initial run to allow for compiling
#time test1(); # 0.049204 seconds (1.44 M allocations: 22.003 MB, 5.64% gc time)
#time test2(); # 0.013657 seconds (4 allocations: 160 bytes)
To add to the existing answers, if you like immutability and would like to get performant (but still persistent) operations which change and extend the dictionary, check out FunctionalCollections.jl's PersistentHashMap type.
If you want to maximize performance and take maximal advantage of immutability, and you don't plan on doing any operations on the dictionary whatsoever, consider implementing a perfect hash function-based dictionary. In fact, if your dictionary is a compile-time constant, these can even be computed ahead of time (using metaprogramming) and precompiled.

Strange Dict behavior with keys of a custom type

I have a recursive function which utilizes a global dict to store values already obtained when traversing the tree. However, at least some of the values stored in the dict seem to disappear! This simplified code shows the problem:
type id
level::Int32
x::Int32
end
Vdict = Dict{id,Float64}()
function getV(w::id)
if haskey(Vdict,w)
return Vdict[w]
end
if w.level == 12
return 1.0
end
w.x == -111 && println("dont have: ",w)
local vv = 0.0
for j = -15:15
local wj = id(w.level+1,w.x+j)
vv += getV(wj)
end
Vdict[w] = vv
w.x == -111 && println("just stored: ",w)
vv
end
getV(id(0,0))
The output has many lines like this:
just stored: id(11,-111)
dont have: id(11,-111)
just stored: id(11,-111)
dont have: id(11,-111)
just stored: id(11,-111)
dont have: id(11,-111)
...
Do I have a silly error, or is there a bug in Julia's dict?
By default, custom types come with implementations of equality and hashing by object identity. Since your id type is mutable, Julia is conservative and assumes that you care about distinguishing each instance from another (since they could potentially diverge):
julia> type Id # There's a strong convention to capitalize type names in Julia
level::Int32
x::Int32
end
julia> x = Id(11, -111)
y = Id(11, -111)
x == y
false
julia> x.level = 12; (x,y)
(Id(12,-111),Id(11,-111))
Julia doesn't know whether you care about the object's long-term behavior or its current value.
There are two ways to make this behave as you'd like:
Make your custom type immutable. It looks like you don't need to mutate the contents of Id. The simplest and most straightforward way to solve this is to define it as immutable Id. Now Id(11, -111) is completely indistinguishable from any other construction of Id(11, -111) since its values can never change. As a bonus, you may see better performance, too.
If you do need to mutate the values, you could alternatively define your own implementations of == and Base.hash so they only care about the current value:
==(a::Id, b::Id) = a.level == b.level && a.x == b.x
Base.hash(a::Id, h::Uint) = hash(a.level, hash(a.x, h))
As #StefanKarpinski just pointed out on the mailing list, this isn't the default for mutable values "since it makes it easy to stick something in a dict, then mutate it, and 'lose it'." That is, the object's hash value has changed but the dictionary stored it in a place based upon its old hash value, and now you can no longer access that key/value pair by key lookup. Even if you create a second object with the same original properties as the first it won't be able to find it since the dictionary checks equality after finding a hash match. The only way to lookup that key is to mutate it back to its original value or explicitly asking the dictionary to Base.rehash! its contents.
In this case, I highly recommend option 1.

Returning multiple values in Ruby, to be used to call a function

Is it possible to return multiple values from a function?
I want to pass the return values into another function, and I wonder if I can avoid having to explode the array into multiple values
My problem?
I am upgrading Capybara for my project, and I realized, thanks to CSS 'contains' selector & upgrade of Capybara, that the statement below will no longer work
has_selector?(:css, "#rightCol:contains(\"#{page_name}\")")
I want to get it working with minimum effort (there are a lot of such cases), So I came up with the idea of using Nokogiri to convert the css to xpath. I wanted to write it so that the above function can become
has_selector? xpath(:css, "#rightCol:contains(\"#{page_name}\")")
But since xpath has to return an array, I need to actually write this
has_selector?(*xpath(:css, "#rightCol:contains(\"#{page_name}\")"))
Is there a way to get the former behavior?
It can be assumed that right now xpath func is like the below, for brevity.
def xpath(*a)
[1, 2]
end
You cannot let a method return multiple values. In order to do what you want, you have to change has_selector?, maybe something like this:
alias old_has_selector? :has_selector?
def has_selector? arg
case arg
when Array then old_has_selector?(*arg)
else old_has_selector?(arg)
end
end
Ruby has limited support for returning multiple values from a function. In particular a returned Array will get "destructured" when assigning to multiple variables:
def foo
[1, 2]
end
a, b = foo
a #=> 1
b #=> 2
However in your case you need the splat (*) to make it clear you're not just passing the array as the first argument.
If you want a cleaner syntax, why not just write your own wrapper:
def has_xpath?(xp)
has_selector?(*xpath(:css, xp))
end

Assert type information onto computed results in Julia

Problem
I read in an array of strings from a file.
julia> file = open("word-pairs.txt");
julia> lines = readlines(file);
But Julia doesn't know that they're strings.
julia> typeof(lines)
Array{Any,1}
Question
Can I tell Julia this somehow?
Is it possible to insert type information onto a computed result?
It would be helpful to know the context where this is an issue, because there might be a better way to express what you need - or there could be a subtle bug somewhere.
Can I tell Julia this somehow?
No, because the readlines function explicitly creates an Any array (a = {}): https://github.com/JuliaLang/julia/blob/master/base/io.jl#L230
Is it possible to insert type information onto a computed result?
You can convert the array:
r = convert(Array{ASCIIString,1}, w)
Or, create your own readstrings function based on the link above, but using ASCIIString[] for the collection array instead of {}.
Isaiah is right about the limits of readlines. More generally, often you can say
n = length(A)::Int
when generic type inference fails but you can guarantee the type in your particular case.
As of 0.3.4:
julia> typeof(lines)
Array{Union(ASCIIString,UTF8String),1}
I just wanted to warn against:
convert(Array{ASCIIString,1}, lines)
that can fail (for non-ASCII) while I guess, in this case nothing needs to be done, this should work:
convert(Array{UTF8String,1}, lines)

How to create a collection in Julia?

This seems like a really basic question, but can't find the answer. How do I create a collection in Julia? For example, I want to open a text file and parse each line to create an (iterable or otherwise) collection. Obviously I don't know how many elements there are in advance.
I can iterate through the lines like this
I = each_line(open(fileName,"r"))
state = start(I)
while !done(I, state)
(i, state) = next(I, state)
println(i)
end
But I don't know how to put each i into an array or other collection. I tried
map( i -> println(i), each_line(open(fileName,"r") ) )
But got the error
no method map(Function,EachLine)
You could do this:
lines = String[]
for line in each_line(open(fileName))
push!(lines, line)
end
And then lines contains the list of lines. You need the String in the first line to make the array extensible.
Standard collections and supported operations are mainly covered in the standard library documentation.
Specifically, the Deques section covers all of the operations supported by the 1d Array type (vector), including push! and pop! as well as insertion, resizing, etc.
Omar's answer is correct, and I will just add a small qualification: String[] creates a 1d array of Strings. The same constructor syntax may be used for example to create Int[], Float[], or even Any[] vectors. The latter type may hold objects of any type.
Depending on your Julia version, you may also be able to write collect(eachline(open("LICENSE.md"))) or [eachline(open("LICENSE.md"))...]. I think these won't work in 0.1.x versions but will working in newer 0.2 development versions (which are recommended at this point – 0.2 is on its way soon).

Resources