Vectorized indexing for a dictionary - julia

I would like to find a succinct syntax in Julia for indexing a dictionary in a vectorized manner. In R, I would do the following:
dict <- c("a" = 1, "b" = 2)
keys <- c("a", "a", "b", "b", "a")
dict[keys]
In Julia, if I have a dict and keys like this,
dict = Dict(:a => 1, :b => 2)
keys = [:a, :a, :b, :b, :a]
then I can achieve the desired result using a list comprehension:
julia> [dict[key] for key in keys]
5-element Array{Int64,1}:
1
1
2
2
1
Is there a more succinct vectorized syntax, similar to the R syntax?

getindex.(Ref(dict), keys)
You can wrap it in Ref so you don't need to [] it.

Here's a little macro (using the brilliant MacroTools package):
using MacroTools
macro vget(ex)
#capture(ex, dict_.[idxvec_])
:(map(i->$dict[i], $idxvec)) |> esc
end
Then you can:
d = Dict(:a => 1, :b => 2)
ks = [:a, :a, :b, :b, :a]
#vget d.[ks]
I probably wouldn't want to use that in production code, something like your list comprehension, or a simple map: map(i->d[i], ks), is more readable/explicit/standard, but it is fun :D

If you're going to be frequently using the dictionary as a lookup table, then it might be worth creating a closure to use as a lookup function:
make_lookup(dict) = key -> dict[key]
dict = Dict(:a => 1, :b => 2)
lookup = make_lookup(dict)
Then you can use lookup in a vectorized fashion:
julia> keys = [:a, :a, :b, :b, :a];
julia> lookup.(keys)
5-element Array{Int64,1}:
1
1
2
2
1

You can use the vectorized version of getindex:
julia> getindex.([dict], keys)
5-element Array{Int64,1}:
1
1
2
2
1
Note that dict is wrapped in an array so that getindex does not attempt to broadcast over the elements of the dictionary:
julia> getindex.(dict, keys)
ERROR: ArgumentError: broadcasting over dictionaries and `NamedTuple`s is reserved
Stacktrace:
[1] broadcastable(::Dict{Symbol,Int64}) at ./broadcast.jl:615
[2] broadcasted(::Function, ::Dict{Symbol,Int64}, ::Array{Symbol,1}) at ./broadcast.jl:1164
[3] top-level scope at none:0

Related

Filtering a dictionary in julia

I want to filter a dictionary using filter() function but I am having trouble with it. What I wish to accomplish is, to return the key for some condition of the value. However I am getting a method error
using Agents: AbstractAgent
# Define types
mutable struct Casualty <: AbstractAgent
id::Int
ts::Int
rescued::Bool
function Casualty(id,ts; rescued = false)
new(id,ts,rescued)
end
end
mutable struct Rescuer <: AbstractAgent
id::Int
actions::Int
dist::Float64
function Rescuer(id; action = rand(1:3) , dist = rand(1)[1])
new(id,action,dist)
end
end
cas1 = Casualty(1,2)
cas2 = Casualty(2,3)
resc1 = Rescuer(3)
agents = Dict(1=> cas1, 2 => cas2, 3 => resc1)
Now to filter
filter((k,v) -> v isa Casualty, agents)
# ERROR: MethodError: no method matching (::var"#22#23")(::Pair{Int64, AbstractAgent})
# what I truly wish to achieve is return the key for some condition of the value
filter((k,v) -> k ? v isa Casualty : "pass", agents)
# ofcourse I am not sure how to "pass" using this format
Any idea how I can achieve this. Thanks
For dictionaries filter gets a key-value pair, so do either (destructuring Pair):
julia> dict = Dict(1=>"a", 2=>"b", 3=>"c")
Dict{Int64, String} with 3 entries:
2 => "b"
3 => "c"
1 => "a"
julia> filter(((k,v),) -> k == 1 || v == "c", dict)
Dict{Int64, String} with 2 entries:
3 => "c"
1 => "a"
or for example (getting Pair as a whole):
julia> filter(p -> first(p) == 1 || last(p) == "c", dict)
Dict{Int64, String} with 2 entries:
3 => "c"
1 => "a"
julia> filter(p -> p[1] == 1 || p[2] == "c", dict)
Dict{Int64, String} with 2 entries:
3 => "c"
1 => "a"
EDIT
Explanation why additional parentheses are needed:
julia> f = (x, y) -> (x, y)
#1 (generic function with 1 method)
julia> g = ((x, y),) -> (x, y)
#3 (generic function with 1 method)
julia> methods(f)
# 1 method for anonymous function "#1":
[1] (::var"#1#2")(x, y) in Main at REPL[1]:1
julia> methods(g)
# 1 method for anonymous function "#3":
[1] (::var"#3#4")(::Any) in Main at REPL[2]:1
julia> f(1, 2)
(1, 2)
julia> f((1, 2))
ERROR: MethodError: no method matching (::var"#1#2")(::Tuple{Int64, Int64})
Closest candidates are:
(::var"#1#2")(::Any, ::Any) at REPL[1]:1
julia> g(1, 2)
ERROR: MethodError: no method matching (::var"#3#4")(::Int64, ::Int64)
Closest candidates are:
(::var"#3#4")(::Any) at REPL[2]:1
julia> g((1, 2))
(1, 2)
As you can see f takes 2 positional argument, while g takes one positional argument that gets destructured (i.e. the assumption is that argument passed to g is iterable and has at least 2 elements).
See also https://docs.julialang.org/en/v1/manual/functions/#Argument-destructuring.
Now comes the tricky part:
julia> h1((x, y)) = (x, y)
h1 (generic function with 1 method)
julia> methods(h1)
# 1 method for generic function "h1":
[1] h1(::Any) in Main at REPL[1]:1
julia> h2 = ((x, y)) -> (x, y)
#1 (generic function with 1 method)
julia> methods(h2)
# 1 method for anonymous function "#1":
[1] (::var"#1#2")(x, y) in Main at REPL[3]:1
In this example h1 is a named function. In this case it is enough to just wrap arguments in extra parentheses to get destructuring behavior. For anonymous functions, because of how Julia parser works an extra , is needed - if you omit it the extra parentheses are ignored.
Now let us check filter docstring:
filter(f, d::AbstractDict)
Return a copy of d, removing elements for which f is false.
The function f is passed key=>value pairs.
As you can see from this docstring f is passed a single argument that is Pair. That is why you need to use either destructuring or define a single argument function and extract its elements inside the function.
The right syntax is:
filter(((k,v),) -> v isa Casualty, agents)
which prints
julia> filter(((k,v),) -> v isa Casualty, agents)
Dict{Int64, AbstractAgent} with 2 entries:
2 => Casualty(2, 3, false)
1 => Casualty(1, 2, false)
About the problem of only getting involved keys... I have no idea beside:
julia> filter(((k,v),) -> v isa Casualty, agents) |> keys
which prints
julia> filter(((k,v),) -> v isa Casualty, agents) |> keys
KeySet for a Dict{Int64, AbstractAgent} with 2 entries. Keys:
2
1

Dictionary element-wise operations in Julia

I would like to broadcast an operation to all values of a dictionary. For an array, I know I can broadcast an element-wise operation using:
julia> b1 = [1, 2, 3]
julia> b1./2
3-element Array{Float64,1}:
0.5
1.0
1.5
What is an efficient way of broadcasting the same operation to all values of a dictionary? Say, for the dictionary
a1 = Dict("A"=>1, "B"=>2)
An iteration protocol is defined for both keys and values of dictionaries, so you can just do, eg:
julia> d = Dict("a"=>1, "b"=>2)
Dict{String,Int64} with 2 entries:
"b" => 2
"a" => 1
julia> values(d).^2
2-element Array{Int64,1}:
4
1
If you want to alter the dictionary in-place, use map!, eg:
julia> map!(x->x^2, values(d))
Base.ValueIterator for a Dict{String,Int64} with 2 entries. Values:
4
1
julia> d
Dict{String,Int64} with 2 entries:
"b" => 4
"a" => 1
However, your function must output a type that can be converted back to the dictionary value type. In my example, I'm squaring Int which yields Int. However, in the question you are dividing by 2, which obviously yields Float64. If the float cannot be converted back to an integer, then you'll get an error.
Note, you can broadcast over keys also, e.g.:
julia> f(x) = "hello mr $(x)"
f (generic function with 1 method)
julia> f.(keys(d))
2-element Array{String,1}:
"hello mr b"
"hello mr a"
but this can not be done in-place, i.e. you can't use map! on keys.
Importantly, note that you should not instantiate the collection. Indeed, this would be inefficient. So avoid constructs like: collect(values(d)) ./ 2.

What is the difference between fields and properties in Julia?

Julia has the setter functions setproperty! and setfield! and the getter functions getproperty and getfield that operate on structs. What is the difference between properties and fields in Julia?
For example, the following seems to indicate that they do the same thing:
julia> mutable struct S
a
end
julia> s = S(2)
S(2)
julia> getfield(s, :a)
2
julia> getproperty(s, :a)
2
julia> setfield!(s, :a, 3)
3
julia> s
S(3)
julia> setproperty!(s, :a, 4)
4
julia> s
S(4)
fields are simply the "components" of a struct. The struct
struct A
b
c::Int
end
has the fields b and c. A call to getfield returns the object that is bound to the field:
julia> a = A("foo", 3)
A("foo", 3)
julia> getfield(a, :b)
"foo"
In early versions of Julia, the syntax a.b used to "lower", i.e. be the same as, writing getfield(a, :b). What has changed now is that a.b lowers to getproperty(a, :b) with the default fallback
getproperty(a::Type, v::Symbol) = getfield(a, v)
So by default, nothing has changed. However, authors of structs can overload getproperty (it is not possible to overload getfield) to provide extra functionality to the dot-syntax:
julia> function Base.getproperty(a::A, v::Symbol)
if v == :c
return getfield(a, :c) * 2
elseif v == :q
return "q"
else
return getfield(a, v)
end
end
julia> a.q
"q"
julia> getfield(a, :q)
ERROR: type A has no field q
julia> a.c
6
julia> getfield(a, :c)
3
julia> a.b
"foo"
So we can add extra functionality to the dot syntax (dynamically if we want). As a concrete example where this is useful is for the package PyCall.jl where you used to have to write pyobject[:field] while it is possible now to implement it such that you can write pyobject.field.
The difference between setfield! and setproperty! is analogous to the difference between getfield and getproperty, explained above.
In addition, it is possible to hook into the function Base.propertynames to provide tab completion of properties in the REPL. By default, only the field names will be shown:
julia> a.<TAB><TAB>
b c
But by overloading propertynames we can make it also show the extra property q:
julia> Base.propertynames(::A) = (:b, :c, :q)
julia> a.<TAB><TAB>
b c q

Unpack dict entries inside a function

I want to unpack parameters that are stored in a dictionary. They should be available inside the local scope of a function afterwards. The name should be the same as the key which is a symbol.
macro unpack_dict()
code = :()
for (k,v) in dict
ex = :($k = $v)
code = quote
$code
$ex
end
end
return esc(code)
end
function assign_parameters(dict::Dict{Symbol, T}) where T<:Any
#unpack_dict
return a + b - c
end
dict = Dict(:a => 1,
:b => 5,
:c => 6)
assign_parameters(dict)
However, this code throws:
LoadError: UndefVarError: dict not defined
If I define the dictionary before the macro it works because the dictionary is defined.
Does someone has an idea how to solve this? Using eval() works but is evaluated in the global scope what I want to avoid.
If you want to unpack them then the best method is to simply unpack them directly:
function actual_fun(d)
a = d[:a]
b = d[:b]
c = d[:c]
a+b+c
end
This will be type stable, relatively fast and readable.
You could, for instance, do something like this (I present you two options to avoid direct assignment to a, b, and c variables):
called_fun(d) = helper(;d...)
helper(;kw...) = actual_fun(;values(kw)...)
actual_fun(;a,b,c, kw...) = a+b+c
function called_fun2(d::Dict{T,S}) where {T,S}
actual_fun(;NamedTuple{Tuple(keys(d)), NTuple{length(d), S}}(values(d))...)
end
and now you can write something like:
julia> d = Dict(:a=>1, :b=>2, :c=>3, :d=>4)
Dict{Symbol,Int64} with 4 entries:
:a => 1
:b => 2
:d => 4
:c => 3
julia> called_fun(d)
6
julia> called_fun2(d)
6
But I would not recommend it - it is not type stable and not very readable.
AFACT other possibilities will have similar shortcomings as during compile time Julia knows only types of variables not their values.
EDIT: You can do something like this:
function unpack_dict(dict)
ex = :()
for (k,v) in dict
ex = :($ex; $k = $v)
end
return :(myfun() = ($ex; a+b+c))
end
runner(d) = eval(unpack_dict(d))
and then run:
julia> d = Dict(:a=>1, :b=>2, :c=>3, :d=>4)
Dict{Symbol,Int64} with 4 entries:
:a => 1
:b => 2
:d => 4
:c => 3
julia> runner(d)
myfun (generic function with 1 method)
julia> myfun()
6
but again - I feel this is a bit messy.
In Julia 1.7, you can simply unpack named tuples into the local scope, and you can easily "spread" a dict into a named tuple.
julia> dict = Dict(:a => 1, :b => 5, :c => 6)
Dict{Symbol, Int64} with 3 entries:
:a => 1
:b => 5
:c => 6
julia> (; a, b, c) = (; sort(dict)...)
(a = 1, b = 5, c = 6)
julia> a, b, c
(1, 5, 6)
(The dict keys are sorted so that the named tuple produced is type-stable; if the keys were produced in arbitrary order then this would result in a named tuple with fields in arbitrary order as well.)

Extracting parameter types in Julia

Suppose I write a function in Julia that takes a Dict{K,V} as an argument, then creates arrays of type Array{K,1} and Array{V,1}. How can I extract the types K and V from the Dict object so that I can use them to create the arrays?
Sven and John's answers are both quite right. If you don't want to introduce method type parameters the way John's code does, you can use the eltype function:
julia> d = ["foo"=>1, "bar"=>2]
["foo"=>1,"bar"=>2]
julia> eltype(d)
(ASCIIString,Int64)
julia> eltype(d)[1]
ASCIIString (constructor with 1 method)
julia> eltype(d)[2]
Int64
julia> eltype(keys(d))
ASCIIString (constructor with 1 method)
julia> eltype(values(d))
Int64
As you can see, there are a few ways to skin this cat, but I think that eltype(keys(d)) and eltype(values(d)) are by far the clearest and since the keys and values functions just return immutable view objects, the compiler is clever enough that this doesn't actually create any objects.
If you're writing a function that will do this for you, you can make the types a parameter of the function, which may save you some run-time lookups:
julia> function foo{K, V}(d::Dict{K, V}, n::Integer = 0)
keyarray = Array(K, n)
valarray = Array(V, n)
# MAGIC HAPPENS
return keyarray, valarray
end
foo (generic function with 2 methods)
julia> x, y = foo(["a" => 2, "b" => 3])
([],[])
julia> typeof(x)
Array{ASCIIString,1}
julia> typeof(y)
Array{Int64,1}
You can use keys and values in combination with typeof:
# an example Dict{K, V}
d = Dict{Int64, ASCIIString}()
# K
typeof(keys(d))
Array{Int64,1}
# V
typeof(values(d))
Array{ASCIIString,1}
If you are just interested in the types, you can use eltype(d), or define even more specific functions
keytype{K}(d::Dict{K}) = K
valuetype{K,V}(d::Dict{K,V}) = V
and find the type immediately through
keytype(d)
valuetype(d)
As far as I understand, this should be pretty efficient because the compiler can deduce most of this at compile time.
For iterables, the eltype approach that is already given in other answers works well enough. E.g.
julia> d = Dict(:a => 1, :b => 2)
Dict{Symbol, Int64} with 2 entries:
:a => 1
:b => 2
julia> K, V = eltype(keys(d)), eltype(values(d))
(Symbol, Int64)
julia> K
Symbol
julia> V
Int64
This approach is generalisable to all iterables. However I would like to offer another approach which will generalise to other kinds of object (and still doesn't use the method type signature approach):
julia> d = Dict(:a => 1, :b => 2)
Dict{Symbol, Int64} with 2 entries:
:a => 1
:b => 2
julia> typeof(d)
Dict{Symbol, Int64}
julia> K, V = typeof(d).parameters
svec(Symbol, Int64)
julia> K
Symbol
julia> V
Int64
You can see that the Dict{Symbol, Int64} type has a field parameters which you can unpack to obtain the type parameters. This works on structs too:
julia> struct MyStruct{S, T}
a::S
b::T
end
julia> x = MyStruct(1, 1.0)
MyStruct{Int64, Float64}(1, 1.0)
julia> S, T = typeof(x).parameters
svec(Int64, Float64)
julia> S
Int64
julia> T
Float64

Resources