How to manipulate named tuples - julia

I like the idea of
NamedTuple
a lot, as a middle ground between Tuple and full, user-defined composite types.
I know how to build a named tuple and access one of its fields
julia> nt = (a=1, b=2.0)
(a = 1, b = 2.0)
julia> nt.a
1
however, I don't know much more and don't even know whether it is possible to do more than that. I'm thinking about a lot of ways we can manipulate plain tuples (usually involving splatting), and wonder if some of those apply to named tuples as well. For example, how to:
dynamically build a NamedTuple from lists of fields and values
grow a NamedTuple , i.e add new field-value pairs to it
"update" (in an immutable sense) a field in an existing named tuple

The NamedTupleTools
package contains a lot of tools aiming at making the use of NamedTuples more
straightforward. But here are a few elementary operations that can be performed
on them "manually":
Creation
# regular syntax
julia> nt = (a=1, b=2.)
(a = 1, b = 2.0)
# empty named tuple (useful as a seed that will later grow)
julia> NamedTuple()
NamedTuple()
# only one entry => don't forget the comma
julia> (a=1,)
(a = 1,)
Growth and "modification"
It is possible to
merge two
named tuples to create a new one:
julia> merge(nt, (c=3, d=4.))
(a = 1, b = 2.0, c = 3, d = 4.0)
...or to re-use an existing NamedTuple by splatting it in the creation of a
new one:
julia> (; nt..., c=3, d=4.)
(a = 1, b = 2.0, c = 3, d = 4.0)
When the same field name appears multiple times, the last occurrence is
kept. This allows for a form of "copy with modification":
julia> nt
(a = 1, b = 2.0)
julia> merge(nt, (b=3,))
(a = 1, b = 3)
julia> (; nt..., b=3)
(a = 1, b = 3)
Dynamic manipulations
Using field=>value pairs in the various techniques presented above allows for
more dynamic manipulations:
julia> field = :c;
julia> merge(nt, [field=>1])
(a = 1, b = 2.0, c = 1)
julia> (; nt..., field=>1)
(a = 1, b = 2.0, c = 1)
The same technique can be used to build NamedTuples from existing dynamic data structures
julia> dic = Dict(:a=>1, :b=>2);
julia> (; dic...)
(a = 1, b = 2)
julia> arr = [:a=>1, :b=>2];
julia> (; arr...)
(a = 1, b = 2)
Iteration
Iterating on a NamedTuple iterates on its values:
julia> for val in nt
println(val)
end
1
2.0
Like all key->value structures, the
keys function
can be used to iterate over the fields:
julia> for field in keys(nt)
val = nt[field]
println("$field => $val")
end
a => 1
b => 2.0

Related

Add element to NamedTuple

I have written a function that adds an element to a NamedTuple:
function Base.setindex!(nt::NamedTuple, key::String, value::Any)
return (; nt..., key=value)
end
nt = (; a=1, b=2)
setindex!(nt, "c", 3)
The issue is though that the added value has the key "key", and not the actual string that key represents as seen below:
(a = 1, b = 2, key = 3)
How can I "evaluate" the key-variable before adding it to the NamedTuple?
This is how I would do it (note that this creates a new NamedTuple and does not update the passed nt as this is not possible):
julia> setindex(nt::NamedTuple, key::AbstractString, value) =
merge(nt, (Symbol(key) => value,))
setindex (generic function with 2 methods)
julia> setindex((a=1, b=2), "c", 3)
(a = 1, b = 2, c = 3)
julia> setindex((a=1, b=2), "b", 3) # note what happens if you re-use the key that is already present
(a = 1, b = 3)
You could also try using Accessors. This is a Julia package providing a macro based syntax for working with immutable types. This syntax makes the code more readable.
julia> using Accessors
julia> nt = (;a=1, b=2)
(a = 1, b = 2)
julia> new_nt = #set nt.a = 33
(a = 33, b = 2)
julia> new_nt = #insert new_nt.c = 44
(a = 33, b = 2, c = 44)

Why (; [(:x, 1), (:y, 2)]...) creates a NamedTuple?

I'm still learning Julia, and I recently came across the following code excerpt that flummoxed me:
res = (; [(:x, 10), (:y, 20)]...) # why the semicolon in front?
println(res) # (x = 10, y = 20)
println(typeof(res)) # NamedTuple{(:x, :y), Tuple{Int64, Int64}}
I understand the "splat" operator ..., but what happens when the semicolon appear first in a tuple? In other words, how does putting a semicolon in (; [(:x, 10), (:y, 20)]...) create a NamedTuple? Is this some undocumented feature/trick?
Thanks for any pointers.
Yes, this is actually a documented feature, but perhaps not a very well known one. As the documentation for NamedTuple notes:
help?> NamedTuple
search: NamedTuple #NamedTuple
NamedTuple
NamedTuples are, as their name suggests, named Tuples. That is, they're a tuple-like
collection of values, where each entry has a unique name, represented as a Symbol.
Like Tuples, NamedTuples are immutable; neither the names nor the values can be
modified in place after construction.
Accessing the value associated with a name in a named tuple can be done using field
access syntax, e.g. x.a, or using getindex, e.g. x[:a]. A tuple of the names can be
obtained using keys, and a tuple of the values can be obtained using values.
[... some other non-relevant parts of the documentation omitted ...]
In a similar fashion as to how one can define keyword arguments programmatically, a
named tuple can be created by giving a pair name::Symbol => value or splatting an
iterator yielding such pairs after a semicolon inside a tuple literal:
julia> (; :a => 1)
(a = 1,)
julia> keys = (:a, :b, :c); values = (1, 2, 3);
julia> (; zip(keys, values)...)
(a = 1, b = 2, c = 3)
As in keyword arguments, identifiers and dot expressions imply names:
julia> x = 0
0
julia> t = (; x)
(x = 0,)
julia> (; t.x)
(x = 0,)

How to promote named tuple fields to variable and associated value?

I have a model with many parameters where I am passing them as a named tuple. Is there a way to promote the values into the variable scope in my function?
parameters = (
τ₁ = 0.035,
β₁ = 0.00509,
θ = 1,
τ₂ = 0.01,
β₂ = 0.02685,
...
)
And then used like so currently:
function model(init,params) # params would be the parameters above
foo = params.β₁ ^ params.θ
end
Is there a way (marco?) to get the parameters into my variable scope directly so that I can do this:
function model(init,params) # params would be the parameters above
#promote params # hypothetical macro to bring each named tuple field into scope
foo = β₁ ^ θ
end
The latter looks a lot nicer with some math-heavy code.
You can use #unpack from the UnPack.jl package1:
julia> nt = (a = 1, b = 2, c = 3);
julia> #unpack a, c = nt; # selectively unpack a and c
julia> a
1
julia> c
3
1 This was formerly part of the Parameters.jl package, which still exports #unpack and has other similar functionality you might find useful.
Edit: As noted in the comments, writing a general macro #unpack x is not possible since the fieldnames are runtime information. You could however define a macro specific to your own type/namedtuple that unpacks
julia> macro myunpack(x)
return esc(quote
a = $(x).a
b = $(x).b
c = $(x).c
nothing
end)
end;
julia> nt = (a = 1, b = 2, c = 3);
julia> #myunpack nt
julia> a, b, c
(1, 2, 3)
However, I think it is more clear to use the #unpack since this version "hides" assignments and it is not clear where the variables a, b and c comes from when reading the code.

unexpected behavior of dict.keys on in

for faster colision control I create dict which is used as sparse array from array of objects T2.
But this cod throw exeption
KeyError: key [142, 69, 77] not found
Here us my code:
Ans=Dict{T1,Vector{T2}}()
for i in L
pos=pos_func(i)
if (pos in Ans.keys)
push!(Ans[pos],i)
else
Ans[pos]=Vector{T2}([i])
end
end
I catched event and printed pos, Ans.keys and (pos in Ans.keys). I found that pos is one of Ans.keys and (pos in Ans.keys)==True. But anyway I cannot get Ans[pos].
Julia Version 1.4.0 Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
What is the reason for such behaviour? Why same code can works half times?
You should use the keys() function in order to get the keys of your dictionary instead of accessing the keys field. (Note that, in most cases, it is not a good idea to access internal fields of Julia objects, especially when accessor methods exist).
And in the particular case of testing whether a given key appears in a Dict, using haskey() would be even more idomatic.
The following should work:
# Some definitions so that your example is runnable
julia> T1 = Int;
julia> T2 = Int;
julia> L = 1:10;
julia> pos_func(i) = i%3;
julia> Ans=Dict{T1,Vector{T2}}()
Dict{Int64,Array{Int64,1}} with 0 entries
julia> for i in L
pos=pos_func(i)
if haskey(Ans, pos) # <- keys(Ans) instead of Ans.keys
push!(Ans[pos],i)
else
Ans[pos] = T2[i] # or maybe simply [i], unless your collection L is heterogeneous
end
end
julia> Ans
Dict{Int64,Array{Int64,1}} with 3 entries:
0 => [3, 6, 9]
2 => [2, 5, 8]
1 => [1, 4, 7, 10]

List comprehensions and tuples in Julia

I am trying to do in Julia what this Python code does. (Find all pairs from the two lists whose combined value is above 7.)
#Python
def sum_is_large(a, b):
return a + b > 7
l1 = [1,2,3]
l2 = [4,5,6]
l3 = [(a,b) for a in l1 for b in l2 if sum_is_large(a, b)]
print(l3)
There is no if for list comprehensions in Julia. And if I use filter(), I'm not sure if I can pass two arguments. So my best suggestion is this:
#Julia
function sum_is_large(pair)
a, b = pair
return a + b > 7
end
l1 = [1,2,3]
l2 = [4,5,6]
l3 = filter(sum_is_large, [(i,j) for i in l1, j in l2])
print(l3)
I don't find this very appealing. So my question is, is there a better way in Julia?
Using the very popular package Iterators.jl, in Julia:
using Iterators # install using Pkg.add("Iterators")
filter(x->sum(x)>7,product(l1,l2))
is an iterator producing the pairs. So to get the same printout as the OP:
l3iter = filter(x->sum(x)>7,product(l1,l2))
for p in l3iter println(p); end
The iterator approach is potentially much more memory efficient. Ofcourse, one could just l3 = collect(l3iter) to get the pair vector.
#user2317519, just curious, is there an equivalent iterator form for python?
Guards (if) are now available in Julia v0.5 (currently in the release-candidate stage):
julia> v1 = [1, 2, 3];
julia> v2 = [4, 5, 6];
julia> v3 = [(a, b) for a in v1, b in v2 if a+b > 7]
3-element Array{Tuple{Int64,Int64},1}:
(3,5)
(2,6)
(3,6)
Note that generators are also now available:
julia> g = ( (a, b) for a in v1, b in v2 if a+b > 7 )
Base.Generator{Filter{##18#20,Base.Prod2{Array{Int64,1},Array{Int64,1}}},##17#19}(#17,Filter{##18#20,Base.Prod2{Array{Int64,1},Array{Int64,1}}}(#18,Base.Prod2{Array{Int64,1},Array{Int64,1}}([1,2,3],[4,5,6])))
Another option similar to the one of #DanGetz using also Iterators.jl:
function expensive_fun(a, b)
return (a + b)
end
Then, if the condition is also complicated, it can be defined as a function:
condition(x) = x > 7
And last, filter the results:
>>> using Iterators
>>> result = filter(condition, imap(expensive_fun, l1, l2))
result is an iterable that is only computed when needed (inexpensive) and can be collected collect(result) if required.
The one-line if the filter condition is simple enough would be:
>>> result = filter(x->(x > 7), imap(expensive_fun, l1, l2))
Note: imap works natively for arbitrary number of parameters.
Perhaps something like this:
julia> filter(pair -> pair[1] + pair[2] > 7, [(i, j) for i in l1, j in l2])
3-element Array{Tuple{Any,Any},1}:
(3,5)
(2,6)
(3,6)
although I'd agree it doesn't look like it ought to be the best way...
I'm surprised nobody mentions the ternary operator to implement the conditional:
julia> l3 = [sum_is_large((i,j)) ? (i,j) : nothing for i in l1, j in l2]
3x3 Array{Tuple,2}:
nothing nothing nothing
nothing nothing (2,6)
nothing (3,5) (3,6)
or even just a normal if block within a compound statement, i.e.
[ (if sum_is_large((x,y)); (x,y); end) for x in l1, y in l2 ]
which gives the same result.
I feel this result makes a lot more sense than filter(), because in julia the a in A, b in B construct is interpreted dimensionally, and therefore the output is in fact an "array comprehension" with appropriate dimensionality, which clearly in many cases would be advantageous and presumably the desired behaviour (whether we include a conditional or not).
Whereas filter will always return a vector. Obviously, if you really want a vector result you can always collect the result; or for a conditional list comprehension like the one here, you can simply remove nothing elements from the array by doing l3 = l3[l3 .!= nothing].
Presumably this is still clearer and no less efficient than the filter() approach.
You can use the #vcomp (vector comprehension) macro in VectorizedRoutines.jl to do Python-like comprehensions:
using VectorizedRoutines
Python.#vcomp Int[i^2 for i in 1:10] when i % 2 == 0 # Int[4, 16, 36, 64, 100]

Resources