Julia dictionary keys curiosity - julia

Consider the following code:
dog = (a=5, b=6, c=7)
frog = Dict(pairs(dog))
frog.keys
returns:
16-element Vector{Symbol}:
:a
:b
#undef
#undef
#undef
#undef
#undef
#undef
#undef
:c
#undef
#undef
#undef
#undef
#undef
#undef
Now, I am well aware that to get the keys of a dictionary the standard way is keys(frog), which does do the right thing, but what is the keys attribute of a dictionary, and why is it so weird?

Because Dict is a hash table. The locations are determined by hash values of keys modulo table size which is 16 in this case:
julia> (((hash(:a) % Int) & (16-1)) + 1)
1
julia> (((hash(:b) % Int) & (16-1)) + 1)
2
julia> (((hash(:c) % Int) & (16-1)) + 1)
10
There is a special case though:
julia> dog = (a=5, b=6, f=7)
(a = 5, b = 6, f = 7)
julia> frog = Dict(pairs(dog))
Dict{Symbol, Int64} with 3 entries:
:a => 5
:b => 6
:f => 7
julia> frog.keys
16-element Vector{Symbol}:
:a
:b
:f
#undef
#undef
#undef
#undef
#undef
#undef
#undef
#undef
#undef
#undef
#undef
#undef
#undef
julia> (((hash(:f) % Int) & (16-1)) + 1)
1
As you can see now :a and :f have index collision, so for :f the new index needs to be computed (you can see the details how this is done in the code that Ashlin Harris linked in the comment to your question).

Related

Destructure and process vector of hash-maps in Clojure

I have a vector of hash-maps, like this:
(def my-maps [{:a 1} {:b 2}])
I want to loop over each hash-map, give the key and value a more meaningful name within the loop, then process each hash-map differently depending on its key.
Without further ado, here is my best attempt:
(for [m my-maps]
(let [my-key-name (key m) my-val-name (val m)]
(case my-key-name
:a (println "Found key :a with value " my-val-name)
:b (println "Found key :b with value " my-val-name))))
This approach, however, produces a rather cryptic error:
; Error printing return value (ClassCastException) at clojure.core/key (core.clj:1569).
; class clojure.lang.PersistentArrayMap cannot be cast to class java.util.Map$Entry (clojure.lang.PersistentArrayMap is in unnamed module of loader 'app'; java.util.Map$Entry is in module java.base of loader 'bootstrap')
What am I doing wrong?
You can destructure inside for (or use doseq):
(for [[[k v] & _] [{:a 1} {:b 2}]]
(println "Found key" k "with value" v))
Found key :a with value 1
Found key :b with value 2
=> (nil nil)
For the sake of clarity, here is a more general answer broken down into individual steps:
(let [my-maps [{:a 1} {:b 2 :c 3}]]
(doseq [curr-map my-maps]
(newline)
(println "curr-map=" curr-map)
(let [map-entries (seq curr-map)]
(println "map-entries=" map-entries)
(doseq [curr-me map-entries]
(let [[k v] curr-me]
(println " curr-me=" curr-me " k=" k " v=" v))))))
With result
curr-map= {:a 1}
map-entries= ([:a 1])
curr-me= [:a 1] k= :a v= 1
curr-map= {:b 2, :c 3}
map-entries= ([:b 2] [:c 3])
curr-me= [:b 2] k= :b v= 2
curr-me= [:c 3] k= :c v= 3
A MapEntry object in Clojure can be treated as either a 2-element vector (accessed via first & second) or as a MapEntry accessed via the key and val functions. The destructuring form:
(let [[k v] curr-me]
treats the MapEntry object curr-me as a sequence and pulls out the first 2 elements into k and v. Even though it prints like a vector (eg [:a 1]), it does have the type clojure.lang.MapEntry.
The destructuring syntax & _ in the for expression of the original answer is a "rest args" destructuring. It causes the sequence of all MapEntry objects after the first one to be assigned to the variable _, which is then ignored in the rest of the code.

Julia convert NamedTuple to Dict

I would like to convert a NamedTuple to a Dict in Julia. Say I have the following NamedTuple:
julia> namedTuple = (a=1, b=2, c=3)
(a = 1, b = 2, c = 3)
I want the following:
julia> Dict(zip(keys(namedTuple), namedTuple))
Dict{Symbol, Int64} with 3 entries:
:a => 1
:b => 2
:c => 3
This works, however I would've hoped for a somewhat simpler solution - something like
julia> Dict(namedTuple)
ERROR: ArgumentError: Dict(kv): kv needs to be an iterator of tuples or pairs
would have been nice. Is there such a solution?
The simplest way to get an iterator of keys and values for any key-value collection is pairs:
julia> Dict(pairs(namedTuple))
Dict{Symbol, Int64} with 3 entries:
:a => 1
:b => 2
:c => 3

How can I distinguish arguments and local variables in slot of Julia codeinfo?

I am studying Julia static analysis, and I have the following function:
function f(x,y,z)
d=x+y
d=d*2*z
end
i use code_typed to analyze it.
julia> y=code_typed(f)
1-element Vector{Any}:
CodeInfo(
1 ─ %1 = (x + y)::Any
│ %2 = (%1 * 2)::Any
│ %3 = (%2 * z)::Any
└── return %3
) => Any
i can get slots and slot types of it.
julia> y[1].first.slotnames
5-element Vector{Symbol}:
Symbol("#self#")
:x
:y
:z
:d
julia> y[1].first.slottypes
5-element Vector{Any}:
Core.Const(f)
Any
Any
Any
Any
but do i have any way to know which is argument and which is local variables among the slots?
You can use Base.argnames to find out arguments of your function:
julia> Base.method_argnames.(methods(f))
1-element Vector{Vector{Symbol}}:
[Symbol("#self#"), :x, :y, :z]
You can extract this from the CodeInfo object as well:
julia> Base.method_argnames(y[1].first.parent.def)
4-element Vector{Symbol}:
Symbol("#self#")
:x
:y
:z

Given a Huffman tree, how to compute Huffman code for each symbol?

As the title stated, I'm writing a function to compute Huffman codes for symbols in a tree, but I feel completely lost.
A branch looks like this:
{:kind :branch, :frequency frequency, :left child0, :right child1}
A leaf looks like this:
{:kind :leaf, :frequency frequency, :value symbol}
And the code itself is structured like this:
{:tree tree, :length length, :bits bits}
I have the main function already (looks like this):
(defn huffman-codes
"Given a Huffman tree, compute the Huffman codes for each symbol in it.
Returns a map mapping each symbol to a sequence of bits (0 or 1)."
[T]
(into {} (for [s (all-symbols T)] [s (find-in-tree s T '())])
)
)
all-symbols return the set of all symbols in the tree and I am to write a helper function find-in-tree that finds the bit string of a symbol
EDIT:
I've tried this now and it gets me closer to what I want, but still not right (see error message below)
(defn find-in-tree
[s T l]
(if (isleaf? T)
{(:value T) l}
(merge (find-in-tree s (:left T) (concat l '(0)))
(find-in-tree s (:right T) (concat l '(1)))
)
)
)
ERROR -- got' {:c {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :b {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :d {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :a {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}} ', expected ' {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)} '
It gets all the correct bit strings but assigns the whole map to every value, and I don't know what's wrong.
Assuming that your Huffman tree is valid (meaning we can ignore :frequency), and that 0 means 'left' and 1 means 'right':
(defn code-map
"Given a Huffman tree, returns a map expressing each symbol's code"
[{:keys [kind left right value]} code]
(if (= kind :leaf)
{value code}
(merge (code-map left (str code "0"))
(code-map right (str code "1")))))
Demo:
;; sample tree
(def root
{:kind :branch
:left {:kind :branch
:left {:kind :leaf
:value "X"}
:right {:kind :leaf
:value "Y"}}
:right {:kind :leaf :value "Z"}})
;; make sure to pass it "" as second arg
(code-map root "")
;;=> {"X" "00", "Y" "01", "Z" "1"}
To clean this up, you could move the "" arg into an inner helper function, and the recursion could be made TCO-able.

How to get the sequence of keys for each map in a list of maps in Clojure?

So I've got the following attempt to map over a list of maps and I'm trying to get the sequence of keys that the 'keys' function returns just fine whenever I pass it a single map.
(map #(keys %) ({:a-id 1 :b 3 :c 2} {:d-id 3 :e 9 :c 1} {:a-id 3 :d-id 5 :c 2}))
which returns me a
java.lang.ClassCastException: null
I'm supposing this has something to do with the return type on keys being a sequence and by mapping over I'm guessing it's expecting a map return value??? I'm really not sure exactly why it's doing this, all I know is that it'd be dern convenient if I could get it to do in mapping what it's doing for me when I do a single application of
(keys {:a-id 1 :b 3 :c 2})
which is -- (:a-id :b :c)
(map keys '({:a-id 1 :b 3 :c 2} {:d-id 3 :e 9 :c 1} {:a-id 3 :d-id 5 :c 2}))
({:a-id 1 :b 3 :c 2} {:d-id 3 :e 9 :c 1} {:a-id 3 :d-id 5 :c 2})
Is a function call not a list. You should use list or a vector or quote the expression:
(list{:a-id 1 :b 3 :c 2} {:d-id 3 :e 9 :c 1} {:a-id 3 :d-id 5 :c 2})
Even better is to just not use lists unless you really want to create a function call. Your original code, converted to a vector of 3 maps, works great:
user=> (map #(keys %) [{:a-id 1 :b 3 :c 2} {:d-id 3 :e 9 :c 1} {:a-id 3 :d-id 5 :c 2}] )
((:a-id :c :b) (:e :c :d-id) (:a-id :c :d-id))
We leave the outer parentheses in place since (map ...) is intended to be a function call. We change the inner list to a vector, since this emphasizes that it is data (as opposed to a function call). Quoting the list also works, but is unnecessarily complex. It like saying "I am making a function call, but don't evaluate it as a function call."

Resources