How do I check for duplicates within a map in clojure? - dictionary

So I have a list like the following:
({:name "yellowtail", :quantity 2} {:name "tuna", :quantity 1}
{:name "albacore", :quantity 1} {:quantity 1, :name "tuna"})
My goal is to search the list of map items and find duplicates keys, if there are duplicates then increment the quantity. So in the list I have two tuna mapped elements that show up. I want to remove one and just increment the quantity of the other. So the result should be:
({:name "yellowtail", :quantity 2} {:name "tuna", :quantity 2}
{:name "albacore", :quantity 1} )
With :quantity of tuna incremented to 2. I have attempted to use recur to do this without success, I'm not sure if recur is a good direction to run with. Could someone point me in the right direction?

You can group-by :name your elements and then map through the grouped collection summing the values.
Something like this
(->> your-list
(group-by :name)
(map (fn [[k v]]
{:name k :quantity (apply + (map :quantity v))})))
P.S. I assume you need to sum quantity of elements, because it's not clear what exactly you need to increment.

This is standard use case for map and reduce.
(->> data
(map (juxt :name :quantity identity))
(reduce (fn [m [key qty _]]
(update m key (fnil (partial + qty) 0)))
{})
(map #(hash-map :name (key %1) :quantity (val %1))))
I am using identity to return the element in case you wish to use other properties in the map to determine uniqueness. If the map only contains two fields, then you could simplify it down to
(->> data
(mapcat #(repeat (:quantity %1) (:name %1)))
(frequencies)
(map #(hash-map :name (key %1) :quantity (val %1))))

Why not just hold a map from name to quantity. Instead of
({:name "yellowtail", :quantity 2} {:name "tuna", :quantity 1}
{:name "albacore", :quantity 1} {:quantity 1, :name "tuna"})
... we have
{"yellowtail" 2, "tuna" 1, "albacore" 1}
We are using the map to represent a multiset. Several clojure implementations are available, but I haven't used them.

Related

How can I turn an ordered tree into a collection of named nodes in Clojure?

I think it's best to use an example. Let's say I have an ordered tree:
(def abcd [:a [:b :c] :d])
I want to build from it a collection of key-value maps, each map representing a nodes of this tree, with a random name and all relevant information, that is, its parent (nil for the root node) its index (0, 1, 2 ..) and, if it's a leaf node, its content (like ":a"). For instance, in this case it could be:
[{:name G__36654, :parent nil, :index 0}
{:name G__36655, :content :a, :parent G__36654, :index 0}
{:name G__36656, :parent G__36654, :index 1}
{:name G__36657, :content :b, :parent G__36656, :index 0}
{:name G__36658, :content :c, :parent G__36656, :index 1}
{:name G__36659, :content :d, :parent G__36654, :index 2}]
I defined a function that seems to do what I want, but it uses recursion by calling itself and I'm having trouble figuring out how to use loop-recur instead, and I believe there must be something better out there. Here's my attempt:
(defn mttrav "my tree traversal"
([ptree parent index]
(let [name (gensym)]
(cond
(not (coll? ptree)) [ {:name name :content ptree :parent parent :index index}]
:else (reduce into
[{:name name :parent parent :index index}]
(map-indexed #(mttrav %2 name %1) ptree)))))
([ptree]
(mttrav ptree nil 0)))
BTW, I don't know if a vector is the right collection to use, maybe a set would make more sense, but I'm using a vector for easier debugging, since it's more readable when the order in which nodes are generated is preserved, and if nodes are accidentally repeated I want to see it.
Thanks in advance!
Edit: just to clarify, it would also be acceptable for each node to have a list of :child nodes instead of a :parent node, and some other variations, as long as it's a flat collection of maps, each map representing a node, with a unique :name, and the position, content and parent-child relations of the nodes are captured in this structure. The intended input are hiccup parse trees coming typically from Instaparse, and the maps are meant to become records to insert in a Clara session.
When a tree resists tail recursion, another thing to try is a "zipper" from Clojure's standard library. Zippers shine for editing, but they're also pretty good at linearizing depth-first traversal while keeping structure context available. A typical zipper loop looks like this:
user> (def abcd '(:a (:b :c) :d))
#'user/abcd'
user> (loop [ret [], z (zip/seq-zip abcd)]
(if (zip/end? z)
ret
(let [o {:name 42, :content (zip/node z), :parent 42, :index 42}]
(recur (conj ret o) (zip/next z)))))
[{:name 42, :content (:a (:b :c) :d), :parent 42, :index 42}
{:name 42, :content :a, :parent 42, :index 42}
{:name 42, :content (:b :c), :parent 42, :index 42}
{:name 42, :content :b, :parent 42, :index 42}
{:name 42, :content :c, :parent 42, :index 42}
{:name 42, :content :d, :parent 42, :index 42}]
To fill in :parent and :index, you'll find zipper notation for looking "up" at parents, "left" for siblings, etc., in the official docs at https://clojure.github.io/clojure/clojure.zip-api.html.
I created the zip with seq-zip having modeled nodes as a list. Your specific case models nodes as vectors, which seq-zip does not recognize, so you would presumably use vector-zip or invent your own adapter. You can follow the "Source" link in the docs to see how seq-zip and vector-zip work.
Breadth first traversal is what you need. So if you want to build the list of parents while you traverse the tree, you need to first uniquely identify all your leaf nodes. I'm not sure it can be done without doing that, except if you know for sure that your leafs nodes are unique. It's also getting really late/early here, so my brain is not working optimally. I'm sure my solution can get distilled down a lot.
So if you have a tree like [:a [:b :c] :d [:b :c]], [:b :c] is a parent of :b and :c, but then last two leaf nodes are also :b and :c, so which parent do you choose ?
So let's have a tree whose leaves have unique id.
(defn attach-ids [tree]
(clojure.walk/postwalk (fn [node]
(if (coll? node) node
{:node node :id (gensym)}))
tree))
(def tree (attach-ids [:a [:b :c] :d]))
;; produces this
;; [{:node :a, :id G__21500}
;; [{:node :b, :id G__21501} {:node :c, :id G__21502}]
;; {:node :d, :id G__21503}]
Now for the rest of the solution
(defn add-parent [parent-map id branch]
(assoc parent-map id {:children-ids (set (map :id branch))
:child-nodes (map :node branch)}))
(defn find-parent-id [node parent-map]
(->> parent-map
(filter (fn [[parent-id {children-ids :children-ids}]]
(contains? children-ids (:id node))))
ffirst))
(defn find-index [node parent-map tree]
(if-let [parent-id (find-parent-id node parent-map)]
(let [children (:child-nodes (get parent-map parent-id))]
(.indexOf children (:node node)))
(.indexOf tree node)))
(defn bfs [tree]
(loop [queue tree
parent-map {}
ret []]
(if (not-empty queue)
(let [node (first queue)
rst (vec (rest queue))]
(cond
(map? node)
(recur rst
parent-map
(conj ret (assoc node :parent (find-parent-id node parent-map)
:index (find-index node parent-map tree))))
(vector? node)
(let [parent-id (gensym)]
(recur (into rst node)
(add-parent parent-map parent-id node)
(conj ret {:id parent-id
:index (find-index node parent-map tree)
:parent (find-parent-id node parent-map)})))))
ret)))
(def tree (attach-ids [:a [:b :c] :d]))
(bfs tree)
;; children with :parent nil value point to root
;;[{:node :a, :id G__21504, :parent nil, :index 0}
;; {:id G__21513, :index 1}
;; {:node :d, :id G__21507, :parent nil, :index 2}
;; {:node :b, :id G__21505, :parent G__21513, :index 0}
;; {:node :c, :id G__21506, :parent G__21513, :index 1}]

clojure: add index to vector of maps

I have a vector of maps. I want to associate an index element for each element.
Example:
(append-index [{:name "foo"} {:name "bar"} {:name "baz"}])
should return
[{:name "foo" :index 1} {:name "bar" :index 2} {:name "baz" :index 3}]
What is the best way to implement append-index function?
First of all, Clojure starts counting vector elements from 0, so you probably want to get
[{:index 0, :name "foo"} {:index 1, :name "bar"} {:index 2, :name "baz"}]
You could do it pretty easily with map-indexed function
(defn append-index [coll]
(map-indexed #(assoc %2 :index %1) coll))
just adding some fun:
(defn append-index [items]
(map assoc items (repeat :index) (range)))

Converting vector to indexed map in Clojure?

Let's say I have the following vector of maps:
[{:name "Jack" :age 5}
{:name "Joe" :age 15}
{:name "Mare" :age 34}
{:name "William" :age 64}
{:name "Adolf" :age 34}]
I want to convert this to an indexed map, like:
{1 {:name "Jack" :age 5}
2 {:name "Joe" :age 15}
3 {:name "Mare" :age 34}
4 {:name "William" :age 64}
5 {:name "Adolf" :age 34}}
And at some point, when I have modified the indexed map, I want to convert it back to vector of maps.
How to do it?
You can use map-indexed in order to associate each map to its index and then reduce it into an hashmap:
(reduce into {} (map-indexed #(assoc {} %1 %2) test))
If you want to go back to your first structure:
(vec (vals your-indexed-map))
zipmap combines a series of keys and values, so you could do:
(zipmap (iterate inc 1) data-vector)
(with data-vector being your vector of maps)
The reverse would basically be sorting by key, then taking all values, which can be written exactly like that:
(->> data-map
(sort-by key)
(map val))

clojure reduce a seq of maps to a map of vectors

I would like to reduce the following seq:
({0 "Billie Verpooten"}
{1 "10:00"}
{2 "17:00"}
{11 "11:10"}
{12 "19:20"})
to
{:name "Billie Verpooten"
:work {:1 ["10:00" "17:00"]
:11 ["11:10" "19:20"]}}
but I have no idea to do this.
I was think about a recursive function that uses deconstruction.
There's a function for reducing a sequence to something in the standard library, and it's called reduce. Though in your specific case, it seems appropriate to remove the special case key 0 first and partition the rest into the pairs of entries that they're meant to be.
The following function gives the result described in your question:
(defn build-map [maps]
(let [entries (map first maps)
key-zero? (comp zero? key)]
{:name (val (first (filter key-zero? entries)))
:work (reduce (fn [acc [[k1 v1] [k2 v2]]]
(assoc acc (keyword (str k1)) [v1 v2]))
{}
(partition 2 (remove key-zero? entries)))}))
Just for variety here is a different way of expressing an answer by threading sequence manipulation functions:
user> (def data '({0 "Billie Verpooten"}
{1 "10:00"}
{2 "17:00"}
{11 "11:10"}
{12 "19:20"}))
user> {:name (-> data first first val)
:work (as-> data x
(rest x)
(into {} x)
(zipmap (map first (partition 1 2 (keys x)))
(partition 2 (vals x))))}
teh as-> threading macro is new to Clojure 1.5 and makes expressing this sort of function a bit more concise.

In Clojure how could I create an "add id to map" function?

Say I have a collection of maps:
(def coll #{{:name "foo"} {:name "bar"}})
I want a function that will add an id (a unique number is fine) to each map element in the collection. i.e.
#{{:id 1 :name "foo"} {:id 2 :name "bar"}}
The following DOES NOT WORK, but it's the line of thinking I currently have.
(defn add-unique-id [coll]
(map assoc :id (iterate inc 0) coll))
Thanks in advance...
If you want to be really, really sure the IDs are unique, use UUIDs.
(defn add-id [coll]
(map #(assoc % :id (str (java.util.UUID/randomUUID))) coll))
How about
(defn add-unique-id [coll]
(map #(assoc %1 :id %2) coll (range (count coll))))
Or
(defn add-unique-id [coll]
(map #(assoc %1 :id %2) coll (iterate inc 0)))

Resources