Get edges from tree represented as vector in Clojure - graph

I've got a tree like so:
[:root
[:a [:b [:c [:g]]]]
[:d [:e [:f [:g]]]]]
How can I get the edges, ie:
[[:root :a] [:root :d] [:a :b] [:b :c] [:c :g] [:d :e] [:e :f] [:f :g]]

This is what I've come up with before I checked your answer. Seems a bit more idiomatic unless I'm missing something.
(defn vec->edges [v-tree]
(->> v-tree
(tree-seq vector? next)
(mapcat (fn [[a & children]]
(map (fn [[b]] [a b]) children)))))

This approach uses a basic loop (no need for extra libraries or recursion):
(defn get-edges [tree]
(loop [subtrees [tree]
output []]
(if (empty? subtrees)
output
(let [[[root & first-subtrees] & subtrees] subtrees]
(recur (into subtrees first-subtrees)
(into output (map #(-> [root (first %)])) first-subtrees))))))
Testing it on the example data:
(get-edges [:root
[:a [:b [:c [:g]]]]
[:d [:e [:f [:g]]]]])
;; => [[:root :a] [:root :d] [:d :e] [:e :f] [:f :g] [:a :b] [:b :c] [:c :g]]
Here is another approach based on lazy sequences:
(defn get-edges2 [tree]
(->> [tree]
(iterate #(into (rest %) (rest (first %))))
(take-while seq)
(mapcat (fn [subtrees]
(let [[[root & sub] & _] subtrees]
(map #(-> [root (first %)]) sub))))))

I really like the way you post your question Scott Klarenbach, it's really synthetic.
I proposed a solution in raw clojure. The tricky part was the location of the recursive call and how to handle the results of these recursive calls.
(def data
[:root
[:a [:b [:c [:g]]]]
[:d [:e [:f [:g]]]]])
(defn get-edges [collection]
(let [root (first collection)
branches (rest collection)]
(if (empty? branches)
[]
(let [edges
(mapv (fn [branch] [root (first branch)]) branches)
sub-edges
(->> branches
(mapcat (fn [branch] (get-edges branch)))
vec)]
(if (empty? sub-edges)
edges
(vec (concat edges sub-edges)))))))
(get-edges data)
;; => [[:root :a] [:root :d] [:a :b] [:b :c] [:c :g] [:d :e] [:e :f] [:f :g]]

There is no simple built-in way to do this. The easiest way is to roll your own recursion.
Here is a solution with unique hiccup tags (no duplicates):
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[schema.core :as s]))
(def result (atom []))
(s/defn walk-with-path-impl
[path :- [s/Any] ; parent path is a vector
user-fn
data :- [s/Any]] ; each data item is a hiccup vector
(let [tag (xfirst data)
path-next (append path tag)]
(user-fn path data)
(doseq [item (xrest data)]
(walk-with-path-impl path-next user-fn item))))
(defn walk-with-path!
[user-fn data]
(walk-with-path-impl [] user-fn data))
(s/defn print-edge
[path :- [s/Any] ; parent path is a vector
data :- [s/Any]] ; each data item is a hiccup vector
(when (not-empty? path)
(let [parent-node (xlast path)
tag (xfirst data)
edge [parent-node tag]]
(swap! result append edge))))
The unit test shows the input data & result
(dotest
(let [tree [:root
[:a
[:b
[:c
[:z]]]]
[:d
[:e
[:f
[:g]]]]]]
(reset! result [])
(walk-with-path! print-edge tree)
(is= #result
[[:root :a] [:a :b] [:b :c] [:c :z]
[:root :d] [:d :e] [:e :f] [:f :g]])))
Here are the docs for convenience functions like xfirst, append, etc

Related

Nested map to sequence of tuples representing edges in Clojure

How might one go about expressing the following transformation in idiomatic Clojure?
(def m
{:a {:b {:c nil
:d nil}
:e nil}})
(map->edges m) ; =>
([:a :b] [:b :c] [:b :d] [:e nil] [:d nil] [:a :e] [:e nil])
I don't care about the order in which vectors appear in the result, so either depth-first or breath-first search strategies are fine.
You can express this fairly concisely using for and tree-seq:
(defn map->edges [m]
(for [entry m
[x m] (tree-seq some? val entry)
y (or (keys m) [m])]
[x y]))
Example:
(map->edges m)
;;=> ([:a :b] [:a :e] [:b :c] [:b :d] [:c nil] [:d nil] [:e nil])

Clojure - walk with path

I am looking for a function similar to those in clojure.walk that have an inner function that takes as argument :
not a key and a value, as is the case with the clojure.walk/walk function
but the vector of keys necessary to access a value from the top-level data structure.
recursively traverses all data
Example :
;; not good since it takes `[k v]` as argument instead of `[path v]`, and is not recursive.
user=> (clojure.walk/walk (fn [[k v]] [k (* 10 v)]) identity {:a 1 :b {:c 2}})
;; {:a 10, :c 30, :b 20}
;; it should receive as arguments instead :
[[:a] 1]
[[:b :c] 2]
Note:
It should work with arrays too, using the keys 0, 1, 2... (just like in get-in).
I don't really care about the outer parameter, if that allows to simplify the code.
Currently learning clojure, I tried this as an exercise.
I however found it quite tricky to implement it directly as a walk down the tree that applies the inner function as it goes.
To achieve the result you are looking for, I split the task in 2:
First transform the nested structure into a dictionary with the path as key, and the value,
Then map the inner function over, or reduce with the outer function.
My implementation:
;; Helper function to have vector's indexes work like for get-in
(defn- to-indexed-seqs [coll]
(if (map? coll)
coll
(map vector (range) coll)))
;; Flattening the tree to a dict of (path, value) pairs that I can map over
;; user> (flatten-path [] {:a {:k1 1 :k2 2} :b [1 2 3]})
;; {[:a :k1] 1, [:a :k2] 2, [:b 0] 1, [:b 1] 2, [:b 2] 3}
(defn- flatten-path [path step]
(if (coll? step)
(->> step
to-indexed-seqs
(map (fn [[k v]] (flatten-path (conj path k) v)))
(into {}))
[path step]))
;; Some final glue
(defn path-walk [f coll]
(->> coll
(flatten-path [])
(map #(apply f %))))
;; user> (println (clojure.string/join "\n" (path-walk #(str %1 " - " %2) {:a {:k1 1 :k2 2} :b [1 2 3]})))
;; [:a :k1] - 1
;; [:a :k2] - 2
;; [:b 0] - 1
;; [:b 1] - 2
;; [:b 2] - 3
It turns out that Stuart Halloway published a gist that could be of some use (it uses a protocol, which makes it extensible as well) :
(ns user)
(def app
"Intenal Helper"
(fnil conj []))
(defprotocol PathSeq
(path-seq* [form path] "Helper for path-seq"))
(extend-protocol PathSeq
java.util.List
(path-seq*
[form path]
(->> (map-indexed
(fn [idx item]
(path-seq* item (app path idx)))
form)
(mapcat identity)))
java.util.Map
(path-seq*
[form path]
(->> (map
(fn [[k v]]
(path-seq* v (app path k)))
form)
(mapcat identity)))
java.util.Set
(path-seq*
[form path]
(->> (map
(fn [v]
(path-seq* v (app path v)))
form)
(mapcat identity)))
java.lang.Object
(path-seq* [form path] [[form path]])
nil
(path-seq* [_ path] [[nil path]]))
(defn path-seq
"Returns a sequence of paths into a form, and the elements found at
those paths. Each item in the sequence is a map with :path
and :form keys. Paths are built based on collection type: lists
by position, maps by key, and sets by value, e.g.
(path-seq [:a [:b :c] {:d :e} #{:f}])
({:path [0], :form :a}
{:path [1 0], :form :b}
{:path [1 1], :form :c}
{:path [2 :d], :form :e}
{:path [3 :f], :form :f})
"
[form]
(map
#(let [[form path] %]
{:path path :form form})
(path-seq* form nil)))
(comment
(path-seq [:a [:b :c] {:d :e} #{:f}])
;; finding nils hiding in data structures:
(->> (path-seq [:a [:b nil] {:d :e} #{:f}])
(filter (comp nil? :form)))
;; finding a nil hiding in a Datomic transaction
(->> (path-seq {:db/id 100
:friends [{:firstName "John"}
{:firstName nil}]})
(filter (comp nil? :form)))
)
Note : in my case I could also have used Specter, so if you are reading this, you may want to check it out as well.
There is also https://github.com/levand/contextual/
(def node (:b (first (root :a))))
(= node {:c 1}) ;; => true
(c/context node) ;; => [:a 0 :b]

Using Clojure update-in with multiple keys

I'm trying to apply a function to all elements in a map that match a certain key.
(def mymap {:a "a" :b "b" :c "c"})
(update-in mymap [:a :b] #(str "X-" %))
I'm expecting
{:a "X-a", :c "c", :b "X-b"}
But I get
ClassCastException java.lang.String cannot be cast to clojure.lang.Associative clojure.lang.RT.assoc (RT.java:702)
Anyone can help me with this?
update-in is to update a single key in the map (at a particular nesting level, [:a :b] means update key :b inside the map value of key :a.
What you want can be done using reduce:
(reduce #(assoc %1 %2 (str "X-" (%1 %2)))
mymap
[:a :b])
Here's a generalized function:
(defn update-each
"Updates each keyword listed in ks on associative structure m using fn."
[m ks fn]
(reduce #(update-in %1 [%2] fn) m ks))
(update-each mymap [:a :b] #(str "X-" %))
In the solution below, the haspmap if first filtered, then it is mapped to the str function, and then merged with the original hashmap -
(def m {:a "a" :b "b" :c "c"})
(def keys #{:a :b})
(->> m
(filter (fn [[k v]] (k keys)))
(map (fn [[k v]] [k (str "X-" v)]))
(into {})
(merge m))

Tree from adjacency map

I'm trying to make a function that builds a tree from an adjacency list of the form {node [children]}.
(def adjacency
{nil [:a]
:a [:b :c]
:b [:d :e]
:c [:f]})
which should result in
{nil {:a {:b {:d nil
:e nil}
:c {:f nil}}}}
However I tried, I couldn't get it to work. Recursion is a bit of a weak spot of mine, and most recursion examples I found only dealt with recursion over a list, not a tree.
Edited: Original dataset and result were unintentionally nested too deep, due to not having an editor and original source at time of posting. Sorry about that.
There is only one entry in every submap in adjacency. Is this necessary? And the same problem in the result tree.
I hope it would be more clear:
(def adjacency {:a [:b :c]
:b [:d :e]
:c [:f]})
So solution is:
(defn tree [m root]
(letfn [(tree* [l]
(if (contains? m l)
{l (into {} (map tree* (m l)))}
[l nil]))]
(tree* root)))
Test:
(tree adjacency :a)
=> {:a {:b {:d nil
:e nil}
:c {:f nil}}}
Update. If you don't need the result tree as nested maps
(defn tree [m root]
(letfn [(tree* [l]
(if (contains? m l)
(list l (map tree* (m l)))
(list l nil)))]
(tree* root)))
(tree adjacency :a)
=> (:a ((:b ((:d nil)
(:e nil)))
(:c ((:f nil)))))
I usually prefer to use clojure.walk when dealing with trees.
I am assuming that the root node is first in the adjacency vector.
(use 'clojure.walk)
(def adjacency
[{nil [:a]}
{:a [:b :c]}
{:b [:d :e]}
{:c [:f]}])
(prewalk
(fn [x]
(if (vector? x)
(let [[k v] x lookup (into {} adjacency)]
[k (into {} (map (fn [kk] [kk (lookup kk)]) v))])
x))
(first adjacency))
Result: {nil {:a {:b {:d {}, :e {}}, :c {:f {}}}}}
NOTE: Empty child are represented as {} rather than nil, also child elements are maps rather than vector as map makes easy to navigate this tree then.

How do I map a vector to a map, pushing into it repeated key values?

This is my input data:
[[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]]
I would like to map this into the following:
{:a [[1 2] [3 4] [5 6]] :b [[\a \b] [\c \d] [\e \f]]}
This is what I have so far:
(defn- build-annotation-map [annotation & m]
(let [gff (first annotation)
remaining (rest annotation)
seqname (first gff)
current {seqname [(nth gff 3) (nth gff 4)]}]
(if (not (seq remaining))
m
(let [new-m (merge-maps current m)]
(apply build-annotation-map remaining new-m)))))
(defn- merge-maps [m & ms]
(apply merge-with conj
(when (first ms)
(reduce conj ;this is to avoid [1 2 [3 4 ... etc.
(map (fn [k] {k []}) (keys m))))
m ms))
The above produces:
{:a [[1 2] [[3 4] [5 6]]] :b [[\a \b] [[\c \d] [\e \f]]]}
It seems clear to me that the problem is in merge-maps, specifically with the function passed to merge-with (conj), but after banging my head for a while now, I'm about ready for someone to help me out.
I'm new to lisp in general, and clojure in particular, so I also appreciate comments not specifically addressing the problem, but also style, brain-dead constructs on my part, etc. Thanks!
Solution (close enough, anyway):
(group-by first [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])
=> {:a [[:a 1 2] [:a 3 4] [:a 5 6]], :b [[:b \a \b] [:b \c \d] [:b \e \f]]}
(defn build-annotations [coll]
(reduce (fn [m [k & vs]]
(assoc m k (conj (m k []) (vec vs))))
{} coll))
Concerning your code, the most significant problem is naming. Firstly, I wouldn't, especially without first understanding your code, have any idea what is meant by annotation, gff, and seqname. current is pretty ambiguous too. In Clojure, remaining would generally be called more, depending on the context, and whether a more specific name should be used.
Within your let statement, gff (first annotation)
remaining (rest annotation), I'd probably take advantage of destructuring, like this:
(let [[first & more] annotation] ...)
If you would rather use (rest annotation) then I'd suggest using next instead, as it will return nil if it's empty, and allow you to write (if-not remaining ...) rather than (if-not (seq remaining) ...).
user> (next [])
nil
user> (rest [])
()
In Clojure, unlike other lisps, the empty list is truthy.
This article shows the standard for idiomatic naming.
Works at least on the given data set.
(defn build-annotations [coll]
(reduce
(fn [result vec]
(let [key (first vec)
val (subvec vec 1)
old-val (get result key [])
conjoined-val (conj old-val val)]
(assoc
result
key
conjoined-val)))
{}
coll))
(build-annotations [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])
I am sorry for not offering improvements on your code. I am just learning Clojure and it is easier to solve problems piece by piece instead of understanding a bigger piece of code and finding the problems in it.
Although I have no comments to your code yet, I tried it for my own and came up with this solution:
(defn build-annotations [coll]
(let [anmap (group-by first coll)]
(zipmap (keys anmap) (map #(vec (map (comp vec rest) %)) (vals anmap)))))
Here's my entry leveraging group-by, although several steps in here are really concerned with returning vectors rather than lists. If you drop that requirement, it gets a bit simpler:
(defn f [s]
(let [g (group-by first s)
k (keys g)
v (vals g)
cleaned-v (for [group v]
(into [] (map (comp #(into [] %) rest) group)))]
(zipmap k cleaned-v)))
Depending what you actually want, you might even be able to get by with just doing group-by.
(defn build-annotations [coll]
(apply merge-with concat
(map (fn [[k & vals]] {k [vals]})
coll))
So,
(map (fn [[k & vals]] {k [vals]})
coll))
takes a collection of [keys & values] and returns a list of {key [values]}
(apply merge-with concat ...list of maps...)
takes a list of maps, merges them together, and concats the values if a key already exists.

Resources