How might one go about expressing the following transformation in idiomatic Clojure?
(def m
{:a {:b {:c nil
:d nil}
:e nil}})
(map->edges m) ; =>
([:a :b] [:b :c] [:b :d] [:e nil] [:d nil] [:a :e] [:e nil])
I don't care about the order in which vectors appear in the result, so either depth-first or breath-first search strategies are fine.
You can express this fairly concisely using for and tree-seq:
(defn map->edges [m]
(for [entry m
[x m] (tree-seq some? val entry)
y (or (keys m) [m])]
[x y]))
Example:
(map->edges m)
;;=> ([:a :b] [:a :e] [:b :c] [:b :d] [:c nil] [:d nil] [:e nil])
Related
I've got a tree like so:
[:root
[:a [:b [:c [:g]]]]
[:d [:e [:f [:g]]]]]
How can I get the edges, ie:
[[:root :a] [:root :d] [:a :b] [:b :c] [:c :g] [:d :e] [:e :f] [:f :g]]
This is what I've come up with before I checked your answer. Seems a bit more idiomatic unless I'm missing something.
(defn vec->edges [v-tree]
(->> v-tree
(tree-seq vector? next)
(mapcat (fn [[a & children]]
(map (fn [[b]] [a b]) children)))))
This approach uses a basic loop (no need for extra libraries or recursion):
(defn get-edges [tree]
(loop [subtrees [tree]
output []]
(if (empty? subtrees)
output
(let [[[root & first-subtrees] & subtrees] subtrees]
(recur (into subtrees first-subtrees)
(into output (map #(-> [root (first %)])) first-subtrees))))))
Testing it on the example data:
(get-edges [:root
[:a [:b [:c [:g]]]]
[:d [:e [:f [:g]]]]])
;; => [[:root :a] [:root :d] [:d :e] [:e :f] [:f :g] [:a :b] [:b :c] [:c :g]]
Here is another approach based on lazy sequences:
(defn get-edges2 [tree]
(->> [tree]
(iterate #(into (rest %) (rest (first %))))
(take-while seq)
(mapcat (fn [subtrees]
(let [[[root & sub] & _] subtrees]
(map #(-> [root (first %)]) sub))))))
I really like the way you post your question Scott Klarenbach, it's really synthetic.
I proposed a solution in raw clojure. The tricky part was the location of the recursive call and how to handle the results of these recursive calls.
(def data
[:root
[:a [:b [:c [:g]]]]
[:d [:e [:f [:g]]]]])
(defn get-edges [collection]
(let [root (first collection)
branches (rest collection)]
(if (empty? branches)
[]
(let [edges
(mapv (fn [branch] [root (first branch)]) branches)
sub-edges
(->> branches
(mapcat (fn [branch] (get-edges branch)))
vec)]
(if (empty? sub-edges)
edges
(vec (concat edges sub-edges)))))))
(get-edges data)
;; => [[:root :a] [:root :d] [:a :b] [:b :c] [:c :g] [:d :e] [:e :f] [:f :g]]
There is no simple built-in way to do this. The easiest way is to roll your own recursion.
Here is a solution with unique hiccup tags (no duplicates):
(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require
[schema.core :as s]))
(def result (atom []))
(s/defn walk-with-path-impl
[path :- [s/Any] ; parent path is a vector
user-fn
data :- [s/Any]] ; each data item is a hiccup vector
(let [tag (xfirst data)
path-next (append path tag)]
(user-fn path data)
(doseq [item (xrest data)]
(walk-with-path-impl path-next user-fn item))))
(defn walk-with-path!
[user-fn data]
(walk-with-path-impl [] user-fn data))
(s/defn print-edge
[path :- [s/Any] ; parent path is a vector
data :- [s/Any]] ; each data item is a hiccup vector
(when (not-empty? path)
(let [parent-node (xlast path)
tag (xfirst data)
edge [parent-node tag]]
(swap! result append edge))))
The unit test shows the input data & result
(dotest
(let [tree [:root
[:a
[:b
[:c
[:z]]]]
[:d
[:e
[:f
[:g]]]]]]
(reset! result [])
(walk-with-path! print-edge tree)
(is= #result
[[:root :a] [:a :b] [:b :c] [:c :z]
[:root :d] [:d :e] [:e :f] [:f :g]])))
Here are the docs for convenience functions like xfirst, append, etc
Looking at the example:
(into [] (select-keys {:a 1 :b 2 :c 3} [:c :b])) => [[:c 3] [:b 2]]
Is it guaranteed that returned result will preserve order declared in second parameter of select-key?
select-keys returns a map which are unorderd, so you can't rely on it. Small maps are represented as arrays which do maintain order but this is broken as the size increases e.g.
(def m {:a 1 :b 2 :c 3 :d 4 :e 5 :f 6 :g 7 :h 8 :i 9 :j 10 :k 11 :l 12 :m 13})
(into [] (select-keys m [:a :b :c :d :e :f :g :h :i :j :k]))
=> [[:e 5] [:k 11] [:g 7] [:c 3] [:j 10] [:h 8] [:b 2] [:d 4] [:f 6] [:i 9] [:a 1]]
It's not a reliable function to preserve orders, as well as keys and vals which do not preserve the order of how you defined the map.
The most reliable way is to use a loop / recur if you want to output an ordered vector of kv (or a reduce as well).
If you want ordered values only, you can use juxt.
I would add that pragmatically hashmaps are not meant to represent ordered data.
If the ordering is a fairly simple one, you can easily sort the map by using sorted-map-by:
(def m (select-keys {:a 1 :b 2 :c 3 :d 4} [:d :b]))
=> {:d 4, :b 2}
(def sm (sorted-map-by compare))
(into sm m)
=> {:b 2, :d 4}
(assoc *1 :c 1234)
=> {:b 2, :c 1234, :d 4} ; maintains sort when additional kvs are assoc'd
(into [] *1)
=> [[:b 2] [:c 1234] [:d 4]]
Even if the ordering is not simple, you can write a custom comparator. This is rife with pitfalls, but there is a guide that is very good. It's not what you were originally after, but it's relevant.
I am looking for a function similar to those in clojure.walk that have an inner function that takes as argument :
not a key and a value, as is the case with the clojure.walk/walk function
but the vector of keys necessary to access a value from the top-level data structure.
recursively traverses all data
Example :
;; not good since it takes `[k v]` as argument instead of `[path v]`, and is not recursive.
user=> (clojure.walk/walk (fn [[k v]] [k (* 10 v)]) identity {:a 1 :b {:c 2}})
;; {:a 10, :c 30, :b 20}
;; it should receive as arguments instead :
[[:a] 1]
[[:b :c] 2]
Note:
It should work with arrays too, using the keys 0, 1, 2... (just like in get-in).
I don't really care about the outer parameter, if that allows to simplify the code.
Currently learning clojure, I tried this as an exercise.
I however found it quite tricky to implement it directly as a walk down the tree that applies the inner function as it goes.
To achieve the result you are looking for, I split the task in 2:
First transform the nested structure into a dictionary with the path as key, and the value,
Then map the inner function over, or reduce with the outer function.
My implementation:
;; Helper function to have vector's indexes work like for get-in
(defn- to-indexed-seqs [coll]
(if (map? coll)
coll
(map vector (range) coll)))
;; Flattening the tree to a dict of (path, value) pairs that I can map over
;; user> (flatten-path [] {:a {:k1 1 :k2 2} :b [1 2 3]})
;; {[:a :k1] 1, [:a :k2] 2, [:b 0] 1, [:b 1] 2, [:b 2] 3}
(defn- flatten-path [path step]
(if (coll? step)
(->> step
to-indexed-seqs
(map (fn [[k v]] (flatten-path (conj path k) v)))
(into {}))
[path step]))
;; Some final glue
(defn path-walk [f coll]
(->> coll
(flatten-path [])
(map #(apply f %))))
;; user> (println (clojure.string/join "\n" (path-walk #(str %1 " - " %2) {:a {:k1 1 :k2 2} :b [1 2 3]})))
;; [:a :k1] - 1
;; [:a :k2] - 2
;; [:b 0] - 1
;; [:b 1] - 2
;; [:b 2] - 3
It turns out that Stuart Halloway published a gist that could be of some use (it uses a protocol, which makes it extensible as well) :
(ns user)
(def app
"Intenal Helper"
(fnil conj []))
(defprotocol PathSeq
(path-seq* [form path] "Helper for path-seq"))
(extend-protocol PathSeq
java.util.List
(path-seq*
[form path]
(->> (map-indexed
(fn [idx item]
(path-seq* item (app path idx)))
form)
(mapcat identity)))
java.util.Map
(path-seq*
[form path]
(->> (map
(fn [[k v]]
(path-seq* v (app path k)))
form)
(mapcat identity)))
java.util.Set
(path-seq*
[form path]
(->> (map
(fn [v]
(path-seq* v (app path v)))
form)
(mapcat identity)))
java.lang.Object
(path-seq* [form path] [[form path]])
nil
(path-seq* [_ path] [[nil path]]))
(defn path-seq
"Returns a sequence of paths into a form, and the elements found at
those paths. Each item in the sequence is a map with :path
and :form keys. Paths are built based on collection type: lists
by position, maps by key, and sets by value, e.g.
(path-seq [:a [:b :c] {:d :e} #{:f}])
({:path [0], :form :a}
{:path [1 0], :form :b}
{:path [1 1], :form :c}
{:path [2 :d], :form :e}
{:path [3 :f], :form :f})
"
[form]
(map
#(let [[form path] %]
{:path path :form form})
(path-seq* form nil)))
(comment
(path-seq [:a [:b :c] {:d :e} #{:f}])
;; finding nils hiding in data structures:
(->> (path-seq [:a [:b nil] {:d :e} #{:f}])
(filter (comp nil? :form)))
;; finding a nil hiding in a Datomic transaction
(->> (path-seq {:db/id 100
:friends [{:firstName "John"}
{:firstName nil}]})
(filter (comp nil? :form)))
)
Note : in my case I could also have used Specter, so if you are reading this, you may want to check it out as well.
There is also https://github.com/levand/contextual/
(def node (:b (first (root :a))))
(= node {:c 1}) ;; => true
(c/context node) ;; => [:a 0 :b]
I'm trying to make a function that builds a tree from an adjacency list of the form {node [children]}.
(def adjacency
{nil [:a]
:a [:b :c]
:b [:d :e]
:c [:f]})
which should result in
{nil {:a {:b {:d nil
:e nil}
:c {:f nil}}}}
However I tried, I couldn't get it to work. Recursion is a bit of a weak spot of mine, and most recursion examples I found only dealt with recursion over a list, not a tree.
Edited: Original dataset and result were unintentionally nested too deep, due to not having an editor and original source at time of posting. Sorry about that.
There is only one entry in every submap in adjacency. Is this necessary? And the same problem in the result tree.
I hope it would be more clear:
(def adjacency {:a [:b :c]
:b [:d :e]
:c [:f]})
So solution is:
(defn tree [m root]
(letfn [(tree* [l]
(if (contains? m l)
{l (into {} (map tree* (m l)))}
[l nil]))]
(tree* root)))
Test:
(tree adjacency :a)
=> {:a {:b {:d nil
:e nil}
:c {:f nil}}}
Update. If you don't need the result tree as nested maps
(defn tree [m root]
(letfn [(tree* [l]
(if (contains? m l)
(list l (map tree* (m l)))
(list l nil)))]
(tree* root)))
(tree adjacency :a)
=> (:a ((:b ((:d nil)
(:e nil)))
(:c ((:f nil)))))
I usually prefer to use clojure.walk when dealing with trees.
I am assuming that the root node is first in the adjacency vector.
(use 'clojure.walk)
(def adjacency
[{nil [:a]}
{:a [:b :c]}
{:b [:d :e]}
{:c [:f]}])
(prewalk
(fn [x]
(if (vector? x)
(let [[k v] x lookup (into {} adjacency)]
[k (into {} (map (fn [kk] [kk (lookup kk)]) v))])
x))
(first adjacency))
Result: {nil {:a {:b {:d {}, :e {}}, :c {:f {}}}}}
NOTE: Empty child are represented as {} rather than nil, also child elements are maps rather than vector as map makes easy to navigate this tree then.
In a project I'm working on I came across an interesting problem that I'm curious about other solutions for. I'm in the middle of reading "The Little Schemer" so I'm trying out some recursion techniques. I'm wondering if there is another way to do this with recursion and also interested if there is an approach without using recursion.
The problem is to take a sequence and partition it into a seq of seqs by taking every nth element. For example this vector:
[ :a :b :c :d :e :f :g :h :i ]
when partitioned with n=3 would produce the seq
((:a :d :g) (:b :e :h) (:c :f :i))
and with n=4:
((:a :e :i) (:b :f) (:c :g) (:d :h))
and so on. I solved this using two functions. The first creates the inner seqs and the other pulls them together. Here are my functions:
(defn subseq-by-nth
"Creates a subsequence of coll formed by starting with the kth element and selecting every nth element."
[coll k n]
(cond (empty? coll) nil
(< (count coll) n) (seq (list (first coll)))
:else (cons (nth coll k) (subseq-by-nth (drop (+ n k) coll) 0 n))))
(defn partition-by-nth
""
([coll n]
(partition-by-nth coll n n))
([coll n i]
(cond (empty? coll) nil
(= 0 i) nil
:else (cons (subseq-by-nth coll 0 n) (partition-by-nth (rest coll) n (dec i))))))
I'm not completely happy with the partition-by-nth function having multiple arity simply for the recursion, but couldn't see another way.
This seems to work just fine with all the test cases. Is this a decent approach? Is it too complicated? Is there a way to do this without recursion or maybe in a single recursive function?
Thanks for the suggestions. I'm new to both Clojure and Lisp, so am picking up the different techniques as I go.
I expect there is a simpler recursive definition which is more in the spirit of The Little Schemer, but the following function using take-nth is quite a bit more compact, since you said you were interested in alternative approaches:
(defn chop [coll n]
(for [i (range n)]
(take-nth n (drop i coll))))
which satisfies your examples:
(chop [:a :b :c :d :e :f :g :h :i ] 3)
;= ((:a :d :g) (:b :e :h) (:c :f :i))
(chop [:a :b :c :d :e :f :g :h :i ] 4)
;= ((:a :e :i) (:b :f) (:c :g) (:d :h))
In Clojure, the built in libraries will get you surprisingly far; when that fails, use an explicitly recursive solution. This version is also lazy; you'd probably want to use lazy-seq or loop...recur in any "longhand" (explicitly recursive) version to handle large datasets without blowing the stack.
I have to offer this Common Lisp loop:
(defun partition-by-nth (list n)
(loop :with result := (make-array n :initial-element '())
:for i :upfrom 0
:and e :in list
:do (push e (aref result (mod i n)))
:finally (return (map 'list #'nreverse result))))
Edited because the original answer totally missed the point.
When I first saw this question I thought clojure.core function partition applied (see
ClojureDocs page).
As Dave pointed out partition only works on the elements in the original order. The take-nth solution is clearly better. Just for the sake of interest a combination of map with multiple sequences derived from partition kind-of works.
(defn ugly-solution [coll n]
(apply map list (partition n n (repeat nil) coll)))
(ugly-solution [:a :b :c :d :e :f :g :h :i] 3)
;;=> ((:a :d :g) (:b :e :h) (:c :f :i))
(ugly-solution [:a :b :c :d :e :f :g :h :i] 4)
;;=> ((:a :e :i) (:b :f nil) (:c :g nil) (:d :h nil))