I have an atom fs that I'm updating inside a recursive function freq-seq that's the value that holds the results of my computation. I have another function mine-freq-seqs to start freq-seq and when mine-freq-seqs is done I would like to receive the last value of said atom. So I thought I would do it like so
(ns freq-seq-enum)
(def fs (atom #{}))
(defn locally-frequents
[sdb min-sup]
(let [uniq-sdb (map (comp frequencies set) sdb)
freqs (apply merge-with + uniq-sdb)]
(->> freqs
(filter #(<= min-sup (second %)))
(map #(vector (str (first %)) (second %))))))
(defn project-sdb
[sdb prefix]
(if (empty? prefix) sdb
(into [] (->> sdb
(filter #(re-find (re-pattern (str (last prefix))) %))
(map #(subs % (inc (.indexOf % (str (last prefix))))))
(remove empty?)))))
(defn freq-seq
[sdb prefix prefix-support min-sup frequent-seqs]
(if ((complement empty?) prefix) (swap! fs conj [prefix prefix-support]))
(let [lf (locally-frequents sdb min-sup)]
(if (empty? lf) nil
(for [[item sup] lf] (freq-seq (project-sdb sdb (str prefix item)) (str prefix item) sup min-sup #fs)))))
(defn mine-freq-seqs
[sdb min-sup]
(freq-seq sdb "" 0 min-sup #fs))
running it first
(mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2)
then deref-ing the atom
(deref fs)
yields
#{["B" 4]
["BC" 4]
["AB" 4]
["CA" 3]
["CAC" 2]
["AC" 4]
["ABC" 4]
["CAB" 2]
["A" 4]
["CABC" 2]
["ABB" 2]
["CC" 2]
["CB" 3]
["C" 4]
["BB" 2]
["CBC" 2]
["AA" 2]}
however (doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2) (deref fs))
just gives #{}
What I want is to let the freq-seq recurse to completion then get the value of the atom fs. So I can call mine-freq-seq and have my result returned in the REPL instead of having to manually deref it there.
First some alternate code without the atom then a look at why you get the empty return.
A more compact version where the sequences in a string are derived with a reduce rather than the recursion with regex and substr.
Then just do a frequencies on those results.
(defn local-seqs
[s]
(->> s
(reduce (fn [acc a] (into acc (map #(conj % a) acc))) #{[]})
(map #(apply str %))
(remove empty?)))
(defn freq-seqs
[sdb min-sup]
(->> (mapcat local-seqs sdb)
frequencies
(filter #(>= (second %) min-sup))
set))
That's the whole thing!
I haven't involved an atom because I didn't see a need but add it at the end if freq-seqs if you like.
For your original question: why the return that you see?
You are calling doall with 2 args, the result of your call and a collection. doall is a function and not a macro so the deref is performed immediately.
(defn doall
;; <snip>
([n coll] ;; you have passed #{} as coll
(dorun n coll) ;; and this line evals to nil
coll) ;; and #{} is returned
You have passed your result as the n arg and an empty set as the coll (from (deref fs))
Now when doall calls dorun, it encounters the following:
(defn dorun
;; <snip>
([n coll]
(when (and (seq coll) (pos? n)) ;; coll is #{} so the seq is falesy
(recur (dec n) (next coll)))) ;; and a nil is returned
Since the empty set from fs is the second arg (coll) and and is a macro, it will be falsey on (seq coll), return nil and then doall returns the empty set that was it's second arg.
Final note:
So that is something that works and why yours failed. As to how to make yours work, to fix the call above I tried:
(do (doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2))
(deref fs))
That is closer to working but with the recusion in your process, it only forces the eval one level deep. So you could push the doall deeper into your funcs but I have proposed a completely different internal structure so I will leave the rest to you if you really need that structure.
I changed it a bit to remove all of the lazy bits (this happens silently in the repl but can be confusing when it changes outside of the repl). Note the changes with vec, mapv, and doall. At least now I get your result:
(def fs (atom #{}))
(defn locally-frequents
[sdb min-sup]
(let [uniq-sdb (map (comp frequencies set) sdb)
freqs (apply merge-with + uniq-sdb)]
(->> freqs
(filter #(<= min-sup (second %)))
(mapv #(vector (str (first %)) (second %))))))
(defn project-sdb
[sdb prefix]
(if (empty? prefix)
sdb
(into [] (->> sdb
(filter #(re-find (re-pattern (str (last prefix))) %))
(map #(subs % (inc (.indexOf % (str (last prefix))))))
(remove empty?)))))
(defn freq-seq
[sdb prefix prefix-support min-sup frequent-seqs]
(if ((complement empty?) prefix) (swap! fs conj [prefix prefix-support]))
(let [lf (locally-frequents sdb min-sup)]
(if (empty? lf)
nil
(vec (for [[item sup] lf] (freq-seq (project-sdb sdb (str prefix item)) (str prefix item) sup min-sup #fs))))))
(defn mine-freq-seqs
[sdb min-sup]
(freq-seq sdb "" 0 min-sup #fs))
(doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2))
(deref fs) => #{["B" 4] ["BC" 4] ["AB" 4] ["CA" 3]
["CAC" 2] ["AC" 4] ["ABC" 4] ["CAB" 2]
["A" 4] ["CABC" 2] ["ABB" 2] ["CC" 2] ["CB" 3]
["C" 4] ["BB" 2] ["CBC" 2] ["AA" 2]}
I'm still not really sure what the goal is or how/why you get entries like "CABC".
Related
Consider this simple-minded recursive implementation of comp in Clojure:
(defn my-comp
([f]
(fn [& args]
(apply f args)))
([f & funcs]
(fn [& args]
(f (apply (apply my-comp funcs) args)))))
The right way to do this, I am told, is using recur, but I am unsure how recur works. In particular: is there a way to coax the code above into being recurable?
evaluation 1
First let's visualize the problem. my-comp as it is written in the question will create a deep stack of function calls, each waiting on the stack to resolve, blocked until the the deepest call returns -
((my-comp inc inc inc) 1)
((fn [& args]
(inc (apply (apply my-comp '(inc inc)) args))) 1)
(inc (apply (fn [& args]
(inc (apply (apply my-comp '(inc)) args))) '(1)))
(inc (inc (apply (apply my-comp '(inc)) '(1))))
(inc (inc (apply (fn [& args]
(apply inc args)) '(1))))
(inc (inc (apply inc '(1)))) ; ⚠️ deep in the hole we go...
(inc (inc 2))
(inc 3)
4
tail-recursive my-comp
Rather than creating a long sequence of functions, this my-comp is refactored to return a single function, which when called, runs a loop over the supplied input functions -
(defn my-comp [& fs]
(fn [init]
(loop [acc init [f & more] fs]
(if (nil? f)
acc
(recur (f acc) more))))) ; 🐍 tail recursion
((my-comp inc inc inc) 1)
;; 4
((apply my-comp (repeat 1000000 inc)) 1)
;; 1000001
evaluation 2
With my-comp rewritten to use loop and recur, we can see linear iterative evaluation of the composition -
((my-comp inc inc inc) 1)
(loop 1 (list inc inc inc))
(loop 2 (list inc inc))
(loop 3 (list inc))
(loop 4 nil)
4
multiple input args
Did you notice ten (10) apply calls at the beginning of this post? This is all in service to support multiple arguments for the first function in the my-comp sequence. It is a mistake to tangle this complexity with my-comp itself. The caller has control to do this if it is the desired behavior.
Without any additional changes to the refactored my-comp -
((my-comp #(apply * %) inc inc inc) '(3 4)) ; ✅ multiple input args
Which evaluates as -
(loop '(3 4) (list #(apply * %) inc inc inc))
(loop 12 (list inc inc inc))
(loop 13 (list inc inc))
(loop 14 (list inc))
(loop 15 nil)
15
right-to-left order
Above (my-comp a b c) will apply a first, then b, and finally c. If you want to reverse that order, a naive solution would be to call reverse at the loop call site -
(defn my-comp [& fs]
(fn [init]
(loop [acc init [f & more] (reverse fs)] ; ⚠️ naive
(if (nil? f)
acc
(recur (f acc) more)))))
Each time the returned function is called, (reverse fs) will be recomputed. To avoid this, use a let binding to compute the reversal just once -
(defn my-comp [& fs]
(let [fs (reverse fs)] ; ✅ reverse once
(fn [init]
(loop [acc init [f & more] fs]
(if (nil? f)
acc
(recur (f acc) more))))))
a way to do this, is to rearrange this code to pass some intermediate function back up to the definition with recur.
the model would be something like this:
(my-comp #(* 10 %) - +)
(my-comp (fn [& args] (#(* 10 %) (apply - args)))
+)
(my-comp (fn [& args]
((fn [& args] (#(* 10 %) (apply - args)))
(apply + args))))
the last my-comp would use the first my-comp overload (which is (my-comp [f])
here's how it could look like:
(defn my-comp
([f] f)
([f & funcs]
(if (seq funcs)
(recur (fn [& args]
(f (apply (first funcs) args)))
(rest funcs))
(my-comp f))))
notice that despite of not being the possible apply target, the recur form can still accept variadic params being passed as a sequence.
user> ((my-comp (partial repeat 3) #(* 10 %) - +) 1 2 3)
;;=> (-60 -60 -60)
notice, though, that in practice this implementation isn't really better than yours: while recur saves you from stack overflow on function creation, it would still overflow on application (somebody, correct me if i'm wrong):
(apply my-comp (repeat 1000000 inc)) ;; ok
((apply my-comp (repeat 1000000 inc)) 1) ;; stack overflow
so it would probably be better to use reduce or something else:
(defn my-comp-reduce [f & fs]
(let [[f & fs] (reverse (cons f fs))]
(fn [& args]
(reduce (fn [acc curr-f] (curr-f acc))
(apply f args)
fs))))
user> ((my-comp-reduce (partial repeat 3) #(* 10 %) - +) 1 2 3)
;;=> (-60 -60 -60)
user> ((apply my-comp-reduce (repeat 1000000 inc)) 1)
;;=> 1000001
There is already a good answer above, but I think the original suggestion to use recur may have been thinking of a more manual accumulation of the result. In case you haven't seen it, reduce is just a very specific usage of loop/recur:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn my-reduce
[step-fn init-val data-vec]
(loop [accum init-val
data data-vec]
(if (empty? data)
accum
(let [accum-next (step-fn accum (first data))
data-next (rest data)]
(recur accum-next data-next)))))
(dotest
(is= 10 (my-reduce + 0 (range 5))) ; 0..4
(is= 120 (my-reduce * 1 (range 1 6))) ; 1..5 )
In general, there can be any number of loop variables (not just 2 like for reduce). Using loop/recur gives you a more "functional" way of looping with accumulated state instead of using and atom and a doseq or something. As the name suggests, from the outside the effect is quite similar to a normal recursion w/o any stack size limits (i.e. tail-call optimization).
P.S. As this example shows, I like to use a let form to very explicitly name the values being generated for the next iteration.
P.P.S. While the compiler will allow you to type the following w/o confusion:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn my-reduce
[step-fn accum data]
(loop [accum accum
data data]
...))
it can be a bit confusing and/or sloppy to re-use variable names (esp. for people new to Clojure or your particular program).
Also
I would be remiss if I didn't point out that the function definition itself can be a recur target (i.e. you don't need to use loop). Consider this version of the factorial:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn fact-impl
[cum x]
(if (= x 1)
cum
(let [cum-next (* cum x)
x-next (dec x)]
(recur cum-next x-next))))
(defn fact [x] (fact-impl 1 x))
(dotest
(is= 6 (fact 3))
(is= 120 (fact 5)))
I am trying to write up a simple Markovian state space models, that, as the name suggests iteratively looks back one step to predict the next state.
Here is what is supposed to be a MWE, though it is not because I cannot quite figure out how I am supposed to place (recur ... ) in the below code.
;; helper function
(defn dur-call
[S D]
(if (< 1 D)
(- D 1)
(rand-int S)))
;; helper function
(defn trans-call
[S D]
(if (< 1 D)
S
(rand-int 3)))
;; state space model
(defn test-func
[t]
(loop
[S (rand-int 3)]
(if (<= t 0)
[S (rand-int (+ S 1))]
(let [pastS (first (test-func (- t 1)))
pastD (second (test-func (- t 1)))
S (trans-call pastS pastD)]
(recur ...?)
[S (dur-call S pastD)]))))
My target is to calculate some a state at say time t=5 say, in which case the model needs to look back and calculate states t=[0 1 2 3 4] as well. This should, in my mind, be done well with loop/recur but could also be done with reduce perhaps (not sure how, still new to Clojure). My problem is really that it would seemt have to use recur inside let but that should not work given how loop/recur are designed.
your task is really to generate the next item based on the previous one, starting with some seed. In clojure it can be fulfilled by using iterate function:
user> (take 10 (iterate #(+ 2 %) 1))
(1 3 5 7 9 11 13 15 17 19)
you just have to define the function to produce the next value. It could look like this (not sure about the correctness of the computation algorithm, just based on what is in the question):
(defn next-item [[prev-s prev-d :as prev-item]]
(let [s (trans-call prev-s prev-d)]
[s (dur-call s prev-d)]))
and now let's iterate with it, starting from some value:
user> (take 5 (iterate next-item [3 4]))
([3 4] [3 3] [3 2] [3 1] [0 0])
now your test function could be implemented this way:
(defn test-fn [t]
(when (not (neg? t))
(nth (iterate next-item
(let [s (rand-int 3)]
[s (rand-int (inc s))]))
t)))
you can also do it with loop (but it is still less idiomatic):
(defn test-fn-2 [t]
(when (not (neg? t))
(let [s (rand-int 3)
d (rand-int (inc s))]
(loop [results [[s d]]]
(if (< t (count results))
(peek results)
(recur (conj results (next-item (peek results)))))))))
here we pass all the accumulated results to the next iteration of the loop.
also you can introduce the loop's iteration index and just pass around the last result together with it:
(defn test-fn-3 [t]
(when (not (neg? t))
(let [s (rand-int 3)
d (rand-int (inc s))]
(loop [result [s d] i 0]
(if (= i t)
result
(recur (next-item result) (inc i)))))))
and one more example with reduce:
(defn test-fn-4 [t]
(when (not (neg? t))
(reduce (fn [prev _] (next-item prev))
(let [s (rand-int 3)
d (rand-int (inc s))]
[s d])
(range t))))
To try out the async library in Clojure, I translated the prime sieve example from Go. Running in the REPL, it successfully printed out the prime numbers up to 227 and then stopped. I hit Ctrl-C and tried running it again but it wouldn't print out any more numbers. Is there a way to get Clojure to handle this, or is the async library just not ready for it yet?
;; A concurrent prime sieve translated from
;; https://golang.org/doc/play/sieve.go
(require '[clojure.core.async :as async :refer [<!! >!! chan go]])
(defn generate
[ch]
"Sends the sequence 2, 3, 4, ... to channel 'ch'."
(doseq [i (drop 2 (range))]
(>!! ch i)))
(defn filter-multiples
[in-chan out-chan prime]
"Copies the values from 'in-chan' to 'out-chan', removing
multiples of 'prime'."
(while true
;; Receive value from 'in-chan'.
(let [i (<!! in-chan)]
(if (not= 0 (mod i prime))
;; Send 'i' to 'out-chan'.
(>!! out-chan i)))))
(defn main
[]
"The prime sieve: Daisy-chain filter-multiples processes."
(let [ch (chan)]
(go (generate ch))
(loop [ch ch]
(let [prime (<!! ch)]
(println prime)
(let [ch1 (chan)]
(go (filter-multiples ch ch1 prime))
(recur ch1))))))
go is a macro. If you want to take advantage of goroutine-like behaviour in go blocks you must use <! and >!, and they must be visible to the go macro (that is you mustn't extract these operations into separate functions).
This literal translation of the program at https://golang.org/doc/play/sieve.go seems to work fine, also with a larger i in the main loop:
(require '[clojure.core.async :refer [<! <!! >! chan go]])
(defn go-generate [ch]
(go (doseq [i (iterate inc 2)]
(>! ch i))))
(defn go-filter [in out prime]
(go (while true
(let [i (<! in)]
(if-not (zero? (rem i prime))
(>! out i))))))
(defn main []
(let [ch (chan)]
(go-generate ch)
(loop [i 10 ch ch]
(if (pos? i)
(let [prime (<!! ch)]
(println prime)
(let [ch1 (chan)]
(go-filter ch ch1 prime)
(recur (dec i) ch1)))))))
I'd like to take a tree-like structure like this:
{"foo" {"bar" "1" "baz" "2"}}
and recursively traverse while remembering the path from the root in order to produce something like this:
["foo/bar/1", "foo/baz/2"]
Any suggestions on how this can be done without zippers or clojure.walk?
As nberger does, we separate enumerating the paths from presenting them as strings.
Enumeration
The function
(defn paths [x]
(if (map? x)
(mapcat (fn [[k v]] (map #(cons k %) (paths v))) x)
[[x]]))
... returns the sequence of path-sequences of a nested map. For example,
(paths {"foo" {"bar" "1", "baz" "2"}})
;(("foo" "bar" "1") ("foo" "baz" "2"))
Presentation
The function
#(clojure.string/join \/ %)
... joins strings together with "/"s. For example,
(#(clojure.string/join \/ %) (list "foo" "bar" "1"))
;"foo/bar/1"
Compose these to get the function you want:
(def traverse (comp (partial map #(clojure.string/join \/ %)) paths))
... or simply
(defn traverse [x]
(->> x
paths
(map #(clojure.string/join \/ %))))
For example,
(traverse {"foo" {"bar" "1", "baz" "2"}})
;("foo/bar/1" "foo/baz/2")
You could entwine these as a single function: clearer and more useful to
separate them, I think.
The enumeration is not lazy, so it will run out of
stack space on deeply enough nested maps.
This is my attempt using tree-seq clojure core function.
(def tree {"foo" {"bar" "1" "baz" "2"}})
(defn paths [t]
(let [cf (fn [[k v]]
(if (map? v)
(->> v
(map (fn [[kv vv]]
[(str k "/" kv) vv]))
(into {}))
(str k "/" v)))]
(->> t
(tree-seq map? #(map cf %))
(remove map?)
vec)))
(paths tree) ; => ["foo/bar/1" "foo/baz/2"]
Map keys are used to accumulate paths.
I did something real quick using accumulator, but it isn't depth first.
(defn paths [separator tree]
(let [finished? (fn [[_ v]] ((complement map?) v))]
(loop [finished-paths nil
path-trees (seq tree)]
(let [new-paths (mapcat
(fn [[path children]]
(map
(fn [[k v]]
(vector (str path separator k) v))
children))
path-trees)
finished (->> (filter finished? new-paths)
(map
(fn [[k v]]
(str k separator v)))
(concat finished-paths))
remaining-paths (remove finished? new-paths)]
(if (seq remaining-paths)
(recur finished remaining-paths)
finished)))))
In the repl
(clojure-scratch.core/paths "/" {"foo" {"bar" {"bosh" "1" "bash" "3"} "baz" "2"}})
=> ("foo/baz/2" "foo/bar/bash/3" "foo/bar/bosh/1")
The following uses recursive depth first traversal:
(defn combine [k coll]
(mapv #(str k "/" %) coll))
(defn f-map [m]
(into []
(flatten
(mapv (fn [[k v]]
(if (map? v)
(combine k (f-map v))
(str k "/" v)))
m))))
(f-map {"foo" {"bar" "1" "baz" "2"}})
=> ["foo/bar/1" "foo/baz/2"]
Here's my take:
(defn traverse [t]
(letfn [(traverse- [path t]
(when (seq t)
(let [[x & xs] (seq t)
[k v] x]
(lazy-cat
(if (map? v)
(traverse- (conj path k) v)
[[(conj path k) v]])
(traverse- path xs)))))]
(traverse- [] t)))
(traverse {"foo" {"bar" "1" "baz" "2"}})
;=> [[["foo" "bar"] "1"] [["foo" "baz"] "2"]]
Traverse returns a lazy seq of path-leaf pairs. You can then apply any transformation to each path-leaf, for example to the "/path/to/leaf" fullpath form:
(def ->full-path #(->> (apply conj %) (clojure.string/join "/")))
(->> (traverse {"foo" {"bar" "1" "baz" "2"}})
(map ->full-path))
;=> ("foo/bar/1" "foo/baz/2")
(->> (traverse {"foo" {"bar" {"buzz" 4 "fizz" "fuzz"} "baz" "2"} "faa" "fee"})
(map ->full-path))
;=> ("foo/bar/buzz/4" "foo/bar/fizz/fuzz" "foo/baz/2" "faa/fee")
What function can I put as FOO here to yield true at the end? I played with hash-set (only correct for first 2 values), conj, and concat but I know I'm not handling the single-element vs set condition properly with just any of those.
(defn mergeMatches [propertyMapList]
"Take a list of maps and merges them combining values into a set"
(reduce #(merge-with FOO %1 %2) {} propertyMapList))
(def in
(list
{:a 1}
{:a 2}
{:a 3}
{:b 4}
{:b 5}
{:b 6} ))
(def out
{ :a #{ 1 2 3}
:b #{ 4 5 6} })
; this should return true
(= (mergeMatches in) out)
What is the most idiomatic way to handle this?
This'll do:
(let [set #(if (set? %) % #{%})]
#(clojure.set/union (set %) (set %2)))
Rewritten more directly for the example (Alex):
(defn to-set [s]
(if (set? s) s #{s}))
(defn set-union [s1 s2]
(clojure.set/union (to-set s1) (to-set s2)))
(defn mergeMatches [propertyMapList]
(reduce #(merge-with set-union %1 %2) {} propertyMapList))
I didn't write this but it was contributed by #amitrathore on Twitter:
(defn kv [bag [k v]]
(update-in bag [k] conj v))
(defn mergeMatches [propertyMapList]
(reduce #(reduce kv %1 %2) {} propertyMapList))
I wouldn't use merge-with for this,
(defn fnil [f not-found]
(fn [x y] (f (if (nil? x) not-found x) y)))
(defn conj-in [m map-entry]
(update-in m [(key map-entry)] (fnil conj #{}) (val map-entry)))
(defn merge-matches [property-map-list]
(reduce conj-in {} (apply concat property-map-list)))
user=> (merge-matches in)
{:b #{4 5 6}, :a #{1 2 3}}
fnil will be part of core soon so you can ignore the implementation... but it just creates a version of another function that can handle nil arguments. In this case conj will substitute #{} for nil.
So the reduction conjoining to a set for every key/value in the list of maps supplied.
Another solution contributed by #wmacgyver on Twitter based on multimaps:
(defn add
"Adds key-value pairs the multimap."
([mm k v]
(assoc mm k (conj (get mm k #{}) v)))
([mm k v & kvs]
(apply add (add mm k v) kvs)))
(defn mm-merge
"Merges the multimaps, taking the union of values."
[& mms]
(apply (partial merge-with union) mms))
(defn mergeMatches [property-map-list]
(reduce mm-merge (map #(add {} (key (first %)) (val (first %))) property-map-list)))
This seems to work:
(defn FOO [v1 v2]
(if (set? v1)
(apply hash-set v2 v1)
(hash-set v1 v2)))
Not super pretty but it works.
(defn mergeMatches [propertyMapList]
(for [k (set (for [pp propertyMapList] (key (first pp))))]
{k (set (remove nil? (for [pp propertyMapList] (k pp))))}))