go block hangs indefinitely with core.sync - asynchronous

Here is the code:
(ns typed-clj-test.async
(:require [clojure.core.async
:as a
:refer [>! <! >!! <!!
go chan buffer
close! thread
alts! alts!! timeout]]))
(def echo-buffer (chan 2))
(go (do (<! (timeout 5000))
(println (<! echo-buffer))))
(>!! echo-buffer "msg1")
(>!! echo-buffer "msg2")
(>!! echo-buffer "msg3")
(>!! echo-buffer "msg4")
This hangs forever after printing msg1 in nrepl:
typed-clj-test.async=> (def echo-buffer (chan 2))
#'typed-clj-test.async/echo-buffer
typed-clj-test.async=> (go (do (<! (timeout 5000))
#_=> (println (<! echo-buffer))))
#<ManyToManyChannel clojure.core.async.impl.channels.ManyToManyChannel#6cc648a>
typed-clj-test.async=> (>!! echo-buffer "msg1")
true
typed-clj-test.async=> (>!! echo-buffer "msg2")
true
typed-clj-test.async=> (>!! echo-buffer "msg3")
msg1
true
typed-clj-test.async=> (>!! echo-buffer "msg4")

You're only ever getting the first message out of the echo-buffer, and since the buffer size is 2, trying to add a fourth message to the buffer will block until another value is removed from the buffer (which will never happen).
In other words, you seem to expect that
(go (do (<! (timeout 5000))
(println (<! echo-buffer))))
loops, but it won't.
Here is how to make it work:
(def echo-buffer (chan 2))
(go (do (loop [i 0]
(<! (timeout (* 100 (rand-int 20))))
(println (<! echo-buffer))
(recur i))))
(>!! echo-buffer "msg1")
(>!! echo-buffer "msg2")
(>!! echo-buffer "msg3")
(>!! echo-buffer "msg4")

Related

Can I use `recur` in this implementation of function composition in Clojure?

Consider this simple-minded recursive implementation of comp in Clojure:
(defn my-comp
([f]
(fn [& args]
(apply f args)))
([f & funcs]
(fn [& args]
(f (apply (apply my-comp funcs) args)))))
The right way to do this, I am told, is using recur, but I am unsure how recur works. In particular: is there a way to coax the code above into being recurable?
evaluation 1
First let's visualize the problem. my-comp as it is written in the question will create a deep stack of function calls, each waiting on the stack to resolve, blocked until the the deepest call returns -
((my-comp inc inc inc) 1)
((fn [& args]
(inc (apply (apply my-comp '(inc inc)) args))) 1)
(inc (apply (fn [& args]
(inc (apply (apply my-comp '(inc)) args))) '(1)))
(inc (inc (apply (apply my-comp '(inc)) '(1))))
(inc (inc (apply (fn [& args]
(apply inc args)) '(1))))
(inc (inc (apply inc '(1)))) ; ⚠️ deep in the hole we go...
(inc (inc 2))
(inc 3)
4
tail-recursive my-comp
Rather than creating a long sequence of functions, this my-comp is refactored to return a single function, which when called, runs a loop over the supplied input functions -
(defn my-comp [& fs]
(fn [init]
(loop [acc init [f & more] fs]
(if (nil? f)
acc
(recur (f acc) more))))) ; 🐍 tail recursion
((my-comp inc inc inc) 1)
;; 4
((apply my-comp (repeat 1000000 inc)) 1)
;; 1000001
evaluation 2
With my-comp rewritten to use loop and recur, we can see linear iterative evaluation of the composition -
((my-comp inc inc inc) 1)
(loop 1 (list inc inc inc))
(loop 2 (list inc inc))
(loop 3 (list inc))
(loop 4 nil)
4
multiple input args
Did you notice ten (10) apply calls at the beginning of this post? This is all in service to support multiple arguments for the first function in the my-comp sequence. It is a mistake to tangle this complexity with my-comp itself. The caller has control to do this if it is the desired behavior.
Without any additional changes to the refactored my-comp -
((my-comp #(apply * %) inc inc inc) '(3 4)) ; ✅ multiple input args
Which evaluates as -
(loop '(3 4) (list #(apply * %) inc inc inc))
(loop 12 (list inc inc inc))
(loop 13 (list inc inc))
(loop 14 (list inc))
(loop 15 nil)
15
right-to-left order
Above (my-comp a b c) will apply a first, then b, and finally c. If you want to reverse that order, a naive solution would be to call reverse at the loop call site -
(defn my-comp [& fs]
(fn [init]
(loop [acc init [f & more] (reverse fs)] ; ⚠️ naive
(if (nil? f)
acc
(recur (f acc) more)))))
Each time the returned function is called, (reverse fs) will be recomputed. To avoid this, use a let binding to compute the reversal just once -
(defn my-comp [& fs]
(let [fs (reverse fs)] ; ✅ reverse once
(fn [init]
(loop [acc init [f & more] fs]
(if (nil? f)
acc
(recur (f acc) more))))))
a way to do this, is to rearrange this code to pass some intermediate function back up to the definition with recur.
the model would be something like this:
(my-comp #(* 10 %) - +)
(my-comp (fn [& args] (#(* 10 %) (apply - args)))
+)
(my-comp (fn [& args]
((fn [& args] (#(* 10 %) (apply - args)))
(apply + args))))
the last my-comp would use the first my-comp overload (which is (my-comp [f])
here's how it could look like:
(defn my-comp
([f] f)
([f & funcs]
(if (seq funcs)
(recur (fn [& args]
(f (apply (first funcs) args)))
(rest funcs))
(my-comp f))))
notice that despite of not being the possible apply target, the recur form can still accept variadic params being passed as a sequence.
user> ((my-comp (partial repeat 3) #(* 10 %) - +) 1 2 3)
;;=> (-60 -60 -60)
notice, though, that in practice this implementation isn't really better than yours: while recur saves you from stack overflow on function creation, it would still overflow on application (somebody, correct me if i'm wrong):
(apply my-comp (repeat 1000000 inc)) ;; ok
((apply my-comp (repeat 1000000 inc)) 1) ;; stack overflow
so it would probably be better to use reduce or something else:
(defn my-comp-reduce [f & fs]
(let [[f & fs] (reverse (cons f fs))]
(fn [& args]
(reduce (fn [acc curr-f] (curr-f acc))
(apply f args)
fs))))
user> ((my-comp-reduce (partial repeat 3) #(* 10 %) - +) 1 2 3)
;;=> (-60 -60 -60)
user> ((apply my-comp-reduce (repeat 1000000 inc)) 1)
;;=> 1000001
There is already a good answer above, but I think the original suggestion to use recur may have been thinking of a more manual accumulation of the result. In case you haven't seen it, reduce is just a very specific usage of loop/recur:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn my-reduce
[step-fn init-val data-vec]
(loop [accum init-val
data data-vec]
(if (empty? data)
accum
(let [accum-next (step-fn accum (first data))
data-next (rest data)]
(recur accum-next data-next)))))
(dotest
(is= 10 (my-reduce + 0 (range 5))) ; 0..4
(is= 120 (my-reduce * 1 (range 1 6))) ; 1..5 )
In general, there can be any number of loop variables (not just 2 like for reduce). Using loop/recur gives you a more "functional" way of looping with accumulated state instead of using and atom and a doseq or something. As the name suggests, from the outside the effect is quite similar to a normal recursion w/o any stack size limits (i.e. tail-call optimization).
P.S. As this example shows, I like to use a let form to very explicitly name the values being generated for the next iteration.
P.P.S. While the compiler will allow you to type the following w/o confusion:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn my-reduce
[step-fn accum data]
(loop [accum accum
data data]
...))
it can be a bit confusing and/or sloppy to re-use variable names (esp. for people new to Clojure or your particular program).
Also
I would be remiss if I didn't point out that the function definition itself can be a recur target (i.e. you don't need to use loop). Consider this version of the factorial:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn fact-impl
[cum x]
(if (= x 1)
cum
(let [cum-next (* cum x)
x-next (dec x)]
(recur cum-next x-next))))
(defn fact [x] (fact-impl 1 x))
(dotest
(is= 6 (fact 3))
(is= 120 (fact 5)))

Idiomatic approach to stopping producer/consumer go-loops?

In an effort to get some good concurrent programming practice I'm trying to implement the producer/consumer pattern using Clojure's core.async library. All is working well but I wanted to be able to stop both the producer and consumer at some point in time.
My current code looks something like this...
(def c (a/chan 5))
(def alive (atom true))
(def producer (a/go-loop []
(Thread/sleep 1000)
(when #alive
(a/>! c 1)
(recur))))
(def consumer (a/go-loop []
(Thread/sleep 3000)
(when #alive
(println (a/<! c))
(recur))))
(do
(reset! alive false)
(a/<!! producer)
(a/<!! consumer))
Unfortunately it appears that the 'do' block occasionally blocks indefinitely. I essentially want to be able to stop both go-loops from continuing and block until both loops have exited. The Thread/sleep code is there to simulate performing some unit of work.
I suspect that stopping the producer causes the consumer to park, hence the hanging, though I'm not sure of an alternative approach, any ideas?
Please see ClojureDocs for details. Example:
(let [c (chan 2) ]
(>!! c 1)
(>!! c 2)
(close! c)
(println (<!! c)) ; 1
(println (<!! c)) ; 2
;; since we closed the channel this will return false(we can no longer add values)
(>!! c 1))
For your problem something like:
(let [c (a/chan 5)
producer (a/go-loop [cnt 0]
(Thread/sleep 1000)
(let [put-result (a/>! c cnt)]
(println "put: " cnt put-result)
(when put-result
(recur (inc cnt)))))
consumer (a/go-loop []
(Thread/sleep 3000)
(let [result (a/<! c)]
(when result
(println "take: " result)
(recur))))]
(Thread/sleep 5000)
(println "closing chan...")
(a/close! c))
with result
put: 0 true
put: 1 true
take: 0
put: 2 true
put: 3 true
closing chan...
put: 4 false
take: 1
take: 2
take: 3

deref an atom after recursive function completes

I have an atom fs that I'm updating inside a recursive function freq-seq that's the value that holds the results of my computation. I have another function mine-freq-seqs to start freq-seq and when mine-freq-seqs is done I would like to receive the last value of said atom. So I thought I would do it like so
(ns freq-seq-enum)
(def fs (atom #{}))
(defn locally-frequents
[sdb min-sup]
(let [uniq-sdb (map (comp frequencies set) sdb)
freqs (apply merge-with + uniq-sdb)]
(->> freqs
(filter #(<= min-sup (second %)))
(map #(vector (str (first %)) (second %))))))
(defn project-sdb
[sdb prefix]
(if (empty? prefix) sdb
(into [] (->> sdb
(filter #(re-find (re-pattern (str (last prefix))) %))
(map #(subs % (inc (.indexOf % (str (last prefix))))))
(remove empty?)))))
(defn freq-seq
[sdb prefix prefix-support min-sup frequent-seqs]
(if ((complement empty?) prefix) (swap! fs conj [prefix prefix-support]))
(let [lf (locally-frequents sdb min-sup)]
(if (empty? lf) nil
(for [[item sup] lf] (freq-seq (project-sdb sdb (str prefix item)) (str prefix item) sup min-sup #fs)))))
(defn mine-freq-seqs
[sdb min-sup]
(freq-seq sdb "" 0 min-sup #fs))
running it first
(mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2)
then deref-ing the atom
(deref fs)
yields
#{["B" 4]
["BC" 4]
["AB" 4]
["CA" 3]
["CAC" 2]
["AC" 4]
["ABC" 4]
["CAB" 2]
["A" 4]
["CABC" 2]
["ABB" 2]
["CC" 2]
["CB" 3]
["C" 4]
["BB" 2]
["CBC" 2]
["AA" 2]}
however (doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2) (deref fs))
just gives #{}
What I want is to let the freq-seq recurse to completion then get the value of the atom fs. So I can call mine-freq-seq and have my result returned in the REPL instead of having to manually deref it there.
First some alternate code without the atom then a look at why you get the empty return.
A more compact version where the sequences in a string are derived with a reduce rather than the recursion with regex and substr.
Then just do a frequencies on those results.
(defn local-seqs
[s]
(->> s
(reduce (fn [acc a] (into acc (map #(conj % a) acc))) #{[]})
(map #(apply str %))
(remove empty?)))
(defn freq-seqs
[sdb min-sup]
(->> (mapcat local-seqs sdb)
frequencies
(filter #(>= (second %) min-sup))
set))
That's the whole thing!
I haven't involved an atom because I didn't see a need but add it at the end if freq-seqs if you like.
For your original question: why the return that you see?
You are calling doall with 2 args, the result of your call and a collection. doall is a function and not a macro so the deref is performed immediately.
(defn doall
;; <snip>
([n coll] ;; you have passed #{} as coll
(dorun n coll) ;; and this line evals to nil
coll) ;; and #{} is returned
You have passed your result as the n arg and an empty set as the coll (from (deref fs))
Now when doall calls dorun, it encounters the following:
(defn dorun
;; <snip>
([n coll]
(when (and (seq coll) (pos? n)) ;; coll is #{} so the seq is falesy
(recur (dec n) (next coll)))) ;; and a nil is returned
Since the empty set from fs is the second arg (coll) and and is a macro, it will be falsey on (seq coll), return nil and then doall returns the empty set that was it's second arg.
Final note:
So that is something that works and why yours failed. As to how to make yours work, to fix the call above I tried:
(do (doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2))
(deref fs))
That is closer to working but with the recusion in your process, it only forces the eval one level deep. So you could push the doall deeper into your funcs but I have proposed a completely different internal structure so I will leave the rest to you if you really need that structure.
I changed it a bit to remove all of the lazy bits (this happens silently in the repl but can be confusing when it changes outside of the repl). Note the changes with vec, mapv, and doall. At least now I get your result:
(def fs (atom #{}))
(defn locally-frequents
[sdb min-sup]
(let [uniq-sdb (map (comp frequencies set) sdb)
freqs (apply merge-with + uniq-sdb)]
(->> freqs
(filter #(<= min-sup (second %)))
(mapv #(vector (str (first %)) (second %))))))
(defn project-sdb
[sdb prefix]
(if (empty? prefix)
sdb
(into [] (->> sdb
(filter #(re-find (re-pattern (str (last prefix))) %))
(map #(subs % (inc (.indexOf % (str (last prefix))))))
(remove empty?)))))
(defn freq-seq
[sdb prefix prefix-support min-sup frequent-seqs]
(if ((complement empty?) prefix) (swap! fs conj [prefix prefix-support]))
(let [lf (locally-frequents sdb min-sup)]
(if (empty? lf)
nil
(vec (for [[item sup] lf] (freq-seq (project-sdb sdb (str prefix item)) (str prefix item) sup min-sup #fs))))))
(defn mine-freq-seqs
[sdb min-sup]
(freq-seq sdb "" 0 min-sup #fs))
(doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2))
(deref fs) => #{["B" 4] ["BC" 4] ["AB" 4] ["CA" 3]
["CAC" 2] ["AC" 4] ["ABC" 4] ["CAB" 2]
["A" 4] ["CABC" 2] ["ABB" 2] ["CC" 2] ["CB" 3]
["C" 4] ["BB" 2] ["CBC" 2] ["AA" 2]}
I'm still not really sure what the goal is or how/why you get entries like "CABC".

Why can't Clojure's async library handle the Go prime sieve?

To try out the async library in Clojure, I translated the prime sieve example from Go. Running in the REPL, it successfully printed out the prime numbers up to 227 and then stopped. I hit Ctrl-C and tried running it again but it wouldn't print out any more numbers. Is there a way to get Clojure to handle this, or is the async library just not ready for it yet?
;; A concurrent prime sieve translated from
;; https://golang.org/doc/play/sieve.go
(require '[clojure.core.async :as async :refer [<!! >!! chan go]])
(defn generate
[ch]
"Sends the sequence 2, 3, 4, ... to channel 'ch'."
(doseq [i (drop 2 (range))]
(>!! ch i)))
(defn filter-multiples
[in-chan out-chan prime]
"Copies the values from 'in-chan' to 'out-chan', removing
multiples of 'prime'."
(while true
;; Receive value from 'in-chan'.
(let [i (<!! in-chan)]
(if (not= 0 (mod i prime))
;; Send 'i' to 'out-chan'.
(>!! out-chan i)))))
(defn main
[]
"The prime sieve: Daisy-chain filter-multiples processes."
(let [ch (chan)]
(go (generate ch))
(loop [ch ch]
(let [prime (<!! ch)]
(println prime)
(let [ch1 (chan)]
(go (filter-multiples ch ch1 prime))
(recur ch1))))))
go is a macro. If you want to take advantage of goroutine-like behaviour in go blocks you must use <! and >!, and they must be visible to the go macro (that is you mustn't extract these operations into separate functions).
This literal translation of the program at https://golang.org/doc/play/sieve.go seems to work fine, also with a larger i in the main loop:
(require '[clojure.core.async :refer [<! <!! >! chan go]])
(defn go-generate [ch]
(go (doseq [i (iterate inc 2)]
(>! ch i))))
(defn go-filter [in out prime]
(go (while true
(let [i (<! in)]
(if-not (zero? (rem i prime))
(>! out i))))))
(defn main []
(let [ch (chan)]
(go-generate ch)
(loop [i 10 ch ch]
(if (pos? i)
(let [prime (<!! ch)]
(println prime)
(let [ch1 (chan)]
(go-filter ch ch1 prime)
(recur (dec i) ch1)))))))

how can I insure all data is written to a file from clojure's core.async channels?

I'm quite new to core.async, and I have been attempting to understand how best to make use of core.async with file IO. I have put together a test that fails to write the last file though printing to the console works. Any idea what I am missing?
First, some functions...
(defn thread-write-to-files [channel]
(let [writer (atom nil)]
(thread
(loop []
(when-some [value (<!! channel)]
(if (and (map? value) (= :FILE (:type value)))
(do (when #writer (.close ^Writer #writer))
(reset! writer (io/writer (File. ^String (:name value))))
(recur))
(do (when #writer (.write #writer value)
(println value))
(recur)))))
(when #writer
(do (.flush #writer)
(.close ^Writer #writer))))))
(defn add-line-number [channel-in channel-out]
(go-loop [line-number 1]
(when-some [value (<! channel-in)]
(if (and (map? value) (= :FILE (:type value)))
(do (>! channel-out value)
(recur 1))
(do (>! channel-out (str line-number ". " value))
(recur (inc line-number)))))))
Now a test that makes use of them...
(deftest test-thread-write-to-file
(let [input-coll ["This gets skipped"
{:type :FILE :name "foo.txt"}
"This is the first line of foo!\n"
"This is the second line of foo.\n"
{:type :FILE :name "bar.txt"}
"Bar me 1.\n"
"Bar me 2.\n"
"Bar me 3.\n"
{:type :FILE :name "baz.txt"}
"BBBBBBBBBBB\n"
"AAAAAAAAAAA\n"
"ZZZZZZZZZZZ\n"]
input-channel (async/to-chan input-coll)
output-channel (chan)
foo (File. "foo.txt")
bar (File. "bar.txt")
baz (File. "baz.txt")]
(when (.exists foo) (.delete foo))
(when (.exists bar) (.delete bar))
(when (.exists baz) (.delete baz))
(add-line-number input-channel output-channel)
(thread-write-to-files output-channel)
(Thread/sleep 1000)
(is (.exists foo))
(is (.exists bar))
(is (.exists baz))
(is (> (.length foo) 0))
(is (> (.length bar) 0))
(is (> (.length baz) 0))))
The last condition tested fails. File baz.txt is created, but empty. My REPL prints out each line from the input, so I am confused as to why the file is still empty.
In thread-write-to-files you perform the flush and close of the final file when the input channel is closed (when (when-some [value (<!! channel)] ...) gets a nil and exits the loop).
Your test never closes the channel, so this doesn't happen. Try using close! on output-channel, or maybe use onto-chan instead of to-chan to put the test data collection into the system.

Resources