Why can't Clojure's async library handle the Go prime sieve? - asynchronous

To try out the async library in Clojure, I translated the prime sieve example from Go. Running in the REPL, it successfully printed out the prime numbers up to 227 and then stopped. I hit Ctrl-C and tried running it again but it wouldn't print out any more numbers. Is there a way to get Clojure to handle this, or is the async library just not ready for it yet?
;; A concurrent prime sieve translated from
;; https://golang.org/doc/play/sieve.go
(require '[clojure.core.async :as async :refer [<!! >!! chan go]])
(defn generate
[ch]
"Sends the sequence 2, 3, 4, ... to channel 'ch'."
(doseq [i (drop 2 (range))]
(>!! ch i)))
(defn filter-multiples
[in-chan out-chan prime]
"Copies the values from 'in-chan' to 'out-chan', removing
multiples of 'prime'."
(while true
;; Receive value from 'in-chan'.
(let [i (<!! in-chan)]
(if (not= 0 (mod i prime))
;; Send 'i' to 'out-chan'.
(>!! out-chan i)))))
(defn main
[]
"The prime sieve: Daisy-chain filter-multiples processes."
(let [ch (chan)]
(go (generate ch))
(loop [ch ch]
(let [prime (<!! ch)]
(println prime)
(let [ch1 (chan)]
(go (filter-multiples ch ch1 prime))
(recur ch1))))))

go is a macro. If you want to take advantage of goroutine-like behaviour in go blocks you must use <! and >!, and they must be visible to the go macro (that is you mustn't extract these operations into separate functions).
This literal translation of the program at https://golang.org/doc/play/sieve.go seems to work fine, also with a larger i in the main loop:
(require '[clojure.core.async :refer [<! <!! >! chan go]])
(defn go-generate [ch]
(go (doseq [i (iterate inc 2)]
(>! ch i))))
(defn go-filter [in out prime]
(go (while true
(let [i (<! in)]
(if-not (zero? (rem i prime))
(>! out i))))))
(defn main []
(let [ch (chan)]
(go-generate ch)
(loop [i 10 ch ch]
(if (pos? i)
(let [prime (<!! ch)]
(println prime)
(let [ch1 (chan)]
(go-filter ch ch1 prime)
(recur (dec i) ch1)))))))

Related

Can I use `recur` in this implementation of function composition in Clojure?

Consider this simple-minded recursive implementation of comp in Clojure:
(defn my-comp
([f]
(fn [& args]
(apply f args)))
([f & funcs]
(fn [& args]
(f (apply (apply my-comp funcs) args)))))
The right way to do this, I am told, is using recur, but I am unsure how recur works. In particular: is there a way to coax the code above into being recurable?
evaluation 1
First let's visualize the problem. my-comp as it is written in the question will create a deep stack of function calls, each waiting on the stack to resolve, blocked until the the deepest call returns -
((my-comp inc inc inc) 1)
((fn [& args]
(inc (apply (apply my-comp '(inc inc)) args))) 1)
(inc (apply (fn [& args]
(inc (apply (apply my-comp '(inc)) args))) '(1)))
(inc (inc (apply (apply my-comp '(inc)) '(1))))
(inc (inc (apply (fn [& args]
(apply inc args)) '(1))))
(inc (inc (apply inc '(1)))) ; ⚠️ deep in the hole we go...
(inc (inc 2))
(inc 3)
4
tail-recursive my-comp
Rather than creating a long sequence of functions, this my-comp is refactored to return a single function, which when called, runs a loop over the supplied input functions -
(defn my-comp [& fs]
(fn [init]
(loop [acc init [f & more] fs]
(if (nil? f)
acc
(recur (f acc) more))))) ; 🐍 tail recursion
((my-comp inc inc inc) 1)
;; 4
((apply my-comp (repeat 1000000 inc)) 1)
;; 1000001
evaluation 2
With my-comp rewritten to use loop and recur, we can see linear iterative evaluation of the composition -
((my-comp inc inc inc) 1)
(loop 1 (list inc inc inc))
(loop 2 (list inc inc))
(loop 3 (list inc))
(loop 4 nil)
4
multiple input args
Did you notice ten (10) apply calls at the beginning of this post? This is all in service to support multiple arguments for the first function in the my-comp sequence. It is a mistake to tangle this complexity with my-comp itself. The caller has control to do this if it is the desired behavior.
Without any additional changes to the refactored my-comp -
((my-comp #(apply * %) inc inc inc) '(3 4)) ; ✅ multiple input args
Which evaluates as -
(loop '(3 4) (list #(apply * %) inc inc inc))
(loop 12 (list inc inc inc))
(loop 13 (list inc inc))
(loop 14 (list inc))
(loop 15 nil)
15
right-to-left order
Above (my-comp a b c) will apply a first, then b, and finally c. If you want to reverse that order, a naive solution would be to call reverse at the loop call site -
(defn my-comp [& fs]
(fn [init]
(loop [acc init [f & more] (reverse fs)] ; ⚠️ naive
(if (nil? f)
acc
(recur (f acc) more)))))
Each time the returned function is called, (reverse fs) will be recomputed. To avoid this, use a let binding to compute the reversal just once -
(defn my-comp [& fs]
(let [fs (reverse fs)] ; ✅ reverse once
(fn [init]
(loop [acc init [f & more] fs]
(if (nil? f)
acc
(recur (f acc) more))))))
a way to do this, is to rearrange this code to pass some intermediate function back up to the definition with recur.
the model would be something like this:
(my-comp #(* 10 %) - +)
(my-comp (fn [& args] (#(* 10 %) (apply - args)))
+)
(my-comp (fn [& args]
((fn [& args] (#(* 10 %) (apply - args)))
(apply + args))))
the last my-comp would use the first my-comp overload (which is (my-comp [f])
here's how it could look like:
(defn my-comp
([f] f)
([f & funcs]
(if (seq funcs)
(recur (fn [& args]
(f (apply (first funcs) args)))
(rest funcs))
(my-comp f))))
notice that despite of not being the possible apply target, the recur form can still accept variadic params being passed as a sequence.
user> ((my-comp (partial repeat 3) #(* 10 %) - +) 1 2 3)
;;=> (-60 -60 -60)
notice, though, that in practice this implementation isn't really better than yours: while recur saves you from stack overflow on function creation, it would still overflow on application (somebody, correct me if i'm wrong):
(apply my-comp (repeat 1000000 inc)) ;; ok
((apply my-comp (repeat 1000000 inc)) 1) ;; stack overflow
so it would probably be better to use reduce or something else:
(defn my-comp-reduce [f & fs]
(let [[f & fs] (reverse (cons f fs))]
(fn [& args]
(reduce (fn [acc curr-f] (curr-f acc))
(apply f args)
fs))))
user> ((my-comp-reduce (partial repeat 3) #(* 10 %) - +) 1 2 3)
;;=> (-60 -60 -60)
user> ((apply my-comp-reduce (repeat 1000000 inc)) 1)
;;=> 1000001
There is already a good answer above, but I think the original suggestion to use recur may have been thinking of a more manual accumulation of the result. In case you haven't seen it, reduce is just a very specific usage of loop/recur:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn my-reduce
[step-fn init-val data-vec]
(loop [accum init-val
data data-vec]
(if (empty? data)
accum
(let [accum-next (step-fn accum (first data))
data-next (rest data)]
(recur accum-next data-next)))))
(dotest
(is= 10 (my-reduce + 0 (range 5))) ; 0..4
(is= 120 (my-reduce * 1 (range 1 6))) ; 1..5 )
In general, there can be any number of loop variables (not just 2 like for reduce). Using loop/recur gives you a more "functional" way of looping with accumulated state instead of using and atom and a doseq or something. As the name suggests, from the outside the effect is quite similar to a normal recursion w/o any stack size limits (i.e. tail-call optimization).
P.S. As this example shows, I like to use a let form to very explicitly name the values being generated for the next iteration.
P.P.S. While the compiler will allow you to type the following w/o confusion:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn my-reduce
[step-fn accum data]
(loop [accum accum
data data]
...))
it can be a bit confusing and/or sloppy to re-use variable names (esp. for people new to Clojure or your particular program).
Also
I would be remiss if I didn't point out that the function definition itself can be a recur target (i.e. you don't need to use loop). Consider this version of the factorial:
(ns tst.demo.core
(:use demo.core tupelo.core tupelo.test))
(defn fact-impl
[cum x]
(if (= x 1)
cum
(let [cum-next (* cum x)
x-next (dec x)]
(recur cum-next x-next))))
(defn fact [x] (fact-impl 1 x))
(dotest
(is= 6 (fact 3))
(is= 120 (fact 5)))

deref an atom after recursive function completes

I have an atom fs that I'm updating inside a recursive function freq-seq that's the value that holds the results of my computation. I have another function mine-freq-seqs to start freq-seq and when mine-freq-seqs is done I would like to receive the last value of said atom. So I thought I would do it like so
(ns freq-seq-enum)
(def fs (atom #{}))
(defn locally-frequents
[sdb min-sup]
(let [uniq-sdb (map (comp frequencies set) sdb)
freqs (apply merge-with + uniq-sdb)]
(->> freqs
(filter #(<= min-sup (second %)))
(map #(vector (str (first %)) (second %))))))
(defn project-sdb
[sdb prefix]
(if (empty? prefix) sdb
(into [] (->> sdb
(filter #(re-find (re-pattern (str (last prefix))) %))
(map #(subs % (inc (.indexOf % (str (last prefix))))))
(remove empty?)))))
(defn freq-seq
[sdb prefix prefix-support min-sup frequent-seqs]
(if ((complement empty?) prefix) (swap! fs conj [prefix prefix-support]))
(let [lf (locally-frequents sdb min-sup)]
(if (empty? lf) nil
(for [[item sup] lf] (freq-seq (project-sdb sdb (str prefix item)) (str prefix item) sup min-sup #fs)))))
(defn mine-freq-seqs
[sdb min-sup]
(freq-seq sdb "" 0 min-sup #fs))
running it first
(mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2)
then deref-ing the atom
(deref fs)
yields
#{["B" 4]
["BC" 4]
["AB" 4]
["CA" 3]
["CAC" 2]
["AC" 4]
["ABC" 4]
["CAB" 2]
["A" 4]
["CABC" 2]
["ABB" 2]
["CC" 2]
["CB" 3]
["C" 4]
["BB" 2]
["CBC" 2]
["AA" 2]}
however (doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2) (deref fs))
just gives #{}
What I want is to let the freq-seq recurse to completion then get the value of the atom fs. So I can call mine-freq-seq and have my result returned in the REPL instead of having to manually deref it there.
First some alternate code without the atom then a look at why you get the empty return.
A more compact version where the sequences in a string are derived with a reduce rather than the recursion with regex and substr.
Then just do a frequencies on those results.
(defn local-seqs
[s]
(->> s
(reduce (fn [acc a] (into acc (map #(conj % a) acc))) #{[]})
(map #(apply str %))
(remove empty?)))
(defn freq-seqs
[sdb min-sup]
(->> (mapcat local-seqs sdb)
frequencies
(filter #(>= (second %) min-sup))
set))
That's the whole thing!
I haven't involved an atom because I didn't see a need but add it at the end if freq-seqs if you like.
For your original question: why the return that you see?
You are calling doall with 2 args, the result of your call and a collection. doall is a function and not a macro so the deref is performed immediately.
(defn doall
;; <snip>
([n coll] ;; you have passed #{} as coll
(dorun n coll) ;; and this line evals to nil
coll) ;; and #{} is returned
You have passed your result as the n arg and an empty set as the coll (from (deref fs))
Now when doall calls dorun, it encounters the following:
(defn dorun
;; <snip>
([n coll]
(when (and (seq coll) (pos? n)) ;; coll is #{} so the seq is falesy
(recur (dec n) (next coll)))) ;; and a nil is returned
Since the empty set from fs is the second arg (coll) and and is a macro, it will be falsey on (seq coll), return nil and then doall returns the empty set that was it's second arg.
Final note:
So that is something that works and why yours failed. As to how to make yours work, to fix the call above I tried:
(do (doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2))
(deref fs))
That is closer to working but with the recusion in your process, it only forces the eval one level deep. So you could push the doall deeper into your funcs but I have proposed a completely different internal structure so I will leave the rest to you if you really need that structure.
I changed it a bit to remove all of the lazy bits (this happens silently in the repl but can be confusing when it changes outside of the repl). Note the changes with vec, mapv, and doall. At least now I get your result:
(def fs (atom #{}))
(defn locally-frequents
[sdb min-sup]
(let [uniq-sdb (map (comp frequencies set) sdb)
freqs (apply merge-with + uniq-sdb)]
(->> freqs
(filter #(<= min-sup (second %)))
(mapv #(vector (str (first %)) (second %))))))
(defn project-sdb
[sdb prefix]
(if (empty? prefix)
sdb
(into [] (->> sdb
(filter #(re-find (re-pattern (str (last prefix))) %))
(map #(subs % (inc (.indexOf % (str (last prefix))))))
(remove empty?)))))
(defn freq-seq
[sdb prefix prefix-support min-sup frequent-seqs]
(if ((complement empty?) prefix) (swap! fs conj [prefix prefix-support]))
(let [lf (locally-frequents sdb min-sup)]
(if (empty? lf)
nil
(vec (for [[item sup] lf] (freq-seq (project-sdb sdb (str prefix item)) (str prefix item) sup min-sup #fs))))))
(defn mine-freq-seqs
[sdb min-sup]
(freq-seq sdb "" 0 min-sup #fs))
(doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2))
(deref fs) => #{["B" 4] ["BC" 4] ["AB" 4] ["CA" 3]
["CAC" 2] ["AC" 4] ["ABC" 4] ["CAB" 2]
["A" 4] ["CABC" 2] ["ABB" 2] ["CC" 2] ["CB" 3]
["C" 4] ["BB" 2] ["CBC" 2] ["AA" 2]}
I'm still not really sure what the goal is or how/why you get entries like "CABC".

If the only non-stack-consuming looping construct in Clojure is "recur", how does this lazy-seq work?

The ClojureDocs page for lazy-seq gives an example of generating a lazy-seq of all positive numbers:
(defn positive-numbers
([] (positive-numbers 1))
([n] (cons n (lazy-seq (positive-numbers (inc n))))))
This lazy-seq can be evaluated for pretty large indexes without throwing a StackOverflowError (unlike the sieve example on the same page):
user=> (nth (positive-numbers) 99999999)
100000000
If only recur can be used to avoid consuming stack frames in a recursive function, how is it possible this lazy-seq example can seemingly call itself without overflowing the stack?
A lazy sequence has the rest of the sequence generating calculation in a thunk. It is not immediately called. As each element (or chunk of elements as the case may be) is requested, a call to the next thunk is made to retrieve the value(s). That thunk may create another thunk to represent the tail of the sequence if it continues. The magic is that (1) these special thunks implement the sequence interface and can transparently be used as such and (2) each thunk is only called once -- its value is cached -- so the realized portion is a sequence of values.
Here it is the general idea without the magic, just good ol' functions:
(defn my-thunk-seq
([] (my-thunk-seq 1))
([n] (list n #(my-thunk-seq (inc n)))))
(defn my-next [s] ((second s)))
(defn my-realize [s n]
(loop [a [], s s, n n]
(if (pos? n)
(recur (conj a (first s)) (my-next s) (dec n))
a)))
user=> (-> (my-thunk-seq) first)
1
user=> (-> (my-thunk-seq) my-next first)
2
user=> (my-realize (my-thunk-seq) 10)
[1 2 3 4 5 6 7 8 9 10]
user=> (count (my-realize (my-thunk-seq) 100000))
100000 ; Level stack consumption
The magic bits happen inside of clojure.lang.LazySeq defined in Java, but we can actually do the magic directly in Clojure (implementation that follows for example purposes), by implementing the interfaces on a type and using an atom to cache.
(deftype MyLazySeq [thunk-mem]
clojure.lang.Seqable
(seq [_]
(if (fn? #thunk-mem)
(swap! thunk-mem (fn [f] (seq (f)))))
#thunk-mem)
;Implementing ISeq is necessary because cons calls seq
;on anyone who does not, which would force realization.
clojure.lang.ISeq
(first [this] (first (seq this)))
(next [this] (next (seq this)))
(more [this] (rest (seq this)))
(cons [this x] (cons x (seq this))))
(defmacro my-lazy-seq [& body]
`(MyLazySeq. (atom (fn [] ~#body))))
Now this already works with take, etc., but as take calls lazy-seq we'll make a my-take that uses my-lazy-seq instead to eliminate any confusion.
(defn my-take
[n coll]
(my-lazy-seq
(when (pos? n)
(when-let [s (seq coll)]
(cons (first s) (my-take (dec n) (rest s)))))))
Now let's make a slow infinite sequence to test the caching behavior.
(defn slow-inc [n] (Thread/sleep 1000) (inc n))
(defn slow-pos-nums
([] (slow-pos-nums 1))
([n] (cons n (my-lazy-seq (slow-pos-nums (slow-inc n))))))
And the REPL test
user=> (def nums (slow-pos-nums))
#'user/nums
user=> (time (doall (my-take 10 nums)))
"Elapsed time: 9000.384616 msecs"
(1 2 3 4 5 6 7 8 9 10)
user=> (time (doall (my-take 10 nums)))
"Elapsed time: 0.043146 msecs"
(1 2 3 4 5 6 7 8 9 10)
Keep in mind that lazy-seq is a macro, and therefore does not evaluate its body when your positive-numbers function is called. In that sense, positive-numbers isn't truly recursive. It returns immediately, and the inner "recursive" call to positive-numbers doesn't happen until the seq is consumed.
user=> (source lazy-seq)
(defmacro lazy-seq
"Takes a body of expressions that returns an ISeq or nil, and yields
a Seqable object that will invoke the body only the first time seq
is called, and will cache the result and return it on all subsequent
seq calls. See also - realized?"
{:added "1.0"}
[& body]
(list 'new 'clojure.lang.LazySeq (list* '^{:once true} fn* [] body)))
I think the trick is that the producer function (positive-numbers) isn't getting called recursively, it doesn't accumulate stack frames as if it was called with basic recursion Little-Schemer style, because LazySeq is invoking it as needed for the individual entries in the sequence. Once a closure gets evaluated for an entry then it can be discarded. So stack frames from previous invocations of the function can get garbage-collected as the code churns through the sequence.

In Clojure, is it possible to combine memoization and tail call optimization?

In clojure, I would like to write a tail-recursive function that memoizes its intermediate results for subsequent calls.
[EDIT: this question has been rewritten using gcd as an example instead of factorial.]
The memoized gcd (greatest common divisor) could be implemented like this:
(def gcd (memoize (fn [a b]
(if (zero? b)
a
(recur b (mod a b))))
In this implementation, intermediate results are not memoized for subsequent calls. For example, in order to calculate gcd(9,6), gcd(6,3) is called as an intermediate result. However, gcd(6,3) is not stored in the cache of the memoized function because the recursion point of recur is the anonymous function that is not memoized.
Therefore, if after having called gcd(9,6), we call gcd(6,3) we won't benefit from the memoization.
The only solution I can think about will be to use mundane recursion (explicitely call gcd instead of recur) but then we will not benefit from Tail Call Optimization.
Bottom Line
Is there a way to achieve both:
Tail call optimization
Memoization of intermediate results for subsequent calls
Remarks
This question is similar to Combine memoization and tail-recursion. But all the answers there are related to F#. Here, I am looking for an answer in clojure.
This question has been left as an exercise for the reader by The Joy of Clojure (chap 12.4). You can consult the relevant page of the book at http://bit.ly/HkQrio.
in your case it's hard to show memoize do anything with factorial because the intermediate calls are unique, so I'll rewrite a somewhat contrived example assuming the point is to explore ways to avoid blowing the stack:
(defn stack-popper [n i]
(if (< i n) (* i (stack-popper n (inc i))) 1))
which can then get something out of a memoize:
(def stack-popper
(memoize (fn [n i] (if (< i n) (* i (stack-popper n (inc i))) 1))))
the general approaches to not blowing the stack are:
use tail calls
(def stack-popper
(memoize (fn [n acc] (if (> n 1) (recur (dec n) (* acc (dec n))) acc))))
use trampolines
(def stack-popper
(memoize (fn [n acc]
(if (> n 1) #(stack-popper (dec n) (* acc (dec n))) acc))))
(trampoline (stack-popper 4 1))
use a lazy sequence
(reduce * (range 1 4))
None of these work all the time, though I have yet to hit a case where none of them work. I almost always go for the lazy ones first because I find them to be most clojure like, then I head for tail calling with recur or tramplines
(defmacro memofn
[name args & body]
`(let [cache# (atom {})]
(fn ~name [& args#]
(let [update-cache!# (fn update-cache!# [state# args#]
(if-not (contains? state# args#)
(assoc state# args#
(delay
(let [~args args#]
~#body)))
state#))]
(let [state# (swap! cache# update-cache!# args#)]
(-> state# (get args#) deref))))))
This will allow a recursive definition of a memoized function, which also caches intermediate results. Usage:
(def fib (memofn fib [n]
(case n
1 1
0 1
(+ (fib (dec n)) (fib (- n 2))))))
(def gcd
(let [cache (atom {})]
(fn [a b]
#(or (#cache [a b])
(let [p (promise)]
(deliver p
(loop [a a b b]
(if-let [p2 (#cache [a b])]
#p2
(do
(swap! cache assoc [a b] p)
(if (zero? b)
a
(recur b (mod a b))))))))))))
There is some concurrency issues (double evaluation, the same problem as with memoize, but worse because of the promises) which may be fixed using #kotarak's advice.
Turning the above code into a macro is left as an exercise to the reader. (Fogus's note was imo tongue-in-cheek.)
Turning this into a macro is really a simple exercise in macrology, please remark that the body (the 3 last lines) remain unchanged.
Using Clojure's recur you can write factorial using an accumulator that has no stack growth, and just memoize it:
(defn fact
([n]
(fact n 1))
([n acc]
(if (= 1 n)
acc
(recur (dec n)
(* n acc)))))
This is factorial function implemented with anonymous recursion with tail call and memoization of intermediate results. The memoization is integrated with the function and a reference to shared buffer (implemented using Atom reference type) is passed by a lexical closure.
Since the factorial function operates on natural numbers and the arguments for succesive results are incremental, Vector seems more tailored data structure to store buffered results.
Instead of passing the result of a previous computation as an argument (accumulator) we're getting it from the buffer.
(def ! ; global variable referring to a function
(let [m (atom [1 1 2 6 24])] ; buffer of results
(fn [n] ; factorial function definition
(let [m-count (count #m)] ; number of results in a buffer
(if (< n m-count) ; do we have buffered result for n?
(nth #m n) ; · yes: return it
(loop [cur m-count] ; · no: compute it recursively
(let [r (*' (nth #m (dec cur)) cur)] ; new result
(swap! m assoc cur r) ; store the result
(if (= n cur) ; termination condition:
r ; · base case
(recur (inc cur)))))))))) ; · recursive case
(time (do (! 8000) nil)) ; => "Elapsed time: 154.280516 msecs"
(time (do (! 8001) nil)) ; => "Elapsed time: 0.100222 msecs"
(time (do (! 7999) nil)) ; => "Elapsed time: 0.090444 msecs"
(time (do (! 7999) nil)) ; => "Elapsed time: 0.055873 msecs"

How to return the output of a recursive function in Clojure

I'm new to functional languages and clojure, so please bear with me...
I'm trying to construct a list of functions, with either random parameters or constants. The function that constructs the list of functions is already working, though it doesn't return the function itself. I verified this using println.
(edit: Okay, it isn't working correctly yet after all)
(edit: Now it's working, but it cannot be "eval"-ed. it seems I need to recur at least two times, to ensure there are at least two children nodes. Is this possible?)
Here is the snippet:
(def operations (list #(- %1 %2) #(+ %1 %2) #(* %1 %2) #(/ %1 %2)))
(def parameters (list \u \v \w \x \y \z))
(def parameterlistcount 6)
(def paramcount 2)
(def opcount 4)
(defn generate-function
([] (generate-function 2 4 0.5 0.6 () parameters))
([pc maxdepth fp pp function-list params]
(if (and (pos? maxdepth) (< (rand) fp))
(let [function-list
(conj function-list
(nth operations
(rand-int (count operations))))]
(recur pc (dec maxdepth) fp pp function-list params))
(if (and (< (rand) pp) (pos? pc))
(let [ params (pop parameters)
function-list
(conj function-list
(nth params
(rand-int (count params))))]
(if (contains? (set operations) (last function-list) )
(recur (dec pc) maxdepth fp pp function-list params)
nil))
(let [function-list
(conj function-list
(rand-int 100))]
(if (or (pos? maxdepth) (pos? pc))
(if (contains? (set operations) (last function-list) )
(recur pc maxdepth fp pp function-list params)
nil)
function-list))))))
Any help will be appreciated, thanks!
Here's my shot at rewriting your function (see comments below):
(defn generate-function
([] (generate-function 2 4 0.5 0.6 ()))
([pc maxdepth fp pp function-list]
(if (and (pos? maxdepth) (< (rand) fp))
(let [function-list
(conj function-list
{:op
(nth operations
(rand-int (count operations)))})]
(recur pc (dec maxdepth) fp pp function-list))
(if (and (< (rand) pp) (pos? pc))
(let [function-list
(conj function-list
{:param
(nth parameters
(rand-int (count parameters)))})]
(recur (dec pc) maxdepth fp pp function-list))
(let [function-list
(conj function-list
{:const
(rand-int 100)})]
(if (or (pos? maxdepth) (pos? pc))
(recur pc maxdepth fp pp function-list)
function-list))))))
And some examples of use from my REPL...
user> (generate-function)
({:const 63} {:op #<user$fn__4557 user$fn__4557#6cbb2d>} {:const 77} {:param \w} {:op #<user$fn__4559 user$fn__4559#8e68bd>} {:const 3} {:param \v} {:const 1} {:const 8} {:op #<user$fn__4559 user$fn__4559#8e68bd>} {:op #<user$fn__4555 user$fn__4555#6f0962>})
user> (generate-function)
({:const 27} {:param \y} {:param \v} {:op #<user$fn__4561 user$fn__4561#10463c3>} {:op #<user$fn__4561 user$fn__4561#10463c3>} {:op #<user$fn__4561 user$fn__4561#10463c3>} {:op #<user$fn__4561 user$fn__4561#10463c3>} {:const 61})
A couple of things to keep in mind, in pretty random order:
I used recur in the above to avoid consuming stack in the recursive self-calls. However, you have this dotimes statement which makes me wonder if you might be interested in constructing a bunch of function-lists in parallel with one generate-function call. If so, tail-recursion with recur might not be an option with simplistic code like this, so you could either settle for the regular self-calls (but do consider the possibility of hitting the recursion limit; if you're positive that you'll only generate smallish functions and this won't be a problem, go ahead with the self-calls) or investigate continuation-passing style and rewrite your function in that style.
The (do (dec pc) ...) thing in your code does nothing to the value of pc in the next recursive call, or indeed to its current value. Local variables (or locals, as they are most often called in the community) in Clojure are immutable; this includes function parameters. If you want to pass along a decremented pc to some function, you'll have to do just that, like you did with maxdepth in an earlier branch of your code.
I renamed your function to generate-function, because camel case in function names is quite unusual in Clojure land. Also, I renamed the parameter which you called function to function-list (so maybe I should have used a name like generate-function-list for the function... hm), because that's what it is for now.
Note that there's no point to keeping a separate opcount Var around; Clojure's persistent lists (as created by the list function) carry their count around, so (count some-list) is a constant-time operation (and very fast). Also, it would be idiomatic to use vectors for operations and parameters (and you can switch to vectors without changing anything in the rest of the code!). E.g. [\u \v \w \x \y \z].
In Clojure 1.2, you'll be able to use (rand-nth coll) for (nth coll (rand-int (count coll))).
If you want to generate actual Clojure functions from trees of items representing ops, params and constants, you'll want to use eval. That's discouraged in most scenarios, but not for evolutionary programming and similar stuff where it's the only way to go.
Personally, I'd use a different format of the op/param/constant maps: something like {:category foo, :content bar} where foo is :op, :param or :const and bar is something appropriate in connection to any given foo.
In general it is a better idea in Clojure to use (recur ...) for your recursive functions. From the docs: "Note that recur is the only non-stack-consuming looping construct in Clojure." link
One other thing to note is that you might want to call the randomizer outside the recursive function, so you can define the stop-condition inside the function.
So like this:
(let [n (rand-int 10)]
(println "Let's define f with n =" n)
(defn f [x]
(if (> x n)
"result"
(do (println x)
(recur (inc x))))))
It prints:
Let's define f with n = 4
user> (f 0)
0
1
2
3
4
"result"
where 4 is of course a random number between 0 (inclusive) and 10 (exclusive).
So okay, I discovered I was going about this the wrong way.
A recursive definition of a tree is non other than defining vertices, and trying to tie everything with it. So, I came up with this, in less than 15 minutes. >_<
(defn generate-branch
"Generate branches for tree"
([] (generate-branch 0.6 () (list \x \y \z)))
([pp branch params]
(loop [branch
(conj branch (nth operations (rand-int (count operations))))]
(if (= (count branch) 3)
branch
(if (and (< (rand) pp))
(recur (conj branch (nth params (rand-int (count params)))))
(recur (conj branch (rand-int 100))))))))
(defn build-vertex
"Generates a vertex from branches"
[]
(let [vertex (list (nth operations (rand-int (count operations))))]
(conj vertex (take 5 (repeatedly generate-branch)))))
THANKS EVERYONE!

Resources