Adding values to vectors and sets - vector

Here is what I have:
(def my-atom (atom []))
(defn add-to-my-atom! [x]
(swap! my-atom conj x))
How do I append value to the vector only if it's not present already? I want to be able to use a predicate for the testing. For example in Common Lisp there is pushnew:
pushnew item place &key key test test-not
Is there something similar in Clojure? Perhaps, I should use sets instead of vectors. Fine. How do you define predicate that set will use to compare its values? For example, set can contain strings, and suppose that differences in case should not affect operations on sets, how Clojure deals with that?

Using a vector:
user> (def my-atom (atom []))
(defn push-new
[place item pred]
(if (pred place item)
(conj place item)
place))
(defn add-to-my-atom!
[x]
(swap! my-atom push-new x
(fn [place item]
(not (some #(= (.toLowerCase %)
(.toLowerCase item))
place)))))
#'user/add-to-my-atom!
user> (add-to-my-atom! "Hello World!")
["Hello World!"]
user> (add-to-my-atom! "hello world!")
["Hello World!"]
user> (add-to-my-atom! "ABCDE")
["Hello World!" "ABCDE"]
user> (add-to-my-atom! "abcde")
["Hello World!" "ABCDE"]
Using a set with a custom sorting comparitor:
user> (def my-atom (atom (sorted-set-by (fn [a b] (compare (.toLowerCase a) (.toLowerCase b))))))
#'user/my-atom
user> (swap! my-atom conj "Hello")
#{"Hello"}
user> (swap! my-atom conj "hello")
#{"Hello"}
user> (swap! my-atom conj "abc")
#{"abc" "Hello"}
user> (swap! my-atom conj "Abc")
#{"abc" "Hello"}

If you were not working with atoms, the fn to add to a vector if it is not there would be:
(defn push-new [v value]
(if (some #{value} v)
v
(conj v value)))
Now you can easily use that fn to move from one value of the atom to the next:
(defn add-to-my-atom [the-atom x]
(swap! the-atom push-new x))
Depending on your use case, a set can be more appropriate. Clojure relies on the equals and hashcode implementation of the objects that you put into the set, unless you use a sorted-set-by, or you can simply uppercase them before putting them into the set.

Related

Clojure: Vector is not immutable

I'm running into a problem where immutability suddenly doesn't hold for my vectors. I was wondering if there was a way to create fresh, immutable vector copies of a given set.
Clojuredocs suggested "aclone" but that is giving me an error stating that there's no such method.
(defn stripSame [word wordList]
(def setVec (into #{} wordList))
(def wordSet word)
(def wordVec (into #{} [wordSet]))
(def diffSet (set/difference setVec wordVec))
(def diffVec (into [] diffSet))
diffVec)
(defn findInsOuts [word passList]
(def wordList (stripSame word passList))
(println word wordList)
(def endLetter (subs word (dec (count word))))
(def startLetter (subs word 0 1))
(println startLetter endLetter)
(def outs (filter (partial starts endLetter) wordList))
(def ins (filter (partial ends startLetter) wordList))
;(println ins outs)
(def indexes [ (count ins) (count outs)])
indexes)
(defn findAll [passList]
(def wordList (into [] passList) ))
(println wordList)
(loop [n 0 indexList []]
(println "In here" (get wordList n) wordList)
(if (< n (count wordList))
(do
(def testList wordList)
(def indexes (findInsOuts (get wordList n) testList))
(println (get wordList n) indexes)
(recur (inc n) (conj indexList [(get wordList n) indexes]))))))
passList is a list of words like so (lol at good) which is then cast into a vector.
So basically findAll calls findInsOuts which goes through every word in the list and sees how many other words start with its last letter but which first removes the search word from the vector before performing some function to prevent duplicates. The problem is that somehow this vector is actually mutable, so the copy of the vector in findAll also has that value permanently stripped.
When I try to create a new vector and then act on that vector the same thing still happens, which implies that they're aliased/sharing the same memory location.
How can I create a fresh vector for use that is actually immutable?
Any help is appreciated
I'm afraid your code is riddled with misunderstandings. For a start, don't use def within defn. Use let instead. This turns your first function into ...
(defn stripSame [word wordList]
(let [setVec (into #{} wordList)
wordSet word
wordVec (into #{} [wordSet])
diffSet (clojure.set/difference setVec wordVec)
diffVec (into [] diffSet)]
diffVec))
For example,
=> (stripSame 2 [1 2 :buckle-my-shoe])
[1 :buckle-my-shoe]
The function can be simplified to ...
(defn stripSame [word wordList]
(vec (disj (set wordList) word)))
... or, using a threading macro, to ...
(defn stripSame [word wordList]
(-> wordList set (disj word) vec))
I don't think the function does what you think it does, because it doesn't always preserve the order of elements in the vector.
If I were you, I'd work my way through some of the community tutorials on this page. There are several good books referred to there too. Once you get to grips with the idioms of the language, you'll find the sort of thing you are trying to do here much clearer and easier.

deref an atom after recursive function completes

I have an atom fs that I'm updating inside a recursive function freq-seq that's the value that holds the results of my computation. I have another function mine-freq-seqs to start freq-seq and when mine-freq-seqs is done I would like to receive the last value of said atom. So I thought I would do it like so
(ns freq-seq-enum)
(def fs (atom #{}))
(defn locally-frequents
[sdb min-sup]
(let [uniq-sdb (map (comp frequencies set) sdb)
freqs (apply merge-with + uniq-sdb)]
(->> freqs
(filter #(<= min-sup (second %)))
(map #(vector (str (first %)) (second %))))))
(defn project-sdb
[sdb prefix]
(if (empty? prefix) sdb
(into [] (->> sdb
(filter #(re-find (re-pattern (str (last prefix))) %))
(map #(subs % (inc (.indexOf % (str (last prefix))))))
(remove empty?)))))
(defn freq-seq
[sdb prefix prefix-support min-sup frequent-seqs]
(if ((complement empty?) prefix) (swap! fs conj [prefix prefix-support]))
(let [lf (locally-frequents sdb min-sup)]
(if (empty? lf) nil
(for [[item sup] lf] (freq-seq (project-sdb sdb (str prefix item)) (str prefix item) sup min-sup #fs)))))
(defn mine-freq-seqs
[sdb min-sup]
(freq-seq sdb "" 0 min-sup #fs))
running it first
(mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2)
then deref-ing the atom
(deref fs)
yields
#{["B" 4]
["BC" 4]
["AB" 4]
["CA" 3]
["CAC" 2]
["AC" 4]
["ABC" 4]
["CAB" 2]
["A" 4]
["CABC" 2]
["ABB" 2]
["CC" 2]
["CB" 3]
["C" 4]
["BB" 2]
["CBC" 2]
["AA" 2]}
however (doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2) (deref fs))
just gives #{}
What I want is to let the freq-seq recurse to completion then get the value of the atom fs. So I can call mine-freq-seq and have my result returned in the REPL instead of having to manually deref it there.
First some alternate code without the atom then a look at why you get the empty return.
A more compact version where the sequences in a string are derived with a reduce rather than the recursion with regex and substr.
Then just do a frequencies on those results.
(defn local-seqs
[s]
(->> s
(reduce (fn [acc a] (into acc (map #(conj % a) acc))) #{[]})
(map #(apply str %))
(remove empty?)))
(defn freq-seqs
[sdb min-sup]
(->> (mapcat local-seqs sdb)
frequencies
(filter #(>= (second %) min-sup))
set))
That's the whole thing!
I haven't involved an atom because I didn't see a need but add it at the end if freq-seqs if you like.
For your original question: why the return that you see?
You are calling doall with 2 args, the result of your call and a collection. doall is a function and not a macro so the deref is performed immediately.
(defn doall
;; <snip>
([n coll] ;; you have passed #{} as coll
(dorun n coll) ;; and this line evals to nil
coll) ;; and #{} is returned
You have passed your result as the n arg and an empty set as the coll (from (deref fs))
Now when doall calls dorun, it encounters the following:
(defn dorun
;; <snip>
([n coll]
(when (and (seq coll) (pos? n)) ;; coll is #{} so the seq is falesy
(recur (dec n) (next coll)))) ;; and a nil is returned
Since the empty set from fs is the second arg (coll) and and is a macro, it will be falsey on (seq coll), return nil and then doall returns the empty set that was it's second arg.
Final note:
So that is something that works and why yours failed. As to how to make yours work, to fix the call above I tried:
(do (doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2))
(deref fs))
That is closer to working but with the recusion in your process, it only forces the eval one level deep. So you could push the doall deeper into your funcs but I have proposed a completely different internal structure so I will leave the rest to you if you really need that structure.
I changed it a bit to remove all of the lazy bits (this happens silently in the repl but can be confusing when it changes outside of the repl). Note the changes with vec, mapv, and doall. At least now I get your result:
(def fs (atom #{}))
(defn locally-frequents
[sdb min-sup]
(let [uniq-sdb (map (comp frequencies set) sdb)
freqs (apply merge-with + uniq-sdb)]
(->> freqs
(filter #(<= min-sup (second %)))
(mapv #(vector (str (first %)) (second %))))))
(defn project-sdb
[sdb prefix]
(if (empty? prefix)
sdb
(into [] (->> sdb
(filter #(re-find (re-pattern (str (last prefix))) %))
(map #(subs % (inc (.indexOf % (str (last prefix))))))
(remove empty?)))))
(defn freq-seq
[sdb prefix prefix-support min-sup frequent-seqs]
(if ((complement empty?) prefix) (swap! fs conj [prefix prefix-support]))
(let [lf (locally-frequents sdb min-sup)]
(if (empty? lf)
nil
(vec (for [[item sup] lf] (freq-seq (project-sdb sdb (str prefix item)) (str prefix item) sup min-sup #fs))))))
(defn mine-freq-seqs
[sdb min-sup]
(freq-seq sdb "" 0 min-sup #fs))
(doall (mine-freq-seqs ["CAABC" "ABCB" "CABC" "ABBCA"] 2))
(deref fs) => #{["B" 4] ["BC" 4] ["AB" 4] ["CA" 3]
["CAC" 2] ["AC" 4] ["ABC" 4] ["CAB" 2]
["A" 4] ["CABC" 2] ["ABB" 2] ["CC" 2] ["CB" 3]
["C" 4] ["BB" 2] ["CBC" 2] ["AA" 2]}
I'm still not really sure what the goal is or how/why you get entries like "CABC".

How do I map a vector to a map, pushing into it repeated key values?

This is my input data:
[[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]]
I would like to map this into the following:
{:a [[1 2] [3 4] [5 6]] :b [[\a \b] [\c \d] [\e \f]]}
This is what I have so far:
(defn- build-annotation-map [annotation & m]
(let [gff (first annotation)
remaining (rest annotation)
seqname (first gff)
current {seqname [(nth gff 3) (nth gff 4)]}]
(if (not (seq remaining))
m
(let [new-m (merge-maps current m)]
(apply build-annotation-map remaining new-m)))))
(defn- merge-maps [m & ms]
(apply merge-with conj
(when (first ms)
(reduce conj ;this is to avoid [1 2 [3 4 ... etc.
(map (fn [k] {k []}) (keys m))))
m ms))
The above produces:
{:a [[1 2] [[3 4] [5 6]]] :b [[\a \b] [[\c \d] [\e \f]]]}
It seems clear to me that the problem is in merge-maps, specifically with the function passed to merge-with (conj), but after banging my head for a while now, I'm about ready for someone to help me out.
I'm new to lisp in general, and clojure in particular, so I also appreciate comments not specifically addressing the problem, but also style, brain-dead constructs on my part, etc. Thanks!
Solution (close enough, anyway):
(group-by first [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])
=> {:a [[:a 1 2] [:a 3 4] [:a 5 6]], :b [[:b \a \b] [:b \c \d] [:b \e \f]]}
(defn build-annotations [coll]
(reduce (fn [m [k & vs]]
(assoc m k (conj (m k []) (vec vs))))
{} coll))
Concerning your code, the most significant problem is naming. Firstly, I wouldn't, especially without first understanding your code, have any idea what is meant by annotation, gff, and seqname. current is pretty ambiguous too. In Clojure, remaining would generally be called more, depending on the context, and whether a more specific name should be used.
Within your let statement, gff (first annotation)
remaining (rest annotation), I'd probably take advantage of destructuring, like this:
(let [[first & more] annotation] ...)
If you would rather use (rest annotation) then I'd suggest using next instead, as it will return nil if it's empty, and allow you to write (if-not remaining ...) rather than (if-not (seq remaining) ...).
user> (next [])
nil
user> (rest [])
()
In Clojure, unlike other lisps, the empty list is truthy.
This article shows the standard for idiomatic naming.
Works at least on the given data set.
(defn build-annotations [coll]
(reduce
(fn [result vec]
(let [key (first vec)
val (subvec vec 1)
old-val (get result key [])
conjoined-val (conj old-val val)]
(assoc
result
key
conjoined-val)))
{}
coll))
(build-annotations [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])
I am sorry for not offering improvements on your code. I am just learning Clojure and it is easier to solve problems piece by piece instead of understanding a bigger piece of code and finding the problems in it.
Although I have no comments to your code yet, I tried it for my own and came up with this solution:
(defn build-annotations [coll]
(let [anmap (group-by first coll)]
(zipmap (keys anmap) (map #(vec (map (comp vec rest) %)) (vals anmap)))))
Here's my entry leveraging group-by, although several steps in here are really concerned with returning vectors rather than lists. If you drop that requirement, it gets a bit simpler:
(defn f [s]
(let [g (group-by first s)
k (keys g)
v (vals g)
cleaned-v (for [group v]
(into [] (map (comp #(into [] %) rest) group)))]
(zipmap k cleaned-v)))
Depending what you actually want, you might even be able to get by with just doing group-by.
(defn build-annotations [coll]
(apply merge-with concat
(map (fn [[k & vals]] {k [vals]})
coll))
So,
(map (fn [[k & vals]] {k [vals]})
coll))
takes a collection of [keys & values] and returns a list of {key [values]}
(apply merge-with concat ...list of maps...)
takes a list of maps, merges them together, and concats the values if a key already exists.

Merge list of maps and combine values to sets in Clojure

What function can I put as FOO here to yield true at the end? I played with hash-set (only correct for first 2 values), conj, and concat but I know I'm not handling the single-element vs set condition properly with just any of those.
(defn mergeMatches [propertyMapList]
"Take a list of maps and merges them combining values into a set"
(reduce #(merge-with FOO %1 %2) {} propertyMapList))
(def in
(list
{:a 1}
{:a 2}
{:a 3}
{:b 4}
{:b 5}
{:b 6} ))
(def out
{ :a #{ 1 2 3}
:b #{ 4 5 6} })
; this should return true
(= (mergeMatches in) out)
What is the most idiomatic way to handle this?
This'll do:
(let [set #(if (set? %) % #{%})]
#(clojure.set/union (set %) (set %2)))
Rewritten more directly for the example (Alex):
(defn to-set [s]
(if (set? s) s #{s}))
(defn set-union [s1 s2]
(clojure.set/union (to-set s1) (to-set s2)))
(defn mergeMatches [propertyMapList]
(reduce #(merge-with set-union %1 %2) {} propertyMapList))
I didn't write this but it was contributed by #amitrathore on Twitter:
(defn kv [bag [k v]]
(update-in bag [k] conj v))
(defn mergeMatches [propertyMapList]
(reduce #(reduce kv %1 %2) {} propertyMapList))
I wouldn't use merge-with for this,
(defn fnil [f not-found]
(fn [x y] (f (if (nil? x) not-found x) y)))
(defn conj-in [m map-entry]
(update-in m [(key map-entry)] (fnil conj #{}) (val map-entry)))
(defn merge-matches [property-map-list]
(reduce conj-in {} (apply concat property-map-list)))
user=> (merge-matches in)
{:b #{4 5 6}, :a #{1 2 3}}
fnil will be part of core soon so you can ignore the implementation... but it just creates a version of another function that can handle nil arguments. In this case conj will substitute #{} for nil.
So the reduction conjoining to a set for every key/value in the list of maps supplied.
Another solution contributed by #wmacgyver on Twitter based on multimaps:
(defn add
"Adds key-value pairs the multimap."
([mm k v]
(assoc mm k (conj (get mm k #{}) v)))
([mm k v & kvs]
(apply add (add mm k v) kvs)))
(defn mm-merge
"Merges the multimaps, taking the union of values."
[& mms]
(apply (partial merge-with union) mms))
(defn mergeMatches [property-map-list]
(reduce mm-merge (map #(add {} (key (first %)) (val (first %))) property-map-list)))
This seems to work:
(defn FOO [v1 v2]
(if (set? v1)
(apply hash-set v2 v1)
(hash-set v1 v2)))
Not super pretty but it works.
(defn mergeMatches [propertyMapList]
(for [k (set (for [pp propertyMapList] (key (first pp))))]
{k (set (remove nil? (for [pp propertyMapList] (k pp))))}))

Clojure: How do I apply a function to a subset of the entries in a hash-map?

I am not to Clojure and attempting to figure out how to do this.
I want to create a new hash-map that for a subset of the keys in the hash-map applies a function to the elements. What is the best way to do this?
(let
[my-map {:hello "World" :try "This" :foo "bar"}]
(println (doToMap my-map [:hello :foo] (fn [k] (.toUpperCase k)))
This should then result a map with something like
{:hello "WORLD" :try "This" :foo "BAR"}
(defn do-to-map [amap keyseq f]
(reduce #(assoc %1 %2 (f (%1 %2))) amap keyseq))
Breakdown:
It helps to look at it inside-out. In Clojure, hash-maps act like functions; if you call them like a function with a key as an argument, the value associated with that key is returned. So given a single key, the current value for that key can be obtained via:
(some-map some-key)
We want to take old values, and change them to new values by calling some function f on them. So given a single key, the new value will be:
(f (some-map some-key))
We want to associate this new value with this key in our hash-map, "replacing" the old value. This is what assoc does:
(assoc some-map some-key (f (some-map some-key)))
("Replace" is in scare-quotes because we're not mutating a single hash-map object; we're returning new, immutable, altered hash-map objects each time we call assoc. This is still fast and efficient in Clojure because hash-maps are persistent and share structure when you assoc them.)
We need to repeatedly assoc new values onto our map, one key at a time. So we need some kind of looping construct. What we want is to start with our original hash-map and a single key, and then "update" the value for that key. Then we take that new hash-map and the next key, and "update" the value for that next key. And we repeat this for every key, one at a time, and finally return the hash-map we've "accumulated". This is what reduce does.
The first argument to reduce is a function that takes two arguments: an "accumulator" value, which is the value we keep "updating" over and over; and a single argument used in one iteration to do some of the accumulating.
The second argument to reduce is the initial value passed as the first argument to this fn.
The third argument to reduce is a collection of arguments to be passed as the second argument to this fn, one at a time.
So:
(reduce fn-to-update-values-in-our-map
initial-value-of-our-map
collection-of-keys)
fn-to-update-values-in-our-map is just the assoc statement from above, wrapped in an anonymous function:
(fn [map-so-far some-key] (assoc map-so-far some-key (f (map-so-far some-key))))
So plugging it into reduce:
(reduce (fn [map-so-far some-key] (assoc map-so-far some-key (f (map-so-far some-key))))
amap
keyseq)
In Clojure, there's a shorthand for writing anonymous functions: #(...) is an anonymous fn consisting of a single form, in which %1 is bound to the first argument to the anonymous function, %2 to the second, etc. So our fn from above can be written equivalently as:
#(assoc %1 %2 (f (%1 %2)))
This gives us:
(reduce #(assoc %1 %2 (f (%1 %2))) amap keyseq)
(defn doto-map [m ks f & args]
(reduce #(apply update-in %1 [%2] f args) m ks))
Example call
user=> (doto-map {:a 1 :b 2 :c 3} [:a :c] + 2)
{:a 3, :b 2, :c 5}
Hopes this helps.
The following seems to work:
(defn doto-map [ks f amap]
(into amap
(map (fn [[k v]] [k (f v)])
(filter (fn [[k v]] (ks k)) amap))))
user=> (doto-map #{:hello :foo} (fn [k] (.toUpperCase k)) {:hello "World" :try "This" :foo "bar"})
{:hello "WORLD", :try "This", :foo "BAR"}
There might be a better way to do this. Perhaps someone can come up with a nice one-liner :)

Resources