Clojure - Remove items from vector inside loop - recursion

I just started learning Clojure and functional programming and I'm having a difficult time trying to implement the following task:
I have a vector of vectors like this [[a b] [a c] [b c] [c d] [d b]]. And I need to iterate through it removing the items that appears on the second column that had already appeared on the second column. For example the items [b c] and [d b] (because both c and b already appeared on the second column). I managed to get a function that remove one item at the time, but I need to iterate through the vector for each item checking and removing the items. How can I achieve that? I thought about using recursion to achieve that, but every attempt ended up in failure Sorry if this is a trivial question, but I am stuck with that.
For example
Input:
[[a b] [a c] [b c] [c d] [a d] [b e]]
Ouput (Expected):
[[a b] [a c] [c d] [b e]]
Removed items:
[[b c] [a d]]
As you can see, both c and d had already appeared on the previous items [a c] and [c d] respectively, so I have to remove the items [b c] and [a d].
So far, I have the following code
This function returns a vector of items to be removed. In our scenario, it returns the vector [[b c] [a d]]
(defn find-invalid [collection-data item-to-check]
(subvec (vec (filter #(= (second %) item-to-check) collection-data)) 1))
(defn find-invalid [collection-data item-to-check]
(subvec (vec (filter #(= (second %) item-to-check) collection-data)) 1))
This other function removes one item at a time from the original vector by a given index of the item
(defn remove-invalid [collection-data item-position]
(vec (concat (subvec collection-data 0 item-position) (subvec collection-data (inc item-position)))))
This last function is what I did to test this logic
(defn remove-invalid [original-collection ]
(dorun (for [item original-collection]
[
(dorun (for [invalid-item (find-invalid original-collection (second item))]
[
(cond (> (count invalid-item) 0)
(println (remove-invalid original-collection (.indexOf original-collection invalid-item)))
)
]))
])))
I think recursion could solve my problem, but I would appreciate any help to get that done :).
Thanks in advance.

One way to implement this would be to use reduce:
(first (reduce (fn [[result previous] [a b]]
[(if (contains? previous b)
result
(conj result [a b]))
(conj previous b)])
[[] #{}]
'[[a b] [a c] [b c] [c d] [d b]]))
;=> [[a b] [a c] [c d]]
We want to keep track of the result we've built up so far (result) and the set of items we've previously found in the second column (previous). For each new item [a b], then, we check whether previous contains the second item, b. If it does, we don't add anything to our result. Otherwise, we conj the new item [a b] onto the end of result. We also conj the second item, b, into previous. Since previous is a set, this won't do anything if previous already contained b. Finally, after the reduce completes, we take the first item from the result, which represents our final answer.

If I understand your question correctly, this should do it:
(defn clear [v]
(loop [v v existing #{} acc []]
(if (empty? v)
acc
(recur (rest v)
(conj existing (second (first v)))
(if (some existing [(ffirst v)]) acc (conj acc (first v)))))))
Solved with loop / recur. If I got some time I will see if I can use something like reduce or whatever function is appropriate here.
This filters: [["a" "b"] ["a" "c"] ["b" "c"] ["c" "d"] ["d" "b"]] to [["a" "b"] ["a" "c"]].

If you can rely on the duplicates being successive as in the example, go with
(->> '[[a b] [a c] [b c] [c d] [a d] [b e]]
(partition-by second)
(map first))
;-> ([a b] [a c] [c d] [b e])
Otherwise implement a distinct-by transducer based on Clojures distinct transducer.
(sequence (distinct-by second)
'[[a b] [a c] [b c] [c d] [a d] [b e]])
;-> ([a b] [a c] [c d] [b e])
Implementation
(defn distinct-by [f]
(fn [rf]
(let [seen (volatile! #{})]
(fn
([] (rf))
([result] (rf result))
([result input]
(let [vinput (f input)] ; virtual input as seen through f
(if (contains? #seen vinput)
result
(do (vswap! seen conj vinput)
(rf result input)))))))))

The following is similar to #Elogent's answer, but uses :as clauses to avoid reconstructing things:
(defn filtered [stuff]
(second
(reduce
(fn [[seconds ans :as sec-ans] [x y :as xy]]
(if (seconds y)
sec-ans
[(conj seconds y) (conj ans xy)]))
[#{} []]
stuff)))
For example,
(filtered '[[a b] [a c] [b c] [c d] [d b]])
;[[a b] [a c] [c d]]

just for fun:
these ones do not preserve the result's order, but if it is ok with you, they're quite expressive (the duplicates can be in any order, unlike the partition-by variant above):
one is to just group everything by second value, and take first item from every val:
(map (comp first val)
(group-by second '[[a b] [a c] [b c] [c d] [a d] [b e]]))
;; => ([a b] [a c] [c d] [b e])
there is also a nice way to do it, using sorted sets:
(into (sorted-set-by #(compare (second %1) (second %2)))
'[[a b] [a c] [b c] [c d] [a d] [b e]])
;; => #{[a b] [a c] [c d] [b e]}
and one more, also not preserving the order:
(vals (into {} (map (juxt second identity)
(rseq '[[a b] [a c] [b c] [c d] [a d] [b e]]))))
;; => ([b e] [c d] [a c] [a b])
but yes, loop/recur is always faster i guess:
(defn remove-dupes [v]
(loop [[[_ i2 :as pair] & xs :as v] v present #{} res []]
(cond (empty? v) res
(present i2) (recur xs present res)
:else (recur xs (conj present i2) (conj res pair)))))

Related

Clojure - Vector of vectors relationship between items

I am learning Clojure and Functional Programming and I am facing another problem that I am stuck and I have no idea how to deal with it. Here is the problem:
I have a vector of vectors:
[[a b][b c][c d][d e][e f][f g][f h][b i][d j][j l][l m][a n][a o][o p]]
And I need to establish a relationship between some of the items. The relationship rules are:
1 - Every item that has the same value as the first column has a direct relationship.
2 - If there is any item with the first column equals the second column from the rule 1, there is also a relationship, but an indirect one.
In our scenario the relationship would be:
Relationship for a (rule 1):
[[a b][a n][a o]]
Relationship for a (rule 2):
[[b c][o p]]
After that I also need to calculate, but I can't figure out how to do this the Functional Programming style with clojure. I have been working with O.O. Programming since 2008 and this is the first time I am learning functional programming.
Any ideas?
Thanks in advance.
ok. the first one is easy:
(def data '[[a b][b c][c d][d e][e f][f g][f h]
[b i][d j][j l][l m][a n][a o][o p]])
(defn rel1 [x data] (filter #(= (first %) x) data))
(rel1 'a data)
;; => ([a b] [a n] [a o])
you just keep all the pairs, whose first item is the one you need
the second one is slightly more complicated. You have to find first level relations for all the first level relations.
e.g: when the first level relations for a are [[a b][a n][a o]], you have to find first level relations for b, n, and o, and concatenate them:
(defn rel2 [x data]
(mapcat (fn [[_ k]] (rel1 k data))
(rel1 x data)))
(rel2 'a data)
;; => ([b c] [b i] [o p])
as a bonus:
you can make up a function to compute any nth relation of this kind:
if you already have rel1:
(defn rel1 [x data] (filter #(= (first %) x) data))
(defn rel-n [x data n]
(when (pos? n)
(nth (iterate #(mapcat (fn [[_ k]] (rel1 k data)) %)
[[nil x]])
n)))
in repl:
user> (rel-n 'a data 0)
nil
user> (rel-n 'a data 1)
([a b] [a n] [a o])
user> (rel-n 'a data 2)
([b c] [b i] [o p])
user> (rel-n 'a data 3)
([c d])
user> (rel-n 'a data 4)
([d e] [d j])
user> (rel-n 'a data 5)
([e f] [j l])
user> (rel-n 'a data 6)
([f g] [f h] [l m])
user> (rel-n 'a data 7)
()

Clojure - Merge two vectors of vectors different sizes

Here I am again facing some problems with Clojure. I have two vectors of vectors.
[[a b c] [d e f] [g h i]]
and
[[a b] [d e] [g h] [j k]]
And I wanna merge these two in a way that the final vector would be something like this:
[[a b c] [d e f] [g h i] [j k l]]
In the output, the last item [j k l], the L is a constant value when there is no value to merge (because it has no corresponding item in the first vector.
How can I do such thing?
P.S.: I am new to Clojure and I appreciate a elaborated answer so that I could understand better. Also, sorry if this is a trivial question.
In general:
break the problem into separable parts
give things names
compose the parts
So in this case your problem can be broken down into:
splitting the lists into the overlapping and non-overlapping parts
choosing the best of each of the overlapping parts
padding the non-overlapping parts to the correct length
combining them back together.
So if I make a couple assumptions about your problem here is an example of breaking it down and building it back up:
user> (def a '[[a b c] [d e f] [g h i]])
#'user/a
user> (def b '[[a b] [d e] [g h] [j k]])
#'user/b
make a function to choose the correct pair of the overlapping parts. I chose length though you can merge these however you want:
user> (defn longer-list [x y]
(if (> (count x) (count y))
x
y))
#'user/longer-list
make a function to pad out a list that's too short
user> (defn pad-list [l min-len default-value]
(into l (take (- min-len (count l)) (repeat default-value))))
#'user/pad-list
Make a function that uses these two functions to split and then recombine the parts of the problem:
user> (defn process-list [a b]
(let [a-len (count a)
b-len (count b)
longer-input (if (> a-len b-len)
a
b)
shorter-input (if (< a-len b-len)
a
b)]
(concat (map longer-list longer-input shorter-input)
(map #(pad-list % 3 'l) (drop (count shorter-input) longer-input)))))
#'user/process-list
and then test it :-)
user> (process-list a b)
([a b c] [d e f] [g h i] [j k l])
There are more details to work out, like what happens when the lists-of-lists are the same length, and if they are not subsets of each other. (and yes you can smash this down to a "one liner" too)
I'd take a look at clojure.core.matrix (see here); It has some nice operations which could help you with this.
i would generally go with the following approach:
fill collections up to the size of the longest one
map both of them, filling every item of the collection up to the size of the longest, mapping items to select the resulting value.
It is better to illustrate it with code:
first of all let's make up some helper functions:
(defn max-count [coll1 coll2] (max (count coll1) (count coll2)))
it's name says for itself.
(defn fill-up-to [coll size] (take size (concat coll (repeat nil))))
this one fills the collection with nils up to some size:
user> (fill-up-to [1 2 3] 10)
(1 2 3 nil nil nil nil nil nil nil)
now the merge function:
(defn merge-colls [v1 v2 default-val]
(let [coll-len (max-count v1 v2)
comp-len (max-count (first v1) (first v2))]
(mapv (fn [comp1 comp2]
(mapv #(or %1 %2 default-val)
(fill-up-to comp1 comp-len)
(fill-up-to comp2 comp-len)))
(fill-up-to v1 coll-len)
(fill-up-to v2 coll-len))))
the outer mapv operates on collections made from initial parameters filled up to the length of the longest one (coll-len), so in context of the question it will be:
(mapv some-fn [[a b c] [d e f] [g h i] nil]]
[[a b] [d e] [g h] [j k]])
the inner mapv operates on inner vectors, filled up to the comp-len (3 in this case):
(mapv #(or %1 %2 default-val) '[a b c] '[d e nil])
...
(mapv #(or %1 %2 default-val) '[nil nil nil] '[j k nil])
let's test it:
user> (let [v1 '[[a b c] [d e f] [g h i]]
v2 '[[a b] [d e] [g h] [j k]]]
(merge-colls v1 v2 'l))
[[a b c] [d e f] [g h i] [j k l]]
ok it works just as we wanted.
now if you look at the merge-colls, you may notice the repetition of the pattern:
(mapv some-fn (fill-up-to coll1 size)
(fill-up-to coll2 size))
we can eliminate the duplication by moving this pattern out to a function:
(defn mapv-equalizing [map-fn size coll1 coll2]
(mapv map-fn (fill-up-to coll1 size) (fill-up-to coll2 size)))
and rewrite our merge:
(defn merge-colls [v1 v2 default-val]
(let [coll-len (max-count v1 v2)
comp-len (max-count (first v1) (first v2))]
(mapv-equalizing (fn [comp1 comp2]
(mapv-equalizing #(or %1 %2 default-val)
comp-len comp1 comp2))
coll-len v1 v2)))
test:
user> (let [v1 '[[a b c] [d e f] [g h i]]
v2 '[[a b] [d e] [g h] [j k]]]
(merge-colls v1 v2 'l))
[[a b c] [d e f] [g h i] [j k l]]
ok. now we can shorten it by removing collection size bindings, as we need these values just once:
(defn merge-colls [v1 v2 default-val]
(mapv-equalizing
(partial mapv-equalizing
#(or %1 %2 default-val)
(max-count (first v1) (first v2)))
(max-count v1 v2) v1 v2))
in repl:
user> (let [v1 '[[a b c] [d e f] [g h i]]
v2 '[[a b] [d e] [g h] [j k]]]
(merge-colls v1 v2 'l))
[[a b c] [d e f] [g h i] [j k l]]

transient map not updated as expected [duplicate]

I'm a bit lost with usage of transients in clojure. Any help will be appreciated.
The sample code:
(defn test-transient [v]
(let [b (transient [])]
(for [x v] (conj! b x))
(persistent! b)))
user> (test-transient [1 2 3])
[]
I tried to make it persistent before return and the result is:
(defn test-transient2 [v]
(let [b (transient [])]
(for [x v] (conj! b x))
(persistent! b)
b))
user> (test-transient2 [1 2 3])
#<TransientVector clojure.lang.PersistentVector$TransientVector#1dfde20>
But if I use conj! separately it seems work ok:
(defn test-transient3 [v]
(let [b (transient [])]
(conj! b 0)
(conj! b 1)
(conj! b 2)
(persistent! b)))
user> (test-transient3 [1 2 3])
[0 1 2]
Does for has some constraint? If so, how can i copy values from persistent vector to transient?
Thank you.
Transients aren't supposed to be bashed in-place like that. Your last example only works due to implementation details which you shouldn't rely on.
The reason why for doesn't work is that it is lazy and the conj! calls are never executed, but that is besides the point, as you shouldn't work with transients that way anyway.
You should use conj! the same way as you would use the "regular" conj with immutable vectors - by using the return value.
What you are trying to do could be accomplished like this:
(defn test-transient [v]
(let [t (transient [])]
(persistent! (reduce conj! t v))))

How do I map a vector to a map, pushing into it repeated key values?

This is my input data:
[[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]]
I would like to map this into the following:
{:a [[1 2] [3 4] [5 6]] :b [[\a \b] [\c \d] [\e \f]]}
This is what I have so far:
(defn- build-annotation-map [annotation & m]
(let [gff (first annotation)
remaining (rest annotation)
seqname (first gff)
current {seqname [(nth gff 3) (nth gff 4)]}]
(if (not (seq remaining))
m
(let [new-m (merge-maps current m)]
(apply build-annotation-map remaining new-m)))))
(defn- merge-maps [m & ms]
(apply merge-with conj
(when (first ms)
(reduce conj ;this is to avoid [1 2 [3 4 ... etc.
(map (fn [k] {k []}) (keys m))))
m ms))
The above produces:
{:a [[1 2] [[3 4] [5 6]]] :b [[\a \b] [[\c \d] [\e \f]]]}
It seems clear to me that the problem is in merge-maps, specifically with the function passed to merge-with (conj), but after banging my head for a while now, I'm about ready for someone to help me out.
I'm new to lisp in general, and clojure in particular, so I also appreciate comments not specifically addressing the problem, but also style, brain-dead constructs on my part, etc. Thanks!
Solution (close enough, anyway):
(group-by first [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])
=> {:a [[:a 1 2] [:a 3 4] [:a 5 6]], :b [[:b \a \b] [:b \c \d] [:b \e \f]]}
(defn build-annotations [coll]
(reduce (fn [m [k & vs]]
(assoc m k (conj (m k []) (vec vs))))
{} coll))
Concerning your code, the most significant problem is naming. Firstly, I wouldn't, especially without first understanding your code, have any idea what is meant by annotation, gff, and seqname. current is pretty ambiguous too. In Clojure, remaining would generally be called more, depending on the context, and whether a more specific name should be used.
Within your let statement, gff (first annotation)
remaining (rest annotation), I'd probably take advantage of destructuring, like this:
(let [[first & more] annotation] ...)
If you would rather use (rest annotation) then I'd suggest using next instead, as it will return nil if it's empty, and allow you to write (if-not remaining ...) rather than (if-not (seq remaining) ...).
user> (next [])
nil
user> (rest [])
()
In Clojure, unlike other lisps, the empty list is truthy.
This article shows the standard for idiomatic naming.
Works at least on the given data set.
(defn build-annotations [coll]
(reduce
(fn [result vec]
(let [key (first vec)
val (subvec vec 1)
old-val (get result key [])
conjoined-val (conj old-val val)]
(assoc
result
key
conjoined-val)))
{}
coll))
(build-annotations [[:a 1 2] [:a 3 4] [:a 5 6] [:b \a \b] [:b \c \d] [:b \e \f]])
I am sorry for not offering improvements on your code. I am just learning Clojure and it is easier to solve problems piece by piece instead of understanding a bigger piece of code and finding the problems in it.
Although I have no comments to your code yet, I tried it for my own and came up with this solution:
(defn build-annotations [coll]
(let [anmap (group-by first coll)]
(zipmap (keys anmap) (map #(vec (map (comp vec rest) %)) (vals anmap)))))
Here's my entry leveraging group-by, although several steps in here are really concerned with returning vectors rather than lists. If you drop that requirement, it gets a bit simpler:
(defn f [s]
(let [g (group-by first s)
k (keys g)
v (vals g)
cleaned-v (for [group v]
(into [] (map (comp #(into [] %) rest) group)))]
(zipmap k cleaned-v)))
Depending what you actually want, you might even be able to get by with just doing group-by.
(defn build-annotations [coll]
(apply merge-with concat
(map (fn [[k & vals]] {k [vals]})
coll))
So,
(map (fn [[k & vals]] {k [vals]})
coll))
takes a collection of [keys & values] and returns a list of {key [values]}
(apply merge-with concat ...list of maps...)
takes a list of maps, merges them together, and concats the values if a key already exists.

Merge list of maps and combine values to sets in Clojure

What function can I put as FOO here to yield true at the end? I played with hash-set (only correct for first 2 values), conj, and concat but I know I'm not handling the single-element vs set condition properly with just any of those.
(defn mergeMatches [propertyMapList]
"Take a list of maps and merges them combining values into a set"
(reduce #(merge-with FOO %1 %2) {} propertyMapList))
(def in
(list
{:a 1}
{:a 2}
{:a 3}
{:b 4}
{:b 5}
{:b 6} ))
(def out
{ :a #{ 1 2 3}
:b #{ 4 5 6} })
; this should return true
(= (mergeMatches in) out)
What is the most idiomatic way to handle this?
This'll do:
(let [set #(if (set? %) % #{%})]
#(clojure.set/union (set %) (set %2)))
Rewritten more directly for the example (Alex):
(defn to-set [s]
(if (set? s) s #{s}))
(defn set-union [s1 s2]
(clojure.set/union (to-set s1) (to-set s2)))
(defn mergeMatches [propertyMapList]
(reduce #(merge-with set-union %1 %2) {} propertyMapList))
I didn't write this but it was contributed by #amitrathore on Twitter:
(defn kv [bag [k v]]
(update-in bag [k] conj v))
(defn mergeMatches [propertyMapList]
(reduce #(reduce kv %1 %2) {} propertyMapList))
I wouldn't use merge-with for this,
(defn fnil [f not-found]
(fn [x y] (f (if (nil? x) not-found x) y)))
(defn conj-in [m map-entry]
(update-in m [(key map-entry)] (fnil conj #{}) (val map-entry)))
(defn merge-matches [property-map-list]
(reduce conj-in {} (apply concat property-map-list)))
user=> (merge-matches in)
{:b #{4 5 6}, :a #{1 2 3}}
fnil will be part of core soon so you can ignore the implementation... but it just creates a version of another function that can handle nil arguments. In this case conj will substitute #{} for nil.
So the reduction conjoining to a set for every key/value in the list of maps supplied.
Another solution contributed by #wmacgyver on Twitter based on multimaps:
(defn add
"Adds key-value pairs the multimap."
([mm k v]
(assoc mm k (conj (get mm k #{}) v)))
([mm k v & kvs]
(apply add (add mm k v) kvs)))
(defn mm-merge
"Merges the multimaps, taking the union of values."
[& mms]
(apply (partial merge-with union) mms))
(defn mergeMatches [property-map-list]
(reduce mm-merge (map #(add {} (key (first %)) (val (first %))) property-map-list)))
This seems to work:
(defn FOO [v1 v2]
(if (set? v1)
(apply hash-set v2 v1)
(hash-set v1 v2)))
Not super pretty but it works.
(defn mergeMatches [propertyMapList]
(for [k (set (for [pp propertyMapList] (key (first pp))))]
{k (set (remove nil? (for [pp propertyMapList] (k pp))))}))

Resources