Java8: Reduce list of elements like sql group by - functional-programming

After a stream().filter().map() on a List I have a data structure of type List<List<Object>> that looks like this:
[["1","a",20],
["1","b",10],
["2","a",10],
["2","b",30]]
What I want is to group by the value of the first element of the inner list, leave the middle element out and finally sum the last elements for each "group".
[["1", 30],
["2", 40]]
Sorry if this is obvious for some of you, but I have yet to find any example of how to achieve this. I assumed it could be done by Stream.reduce(U identity, BiFunction accumulator, BinaryOperator combiner) but so far I haven't succeeded. If someone could provide some example code for this, I believe it would be appreciated by many others too.

The following code may be of help:
List<List<Object>> originalList = Arrays.asList(
Arrays.asList("1", "a", 20),
Arrays.asList("1", "b", 10),
Arrays.asList("2", "a", 10),
Arrays.asList("2", "b", 30)
);
final Map<Object, Integer> collectedMap =
originalList.stream()
.collect(Collectors.groupingBy(
e -> e.get(0),
Collectors.summingInt(e -> (Integer) e.get(2))));
System.out.println(collectedMap);
The output is:
{1=30, 2=40}
Basically, what the code does is to group by the first value in the sublist (get(0)). Then it sums the integers by the use of summingInt. However, it groups the entire thing in a Map - if some other collection is required the stream must be slightly changed.
E.g. to collect the whole thing as a List:
final List<List<Object>> collectedList =
collectedMap.entrySet()
.stream()
.map(e -> Arrays.asList(e.getKey(), e.getValue()))
.collect(Collectors.toList());
System.out.println(collectedList);
Then, the output will be:
[[1, 30], [2, 40]]

Related

RxJava - Count events on the fly

I would like to count objects passing from observable. I know there is a count operator but that can't be used for infinite streams because it waits for completition.
What I want is something like Value -> operator -> Pair(Int, Value). I know there could be a problem with int (or long) overflow and that is maybe a reason nothing like this exists but I still have feeling I've seen something like this before. One can implement this with scan operator but I thought there is a simpler way.
Output would be like:
Observable.just(Event1, Event2, Event3) -> (1, Event1), (2, Event2), (3, Event3)
You can solve your problem using the RxJava Scan method:
Observable.just("a", "b", "c")
.scan(new Pair<>(0, ""), (pair, s) -> new Pair<>(pair.first + 1, s))
.skip(1)
Pair<>(0, "") is your seed value
Lambda (pair, s) -> new Pair<>(pair.first + 1, s) takes the seed value and value emitted by original observable and then produces the next Pair value to be emitted and fed back into the lambda
Skip(1) is needed to avoid emitting the seed value
Count means a state change. So you can use a "stateful" map instead of an anonymous class.
ex:
class Mapper<T, R>(val mapper: (T) -> R) : Function<T, Pair<Int, R>> {
private var counter = 0
override fun apply(t: T): Pair<Int, R> {
return Pair(counter++, mapper(t))
//or ++counter if you want start from 1 instead of zero
}
}
//usage
val mapper = Mapper<String, String> { it.toUpperCase() }
Observable.just("a", "b", "c")
.map(mapper)
.subscribe {
Log.d("test logger", it.toString())
}

Lua - writing iterator similar to ipairs, but selects indices

I'd like to write an iterator that behaves exactly like ipairs, except which takes a second argument. The second argument would be a table of the indices that ipairs should loop over.
I'm wondering if my current approach is inefficient, and how I could improve it with closures.
I'm also open to other methods of accomplishing the same thing. But I like iterators because they're easy to use and debug.
I'll be making references to and using some of the terminology from Programming in Lua (PiL), especially the chapter on closures (chapter 7 in the link).
So I'd like to have this,
ary = {10,20,30,40}
for i,v in selpairs(ary, {1,3}) do
ary[i] = v+5
print(string.format("ary[%d] is now = %g", i, ary[i]))
end
which would output this:
ary[1] is now = 15
ary[3] is now = 35
My current approach is this : (in order: iterator, factory, then generic for)
iter = function (t, s)
s = s + 1
local i = t.sel[s]
local v = t.ary[i]
if v then
return s, i, v
end
end
function selpairs (ary, sel)
local t = {}
t.ary = ary
t.sel = sel
return iter, t, 0
end
ary = {10,20,30,40}
for _,i,v in selpairs(ary, {1,3}) do
ary[i] = v+5
print(string.format("ary[%d] is now = %g", i, ary[i]))
end
-- same output as before
It works. sel is the array of 'selected' indices. ary is the array you want to perform the loop on. Inside iter, s indexes sel, and i indexes ary.
But there are a few glaring problems.
I must always discard the first returned argument s (_ in the for loop). I never need s, but it has to be returned as the first argument since it is the "control variable".
The "invariant state" is actually two invariant states (ary and sel) packed into a single table. Pil says that this is more expensive, and recommends using closures. (Hence my writing this question).
The rest can of this can be ignored. I'm just providing more context for what I'm wanting to use selpairs for.
I'm mostly concerned with the second problem. I'm writing this for a library I'm making for generating music. Doing simple stuff like ary[i] = v+5 won't really be a problem. But when I do stuff like accessing object properties and checking bounds, then I get concerned that the 'invariant state as a table' approach may be creating unnecessary overhead. Should I be concerned about this?
If anything, I'd like to know how to write this with closures just for the knowledge.
Of course, I've tried using closures, but I'm failing to understand the scope of "locals in enclosing functions" and how it relates to a for loop calling an iterator.
As for the first problem, I imagine I could make the control variable a table of s, i, and v. And at the return in iter, unpack the table in the desired order.
But I'm guessing that this is inefficient too.
Eventually, I'd like to write an iterator which does this, except nested into itself. My main data structure is arrays of arrays, so I'd hope to make something like this:
ary_of_arys = {
{10, 20, 30, 40},
{5, 6, 7, 8},
{0.9, 1, 1.1, 1.2},
}
for aoa,i,v in selpairs_inarrays(ary_of_arys, {1,3}, {2,3,4}) do
ary_of_arys[aoa][i] = v+5
end
And this too, could use the table approach, but it'd be nice to know how to take advantage of closures.
I've actually done something similar: A function that basically does the same thing by taking a function as it's fourth and final argument. It works just fine, but would this be less inefficient than an iterator?
You can hide "control variable" in an upvalue:
local function selpairs(ary, sel)
local s = 0
return
function()
s = s + 1
local i = sel[s]
local v = ary[i]
if v then
return i, v
end
end
end
Usage:
local ary = {10,20,30,40}
for i, v in selpairs(ary, {1,3}) do
ary[i] = v+5
print(string.format("ary[%d] is now = %g", i, ary[i]))
end
Nested usage:
local ary_of_arys = {
{10, 20, 30, 40},
{5, 6, 7, 8},
{0.9, 1, 1.1, 1.2},
}
local outer_indices = {1,3}
local inner_indices = {2,3,4}
for aoa, ary in selpairs(ary_of_arys, outer_indices) do
for i, v in selpairs(ary, inner_indices) do
ary[i] = v+5 -- This is the same as ary_of_arys[aoa][i] = v+5
end
end
Not sure if I understand what you want to achive but why not simply write
local values = {"a", "b", "c", "d"}
for i,key in ipairs {3,4,1} do
print(values[key])
end
and so forth, instead of implementing all that interator stuff? I mean your use case is rather simple. It can be easily extended to more dimensions.
And here's a co-routine based possibility:
function selpairs(t,selected)
return coroutine.wrap(function()
for _,k in ipairs(selected) do
coroutine.yield(k,t[k])
end
end)
end

How to fix this SML code to work as intended?

Right now I have an SML function:
method([1,1,1,1,2,2,2,3,3,3]);
returns:
val it = [[2,2,2],[3,3,3]] : int list list
but I need it to return:
val it = [[1,1,1,1],[2,2,2],[3,3,3]] : int list list
This is my current code:
- fun method2(L: int list) =
= if tl(L) = [] then [hd(L)] else
= if hd(tl(L)) = hd(L) then hd(L)::method(tl(L)) else [hd(L)];
- fun method(L: int list) =
= if tl(L) = [] then [] else
= if hd(tl(L)) = hd(L) then method(tl(L)) else
= method2(tl(L))::method(tl(L));
As you can see it misses the first method2 call. Any ideas on how I can fix this? I am completely stumped.
Your problem is here if hd(tl(L)) = hd(L) then method(tl(L)) else. This is saying if the head of the tail is equal to the head, then continue processing, but don't add it to the result list. this will skip the first contiguous chunk of equal values. I would suggest separating the duties of these functions a bit more. The way to do this is to have method2 strip off the next contiguous chunk of values, and return a pair, where the first element will have the contiguous chunk removed, and the second element will have the remaining list. For example, method2([1, 1, 1, 2, 2, 3, 3]) = ([1, 1, 1], [2, 2, 3, 3]) and method2([2, 2, 3, 3]) = ([2, 2], [3, 3]). Now, you can just keep calling method2 until the second part of the pair is nil.
I'm not quite sure what you are trying to do with your code. I would recommend creating a tail recursive helper function which is passed three arguments:
1) The list of lists you are trying to build up
2) The current list you are building up
3) The list you are processing
In your example, a typical call somewhere in the middle of the computation would look like:
helper([[1,1,1,1]], [2,2],[2,3,3,3])
The recursion would work by looking at the head of the last argument ([2,3,3,3]) as well as the head of the list which is currently being built up ([2,2]) and, since they are the same -- the 2 at the end of the last argument is shunted to the list being built up:
helper([[1,1,1,1]], [2,2,2],[3,3,3])
in the next step in the recursion the heads are then compared and found to be different (2 != 3), so the helper function will put the middle list at the front of the list of lists:
helper([[2,2,2], [1,1,1,1]], [3],[3,3])
the middle list is re-initialized to [3] so it will start growing
eventually you reach something like this:
helper([[2,2,2], [1,1,1,1]], [3,3,3],[])
the [3,3,3] is then tacked onto the list of lists and the reverse of this list is returned.
Once such a helper function is defined, the main method checks for an empty list and, if not empty, initializes the first call to the helper function. The following code fleshes out theses ideas -- using pattern-matching style rather than hd and tl (I am not a big fan of using those functions explicitly -- it makes the code too Lisp-like). If this is homework then you should probably thoroughly understand how it works and then translate it to code involving hd and tl since your professor would regard it as plagiarized if you use things you haven't yet studied and haven't made it your own work:
fun helper (xs, ys, []) = rev (ys::xs)
| helper (xs, y::ys, z::zs) =
if y = z
then helper(xs, z :: y :: ys, zs)
else helper((y::ys)::xs,[z],zs);
fun method [] = []
| method (x::xs) = helper([],[x],xs);

getting values from nested lists in groovy

I have a 3 level nested list in groovy like this
rList = [[[12, name1],[22,name2],[49,name3]],[[33, name5],[22,name6],[21, name7]]]
how can I iterate it to get the name values from the 1st sublist
so I want like rsublist = [name1, name2, name3]
Thanks in advance.
rList = [[[12, 'name1'], [22, 'name2'], [49, 'name3']], [[33, 'name5'], [22, 'name6'], [21, 'name7']]]
names = rList[0]*.getAt(1)
assert names == ['name1', 'name2', 'name3']
First, rList[0] gives you the first sub-list, which is [[12, name1], [22, name2], [49, name3]]. Then, the spread operator, *., is used to apply the same method, getAt(1), to every element of that list, which will return a list with every second element of the sub-lists, which are the values you were looking for :)
You can also use rList[0].collect { it[1] } which is equivalent and might be more familiar if you are not used to the spread operator.
Based on the #epidemian's answer, here's a more readable version:
rList.first()*.last()

Appending data to an AT Field using transmogrifier

I have a CSV file of data like this:
1, [a, b, c]
2, [a, b, d]
3, [a]
and some Plone objects which should be updated like this:
ID, LinesField
a, [1,2,3]
b, [1,2]
c, [1]
d, [2]
So, to clarify, the object with the id a is named on lines 1, 2 and 3 of the CSV, and thus the LinesField property of object a needs to have those line ids (the first number on the line) listed.
Ideally I'd like to use Transmogrifier to import this information (and avoid doing any manipulation in Excel beforehand), and I can see two ways, theoretically of doing this, but I can't work out how to do this in practice. I'd be grateful for some pointers to examples. I think that either I need to transform the entire pipeline so that the items reflect the structure of my Plone objects and then use the ATSchemaUpdater blueprint, but I can't see any examples on how to add items to the pipeline (do I need to write my own blueprint?) Or, alternatively I could loop through the items as they exist and append the value in the left column to the items in the list in the right. For that I need a way of appending values with ATSchemaUpdater rather than overwriting them - again, is there a blueprint for that anywhere?
Here's a few sample csv lines:
"Name","Themes"
"Bessie Brown","cah;cab;cac"
"Fred Blogs","cah;cac"
"Dinah Washington","cah;cab"
The Plone object will be a theme and the lines field a list of names:
cah, ['Bessie Brown', 'Fred Boggs' etc etc]
I'm not pretty sure you want to read the CVS file using transmogrifier, but I think you can create a section to insert these values to the items in the pipeline using a function like this:
def transpose(cvs):
keys = []
[keys.extend(v) for v in cvs.values()]
keys = set(keys)
d = {}
for key in keys:
values = [k for k, v in cvs.iteritems() if key in v]
d[key] = values
return d
In this context, cvs is {1: ['a', 'b', 'c'], 2: ['a', 'b', 'd'], 3: ['a']}; keys will contain all possible values set(['a', 'c', 'b', 'd']); and d will be what you want {'a': [1, 2, 3], 'c': [1], 'b': [1, 2], 'd': [2]}.
Probably there are better ways to do it, but I'm not a Python magician.
The insert section could look like this one:
class Insert(object):
"""Insert new keys into items.
"""
classProvides(ISectionBlueprint)
implements(ISection)
def __init__(self, transmogrifier, name, options, previous):
self.previous = previous
self.new_keys = transpose(cvs)
def __iter__(self):
for item in self.previous:
item.update(self.new_keys)
yield item
After that you can use the SchemaUpdater section.

Resources