complex reduce sample unclear how the reduce works - crossfilter

Starting with complex reduce sample
I have trimmed it down to a single chart and I am trying to understand how the reduce works
I have made comments in the code that were not in the example denoting what I think is happening based on how I read the docs.
function groupArrayAdd(keyfn) {
var bisect = d3.bisector(keyfn); //set the bisector value function
//elements is the group that we are reducing,item is the current item
//this is a the reduce function being supplied to the reduce call on the group runAvgGroup for add below
return function(elements, item) {
//get the position of the key value for this element in the sorted array and put it there
var pos = bisect.right(elements, keyfn(item));
elements.splice(pos, 0, item);
return elements;
};
}
function groupArrayRemove(keyfn) {
var bisect = d3.bisector(keyfn);//set the bisector value function
//elements is the group that we are reducing,item is the current item
//this is a the reduce function being supplied to the reduce call on the group runAvgGroup for remove below
return function(elements, item) {
//get the position of the key value for this element in the sorted array and splice it out
var pos = bisect.left(elements, keyfn(item));
if(keyfn(elements[pos])===keyfn(item))
elements.splice(pos, 1);
return elements;
};
}
function groupArrayInit() {
//for each key found by the key function return this array?
return []; //the result array for where the data is being inserted in sorted order?
}
I am not quite sure my perception of how this is working is quite right. Some of the magic isn't showing itself. Am I correct that elements is the group the reduce function is being called on ? also the array in groupArrayInit() how is it being indirectly populated?
Part of me feels that the functions supplied to the reduce call are really array.map functions not array.reduce functions but I just can't quite put my finger on why. having read the docs I am just not making a connection here.
Any help would be appreciated.
Also have I missed Pens/Fiddles that are created for all these examples? like this one
http://dc-js.github.io/dc.js/examples/complex-reduce.html which is where I started with this but had to download the csv and manually convert to Json.
--------------Update
I added some print statements to try to clarify how the add function is working
function groupArrayAdd(keyfn) {
var bisect = d3.bisector(keyfn); //set the bisector value function
//elements is the group that we are reducing,item is the current item
//this is a the reduce function being supplied to the reduce call on the group runAvgGroup for add below
return function(elements, item) {
console.log("---Start Elements and Item and keyfn(item)----")
console.log(elements) //elements grouped by run?
console.log(item) //not seeing the pattern on what this is on each run
console.log(keyfn(item))
console.log("---End----")
//get the position of the key value for this element in the sorted array and put it there
var pos = bisect.right(elements, keyfn(item));
elements.splice(pos, 0, item);
return elements;
};
}
and to print out the group's contents
console.log("RunAvgGroup")
console.log(runAvgGroup.top(Infinity))
which results in
Which appears to be incorrect b/c the values are not sorted by key (the run number)?
And looking at the results of the print statements doesn't seem to help either.

This looks basically right to me. The issues are just conceptual.
Crossfilter’s group.reduce is not exactly like either Array.reduce or Array.map. Group.reduce defines methods for handling adding new records to a group or removing records from a group. So it is conceptually similar to an incremental Array.reduce that supports an reversal operation. This allows filters to be applied and removed.
Group.top returns your list of groups. The value property of these groups should be the elements value that your reduce functions return. The key of the group is the value returned by your group accessor (defined in the dimension.group call that creates your group) or your dimension accessor if you didn’t define a group accessor. Reduce functions work only on the group values and do not have direct access to the group key.
So check those values in the group.top output and hopefully you’ll see the lists of elements you expect.

Related

Can a pure function change input arguments?

Is a function that changes the values of an input argument still a pure function?
My example (Kotlin):
data class Klicker(
var id: Long = 0,
var value: Int = 0
)
fun Klicker.increment() = this.value++
fun Klicker.decrement() = this.value--
fun Klicker.reset() {
this.value = 0
}
Wikipedia says a pure function has these two requirements:
The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change while program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices.
Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.
From my understanding, all functions from my example comply with the first requirement.
My uncertainty starts with the second requirement. With the change of the input argument, I mutate an object (rule violation), but this object is not outside of the function scope, so maybe no rule violation?
Also, does a pure function always need to return a completely new value?
I presume, this function is considert 100% pure:
fun pureIncrement(klicker: Klicker): Klicker {
return klicker.copy(value = klicker.value++)
}
Be gentle, this is my first Stackoverflow question.
The increment and decrement functions fulfill neither of the requirements for a pure function. Their return value depends on the state of the Klicker class, which may change while program execution proceeds, so the first requirement is not fulfilled. The evaluation of the result mutates the mutable Klicker instance, so the second requirement is also not fulfilled. It doesn't matter in which scope the mutable data is; a pure function must not mutate any data at all.
The reset function violates only the second requirement.
The pureIncrement function can be made pure if you change it to:
fun pureIncrement(klicker: Klicker): Klicker {
return klicker.copy(value = klicker.value + 1)
}

How get random item from es6 Map or Set

I have a project that uses arrays of objects that I'm thinking of moving to es6 Sets or Maps.
I need to quickly get a random item from them (obviously trivial for my current arrays). How would I do this?
Maps and Sets are not well suited for random access. They are ordered and their length is known, but they are not indexed for access by an order index. As such, to get the Nth item in a Map or Set, you have to iterate through it to find that item.
The simple way to get a random item from a Set or Map would be to get the entire list of keys/items and then select a random one.
// get random item from a Set
function getRandomItem(set) {
let items = Array.from(set);
return items[Math.floor(Math.random() * items.length)];
}
You could make a version that would work with both a Set and a Map like this:
// returns random key from Set or Map
function getRandomKey(collection) {
let keys = Array.from(collection.keys());
return keys[Math.floor(Math.random() * keys.length)];
}
This is obviously not something that would perform well with a large Set or Map since it has to iterate all the keys and build a temporary array in order to select a random one.
Since both a Map and a Set have a known size, you could also select the random index based purely on the .size property and then you could iterate through the Map or Set until you got to the desired Nth item. For large collections, that might be a bit faster and would avoid creating the temporary array of keys at the expense of a little more code, though on average it would still be proportional to the size/2 of the collection.
// returns random key from Set or Map
function getRandomKey(collection) {
let index = Math.floor(Math.random() * collection.size);
let cntr = 0;
for (let key of collection.keys()) {
if (cntr++ === index) {
return key;
}
}
}
There's a short neat ES6+ version of the answer above:
const getRandomItem = iterable => iterable.get([...iterable.keys()][Math.floor(Math.random() * iterable.size)])
Works for Maps as well as for Sets (where keys() is an alias for value() method)
This is the short answer for Sets:
const getRandomItem = set => [...set][Math.floor(Math.random()*set.size)]

Using Rascal MAP

I am trying to create an empty map, that will be then populated within a for loop. Not sure how to proceed in Rascal. For testing purpose, I tried:
rascal>map[int, list[int]] x;
ok
Though, when I try to populate "x" using:
rascal>x += (1, [1,2,3])
>>>>>>>;
>>>>>>>;
^ Parse error here
I got a parse error.
To start, it would be best to assign it an initial value. You don't have to do this at the console, but this is required if you declare the variable inside a script. Also, if you are going to use +=, it has to already have an assigned value.
rascal>map[int,list[int]] x = ( );
map[int, list[int]]: ()
Then, when you are adding items into the map, the key and the value are separated by a :, not by a ,, so you want something like this instead:
rascal>x += ( 1 : [1,2,3]);
map[int, list[int]]: (1:[1,2,3])
rascal>x[1];
list[int]: [1,2,3]
An easier way to do this is to use similar notation to the lookup shown just above:
rascal>x[1] = [1,2,3];
map[int, list[int]]: (1:[1,2,3])
Generally, if you are just setting the value for one key, or are assigning keys inside a loop, x[key] = value is better, += is better if you are adding two existing maps together and saving the result into one of them.
I also like this solution sometimes, where you instead of joining maps just update the value of a certain key:
m = ();
for (...whatever...) {
m[key]?[] += [1,2,3];
}
In this code, when the key is not yet present in the map, then it starts with the [] empty list and then concatenates [1,2,3] to it, or if the key is present already, let's say it's already at [1,2,3], then this will create [1,2,3,1,2,3] at the specific key in the map.

How to filter empty groups from a reduction?

It appears to me that Crossfilter never excludes a group from the results of a reduction, even if the applied filters have excluded all the rows in that group. Groups that have had all of their rows filtered out simply return an aggregate value of 0 (or whatever reduceInitial returns).
The problem with this is that it makes it impossible to distinguish between groups that contain no rows and groups that do contain rows but just legitimately aggregate to a value of 0. Basically, there's no way (that I can see) to distinguish between a null value and a 0 aggregation.
Does anybody know of a built-in Crossfilter technique for achieving this? I did come up with a way to do this with my own custom reduceInitial/reduceAdd/reduceRemove method but it wasn't totally straight forward and it seemed to me that this is behavior that might/should be more native to Crossfilter's filtering semantics. So I'm wondering if there's a canonical way to achieve this.
I'll post my technique as an answer if it turns out that there is no built-in way to do this.
A simple way to accomplish this is to have both count and total be reduce attributes:
var dimGroup = dim.group().reduce(reduceAdd, reduceRemove, reduceInitial);
function reduceAdd(p, v) {
++p.count;
p.total += v.value;
return p;
}
function reduceRemove(p, v) {
--p.count;
p.total -= v.value;
return p;
}
function reduceInitial() {
return {count: 0, total: 0};
}
Empty groups will have zero counts, so retrieving only non-empty groups is easy:
dimGroup.top(Infinity).filter(function(d) { return d.value.count > 0; });
OK, there doesn't seem to be any obvious answer jumping out so I'll answer my own question and post the technique I used to solve this.
This example assumes that I've already created a dimension and grouping, which is passed in as groupDim. Because I want to be able to sum up any arbitrary numeric field, I also pass in fieldName so that it will be available in the closure scope of my the reduction functions.
One important characteristic of this technique is that it relies on there being a way to uniquely identify which group each row belongs to. Thinking in term of OLAP, this is essentially the "tuple" that defines a particular aggregation context. But it can be anything you want as long as it deterministically returns the same value for all data rows belonging to a given group.
The end result is that empty groups will have an aggregate value of "null" which can be easily detected for and filtered out after the fact. Any group with at least one row will have a numeric value (even if it happens to be zero).
Refinements or suggestions to this are more then welcome. Here's the code with comments inline:
function configureAggregateSum(groupDim, fieldName) {
function getGroupKey(datum) {
// Given datum return key corresponding to the group to which the datum belongs
}
// This object will keep track of the number of times each group had reduceAdd
// versus reduceRemove called. It is used to revert the running aggregate value
// back to "null" if the count hits zero. This is unfortunately necessary because
// Crossfilter filters as it is aggregating so reduceAdd can be called even if, in
// the end, all records in a group end up being filtered out.
//
var groupCount = {};
function reduceAdd(p, v) {
// Here's the code that keeps track of the invocation count per group
var groupKey = getGroupKey(v);
if (groupCount[groupKey] === undefined) { groupCount[groupKey] = 0; }
groupCount[groupKey]++;
// And here's the implementation of the add reduction (sum in my case)
// Note the check for null (our initial value)
var value = +v[fieldName];
return p === null ? value : p + value;
}
function reduceRemove(p, v) {
// This code keeps track of invocations of invocation count per group and, importantly,
// reverts value back to "null" if it hits 0 for the group. Essentially, if we detect
// that group has no records again we revert to the initial value.
var groupKey = getGroupKey(v);
groupCount[groupKey]--;
if (groupCount[groupKey] === 0) {
return null;
}
// And here's the code for the remove reduction (sum in my case)
var value = +v[fieldName];
return p - value;
}
function reduceInitial() {
return null;
}
// Once returned, can invoke all() or top() to get the values, which can then be filtered
// using a native Array.filter to remove the groups with null value.
return groupedDim.reduce(reduceAdd, reduceRemove, reduceInitial);
}

Sorting in AdvancedDatagrid in Flex 3

I am using AdvancedDatagrid in Flex 3. One column of AdvancedDatagrid contains numbers and alphabets. When I sort this column, numbers come before alphabets (Default behavior of internal sorting of AdvancedDatagrid). But I want alphabets to come before number when I sort.
I know I will have to write the custom sort function. But can anybody give some idea on how to proceed.
Thanks in advance.
Use sortCompareFunction
The AdvancedDataGrid control uses this function to sort the elements of the data provider collection. The function signature of the callback function takes two parameters and has the following form:
mySortCompareFunction(obj1:Object, obj2:Object):int
obj1 — A data element to compare.
obj2 — Another data element to compare with obj1.
The function should return a value based on the comparison of the objects:
-1 if obj1 should appear before obj2 in ascending order.
0 if obj1 = obj2.
1 if obj1 should appear after obj2 in ascending order.
<mx:AdvancedDataGridColumn sortCompareFunction="mySort"
dataField="colData"/>
Try the following sort compare function.
public function mySort(obj1:Object, obj2:Object):int
{
var s1:String = obj1.colData;
var s2:String = obj2.colData;
var result:Number = s1.localeCompare(s2);
if(result != 0)
result = result > 0 ? 1 : -1;
if(s1.match(/^\d/))
{
if(s2.match(/^\d/))
return result;
else
return 1;
}
else if(s2.match(/^\d/))
return -1;
else
return result;
}
It checks the first character of strings and pushes the ones that start with a digit downwards in the sort order. It uses localeCompare to compare two strings if they both start with letters or digits - otherwise it says the one starting with a letter should come before the one with digit. Thus abc will precede 123 but a12 will still come before abc.
If you want a totally different sort where letters always precede numbers irrespective of their position in the string, you would have to write one from the scratch - String::charCodeAt might be a good place to start.

Resources