Is there a cleaner way to check whether a value is present at a particular index like list.getOrDefault(index, "defaultValue"). Or even do a default operation when the particular index is out of range of the list.
The normal way to do this is to check for size of the list before attempting this operation.
The default List interface does not have this functionality. There is Iterables.get in Guava:
Iterables.get(iterable, position, defaultValue);
Returns the element at the specified position in iterable or
defaultValue if iterable contains fewer than position + 1 elements.
Throws IndexOutOfBoundsException if position is negative.
If this is functionality you intend to use a lot and can't afford to depend on third-party libraries, you could write your own static method (here inspired by the Guava Lists class):
public class Lists {
public static <E> E getOrDefault(int index, E defaultValue, List<E> list) {
if (index < 0) {
throw new IllegalArgumentException("index is less than 0: " + index);
}
return index <= list.size() - 1 ? list.get(index) : defaultValue;
}
}
Related
Instead of using the for loop, how do I use the Stream API of Java 8 on array of booleans? How do I use methods such as forEach, reduce etc.?
I want to get rid of the two variables totalRelevant and retrieved which I am using to maintain state.
As in a lambda expression, we can only reference final variables from its lexical context.
import java.util.Arrays;
import java.util.List;
public class IRLab {
public static void main(String[] args) {
// predefined list of either document is relevant or not
List<Boolean> documentRelivency = Arrays.asList(true, false, true, true, false);
System.out.println("Precision\tRecall\tF-Measure");
// variables for output
double totalRelevant = 0.0;
double retrieved = 0.0;
for (int i = 0; i < documentRelivency.size(); ++i) {
Boolean isRelevant = documentRelivency.get(i);
// check if document is relevant
if (isRelevant) totalRelevant += 1;
// total number of retrieved documents will be equal to
// number of document being processed currently, i.e. retrieved = i + 1
retrieved += 1;
// storing values using formulas
double precision = totalRelevant / retrieved;
double recall = totalRelevant / totalRelevant;
double fmeasure = (2 * precision * recall) / (precision + recall);
// Printing the final calculated values
System.out.format("%9.2f\t%.2f\t%.2f\t\n", precision, recall, fmeasure);
}
}
}
How do I convert above code to functional code using the Java 8 Stream API and Lambda Expressions? I need to maintain state for two variables as above.
Generally, converting imperative to a functional code will only be an improvement when you manage to get rid of mutable state that causes the processing of one element to depend on the processing of the previous one.
There are workarounds that allow you to incorporate mutable state, but you should first try to find a different representation of your problem that works without. In your example, the processing of each element depends on two values, totalRelevant and retrieved. The latter is just an ascending number and therefore can be represented as a range, e.g. IntStream.range(startValue, endValue). The second stems from your list of boolean values and is the number of true value inside the sublist (0, retrieved)(inclusive).
You could recalculate that value without needing the previous value, but reiterating the list in each step could turn out to be expensive. So instead, collect your list into a single int number representing a bitset first, i.e. [true, false, true, true, false] becomes 0b_10110. Then, you can get the number of one bits using intrinsic operations:
List<Boolean> documentRelivency = Arrays.asList(true, false, true, true, false);
int numBits=documentRelivency.size(), bitset=IntStream.range(0, numBits)
.map(i -> documentRelivency.get(i)? 1<<(numBits-i-1): 0).reduce(0, (i,j) -> i|j);
System.out.println("Precision\tRecall\tF-Measure");
IntStream.rangeClosed(1, numBits)
.mapToObj(retrieved -> {
double totalRelevant = Integer.bitCount(bitset&(-1<<(numBits-retrieved)));
return String.format("%9.2f\t%.2f\t%.2f",
totalRelevant/retrieved, 1f, 2/(1+retrieved/totalRelevant));
})
.forEach(System.out::println);
This way, you have expressed the entire operation in a functional way where the processing of one element does not depend on the previous one. It could even run in parallel, though this would offer no benefit here.
If the list size exceeds 32, you have to resort to long, or java.util.BitSet for more than 64.
But the whole operation is more an example of how to change the thinking from “this is a number I increment in each iteration” to “I’m processing a continuous range of values” and from “this is a number I increment when the element is true” to “this is the count of true values in a range of this list”.
It's unclear why you need to change your code to lambdas. Currently it's quite short and lambdas will not make it shorter or cleaner. However if you really want, you may encapsulate your shared state in the separate object:
static class Stats {
private int totalRelevant, retrieved;
public void add(boolean relevant) {
if(relevant)
totalRelevant++;
retrieved++;
}
public double getPrecision() {
return ((double)totalRelevant) / retrieved;
}
public double getRecall() {
return 1.0; // ??? was totalRelevant/totalRelevant in original code
}
public double getFMeasure() {
double precision = getPrecision();
double recall = getRecall();
return (2 * precision * recall) / (precision + recall);
}
}
And use with lambda like this:
Stats stats = new Stats();
documentRelivency.forEach(relevant -> {
stats.add(relevant);
System.out.format("%9.2f\t%.2f\t%.2f\t\n", stats.getPrecision(),
stats.getRecall(), stats.getFMeasure());
});
Lambda is here, but not Stream API. Seems that involving Stream API for such problem is not very good idea as you need to output the intermediate states of mutable container which should be mutated strictly in given order. Well, if you desperately need Stream API, replace .forEach with .stream().forEachOrdered.
I have my_list that defined this way:
struct my_struct {
comparator[2] : list of int(bits:16);
something_else[2] : list of uint(bits:16);
};
...
my_list[10] : list of my_struct;
It is forbidden to comparators at the same index (0 or 1) to be the same in all the list. When I constrain it this way (e.g. for index 0):
keep my_list.all_different(it.comparator[0]);
I get compilation error:
*** Error: GEN_NO_GENERATABLE_NOTIF:
Constraint without any generatable element.
...
keep my_list.all_different(it.comparator[0]);
How can I generate them all different? Appreciate any help
It also works in one go:
keep for each (elem) in my_list {
elem.comparator[0] not in my_list[0..max(0, index-1)].apply(.comparator[0]);
elem.comparator[1] not in my_list[0..max(0, index-1)].apply(.comparator[1]);
};
When you reference my_list.comparator it doesn't do what you think it does. What happens is that it concatenates all comparator lists into one bit 20 element list. Try it out by removing your constraint and printing it:
extend sys {
my_list[10] : list of my_struct;
run() is also {
print my_list.comparator;
};
};
What you can do in this case is construct your own list of comparator[0] elements:
extend sys {
comparators0 : list of int;
keep comparators0.size() == my_list.size();
keep for each (comp) in comparators0 {
comp == my_list.comparator[index * 2];
};
keep comparators0.all_different(it);
// just to make sure that we've sliced the appropriate elements
run() is also {
print my_list[0].comparator[0], comparators0[0];
print my_list[1].comparator[0], comparators0[1];
print my_list[2].comparator[0], comparators0[2];
};
};
You can apply an all_different() constraint on this new list. To make sure it's working, adding the following constraint should cause a contradiction:
extend sys {
// this constraint should cause a contradiction
keep my_list[0].comparator[0] == my_list[1].comparator[0];
};
It appears to me that Crossfilter never excludes a group from the results of a reduction, even if the applied filters have excluded all the rows in that group. Groups that have had all of their rows filtered out simply return an aggregate value of 0 (or whatever reduceInitial returns).
The problem with this is that it makes it impossible to distinguish between groups that contain no rows and groups that do contain rows but just legitimately aggregate to a value of 0. Basically, there's no way (that I can see) to distinguish between a null value and a 0 aggregation.
Does anybody know of a built-in Crossfilter technique for achieving this? I did come up with a way to do this with my own custom reduceInitial/reduceAdd/reduceRemove method but it wasn't totally straight forward and it seemed to me that this is behavior that might/should be more native to Crossfilter's filtering semantics. So I'm wondering if there's a canonical way to achieve this.
I'll post my technique as an answer if it turns out that there is no built-in way to do this.
A simple way to accomplish this is to have both count and total be reduce attributes:
var dimGroup = dim.group().reduce(reduceAdd, reduceRemove, reduceInitial);
function reduceAdd(p, v) {
++p.count;
p.total += v.value;
return p;
}
function reduceRemove(p, v) {
--p.count;
p.total -= v.value;
return p;
}
function reduceInitial() {
return {count: 0, total: 0};
}
Empty groups will have zero counts, so retrieving only non-empty groups is easy:
dimGroup.top(Infinity).filter(function(d) { return d.value.count > 0; });
OK, there doesn't seem to be any obvious answer jumping out so I'll answer my own question and post the technique I used to solve this.
This example assumes that I've already created a dimension and grouping, which is passed in as groupDim. Because I want to be able to sum up any arbitrary numeric field, I also pass in fieldName so that it will be available in the closure scope of my the reduction functions.
One important characteristic of this technique is that it relies on there being a way to uniquely identify which group each row belongs to. Thinking in term of OLAP, this is essentially the "tuple" that defines a particular aggregation context. But it can be anything you want as long as it deterministically returns the same value for all data rows belonging to a given group.
The end result is that empty groups will have an aggregate value of "null" which can be easily detected for and filtered out after the fact. Any group with at least one row will have a numeric value (even if it happens to be zero).
Refinements or suggestions to this are more then welcome. Here's the code with comments inline:
function configureAggregateSum(groupDim, fieldName) {
function getGroupKey(datum) {
// Given datum return key corresponding to the group to which the datum belongs
}
// This object will keep track of the number of times each group had reduceAdd
// versus reduceRemove called. It is used to revert the running aggregate value
// back to "null" if the count hits zero. This is unfortunately necessary because
// Crossfilter filters as it is aggregating so reduceAdd can be called even if, in
// the end, all records in a group end up being filtered out.
//
var groupCount = {};
function reduceAdd(p, v) {
// Here's the code that keeps track of the invocation count per group
var groupKey = getGroupKey(v);
if (groupCount[groupKey] === undefined) { groupCount[groupKey] = 0; }
groupCount[groupKey]++;
// And here's the implementation of the add reduction (sum in my case)
// Note the check for null (our initial value)
var value = +v[fieldName];
return p === null ? value : p + value;
}
function reduceRemove(p, v) {
// This code keeps track of invocations of invocation count per group and, importantly,
// reverts value back to "null" if it hits 0 for the group. Essentially, if we detect
// that group has no records again we revert to the initial value.
var groupKey = getGroupKey(v);
groupCount[groupKey]--;
if (groupCount[groupKey] === 0) {
return null;
}
// And here's the code for the remove reduction (sum in my case)
var value = +v[fieldName];
return p - value;
}
function reduceInitial() {
return null;
}
// Once returned, can invoke all() or top() to get the values, which can then be filtered
// using a native Array.filter to remove the groups with null value.
return groupedDim.reduce(reduceAdd, reduceRemove, reduceInitial);
}
In an object, I have an array of const-handles to some object of another specific class. In a method, I may want to return one of this handles as an inout-parameter. Here as a simplified example:
class A {}
class B {
const(A) a[];
this() {
a = [new A(), new A(), new A()];
}
void assign_const(const(A)* value) const {
// *value = a[0]; // fails with: Error: cannot modify const expression *value
}
}
void main() {
const(A) a;
B b = new B();
b.assign_const(&a);
assert(a == b.a[0]); // fails .. obviously
}
I do not want to remove the const in the original array. Class B is meant as some kind of view onto a collection constant A-items. I'm new to D coming from C++. Do I have messed up with const-correctness in the D-way? I've tried several ways to get this to work but have no clue how to get it right.
How is the correct way to perform this lookup without "evil" casting?
Casting away const and modifying an element is undefined behavior in D. Don't do it. Once something is const, it's const. If the element of an array is const, then it can't be changed. So, if you have const(A)[], then you can append elements to the array (since it's the elements that are const, not the array itself), but you can't alter any of the elements in the array. It's the same with immutable. For instance, string is an alias for immutable(char)[], which is why you can append to a string, but you can't alter any of its elements.
If you want an array of const objects where you can alter the elements in the array, you need another level of indirection. In the case of structs, you could use a pointer:
const(S)*[] arr;
but that won't work with classes, because if C is a class, then C* points to a reference to a class object, not to the object itself. For classes, you need to do
Rebindable!(const C) arr;
Rebindable is in std.typecons.
I have n integers and I need a quick logic test to see that they are all different, and I don't want to compare every combination to find a match...any ideas on a nice and elegant approach?
I don't care what programming language your idea is in, I can convert!
Use a set data structure if your language supports it, you might also look at keeping a hash table of seen elements.
In python you might try
seen={}
n_already_seen=n in seen
seen[n]=n
n_already_seen will be a boolean indicating if n has already been seen.
You don't have to check every combination thanks to commutivity and transitivity; you can simply go down the list and check each entry against each entry that comes after it. For example:
bool areElementsUnique( int[] arr ) {
for( int i=0; i<arr.Length-1; i++ ) {
for( int j=i+1; j<arr.Length; j++ ) {
if( arr[i] == arr[j] ) return false;
}
}
return true;
}
Note that the inner loop doesn't start from the beginning, but from the next element (i+1).
You can use a Hash Table or a Set type of data structure that using hashing. Then you can insert all of the elements into the hashtable or hashset, and either as you insert, check if the element is already in the table/set. If for some reason you don't want to check as you go, you can just insert all the numbers and then check to see if the size of the structure is less than n. If it is less than n, there had to be repeated elements. Otherwise, they were all unique.
Here is a really compact Java solution. The time-complexity is amortized O(n) and the space complexity is also O(n).
public boolean areAllElementsUnique(int [] list)
{
Set<Integer> set = new HashSet<Integer>();
for (int number: list)
if (set.contains(number))
return false;
else
set.add(number);
return true;
}