I have been practicing java 8 streams and functional style for a while.
Sometimes I try to solve some programming puzzles just using streams.
And during this time I found a class of tasks which I don't know how to solve with streams, only with classical approach.
One example of this kind of tasks is:
Given an array of numbers find index of the element which will make sum of the left part of array below zero.
e.g. for array [1, 2, 3, -1, 3, -10, 9] answer will be 5
My first idea was to use IntStream.generate(0, arr.length)... but then I don't know how to accumulate values and being aware of index same time.
So questions are:
Is it possible to somehow accumulate value over stream and then make conditional exit?
What is then with parallel execution? it's not fitting to problem of finding indexes where we need to be aware of elements order.
I doubt your task is well suited for streams. What you are looking for is a typical scan left operation which is by nature a sequential operation.
For instance imagine the following elements in the pipeline: [1, 2, -4, 5]. A parallel execution may split it into two subparts namely [1, 2] and [-4, 5]. Then what would you do with them ? You cannot sum them independently because it will yields [3] and [1] and then you lost the fact that 1 + 2 - 4 < 0 was respected.
So even if you write a collector that keeps track of the index, and the sum, it won't be able to perform well in parallel (I doubt you can even benefit from it) but you can imagine such a collector for sequential use :
public static Collector<Integer, ?, Integer> indexSumLeft(int limit) {
return Collector.of(
() -> new int[]{-1, 0, 0},
(arr, elem) -> {
if(arr[2] == 0) {
arr[1] += elem;
arr[0]++;
}
if(arr[1] < limit) {
arr[2] = 1;
}
},
(arr1, arr2) -> {throw new UnsupportedOperationException("Cannot run in parallel");},
arr -> arr[0]
);
}
and a simple usage:
int index = IntStream.of(arr).boxed().collect(indexSumLeft(0));
This will still traverse all the elements of the pipeline, so not very efficient.
Also you might consider using Arrays.parallelPrefix if the data-source is an array. Just compute the partial sums over it and then use a stream to find the first index where the sum is below the limit.
Arrays.parallelPrefix(arr, Integer::sum);
int index = IntStream.range(0, arr.length)
.filter(i -> arr[i] < limit)
.findFirst()
.orElse(-1);
Here also all the partial sums are computed (but in parallel).
In short, I would use a simple for-loop.
I can propose a solution using my StreamEx library (which provides additional functions to the Stream API), but I would not be very happy with such solution:
int[] input = {1, 2, 3, -1, 3, -10, 9};
System.out.println(IntStreamEx.of(
IntStreamEx.of(input).scanLeft(Integer::sum)).indexOf(x -> x < 0));
// prints OptionalLong[5]
It uses IntStreamEx.scanLeft operation to compute the array of prefix sums, then searches over this array using IntStreamEx.indexOf operation. While indexOf is short-circuiting, the scanLeft operation will process the whole input and create an intermediate array of the same length as the input which is completely unnecessary when solving the same problem in imperative style.
With new headTail method in my StreamEx library it's possibly to create lazy solution which works well for very long or infinite streams. First, we can define a new intermediate scanLeft operation:
public static <T> StreamEx<T> scanLeft(StreamEx<T> input, BinaryOperator<T> operator) {
return input.headTail((head, tail) ->
scanLeft(tail.mapFirst(cur -> operator.apply(head, cur)), operator)
.prepend(head));
}
This defines a lazy scanLeft using the headTail: it applies given function to the head and the first element of the tail stream, then prepends the head. Now you can use this scanLeft:
scanLeft(StreamEx.of(1, 2, 3, -1, 3, -10, 9), Integer::sum).indexOf(x -> x < 0);
The same can be applied to the infinite stream (e.g. stream of random numbers):
StreamEx<Integer> ints = IntStreamEx.of(new Random(), -100, 100)
.peek(System.out::println).boxed();
int idx = scanLeft(ints, Integer::sum).indexOf(x -> x < 0);
This will run till the cumulative sum becomes negative and returns the index of the corresponding element.
Related
In one of the previous intro to cs exams there was a question: calculate the space and time complexity of the function f1 as a function of n, assume that the time complexity of malloc(n) is O(1) and its space complexity is O(n).
int f1(int n) {
if(n < 3)
return 1;
int* arr = (int*) malloc(sizeof(int) * n);
f1(f1(n – 3));
free(arr);
return n;
}
The official solution is: time complexity: O(2^(n/3)), space complexity: O(n^2)
I tried to solve it but i didn't know how until i saw a note in my notebook that said: since the function returns n then we can treat f(f(n-3)) as f(n-3)+f(n-3) or as 2f(n-3). In this case the question becomes very similar to this one: Space complexity of recursive function
I tried solving it this way and i got the correct answer.
For the time complexity:
T(n)=2T(n-3)+1 , T(0)=1
T(n-3)=2T(n-3*2)+1
T(n)=2*2T(n-3*2)+2+1
T(n-3*2)=2T(n-3*3)+1
T(n)=2*2*2T(n-3*3)+2*2+2+1
...
T(n)=(2^k)T(n-3*k)+2^(k-1)+...+2^2+2+1
n-3*k=0
k=n/3
===> 2^(n/3)+...+2^2+2+1=2^(n/3)[1+(1/2)+(1/2^2)+...]=2^(n/3)*constant
Thus I got O(2^(n/3))
For the space complexity: the tree depth is n/3 and each time we do malloc so we get (n/3)^2 thus O(n^2).
My question:
why can we treat f1(f1(n – 3)) as f1(n-3)+f1(n-3) or as 2f1(n-3)?
if the function didn't return n but changed it, for example: return n/3 instead of return n, then how do we solve it? do we treat it as 2f1((n-3)/3)?
if we can't always treat f1(f1(n – 3)) as f1(n-3)+f1(n-3) or as 2f1(n-3) then how do we draw the recursion tree and how do we write and solve it using the induction method T(n)?
Why can we treat f1(f1(n – 3)) as f1(n-3)+f1(n-3) or as 2f1(n-3)?
Because i) the nested f1 is evaluated first, and its return value is used to call the outer f1; so these nested calls are equivalent to:
int result = f1(n - 3);
f1(result);
... and ii) the return value of f1 is just its argument (except for the base case, but it doesn't matter asymptotically), so the above is further equivalent to:
f1(n - 3);
f1(n - 3); // result = n - 3
If the function didn't return n but changed it, for example: return n/3 instead of return n, then how do we solve it? do we treat it as 2f1((n-3)/3)?
Only the outer call is affected. Again, using the equivalent expression from before:
f1(n - 3); // = (n - 3) / 3
f1((n - 3) / 3);
i.e. just f1(n - 3) + f1((n - 3) / 3) for your example.
If we can't always treat f1(f1(n – 3)) as f1(n-3)+f1(n-3) or as 2f1(n-3) then how do we draw the recursion tree and how do we write and solve it using the induction method T(n)?
You can always separate them into two separate calls as above, and again remember that only the second call is affected by the return result. If this is different to n - 3 then you would need a recursion tree instead of simple expansion. Depends on the specific problem, needless to say.
/*
Given an array: [1,2] and a target: 4
Find the solution set that adds up to the target
in this case:
[1,1,1,1]
[1,1,2]
[2,2]
*/
import "sort"
func combinationSum(candidates []int, target int) [][]int {
sort.Ints(candidates)
return combine(0, target, []int{}, candidates)
}
func combine(sum int, target int, curComb []int, candidates []int) [][]int {
var tmp [][]int
var result [][]int
if sum == target {
fmt.Println(curComb)
return [][]int{curComb}
} else if sum < target {
for i,v := range candidates {
tmp = combine(sum+v, target, append(curComb, v), candidates[i:])
result = append(result,tmp...)
}
}
return result
}
This is a problem in Leetcode and I use recursion to solve it.
In line 18, I print every case when the sum is equal to the target.
The output is :
[1,1,1,1]
[1,1,2]
[2,2]
And that is the answer that I want!
But why is the final answer (two-dimensional):
[[1,1,1,2],[1,1,2],[2,2]]
Expected answer is : [[1,1,1,1],[1,1,2],[2,2]]
Please help me find the mistake in the code. Thanks for your time.
This happens because of the way slices work. A slice object is a reference to an underlying array, along with the length of the slice, a pointer to the start of the slice in the array, and the slice's capacity. The capacity of a slice is the number of elements from the beginning of the slice to the end of the array. When you append to a slice, if there is available capacity for the new element, it is added to the existing array. However, if there isn't sufficient capacity, append allocates a new array and copies the elements. The new array is allocated with extra capacity so that an allocation isn't required for every append.
In your for loop, when curComb is [1, 1, 1], its capacity is 4. On successive iterations of the loop, you append 1 and then 2, neither of which causes a reallocation because there's enough room in the array for the new element. When curComb is [1, 1, 1, 1], it is put on the results list, but in the next iteration of the for loop, the append changes the last element to 2 (remember that it's the same underlying array), so that's what you see when you print the results at the end.
The solution to this is to return a copy of curComb when the sum equals the target:
if sum == target {
fmt.Println(curComb)
tmpCurComb := make([]int, len(curComb))
copy(tmpCurComb, curComb)
return [][]int{tmpCurComb}
This article gives a good explanation of how slices work.
I don't quite understand this piece of code. So if for example n = 5 and we have:
array[5] = {13, 27, 78, 42, 69}
Would someone explain please?
All I understand is if n = 1, that is the lowest.
But when n = 5, we would get the 4th index and compare it to the 4th index and check which is the smallest and return the smallest, then take the 4th index and compare it to the 3rd index and check which one is the smallest and return the smallest? I am confused.
int min(int a, int b)
{
return (a < b) ? a: b;
}
// Recursively find the minimum element in an array, n is the length of the
// array, which you assume is at least 1.
int find_min(int *array, int n)
{
if(n == 1)
return array[0];
return min(array[n - 1], find_min(array, n - 1));
}
Given your array:
1. initial call: find_min(array, 5)
n!=1, therefore if() doesn't trigger
2. return(min(array[4], find_min(array, 4)))
n!=1, therefore if doesn't trigger
3. return(min(array[3], find_min(array,3)))
n!=1, therefore if doesn't trigger
4. return(min(array[2], find_min(array,2)))
n!=1, threfore if() doesn't trigger
5. return(min(array[1], find_min(array,1)))
n==1, so return array[0]
4. return(min(array[1], array[0]))
return(min(13, 27)
return(13)
3. return(min(array[2], 13))
etc...
It's quite simple. Run through the code using the example you gave.
On the first run through find_min(), it will return the minimum of the last element in the array (69) and the minimum of the rest of the array. To calculate the minimum of the rest of the array, it calls itself, i.e. it is recursive. This 2nd-level call will compare the number 42 (the new "last" element) with the minimum from the rest of the array, and so on. The final call to find_min() will have n=1 with the array "{13}", so it will return 13. The layer that called it will compare 13 with 27 and find that 13 is less so it will return it, and so on back up the chain.
Note: I assume the backward quotes in your code are not supposed to be there.
The solution uses recursion to compute the minimum for the smallest possible comparison set and comparing that result with the next bigger set of numbers. Each recursive call returns a result that is compared against the next element in a backward manner until the minimum value bubbles up to the top. Recursion appears to be tricky at first, but can be quite effective once you get familiar with it.
I found an answer on SO that explained how to write a randomly weighted drop system for a game. I would prefer to write this code in a more functional-programming style but I couldn't figure out a way to do that for this code. I'll inline the pseudo code here:
R = (some random int);
T = 0;
for o in os
T = T + o.weight;
if T > R
return o;
How could this be written in a style that's more functional? I am using CoffeeScript and underscore.js, but I'd prefer this answer to be language agnostic because I'm having trouble thinking about this in a functional way.
Here are two more functional versions in Clojure and JavaScript, but the ideas here should work in any language that supports closures. Basically, we use recursion instead of iteration to accomplish the same thing, and instead of breaking in the middle we just return a value and stop recursing.
Original pseudo code:
R = (some random int);
T = 0;
for o in os
T = T + o.weight;
if T > R
return o;
Clojure version (objects are just treated as clojure maps):
(defn recursive-version
[r objects]
(loop [t 0
others objects]
(let [obj (first others)
new_t (+ t (:weight obj))]
(if (> new_t r)
obj
(recur new_t (rest others))))))
JavaScript version (using underscore for convenience).
Be careful, because this could blow out the stack.
This is conceptually the same as the clojure version.
var js_recursive_version = function(objects, r) {
var main_helper = function(t, others) {
var obj = _.first(others);
var new_t = t + obj.weight;
if (new_t > r) {
return obj;
} else {
return main_helper(new_t, _.rest(others));
}
};
return main_helper(0, objects);
};
You can implement this with a fold (aka Array#reduce, or Underscore's _.reduce):
An SSCCE:
items = [
{item: 'foo', weight: 50}
{item: 'bar', weight: 35}
{item: 'baz', weight: 15}
]
r = Math.random() * 100
{item} = items.reduce (memo, {item, weight}) ->
if memo.sum > r
memo
else
{item, sum: memo.sum + weight}
, {sum: 0}
console.log 'r:', r, 'item:', item
You can run it many times at coffeescript.org and see that the results make sense :)
That being said, i find the fold a bit contrived, as you have to remember both the selected item and the accumulated weight between iterations, and it doesn't short-circuit when the item is found.
Maybe a compromise solution between pure FP and the tedium of reimplementing a find algorithm can be considered (using _.find):
total = 0
{item} = _.find items, ({weight}) ->
total += weight
total > r
Runnable example.
I find (no pun intended) this algorithm much more accessible than the first one (and it should perform better, as it doesn't create intermediate objects, and it does short-circuiting).
Update/side-note: the second algorithm is not "pure" because the function passed to _.find is not referentially transparent (it has the side effect of modifying the external total variable), but the whole of the algorithm is referentially transparent. If you were to encapsulate it in a findItem = (items, r) -> function, the function will be pure and will always return the same output for the same input. That's a very important thing, because it means that you can get the benefits of FP while using some non-FP constructs (for performance, readability, or whatever reason) under the hoods :D
I think the underlying task is randomly selecting 'events' (objects) from array os with a frequency defined by their respective weights. The approach is to map (i.e. search) a random number (with uniform distribution) onto the stairstep cumulative probability distribution function.
With positive weights, their cumulative sum is increasing from 0 to 1. The code you gave us simply searches starting at the 0 end. To maximize speed with repeated calls, pre calculate sums, and order the events so the largest weights are first.
It really doesn't matter whether you search with iteration (looping) or recursion. Recursion is nice in a language that tries to be 'purely functional' but doesn't help understanding the underlying mathematical problem. And it doesn't help you package the task into a clean function. The underscore functions are another way of packaging the iterations, but don't change the basic functionality. Only any and all exit early when the target is found.
For small os array this simple search is sufficient. But with a large array, a binary search will be faster. Looking in underscore I find that sortedIndex uses this strategy. From Lo-Dash (an underscore dropin), "Uses a binary search to determine the smallest index at which the value should be inserted into array in order to maintain the sort order of the sorted array"
The basic use of sortedIndex is:
os = [{name:'one',weight:.7},
{name:'two',weight:.25},
{name:'three',weight:.05}]
t=0; cumweights = (t+=o.weight for o in os)
i = _.sortedIndex(cumweights, R)
os[i]
You can hide the cumulative sum calculation with a nested function like:
osEventGen = (os)->
t=0; xw = (t+=y.weight for y in os)
return (R) ->
i = __.sortedIndex(xw, R)
return os[i]
osEvent = osEventGen(os)
osEvent(.3)
# { name: 'one', weight: 0.7 }
osEvent(.8)
# { name: 'two', weight: 0.25 }
osEvent(.99)
# { name: 'three', weight: 0.05 }
In coffeescript, Jed Clinger's recursive search could be written like this:
foo = (x, r, t=0)->
[y, x...] = x
t += y
return [y, t] if x.length==0 or t>r
return foo(x, r, t)
An loop version using the same basic idea is:
foo=(x,r)->
t=0
while x.length and t<=r
[y,x...]=x # the [first, rest] split
t+=y
y
Tests on jsPerf http://jsperf.com/sortedindex
suggest that sortedIndex is faster when os.length is around 1000, but slower than the simple loop when the length is more like 30.
There are articles and presentations about functional style programming in D (e.g. http://www.drdobbs.com/architecture-and-design/component-programming-in-d/240008321). I never used D before, but I'm interested in trying it. Is there a way to write code in D similar to this Python expression:
max(x*y for x in range(N) for y in range(x, N) if str(x*y) == str(x*y)[::-1])
Are there D constructs for generators or list (array) comprehensions?
Here's one possible solution, not particularly pretty:
iota(1,N)
.map!(x =>
iota(x,N)
.map!(y => tuple(x,y)))
.joiner
.map!(xy => xy[0]*xy[1])
.filter!(xy => equal(to!string(xy), to!string(xy).retro))
.reduce!max;
So what this actually does is create a range from 1 to N, and map each element to a range of tuples with your x,y values. This gives you a nested range ([[(1,1),(1,2)],[(2,2)]] for N = 2).
We then join this range to get a range of tuples ([(1,1),(1,2),(2,2)] for N = 2).
Next we map to x*y (D's map does for some reason not allow for unpacked tuples, so we need to use indexing).
Penultimately we filter out non-palindromes, before finally reducing the range to its largest element.
Simple answer, no, D does not have generators or list comprehensions (AFAIK). However, you can create a generator using an InputRange. For that solution, see this related question: What is a "yield return" equivalent in the D programming language?
However, your code isn't using generators, so your code could be translated as:
import std.algorithm : max, reduce, retro, equal;
import std.conv : to;
immutable N = 13;
void main() {
int[] keep;
foreach(x; 0 .. N) {
foreach(y; x .. N) {
auto val = x*y;
auto s = to!string(val);
if (equal(s, s.retro)) // reverse doesn't work on immutable Ranges
keep ~= val; // don't use ~ if N gets large, use appender instead
}
}
reduce!max(keep); // returns 121 (11*11)
}
For me, this is much more readable than your list comprehension because the list comprehension has gotten quite large.
There may be a better solution out there, but this is how I'd implement it. An added bonus is you get to see std.algorithm in all its glory.
However, for this particular piece of code, I wouldn't use the array to save on memory and instead store only the best value to save on memory. Something like this:
import std.algorithm : retro, equal;
import std.conv : to;
immutable N = 13;
void main() {
int best = 0;
foreach(x; 0 .. N) {
foreach(y; x .. N) {
auto val = x*y;
auto s = to!string(val);
if (equal(s, s.retro))
best = val;
}
}
}