F# tree-building function causes stack overflow in Xamarin Studio - recursion

I'm trying to build up some rules in a tree structure, with logic gates i.e. and, not, or as well as conditions, e.g. property x equals value y. I wrote the most obvious recursive function first, which worked. I then tried to write a version that wouldn't cause a stack-overflow in continuation passing style taking my cue from this post about generic tree folding and this answer on stackoverflow.
It works for small trees (depth of approximately 1000), but unfortunately when using a large tree it causes a stackoverflow when I run it on my Mac with Xamarin Studio. Can anyone tell me whether I've misunderstood how F# treats tail-recursive code or whether this code isn't tail-recursive?
The full sample is here.
let FoldTree andF orF notF leafV t data =
let rec Loop t cont =
match t with
| AndGate (left, right)->
Loop left (fun lacc ->
Loop right (fun racc ->
cont (andF lacc racc)))
| OrGate (left, right)->
Loop left (fun lacc ->
Loop right (fun racc ->
cont (orF lacc racc)))
| NotGate exp ->
Loop exp (fun acc -> cont (notF acc))
| EqualsExpression(property,value) -> cont (leafV (property,value))
Loop t id
let evaluateContinuationPassingStyle tree data =
FoldTree (&&) (||) (not) (fun (prop,value) -> data |> Map.find prop |> ((=) value)) tree data

The code is tail-recursive, you got it right. But the problem is with Mono. See, Mono is not as high-quality implementation of .NET as the official thing. In particular, it doesn't do tail call elimination. Like, at all.
For the simplest (and most prevalent) case of self-recursion this doesn't matter too much, because the compiler catches it earlier. The F# compiler is smart enough to spot that the function is calling itself, figure out under what conditions, and convert it into a neat while loop, so that the compiled code doesn't make any calls at all.
But when your tail call is to a function passed as parameter, the compiler can't do that, because the actual function being called isn't known until runtime. In fact, even mutual recursion of two functions can't be converted into a loop reliably.
Possible solutions:
Switch to .NET Core.
Don't use recursive continuations, use accumulator instead (might not be possible).
Use self-recursion and pass manually maintained stack of continuations.
If all else fails, use a mutable stack.

Related

Recursive Sequences in F#

Let's say I want to calculate the factorial of an integer. A simple approach to this in F# would be:
let rec fact (n: bigint) =
match n with
| x when x = 0I -> 1I
| _ -> n * fact (n-1I)
But, if my program needs dynamic programming, how could I sustain functional programming whilst using memoization?
One idea I had for this was making a sequence of lazy elements, but I ran into a problem. Assume that the follow code was acceptable in F# (it is not):
let rec facts =
seq {
yield 1I
for i in 1I..900I do
yield lazy (i * (facts |> Seq.item ((i-1I) |> int)))
}
Is there anything similar to this idea in F#?
(Note: I understand that I could use a .NET Dictionary but isn't invoking the ".Add()" method imperative style?)
Also, Is there any way I could generalize this with a function? For example, could I create a sequence of length of the collatz function defined by the function:
let rec collatz n i =
if n = 0 || n = 1 then (i+1)
elif n % 2 = 0 then collatz (n/2) (i+1)
else collatz (3*n+1) (i+1)
If you want to do it lazily, this is a nice approach:
let factorials =
Seq.initInfinite (fun n -> bigint n + 1I)
|> Seq.scan ((*)) 1I
|> Seq.cache
The Seq.cache means you won't repeatedly evaluate elements you've already enumerated.
You can then take a particular number of factorials using e.g. Seq.take n, or get a particular factorial using Seq.item n.
At first, i don't see in your example what you mean with "dynamic programming".
Using memorization doesn't mean something is not "functional" or breaks immutability. The important
point is not how something is implemented. The important thing is how it behaves. A function that uses
a mutable memoization is still considered pure, as long as it behaves like a pure function/immutable
function. So using a mutable variables in a limited scope that is not visible to the caller is still
considered pure. If the implementation would be important we could also consider tail-recursion as
not pure, as the compiler transform it into a loop with mutable variables under the hood. There
also exists some List.xyz function that use mutation and transform things into a mutable variable
just because of speed. Those function are still considered pure/immutable because they still behave like
pure function.
A sequence itself is already lazy. It already computes all its elements only when you ask for those elements.
So it doesn't make much sense to me to create a sequence that returns lazy elements.
If you want to speed up the computation there exists multiple ways how to do it. Even in the recursion
version you could use an accumulator that is passed to the next function call. Instead of doing deep
recursion.
let fact n =
let rec loop acc x =
if x = n
then acc * x
else loop (acc*x) (x+1I)
loop 1I 1I
That overall is the same as
let fact' n =
let mutable acc = 1I
let mutable x = 1I
while x <= n do
acc <- acc * x
x <- x + 1I
acc
As long you are learning functional programming it is a good idea to get accustomed to the first version and learn
to understand how looping and recursion relate to each other. But besides learning there isn't a reason why you
always should force yourself to always write the first version. In the end you should use what you consider more
readable and easier to understand. Not whether something uses a mutable variable as an implementation or not.
In the end nobody really cares for the exact implementation. We should view functions as black-boxes. So as long as
a function behaves like a pure function, everything is fine.
The above uses an accumulator, so you don't need to repetitive call a function again to get a value. So you also
don't need an internal mutable cache. if you really have a slow recursive version and want to speed it up with
caching you can use something like that.
let fact x =
let rec fact x =
match x with
| x when x = 1I -> 1I
| x -> (fact (x-1I)) * x
let cache = System.Collections.Generic.Dictionary<bigint,bigint>()
match cache.TryGetValue x with
| false,_ ->
let value = fact x
cache.Add(x,value)
value
| true,value ->
value
But that would probably be slower as the versions with an accumulator. If you want to cache calls to fact even across multiple
fact calls across your whole application then you need an external cache. You could create a Dictionary outside of fact and use a
private variable for this. But you also then can use a function with a closure, and make the whole process itself generic.
let memoize (f:'a -> 'b) =
let cache = System.Collections.Generic.Dictionary<'a,'b>()
fun x ->
match cache.TryGetValue x with
| false,_ ->
let value = f x
cache.Add(x,value)
value
| true,value ->
value
let rec fact x =
match x with
| x when x = 1I -> 1I
| x -> (fact (x-1I)) * x
So now you can use something like that.
let fact = memoize fact
printfn "%A" (fact 100I)
printfn "%A" (fact 100I)
and create a memoized function out of every other function that takes 1 parameter
Note that memoization doesn't automatically speed up everything. If you use the memoize function on fact
nothing get speeded up, it will even be slower as without the memoization. You can add a printfn "Cache Hit"
to the | true,value -> branch inside the memoize function. Calling fact 100I twice in a row will only
yield a single "Cache Hit" line.
The problem is how the algorithm works. It starts from 100I and it goes down to 0I. So calculating 100I ask
the cache of 99I, it doesn't exists, so it tries to calculate 98I and ask the cache. That also doesn't exists
so it goes down to 1I. It always asked the cache, never found a result and calculates the needed value.
So you never get a "Cache Hit" and you have the additional work of asking the cache. To really benefit from the
cache you need to change fact itself, so it starts from 1I up to 100I. The current version even throws StackOverflow
for big inputs, even with the memoize function.
Only the second call benefits from the cache, That is why calling fact 100I twice will ever only print "Cache Hit" once.
This is just an example that is easy to get the behaviour wrong with caching/memoization. In general you should try to
write a function so it is tail-recursive and uses accumulators instead. Don't try to write functions that expects
memoization to work properly.
I would pick a solution with an accumulator. If you profiled your application and you found that this is still to slow
and you have a bottleneck in your application and caching fact would help, then you also can just cache the results of
facts directly. Something like this. You could use dict or a Map for this.
let factCache = [1I..100I] |> List.map (fun x -> x,fact x) |> dict
let factCache = [1I..100I] |> List.map (fun x -> x,fact x) |> Map.ofList

What are real use cases of currying?

I've been reading lots of articles on currying, but almost all of them are misleading, explaining currying as a partial function application and all of examples usually are about functions with arity of 2, like add function or something.
Also many implementations of curry function in JavaScript makes it to accept more than 1 argument per partial application (see lodash), when Wikipedia article clearly tells that currying is about:
translating the evaluation of a function that takes multiple arguments (or a tuple of arguments) into evaluating a sequence of functions, each with a single argument (partial application)
So basically currying is a series of partial applications each with a single argument. And I really want to know real uses of that, in any language.
Real use case of currying is partial application.
Currying by itself is not terribly interesting. What's interesting is if your programming language supports currying by default, as is the case in F# or Haskell.
You can define higher order functions for currying and partial application in any language that supports first class functions, but it's a far cry from the flexibility you get when every function you get is curried, and thus partially applicable without you having to do anything.
So if you see people conflating currying and partial application, that's because of how closely those concepts are tied there - since currying is ubiquitous, you don't really need other forms of partial application than applying curried functions to consecutive arguments.
It is usefull to pass context.
Consider the 'map' function. It takes a function as argument:
map : (a -> b) -> [a] -> [b]
Given a function which uses some form of context:
f : SomeContext -> a -> b
This means you can elegantly use the map function without having to state the 'a'-argument:
map (f actualContext) [1,2,3]
Without currying, you would have to use a lambda:
map (\a -> f actualContext a) [1,2,3]
Notes:
map is a function which takes a list containing values of a, a function f. It constructs a new list, by taking each a and applying f to it, resulting in a list of b
e.g. map (+1) [1,2,3] = [2,3,4]
The bearing currying has on code can be divided into two sets of issues (I use Haskell to illustrate).
Syntactical, Implementation.
Syntax Issue 1:
Currying allows greater code clarity in certain cases.
What does clarity mean? Reading the function provides clear indication of its functionality.
e.g. The map function.
map : (a -> b) -> ([a] -> [b])
Read in this way, we see that map is a higher order function that lifts a function transforming as to bs to a function that transforms [a] to [b].
This intuition is particularly useful when understanding such expressions.
map (map (+1))
The inner map has the type above [a] -> [b].
In order to figure out the type of the outer map, we recursively apply our intuition from above. The outer map thus lifts [a] -> [b] to [[a]] -> [[b]].
This intuition will carry you forward a LOT.
Once we generalize map over into fmap, a map over arbitrary containers, it becomes really easy to read expressions like so (Note I've monomorphised the type of each fmap to a different type for the sake of the example).
showInt : Int -> String
(fmap . fmap . fmap) showInt : Tree (Set [Int]) -> Tree (Set [String])
Hopefully the above illustrates that fmap provides this generalized notion of lifting vanilla functions into functions over some arbitrary container.
Syntax Issue 2:
Currying also allows us to express functions in point-free form.
nthSmallest : Int -> [Int] -> Maybe Int
nthSmallest n = safeHead . drop n . sort
safeHead (x:_) = Just x
safeHead _ = Nothing
The above is usually considered good style as it illustrates thinking in terms of a pipeline of functions rather than the explicit manipulation of data.
Implementation:
In Haskell, point free style (through currying) can help us optimize functions. Writing a function in point free form will allow us to memoize it.
memoized_fib :: Int -> Integer
memoized_fib = (map fib [0 ..] !!)
where fib 0 = 0
fib 1 = 1
fib n = memoized_fib (n-2) + memoized_fib (n-1)
not_memoized_fib :: Int -> Integer
not_memoized_fib x = map fib [0 ..] !! x
where fib 0 = 0
fib 1 = 1
fib n = not_memoized_fib (n-2) + not_memoized_fib (n-1)
Writing it as a curried function as in the memoized version treats the curried function as an entity and therefore memoizes it.

Why did I still get stackoverflow even if I used tail-recursion in OCaml?

I wrote a function which generates a list of randomized ints in OCaml.
let create_shuffled_int_list n =
Random.self_init;
let rec create n' acc =
if n' = 0 then acc
else
create (n'-1) (acc # [Random.int (n/2)])
in
create n [];;
When I tried to generate 10000 integers, it gives Exception: RangeError: Maximum call stack size exceeded. error.
However, I believed in the function, I have used tail-recursion and it should not give stackoverflow error, right?
Any idea?
From the core library documentation
val append : 'a list -> 'a list -> 'a list
Catenate two lists. Same function as the infix operator #. Not tail-recursive (length of the first argument). The # operator is not tail-recursive either.
So it's not your function that's causing the overflow, it's the # function. Seeing as you only care about producing a shuffled list, however, there's no reason to be appending things onto the end of lists. Even if the # operator were tail-recursive, list append is still O(n). List prepending, however, is O(1). So if you stick your new random numbers on the front of your list, you avoid the overflow (and make your function much much faster):
let create_shuffled_int_list n =
Random.self_init;
let rec create n' acc =
if n' = 0 then acc
else
create (n'-1) (Random.int (n/2) :: acc)
in
create n [];;
If you care about the order (not sure why), then just stick a List.rev on the end:
List.rev (create n []);;
As an aside, you should not call Random.self_init in a function, since:
the user of your function may want to control the seed in order to obtain reproductible results (testing, sharing results...)
this may reset the seed with a not so random entropy source and you probably want to do this only once.

Folds versus recursion in Erlang

According to Learn you some Erlang :
Pretty much any function you can think of that reduces lists to 1 element can be expressed as a fold. [...]
This means fold is universal in the sense that you can implement pretty much any other recursive function on lists with a fold
My first thought when writing a function that takes a lists and reduces it to 1 element is to use recursion.
What are the guidelines that should help me decide whether to use recursion or a fold?
Is this a stylistic consideration or are there other factors as well (performance, readability, etc.)?
I personally prefer recursion over fold in Erlang (contrary to other languages e.g. Haskell). I don't see fold more readable than recursion. For example:
fsum(L) -> lists:foldl(fun(X,S) -> S+X end, 0, L).
or
fsum(L) ->
F = fun(X,S) -> S+X end,
lists:foldl(F, 0, L).
vs
rsum(L) -> rsum(L, 0).
rsum([], S) -> S;
rsum([H|T], S) -> rsum(T, H+S).
Seems more code but it is pretty straightforward and idiomatic Erlang. Using fold requires less code but the difference becomes smaller and smaller with more payload. Imagine we want a filter and map odd values to their square.
lcfoo(L) -> [ X*X || X<-L, X band 1 =:= 1].
fmfoo(L) ->
lists:map(fun(X) -> X*X end,
lists:filter(fun(X) when X band 1 =:= 1 -> true; (_) -> false end, L)).
ffoo(L) -> lists:foldr(
fun(X, A) when X band 1 =:= 1 -> [X|A];
(_, A) -> A end,
[], L).
rfoo([]) -> [];
rfoo([H|T]) when H band 1 =:= 1 -> [H*H | rfoo(T)];
rfoo([_|T]) -> rfoo(T).
Here list comprehension wins but recursive function is in the second place and fold version is ugly and less readable.
And finally, it is not true that fold is faster than recursive version especially when compiled to native (HiPE) code.
Edit:
I add a fold version with fun in variable as requested:
ffoo2(L) ->
F = fun(X, A) when X band 1 =:= 1 -> [X|A];
(_, A) -> A
end,
lists:foldr(F, [], L).
I don't see how it is more readable than rfoo/1 and I found especially an accumulator manipulation more complicated and less obvious than direct recursion. It is even longer code.
folds are usually both more readable (since everybody know what they do) and faster due to optimized implementations in the runtime (especially foldl which always should be tail recursive). It's worth noting that they are only a constant factor faster, not on another order, so it's usually premature optimization if you find yourself considering one over the other for performance reasons.
Use standard recursion when you do fancy things, such as working on more than one element at a time, splitting into multiple processes and similar, and stick to higher-order functions (fold, map, ...) when they already do what you want.
I expect fold is done recursively, so you may want to look at trying to implement some of the various list functions, such as map or filter, with fold, and see how useful it can be.
Otherwise, if you are doing this recursively you may be re-implementing fold, basically.
Learn to use what comes with the language, is my thought.
This discussion on foldl and recursion is interesting:
Easy way to break foldl
If you look at the first paragraph in this introduction (you may want to read all of it), he states better than I did.
http://www.cs.nott.ac.uk/~gmh/fold.pdf
Old thread but my experience is that fold works slower than a recursive function.

How do I implement graphs and graph algorithms in a functional programming language?

Basically, I know how to create graph data structures and use Dijkstra's algorithm in programming languages where side effects are allowed. Typically, graph algorithms use a structure to mark certain nodes as 'visited', but this has side effects, which I'm trying to avoid.
I can think of one way to implement this in a functional language, but it basically requires passing around large amounts of state to different functions, and I'm wondering if there is a more space-efficient solution.
You might check out how Martin Erwig's Haskell functional graph library does things. For instance, its shortest-path functions are all pure, and you can see the source code for how it's implemented.
Another option, like fmark mentioned, is to use an abstraction which allows you to implement pure functions in terms of state. He mentions the State monad (which is available in both lazy and strict varieties). Another option, if you're working in the GHC Haskell compiler/interpreter (or, I think, any Haskell implementation which supports rank-2 types), another option is the ST monad, which allows you to write pure functions which deal with mutable variables internally.
If you were using haskell, the only functional language with which I am familiar, I would recommend using the State monad. The State monad is an abstraction for a function that takes a state and returns an intermediate value and some new state value. This is considered idiomatic haskell for those situations where maintaining a large state is necessary.
It is a much nicer alternative to the naive "return state as a function result and pass it as a parameter" idiom that is emphasized in beginner functional programming tutorials. I imagine most functional programming languages have a similar construct.
I just keep the visited set as a set and pass it as a parameter. There are efficient log-time implementations of sets of any ordered type and extra-efficient sets of integers.
To represent a graph I use adjacency lists, or I'll use a finite map that maps each node to a list of its successors. It depends what I want to do.
Rather than Abelson and Sussman, I recommend Chris Okasaki's Purely Functional Data Structures. I've linked to Chris's dissertation, but if you have the money, he expanded it into an excellent book.
Just for grins, here's a slightly scary reverse postorder depth-first search done in continuation-passing style in Haskell. This is straight out of the Hoopl optimizer library:
postorder_dfs_from_except :: forall block e . (NonLocal block, LabelsPtr e)
=> LabelMap (block C C) -> e -> LabelSet -> [block C C]
postorder_dfs_from_except blocks b visited =
vchildren (get_children b) (\acc _visited -> acc) [] visited
where
vnode :: block C C -> ([block C C] -> LabelSet -> a)
-> ([block C C] -> LabelSet -> a)
vnode block cont acc visited =
if setMember id visited then
cont acc visited
else
let cont' acc visited = cont (block:acc) visited in
vchildren (get_children block) cont' acc (setInsert id visited)
where id = entryLabel block
vchildren bs cont acc visited = next bs acc visited
where next children acc visited =
case children of [] -> cont acc visited
(b:bs) -> vnode b (next bs) acc visited
get_children block = foldr add_id [] $ targetLabels bloc
add_id id rst = case lookupFact id blocks of
Just b -> b : rst
Nothing -> rst
Here is a Swift example. You might find this a bit more readable. The variables are actually descriptively named, unlike the super cryptic Haskell examples.
https://github.com/gistya/Functional-Swift-Graph
Most functional languages support inner functions. So you can just create your graph representation in the outermost layer and just reference it from the inner function.
This book covers it extensively http://www.amazon.com/gp/product/0262510871/ref=pd_lpo_k2_dp_sr_1?ie=UTF8&cloe_id=aa7c71b1-f0f7-4fca-8003-525e801b8d46&attrMsgId=LPWidget-A1&pf_rd_p=486539851&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0262011530&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=114DJE8K5BG75B86E1QS
I would love to hear about some really clever technique, but I think there are two fundamental approaches:
Modify some global state object. i.e. side-effects
Pass the graph as an argument to your functions with the return value being the modified graph. I assume this is your approach of "passing around large amounts of state"
That is what's done in functional programming. If the compiler/interpreter is any good, it will help manage memory for you. In particular, you'll want to make sure that you use tail recursion, if you happen to recurse in any of your functions.

Resources