Pure functional bottom up tree algorithm - functional-programming

Say I wanted to write an algorithm working on an immutable tree data structure that has a list of leaves as its input. It needs to return a new tree with changes made to the old tree going upwards from those leaves.
My problem is that there seems to be no way to do this purely functional without reconstructing the entire tree checking at leaves if they are in the list, because you always need to return a complete new tree as the result of an operation and you can't mutate the existing tree.
Is this a basic problem in functional programming that only can be avoided by using a better suited algorithm or am I missing something?
Edit: I not only want to avoid to recreate the entire tree but also the functional algorithm should have the same time complexity than the mutating variant.

The most promising I have seen so far (which admittedly is not very long...) is the Zipper data structure: It basically keeps a separate structure, a reverse path from the node to root, and does local edits on this separate structure.
It can do multiple local edits, most of which are constant time, and write them back to the tree (reconstructing the path to root, which are the only nodes that need to change) all in one go.
The Zipper is a standard library in Clojure (see the heading Zippers - Functional Tree Editing).
And there's the original paper by Huet with an implementation in OCaml.
Disclaimer: I have been programming for a long time, but only started functional programming a couple of weeks ago, and had never even heard of the problem of functional editing of trees until last week, so there may very well be other solutions I'm unaware of.
Still, it looks like the Zipper does most of what one could wish for. If there are other alternatives at O(log n) or below, I'd like to hear them.

You may enjoy reading
http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!248.entry

This depends on your functional programming language. For instance in Haskell, which is a Lazy functional programming language, results are calculated at the last moment; when they are acutally needed.
In your example the assumption is that because your function creates a new tree, the whole tree must be processed, whereas in reality the function is just passed on to the next function and only executed when necessary.
A good example of lazy evaluation is the sieve of erastothenes in Haskell, which creates the prime numbers by eliminating the multiples of the current number in the list of numbers. Note that the list of numbers is infinite. Taken from here
primes :: [Integer]
primes = sieve [2..]
where
sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0]

I recently wrote an algorithm that does exactly what you described - https://medium.com/hibob-engineering/from-list-to-immutable-hierarchy-tree-with-scala-c9e16a63cb89
It works in 2 phases:
Sort the list of nodes by their depth in the hierarchy
constructs the tree from bottom up
Some caveats:
No Node mutation, The result is an Immutable-tree
The complexity is O(n)
Ignores cyclic referencing in the incoming list

Related

Struggling with building an intuition for recursion

Though I have studied and able am able to understand some programs in recursion, I am still not able to intuitively obtain a solution using recursion as I do easily using Iteration. Is there any course or track available in order to build an intuition for recursion? How can one master the concept of recursion?
if you want to gain a thorough understanding of how recursion works, I highly recommend that you start with understanding mathematical induction, as the two are very closely related, if not arguably identical.
Recursion is a way of breaking down seemingly complicated problems into smaller bits. Consider the trivial example of the factorial function.
def factorial(n):
if n < 2:
return 1
return n * factorial(n - 1)
To calculate factorial(100), for example, all you need is to calculate factorial(99) and multiply 100. This follows from the familiar definition of the factorial.
Here are some tips for coming up with a recursive solution:
Assume you know the result returned by the immediately preceding recursive call (e.g. in calculating factorial(100), assume you already know the value of factorial(99). How do you go from there?)
Consider the base case (i.e. when should the recursion come to a halt?)
The first bullet point might seem rather abstract, but all it means is this: a large portion of the work has already been done. How do you go from there to complete the task? In the case of the factorial, factorial(99) constituted this large portion of work. In many cases, you will find that identifying this portion of work simply amounts to examining the argument to the function (e.g. n in factorial), and assuming that you already have the answer to func(n - 1).
Here's another example for concreteness. Let's say we want to reverse a string without using in-built functions. In using recursion, we might assume that string[:-1], or the substring until the very last character, has already been reversed. Then, all that is needed is to put the last remaining character in the front. Using this inspiration, we might come up with the following recursive solution:
def my_reverse(string):
if not string: # base case: empty string
return string # return empty string, nothing to reverse
return string[-1] + my_reverse(string[:-1])
With all of this said, recursion is built on mathematical induction, and these two are inseparable ideas. In fact, one can easily prove that recursive algorithms work using induction. I highly recommend that you checkout this lecture.

Why after pressing semicolon program is back in deep recursion?

I'm trying to understand the semicolon functionality.
I have this code:
del(X,[X|Rest],Rest).
del(X,[Y|Tail],[Y|Rest]) :-
del(X,Tail,Rest).
permutation([],[]).
permutation(L,[X|P]) :- del(X,L,L1), permutation(L1,P).
It's the simple predicate to show all permutations of given list.
I used the built-in graphical debugger in SWI-Prolog because I wanted to understand how it works and I understand for the first case which returns the list given in argument. Here is the diagram which I made for better understanding.
But I don't get it for the another solution. When I press the semicolon it doesn't start in the place where it ended instead it's starting with some deep recursion where L=[] (like in step 9). I don't get it, didn't the recursion end earlier? It had to go out of the recursions to return the answer and after semicolon it's again deep in recursion.
Could someone clarify that to me? Thanks in advance.
One analogy that I find useful in demystifying Prolog is that Backtracking is like Nested Loops, and when the innermost loop's variables' values are all found, the looping is suspended, the vars' values are reported, and then the looping is resumed.
As an example, let's write down simple generate-and-test program to find all pairs of natural numbers above 0 that sum up to a prime number. Let's assume is_prime/1 is already given to us.
We write this in Prolog as
above(0, N), between(1, N, M), Sum is M+N, is_prime(Sum).
We write this in an imperative pseudocode as
for N from 1 step 1:
for M from 1 step 1 until N:
Sum := M+N
if is_prime(Sum):
report_to_user_and_ask(Sum)
Now when report_to_user_and_ask is called, it prints Sum out and asks the user whether to abort or to continue. The loops are not exited, on the contrary, they are just suspended. Thus all the loop variables values that got us this far -- and there may be more tests up the loops chain that sometimes succeed and sometimes fail -- are preserved, i.e. the computation state is preserved, and the computation is ready to be resumed from that point, if the user presses ;.
I first saw this in Peter Norvig's AI book's implementation of Prolog in Common Lisp. He used mapping (Common Lisp's mapcan which is concatMap in Haskell or flatMap in many other languages) as a looping construct though, and it took me years to see that nested loops is what it is really all about.
Goals conjunction is expressed as the nesting of the loops; goals disjunction is expressed as the alternatives to loop through.
Further twist is that the nested loops' structure isn't fixed from the outset. It is fluid, the nested loops of a given loop can be created depending on the current state of that loop, i.e. depending on the current alternative being explored there; the loops are written as we go. In (most of the) languages where such dynamic creation of nested loops is impossible, it can be encoded with nested recursion / function invocation / inside the loops. (Here's one example, with some pseudocode.)
If we keep all such loops (created for each of the alternatives) in memory even after they are finished with, what we get is the AND-OR tree (mentioned in the other answer) thus being created while the search space is being explored and the solutions are found.
(non-coincidentally this fluidity is also the essence of "monad"; nondeterminism is modeled by the list monad; and the essential operation of the list monad is the flatMap operation which we saw above. With fluid structure of loops it is "Monad"; with fixed structure it is "Applicative Functor"; simple loops with no structure (no nesting at all): simply "Functor" (the concepts used in Haskell and the like). Also helps to demystify those.)
So, the proper slogan could be Backtracking is like Nested Loops, either fixed, known from the outset, or dynamically-created as we go. It's a bit longer though. :)
Here's also a Prolog example, which "as if creates the code to be run first (N nested loops for a given value of N), and then runs it." (There's even a whole dedicated tag for it on SO, too, it turns out, recursive-backtracking.)
And here's one in Scheme ("creates nested loops with the solution being accessible in the innermost loop's body"), and a C++ example ("create n nested loops at run-time, in effect enumerating the binary encoding of 2n, and print the sums out from the innermost loop").
There is a big difference between recursion in functional/imperative programming languages and Prolog (and it really became clear to me only in the last 2 weeks or so):
In functional/imperative programming, you recurse down a call chain, then come back up, unwinding the stack, then output the result. It's over.
In Prolog, you recurse down an AND-OR tree (really, alternating AND and OR nodes), selecting a predicate to call on an OR node (the "choicepoint"), from left to right, and calling every predicate in turn on an AND node, also from left to right. An acceptable tree has exactly one predicate returning TRUE under each OR node, and all predicates returning TRUE under each AND node. Once an acceptable tree has been constructed, by the very search procedure, we are (i.e. the "search cursor" is) on a rightmost bottommost node .
Success in constructing an acceptable tree also means a solution to the query entered at the Prolog Toplevel (the REPL) has been found: The variable values are output, but the tree is kept (unless there are no choicepoints).
And this is also important: all variables are global in the sense that if a variable X as been passed all the way down the call chain from predicate to predicate to the rightmost bottommost node, then constrained at the last possible moment by unifying it with 2 for example, X = 2, then the Prolog Toplevel is aware of that without further ado: nothing needs to be passed up the call chain.
If you now press ;, search doesn't restart at the top of the tree, but at the bottom, i.e. at the current cursor position: the nearest parent OR node is asked for more solutions. This may result in much search until a new acceptable tree has been constructed, we are at a new rightmost bottommost node. The new variable values are output and you may again enter ;.
This process cycles until no acceptable tree can be constructed any longer, upon which false is output.
Note that having this AND-OR as an inspectable and modifiable data structure at runtime allows some magical tricks to be deployed.
There is bound to be a lot of power in debugging tools which record this tree to help the user who gets the dreaded sphynxian false from a Prolog program that is supposed to work. There are now Time Traveling Debuggers for functional and imperative languages, after all...

The difference between MapReduce and the map-reduce combination in functional programming

I read the mapreduce at http://en.wikipedia.org/wiki/MapReduce ,understood the example of how to get the count of a "word" in many "documents". However I did not understand the following line:
Thus the MapReduce framework transforms a list of (key, value) pairs into a list of values. This behavior is different from the functional programming map and reduce combination, which accepts a list of arbitrary values and returns one single value that combines all the values returned by map.
Can someone elaborate on the difference again(MapReduce framework VS map and reduce combination)? Especially, what does the reduce functional programming do?
Thanks a great deal.
The main difference would be that MapReduce is apparently patentable. (Couldn't help myself, sorry...)
On a more serious note, the MapReduce paper, as I remember it, describes a methodology of performing calculations in a massively parallelised fashion. This methodology builds upon the map / reduce construct which was well known for years before, but goes beyond into such matters as distributing the data etc. Also, some constraints are imposed on the structure of data being operated upon and returned by the functions used in the map-like and reduce-like parts of the computation (the thing about data coming in lists of key/value pairs), so you could say that MapReduce is a massive-parallelism-friendly specialisation of the map & reduce combination.
As for the Wikipedia comment on the function being mapped in the functional programming's map / reduce construct producing one value per input... Well, sure it does, but here there are no constraints at all on the type of said value. In particular, it could be a complex data structure like perhaps a list of things to which you would again apply a map / reduce transformation. Going back to the "counting words" example, you could very well have a function which, for a given portion of text, produces a data structure mapping words to occurrence counts, map that over your documents (or chunks of documents, as the case may be) and reduce the results.
In fact, that's exactly what happens in this article by Phil Hagelberg. It's a fun and supremely short example of a MapReduce-word-counting-like computation implemented in Clojure with map and something equivalent to reduce (the (apply + (merge-with ...)) bit -- merge-with is implemented in terms of reduce in clojure.core). The only difference between this and the Wikipedia example is that the objects being counted are URLs instead of arbitrary words -- other than that, you've got a counting words algorithm implemented with map and reduce, MapReduce-style, right there. The reason why it might not fully qualify as being an instance of MapReduce is that there's no complex distribution of workloads involved. It's all happening on a single box... albeit on all the CPUs the box provides.
For in-depth treatment of the reduce function -- also known as fold -- see Graham Hutton's A tutorial on the universality and expressiveness of fold. It's Haskell based, but should be readable even if you don't know the language, as long as you're willing to look up a Haskell thing or two as you go... Things like ++ = list concatenation, no deep Haskell magic.
Using the word count example, the original functional map() would take a set of documents, optionally distribute subsets of that set, and for each document emit a single value representing the number of words (or a particular word's occurrences) in the document. A functional reduce() would then add up the global counts for all documents, one for each document. So you get a total count (either of all words or a particular word).
In MapReduce, the map would emit a (word, count) pair for each word in each document. A MapReduce reduce() would then add up the count of each word in each document without mixing them into a single pile. So you get a list of words paired with their counts.
MapReduce is a framework built around splitting a computation into parallelizable mappers and reducers. It builds on the familiar idiom of map and reduce - if you can structure your tasks such that they can be performed by independent mappers and reducers, then you can write it in a way which takes advantage of a MapReduce framework.
Imagine a Python interpreter which recognized tasks which could be computed independently, and farmed them out to mapper or reducer nodes. If you wrote
reduce(lambda x, y: x+y, map(int, ['1', '2', '3']))
or
sum([int(x) for x in ['1', '2', '3']])
you would be using functional map and reduce methods in a MapReduce framework. With current MapReduce frameworks, there's a lot more plumbing involved, but it's the same concept.

What is (functional) reactive programming?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I've read the Wikipedia article on reactive programming. I've also read the small article on functional reactive programming. The descriptions are quite abstract.
What does functional reactive programming (FRP) mean in practice?
What does reactive programming (as opposed to non-reactive programming?) consist of?
My background is in imperative/OO languages, so an explanation that relates to this paradigm would be appreciated.
If you want to get a feel for FRP, you could start with the old Fran tutorial from 1998, which has animated illustrations. For papers, start with Functional Reactive Animation and then follow up on links on the publications link on my home page and the FRP link on the Haskell wiki.
Personally, I like to think about what FRP means before addressing how it might be implemented.
(Code without a specification is an answer without a question and thus "not even wrong".)
So I don't describe FRP in representation/implementation terms as Thomas K does in another answer (graphs, nodes, edges, firing, execution, etc).
There are many possible implementation styles, but no implementation says what FRP is.
I do resonate with Laurence G's simple description that FRP is about "datatypes that represent a value 'over time' ".
Conventional imperative programming captures these dynamic values only indirectly, through state and mutations.
The complete history (past, present, future) has no first class representation.
Moreover, only discretely evolving values can be (indirectly) captured, since the imperative paradigm is temporally discrete.
In contrast, FRP captures these evolving values directly and has no difficulty with continuously evolving values.
FRP is also unusual in that it is concurrent without running afoul of the theoretical & pragmatic rats' nest that plagues imperative concurrency.
Semantically, FRP's concurrency is fine-grained, determinate, and continuous.
(I'm talking about meaning, not implementation. An implementation may or may not involve concurrency or parallelism.)
Semantic determinacy is very important for reasoning, both rigorous and informal.
While concurrency adds enormous complexity to imperative programming (due to nondeterministic interleaving), it is effortless in FRP.
So, what is FRP?
You could have invented it yourself.
Start with these ideas:
Dynamic/evolving values (i.e., values "over time") are first class values in themselves. You can define them and combine them, pass them into & out of functions. I called these things "behaviors".
Behaviors are built up out of a few primitives, like constant (static) behaviors and time (like a clock), and then with sequential and parallel combination. n behaviors are combined by applying an n-ary function (on static values), "point-wise", i.e., continuously over time.
To account for discrete phenomena, have another type (family) of "events", each of which has a stream (finite or infinite) of occurrences. Each occurrence has an associated time and value.
To come up with the compositional vocabulary out of which all behaviors and events can be built, play with some examples. Keep deconstructing into pieces that are more general/simple.
So that you know you're on solid ground, give the whole model a compositional foundation, using the technique of denotational semantics, which just means that (a) each type has a corresponding simple & precise mathematical type of "meanings", and (b) each primitive and operator has a simple & precise meaning as a function of the meanings of the constituents.
Never, ever mix implementation considerations into your exploration process. If this description is gibberish to you, consult (a) Denotational design with type class morphisms, (b) Push-pull functional reactive programming (ignoring the implementation bits), and (c) the Denotational Semantics Haskell wikibooks page. Beware that denotational semantics has two parts, from its two founders Christopher Strachey and Dana Scott: the easier & more useful Strachey part and the harder and less useful (for software design) Scott part.
If you stick with these principles, I expect you'll get something more-or-less in the spirit of FRP.
Where did I get these principles? In software design, I always ask the same question: "what does it mean?".
Denotational semantics gave me a precise framework for this question, and one that fits my aesthetics (unlike operational or axiomatic semantics, both of which leave me unsatisfied).
So I asked myself what is behavior?
I soon realized that the temporally discrete nature of imperative computation is an accommodation to a particular style of machine, rather than a natural description of behavior itself.
The simplest precise description of behavior I can think of is simply "function of (continuous) time", so that's my model.
Delightfully, this model handles continuous, deterministic concurrency with ease and grace.
It's been quite a challenge to implement this model correctly and efficiently, but that's another story.
In pure functional programming, there are no side-effects. For many types of software (for example, anything with user interaction) side-effects are necessary at some level.
One way to get side-effect like behavior while still retaining a functional style is to use functional reactive programming. This is the combination of functional programming, and reactive programming. (The Wikipedia article you linked to is about the latter.)
The basic idea behind reactive programming is that there are certain datatypes that represent a value "over time". Computations that involve these changing-over-time values will themselves have values that change over time.
For example, you could represent the mouse coordinates as a pair of integer-over-time values. Let's say we had something like (this is pseudo-code):
x = <mouse-x>;
y = <mouse-y>;
At any moment in time, x and y would have the coordinates of the mouse. Unlike non-reactive programming, we only need to make this assignment once, and the x and y variables will stay "up to date" automatically. This is why reactive programming and functional programming work so well together: reactive programming removes the need to mutate variables while still letting you do a lot of what you could accomplish with variable mutations.
If we then do some computations based on this the resulting values will also be values that change over time. For example:
minX = x - 16;
minY = y - 16;
maxX = x + 16;
maxY = y + 16;
In this example, minX will always be 16 less than the x coordinate of the mouse pointer. With reactive-aware libraries you could then say something like:
rectangle(minX, minY, maxX, maxY)
And a 32x32 box will be drawn around the mouse pointer and will track it wherever it moves.
Here is a pretty good paper on functional reactive programming.
An easy way of reaching a first intuition about what it's like is to imagine your program is a spreadsheet and all of your variables are cells. If any of the cells in a spreadsheet change, any cells that refer to that cell change as well. It's just the same with FRP. Now imagine that some of the cells change on their own (or rather, are taken from the outside world): in a GUI situation, the position of the mouse would be a good example.
That necessarily misses out rather a lot. The metaphor breaks down pretty fast when you actually use a FRP system. For one, there are usually attempts to model discrete events as well (e.g. the mouse being clicked). I'm only putting this here to give you an idea what it's like.
To me it is about 2 different meanings of symbol =:
In math x = sin(t) means, that x is different name for sin(t). So writing x + y is the same thing as sin(t) + y. Functional reactive programming is like math in this respect: if you write x + y, it is computed with whatever the value of t is at the time it's used.
In C-like programming languages (imperative languages), x = sin(t) is an assignment: it means that x stores the value of sin(t) taken at the time of the assignment.
OK, from background knowledge and from reading the Wikipedia page to which you pointed, it appears that reactive programming is something like dataflow computing but with specific external "stimuli" triggering a set of nodes to fire and perform their computations.
This is pretty well suited to UI design, for example, in which touching a user interface control (say, the volume control on a music playing application) might need to update various display items and the actual volume of audio output. When you modify the volume (a slider, let's say) that would correspond to modifying the value associated with a node in a directed graph.
Various nodes having edges from that "volume value" node would automatically be triggered and any necessary computations and updates would naturally ripple through the application. The application "reacts" to the user stimulus. Functional reactive programming would just be the implementation of this idea in a functional language, or generally within a functional programming paradigm.
For more on "dataflow computing", search for those two words on Wikipedia or using your favorite search engine. The general idea is this: the program is a directed graph of nodes, each performing some simple computation. These nodes are connected to each other by graph links that provide the outputs of some nodes to the inputs of others.
When a node fires or performs its calculation, the nodes connected to its outputs have their corresponding inputs "triggered" or "marked". Any node having all inputs triggered/marked/available automatically fires. The graph might be implicit or explicit depending on exactly how reactive programming is implemented.
Nodes can be looked at as firing in parallel, but often they are executed serially or with limited parallelism (for example, there may be a few threads executing them). A famous example was the Manchester Dataflow Machine, which (IIRC) used a tagged data architecture to schedule execution of nodes in the graph through one or more execution units. Dataflow computing is fairly well suited to situations in which triggering computations asynchronously giving rise to cascades of computations works better than trying to have execution be governed by a clock (or clocks).
Reactive programming imports this "cascade of execution" idea and seems to think of the program in a dataflow-like fashion but with the proviso that some of the nodes are hooked to the "outside world" and the cascades of execution are triggered when these sensory-like nodes change. Program execution would then look like something analogous to a complex reflex arc. The program may or may not be basically sessile between stimuli or may settle into a basically sessile state between stimuli.
"non-reactive" programming would be programming with a very different view of the flow of execution and relationship to external inputs. It's likely to be somewhat subjective, since people will likely be tempted to say anything that responds to external inputs "reacts" to them. But looking at the spirit of the thing, a program that polls an event queue at a fixed interval and dispatches any events found to functions (or threads) is less reactive (because it only attends to user input at a fixed interval). Again, it's the spirit of the thing here: one can imagine putting a polling implementation with a fast polling interval into a system at a very low level and program in a reactive fashion on top of it.
After reading many pages about FRP I finally came across this enlightening writing about FRP, it finally made me understand what FRP really is all about.
I quote below Heinrich Apfelmus (author of reactive banana).
What is the essence of functional reactive programming?
A common answer would be that “FRP is all about describing a system in
terms of time-varying functions instead of mutable state”, and that
would certainly not be wrong. This is the semantic viewpoint. But in
my opinion, the deeper, more satisfying answer is given by the
following purely syntactic criterion:
The essence of functional reactive programming is to specify the dynamic behavior of a value completely at the time of declaration.
For instance, take the example of a counter: you have two buttons
labelled “Up” and “Down” which can be used to increment or decrement
the counter. Imperatively, you would first specify an initial value
and then change it whenever a button is pressed; something like this:
counter := 0 -- initial value
on buttonUp = (counter := counter + 1) -- change it later
on buttonDown = (counter := counter - 1)
The point is that at the time of declaration, only the initial value
for the counter is specified; the dynamic behavior of counter is
implicit in the rest of the program text. In contrast, functional
reactive programming specifies the whole dynamic behavior at the time
of declaration, like this:
counter :: Behavior Int
counter = accumulate ($) 0
(fmap (+1) eventUp
`union` fmap (subtract 1) eventDown)
Whenever you want to understand the dynamics of counter, you only have
to look at its definition. Everything that can happen to it will
appear on the right-hand side. This is very much in contrast to the
imperative approach where subsequent declarations can change the
dynamic behavior of previously declared values.
So, in my understanding an FRP program is a set of equations:
j is discrete: 1,2,3,4...
f depends on t so this incorporates the possiblilty to model external stimuli
all state of the program is encapsulated in variables x_i
The FRP library takes care of progressing time, in other words, taking j to j+1.
I explain these equations in much more detail in this video.
EDIT:
About 2 years after the original answer, recently I came to the conclusion that FRP implementations have another important aspect. They need to (and usually do) solve an important practical problem: cache invalidation.
The equations for x_i-s describe a dependency graph. When some of the x_i changes at time j then not all the other x_i' values at j+1 need to be updated, so not all the dependencies need to be recalculated because some x_i' might be independent from x_i.
Furthermore, x_i-s that do change can be incrementally updated. For example let's consider a map operation f=g.map(_+1) in Scala, where f and g are List of Ints. Here f corresponds to x_i(t_j) and g is x_j(t_j). Now if I prepend an element to g then it would be wasteful to carry out the map operation for all the elements in g. Some FRP implementations (for example reflex-frp) aim to solve this problem. This problem is also known as incremental computing.
In other words, behaviours (the x_i-s ) in FRP can be thought as cache-ed computations. It is the task of the FRP engine to efficiently invalidate and recompute these cache-s (the x_i-s) if some of the f_i-s do change.
The paper Simply efficient functional reactivity by Conal Elliott (direct PDF, 233 KB) is a fairly good introduction. The corresponding library also works.
The paper is now superceded by another paper, Push-pull functional reactive programming (direct PDF, 286 KB).
Disclaimer: my answer is in the context of rx.js - a 'reactive programming' library for Javascript.
In functional programming, instead of iterating through each item of a collection, you apply higher order functions (HoFs) to the collection itself. So the idea behind FRP is that instead of processing each individual event, create a stream of events (implemented with an observable*) and apply HoFs to that instead. This way you can visualize the system as data pipelines connecting publishers to subscribers.
The major advantages of using an observable are:
i) it abstracts away state from your code, e.g., if you want the event handler to get fired only for every 'n'th event, or stop firing after the first 'n' events, or start firing only after the first 'n' events, you can just use the HoFs (filter, takeUntil, skip respectively) instead of setting, updating and checking counters.
ii) it improves code locality - if you have 5 different event handlers changing the state of a component, you can merge their observables and define a single event handler on the merged observable instead, effectively combining 5 event handlers into 1. This makes it very easy to reason about what events in your entire system can affect a component, since it's all present in a single handler.
An Observable is the dual of an Iterable.
An Iterable is a lazily consumed sequence - each item is pulled by the iterator whenever it wants to use it, and hence the enumeration is driven by the consumer.
An observable is a lazily produced sequence - each item is pushed to the observer whenever it is added to the sequence, and hence the enumeration is driven by the producer.
Dude, this is a freaking brilliant idea! Why didn't I find out about this back in 1998? Anyway, here's my interpretation of the Fran tutorial. Suggestions are most welcome, I am thinking about starting a game engine based on this.
import pygame
from pygame.surface import Surface
from pygame.sprite import Sprite, Group
from pygame.locals import *
from time import time as epoch_delta
from math import sin, pi
from copy import copy
pygame.init()
screen = pygame.display.set_mode((600,400))
pygame.display.set_caption('Functional Reactive System Demo')
class Time:
def __float__(self):
return epoch_delta()
time = Time()
class Function:
def __init__(self, var, func, phase = 0., scale = 1., offset = 0.):
self.var = var
self.func = func
self.phase = phase
self.scale = scale
self.offset = offset
def copy(self):
return copy(self)
def __float__(self):
return self.func(float(self.var) + float(self.phase)) * float(self.scale) + float(self.offset)
def __int__(self):
return int(float(self))
def __add__(self, n):
result = self.copy()
result.offset += n
return result
def __mul__(self, n):
result = self.copy()
result.scale += n
return result
def __inv__(self):
result = self.copy()
result.scale *= -1.
return result
def __abs__(self):
return Function(self, abs)
def FuncTime(func, phase = 0., scale = 1., offset = 0.):
global time
return Function(time, func, phase, scale, offset)
def SinTime(phase = 0., scale = 1., offset = 0.):
return FuncTime(sin, phase, scale, offset)
sin_time = SinTime()
def CosTime(phase = 0., scale = 1., offset = 0.):
phase += pi / 2.
return SinTime(phase, scale, offset)
cos_time = CosTime()
class Circle:
def __init__(self, x, y, radius):
self.x = x
self.y = y
self.radius = radius
#property
def size(self):
return [self.radius * 2] * 2
circle = Circle(
x = cos_time * 200 + 250,
y = abs(sin_time) * 200 + 50,
radius = 50)
class CircleView(Sprite):
def __init__(self, model, color = (255, 0, 0)):
Sprite.__init__(self)
self.color = color
self.model = model
self.image = Surface([model.radius * 2] * 2).convert_alpha()
self.rect = self.image.get_rect()
pygame.draw.ellipse(self.image, self.color, self.rect)
def update(self):
self.rect[:] = int(self.model.x), int(self.model.y), self.model.radius * 2, self.model.radius * 2
circle_view = CircleView(circle)
sprites = Group(circle_view)
running = True
while running:
for event in pygame.event.get():
if event.type == QUIT:
running = False
if event.type == KEYDOWN and event.key == K_ESCAPE:
running = False
screen.fill((0, 0, 0))
sprites.update()
sprites.draw(screen)
pygame.display.flip()
pygame.quit()
In short: If every component can be treated like a number, the whole system can be treated like a math equation, right?
Paul Hudak's book, The Haskell School of Expression, is not only a fine introduction to Haskell, but it also spends a fair amount of time on FRP. If you're a beginner with FRP, I highly recommend it to give you a sense of how FRP works.
There is also what looks like a new rewrite of this book (released 2011, updated 2014), The Haskell School of Music.
According to the previous answers, it seems that mathematically, we simply think in a higher order. Instead of thinking a value x having type X, we think of a function x: T → X, where T is the type of time, be it the natural numbers, the integers or the continuum. Now when we write y := x + 1 in the programming language, we actually mean the equation y(t) = x(t) + 1.
Acts like a spreadsheet as noted. Usually based on an event driven framework.
As with all "paradigms", it's newness is debatable.
From my experience of distributed flow networks of actors, it can easily fall prey to a general problem of state consistency across the network of nodes i.e. you end up with a lot of oscillation and trapping in strange loops.
This is hard to avoid as some semantics imply referential loops or broadcasting, and can be quite chaotic as the network of actors converges (or not) on some unpredictable state.
Similarly, some states may not be reached, despite having well-defined edges, because the global state steers away from the solution. 2+2 may or may not get to be 4 depending on when the 2's became 2, and whether they stayed that way. Spreadsheets have synchronous clocks and loop detection. Distributed actors generally don't.
All good fun :).
I found this nice video on the Clojure subreddit about FRP. It is pretty easy to understand even if you don't know Clojure.
Here's the video: http://www.youtube.com/watch?v=nket0K1RXU4
Here's the source the video refers to in the 2nd half: https://github.com/Cicayda/yolk-examples/blob/master/src/yolk_examples/client/autocomplete.cljs
This article by Andre Staltz is the best and clearest explanation I've seen so far.
Some quotes from the article:
Reactive programming is programming with asynchronous data streams.
On top of that, you are given an amazing toolbox of functions to combine, create and filter any of those streams.
Here's an example of the fantastic diagrams that are a part of the article:
It is about mathematical data transformations over time (or ignoring time).
In code this means functional purity and declarative programming.
State bugs are a huge problem in the standard imperative paradigm. Various bits of code may change some shared state at different "times" in the programs execution. This is hard to deal with.
In FRP you describe (like in declarative programming) how data transforms from one state to another and what triggers it. This allows you to ignore time because your function is simply reacting to its inputs and using their current values to create a new one. This means that the state is contained in the graph (or tree) of transformation nodes and is functionally pure.
This massively reduces complexity and debugging time.
Think of the difference between A=B+C in math and A=B+C in a program.
In math you are describing a relationship that will never change. In a program, its says that "Right now" A is B+C. But the next command might be B++ in which case A is not equal to B+C. In math or declarative programming A will always be equal to B+C no matter what point in time you ask.
So by removing the complexities of shared state and changing values over time. You program is much easier to reason about.
An EventStream is an EventStream + some transformation function.
A Behaviour is an EventStream + Some value in memory.
When the event fires the value is updated by running the transformation function. The value that this produces is stored in the behaviours memory.
Behaviours can be composed to produce new behaviours that are a transformation on N other behaviours. This composed value will recalculate as the input events (behaviours) fire.
"Since observers are stateless, we often need several of them to simulate a state machine as in the drag example. We have to save the state where it is accessible to all involved observers such as in the variable path above."
Quote from - Deprecating The Observer Pattern
http://infoscience.epfl.ch/record/148043/files/DeprecatingObserversTR2010.pdf
The short and clear explanation about Reactive Programming appears on Cyclejs - Reactive Programming, it uses simple and visual samples.
A [module/Component/object] is reactive means it is fully responsible
for managing its own state by reacting to external events.
What is the benefit of this approach? It is Inversion of Control,
mainly because [module/Component/object] is responsible for itself, improving encapsulation using private methods against public ones.
It is a good startup point, not a complete source of knowlege. From there you could jump to more complex and deep papers.
Check out Rx, Reactive Extensions for .NET. They point out that with IEnumerable you are basically 'pulling' from a stream. Linq queries over IQueryable/IEnumerable are set operations that 'suck' the results out of a set. But with the same operators over IObservable you can write Linq queries that 'react'.
For example, you could write a Linq query like
(from m in MyObservableSetOfMouseMovements
where m.X<100 and m.Y<100
select new Point(m.X,m.Y)).
and with the Rx extensions, that's it: You have UI code that reacts to the incoming stream of mouse movements and draws whenever you're in the 100,100 box...
FRP is a combination of Functional programming(programming paradigm built upon the idea of everything is a function) and reactive programming paradigm (built upon the idea that everything is a stream(observer and observable philosophy)). It is supposed to be the best of the worlds.
Check out Andre Staltz post on reactive programming to start with.

Real-world examples of recursion [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
What are real-world problems where a recursive approach is the natural solution besides depth-first search (DFS)?
(I don't consider Tower of Hanoi, Fibonacci number, or factorial real-world problems. They are a bit contrived in my mind.)
A real world example of recursion
How about anything involving a directory structure in the file system. Recursively finding files, deleting files, creating directories, etc.
Here is a Java implementation that recursively prints out the content of a directory and its sub-directories.
import java.io.File;
public class DirectoryContentAnalyserOne implements DirectoryContentAnalyser {
private static StringBuilder indentation = new StringBuilder();
public static void main (String args [] ){
// Here you pass the path to the directory to be scanned
getDirectoryContent("C:\\DirOne\\DirTwo\\AndSoOn");
}
private static void getDirectoryContent(String filePath) {
File currentDirOrFile = new File(filePath);
if ( !currentDirOrFile.exists() ){
return;
}
else if ( currentDirOrFile.isFile() ){
System.out.println(indentation + currentDirOrFile.getName());
return;
}
else{
System.out.println("\n" + indentation + "|_" +currentDirOrFile.getName());
indentation.append(" ");
for ( String currentFileOrDirName : currentDirOrFile.list()){
getPrivateDirectoryContent(currentDirOrFile + "\\" + currentFileOrDirName);
}
if (indentation.length() - 3 > 3 ){
indentation.delete(indentation.length() - 3, indentation.length());
}
}
}
}
There are lots of mathy examples here, but you wanted a real world example, so with a bit of thinking, this is possibly the best I can offer:
You find a person who has contracted a given contageous infection, which is non fatal, and fixes itself quickly( Type A) , Except for one in 5 people ( We'll call these type B ) who become permanently infected with it and shows no symptoms and merely acts a spreader.
This creates quite annoying waves of havoc when ever type B infects a multitude of type A.
Your task is to track down all the type Bs and immunise them to stop the backbone of the disease. Unfortunately tho, you cant administer a nationwide cure to all, because the people who are typeAs are also deadly allergic to the cure that works for type B.
The way you would do this, would be social discovery, given an infected person(Type A), choose all their contacts in the last week, marking each contact on a heap. When you test a person is infected, add them to the "follow up" queue. When a person is a type B, add them to the "follow up" at the head ( because you want to stop this fast ).
After processing a given person, select the person from the front of the queue and apply immunization if needed. Get all their contacts previously unvisited, and then test to see if they're infected.
Repeat until the queue of infected people becomes 0, and then wait for another outbreak..
( Ok, this is a bit iterative, but its an iterative way of solving a recursive problem, in this case, breadth first traversal of a population base trying to discover likely paths to problems, and besides, iterative solutions are often faster and more effective, and I compulsively remove recursion everywhere so much its become instinctive. .... dammit! )
Quicksort, merge sort, and most other N-log N sorts.
Matt Dillard's example is good. More generally, any walking of a tree can generally be handled by recursion very easily. For instance, compiling parse trees, walking over XML or HTML, etc.
Recursion is often used in implementations of the Backtracking algorithm. For a "real-world" application of this, how about a Sudoku solver?
Recursion is appropriate whenever a problem can be solved by dividing it into sub-problems, that can use the same algorithm for solving them. Algorithms on trees and sorted lists are a natural fit. Many problems in computational geometry (and 3D games) can be solved recursively using binary space partitioning (BSP) trees, fat subdivisions, or other ways of dividing the world into sub-parts.
Recursion is also appropriate when you are trying to guarantee the correctness of an algorithm. Given a function that takes immutable inputs and returns a result that is a combination of recursive and non-recursive calls on the inputs, it's usually easy to prove the function is correct (or not) using mathematical induction. It's often intractable to do this with an iterative function or with inputs that may mutate. This can be useful when dealing with financial calculations and other applications where correctness is very important.
Surely that many compilers out there use recursion heavily. Computer languages are inherently recursive themselves (i.e., you can embed 'if' statements inside other 'if' statements, etc.).
Disabling/setting read-only for all children controls in a container control. I needed to do this because some of the children controls were containers themselves.
public static void SetReadOnly(Control ctrl, bool readOnly)
{
//set the control read only
SetControlReadOnly(ctrl, readOnly);
if (ctrl.Controls != null && ctrl.Controls.Count > 0)
{
//recursively loop through all child controls
foreach (Control c in ctrl.Controls)
SetReadOnly(c, readOnly);
}
}
People often sort stacks of documents using a recursive method. For example, imagine you are sorting 100 documents with names on them. First place documents into piles by the first letter, then sort each pile.
Looking up words in the dictionary is often performed by a binary-search-like technique, which is recursive.
In organizations, bosses often give commands to department heads, who in turn give commands to managers, and so on.
Famous Eval/Apply cycle from SICP
(source: mit.edu)
Here is the definition of eval:
(define (eval exp env)
(cond ((self-evaluating? exp) exp)
((variable? exp) (lookup-variable-value exp env))
((quoted? exp) (text-of-quotation exp))
((assignment? exp) (eval-assignment exp env))
((definition? exp) (eval-definition exp env))
((if? exp) (eval-if exp env))
((lambda? exp)
(make-procedure (lambda-parameters exp)
(lambda-body exp)
env))
((begin? exp)
(eval-sequence (begin-actions exp) env))
((cond? exp) (eval (cond->if exp) env))
((application? exp)
(apply (eval (operator exp) env)
(list-of-values (operands exp) env)))
(else
(error "Unknown expression type - EVAL" exp))))
Here is the definition of apply:
(define (apply procedure arguments)
(cond ((primitive-procedure? procedure)
(apply-primitive-procedure procedure arguments))
((compound-procedure? procedure)
(eval-sequence
(procedure-body procedure)
(extend-environment
(procedure-parameters procedure)
arguments
(procedure-environment procedure))))
(else
(error
"Unknown procedure type - APPLY" procedure))))
Here is the definition of eval-sequence:
(define (eval-sequence exps env)
(cond ((last-exp? exps) (eval (first-exp exps) env))
(else (eval (first-exp exps) env)
(eval-sequence (rest-exps exps) env))))
eval -> apply -> eval-sequence -> eval
Recursion is used in things like BSP trees for collision detection in game development (and other similar areas).
Real world requirement I got recently:
Requirement A: Implement this feature after thoroughly understanding Requirement A.
Recursion is applied to problems (situations) where you can break it up (reduce it) into smaller parts, and each part(s) looks similar to the original problem.
Good examples of where things that contain smaller parts similar to itself are:
tree structure (a branch is like a tree)
lists (part of a list is still a list)
containers (Russian dolls)
sequences (part of a sequence looks like the next)
groups of objects (a subgroup is a still a group of objects)
Recursion is a technique to keep breaking the problem down into smaller and smaller pieces, until one of those pieces become small enough to be a piece-of-cake. Of course, after you break them up, you then have to "stitch" the results back together in the right order to form a total solution of your original problem.
Some recursive sorting algorithms, tree-walking algorithms, map/reduce algorithms, divide-and-conquer are all examples of this technique.
In computer programming, most stack-based call-return type languages already have the capabilities built in for recursion: i.e.
break the problem down into smaller pieces ==> call itself on a smaller subset of the original data),
keep track on how the pieces are divided ==> call stack,
stitch the results back ==> stack-based return
Feedback loops in a hierarchical organization.
Top boss tells top executives to collect feedback from everyone in the company.
Each executive gathers his/her direct reports and tells them to gather feedback from their direct reports.
And on down the line.
People with no direct reports -- the leaf nodes in the tree -- give their feedback.
The feedback travels back up the tree with each manager adding his/her own feedback.
Eventually all the feedback makes it back up to the top boss.
This is the natural solution because the recursive method allows filtering at each level -- the collating of duplicates and the removal of offensive feedback. The top boss could send a global email and have each employee report feedback directly back to him/her, but there are the "you can't handle the truth" and the "you're fired" problems, so recursion works best here.
Parsers and compilers may be written in a recursive-descent method. Not the best way to do it, as tools like lex/yacc generate faster and more efficient parsers, but conceptually simple and easy to implement, so they remain common.
I have a system that uses pure tail recursion in a few places to simulate a state machine.
Some great examples of recursion are found in functional programming languages. In functional programming languages (Erlang, Haskell, ML/OCaml/F#, etc.), it's very common to have any list processing use recursion.
When dealing with lists in typical imperative OOP-style languages, it's very common to see lists implemented as linked lists ([item1 -> item2 -> item3 -> item4]). However, in some functional programming languages, you find that lists themselves are implemented recursively, where the "head" of the list points to the first item in the list, and the "tail" points to a list containing the rest of the items ([item1 -> [item2 -> [item3 -> [item4 -> []]]]]). It's pretty creative in my opinion.
This handling of lists, when combined with pattern matching, is VERY powerful. Let's say I want to sum a list of numbers:
let rec Sum numbers =
match numbers with
| [] -> 0
| head::tail -> head + Sum tail
This essentially says "if we were called with an empty list, return 0" (allowing us to break the recursion), else return the value of head + the value of Sum called with the remaining items (hence, our recursion).
For example, I might have a list of URLs, I think break apart all the URLs each URL links to, and then I reduce the total number of links to/from all URLs to generate "values" for a page (an approach that Google takes with PageRank and that you can find defined in the original MapReduce paper). You can do this to generate word counts in a document also. And many, many, many other things as well.
You can extend this functional pattern to any type of MapReduce code where you can taking a list of something, transforming it, and returning something else (whether another list, or some zip command on the list).
XML, or traversing anything that is a tree. Although, to be honest, I pretty much never use recursion in my job.
A "real-world" problem solved by recursion would be nesting dolls. Your function is OpenDoll().
Given a stack of them, you would recursilvey open the dolls, calling OpenDoll() if you will, until you've reached the inner-most doll.
Parsing an XML file.
Efficient search in multi-dimensional spaces. E. g. quad-trees in 2D, oct-trees in 3D, kd-trees, etc.
Hierarchical clustering.
Come to think of it, traversing any hierarchical structure naturally lends itself to recursion.
Template metaprogramming in C++, where there are no loops and recursion is the only way.
Suppose you are building a CMS for a website, where your pages are in a tree structure, with say the root being the home-page.
Suppose also your {user|client|customer|boss} requests that you place a breadcrumb trail on every page to show where you are in the tree.
For any given page n, you'll may want to walk up to the parent of n, and its parent, and so on, recursively to build a list of nodes back up to the root of page tree.
Of course, you're hitting the db several times per page in that example, so you may want to use some SQL aliasing where you look up page-table as a, and page-table again as b, and join a.id with b.parent so you make the database do the recursive joins. It's been a while, so my syntax is probably not helpful.
Then again, you may just want to only calculate this once and store it with the page record, only updating it if you move the page. That'd probably be more efficient.
Anyway, that's my $.02
You have an organization tree that is N levels deep. Several of the nodes are checked, and you want to expand out to only those nodes that have been checked.
This is something that I actually coded.
Its nice and easy with recursion.
In my job we have a system with a generic data structure that can be described as a tree. That means that recursion is a very effective technique to work with the data.
Solving it without recursion would require a lot of unnecessary code. The problem with recursion is that it is not easy to follow what happens. You really have to concentrate when following the flow of execution. But when it works the code is elegant and effective.
Calculations for finance/physics, such as compound averages.
Parsing a tree of controls in Windows Forms or WebForms (.NET Windows Forms / ASP.NET).
The best example I know is quicksort, it is a lot simpler with recursion. Take a look at:
shop.oreilly.com/product/9780596510046.do
www.amazon.com/Beautiful-Code-Leading-Programmers-Practice/dp/0596510047
(Click on the first subtitle under the chapter 3: "The most beautiful code I ever wrote").
Phone and cable companies maintain a model of their wiring topology, which in effect is a large network or graph. Recursion is one way to traverse this model when you want to find all parent or all child elements.
Since recursion is expensive from a processing and memory perspective, this step is commonly only performed when the topology is changed and the result is stored in a modified pre-ordered list format.
Inductive reasoning, the process of concept-formation, is recursive in nature. Your brain does it all the time, in the real world.
Ditto the comment about compilers. The abstract syntax tree nodes naturally lend themselves to recursion. All recursive data structures (linked lists, trees, graphs, etc.) are also more easily handled with recursion. I do think that most of us don't get to use recursion a lot once we are out of school because of the types of real-world problems, but it's good to be aware of it as an option.

Resources