Replacing functions with Table Lookups - functional-programming

I've been watching this MSDN video with Brian Beckman and I'd like to better understand something he says:
Every imperitive programmer goes through this phase of learning that
functions can be replaced with table lookups
Now, I'm a C# programmer who never went to university, so perhaps somewhere along the line I missed out on something everyone else learned to understand.
What does Brian mean by:
functions can be replaced with table lookups
Are there practical examples of this being done and does it apply to all functions? He gives the example of the sin function, which I can make sense of, but how do I make sense of this in more general terms?

Brian just showed that the functions are data too. Functions in general are just a mapping of one set to another: y = f(x) is mapping of set {x} to set {y}: f:X->Y. The tables are mappings as well: [x1, x2, ..., xn] -> [y1, y2, ..., yn].
If function operates on finite set (this is the case in programming) then it's can be replaced with a table which represents that mapping. As Brian mentioned, every imperative programmer goes through this phase of understanding that the functions can be replaced with the table lookups just for performance reason.
But it doesn't mean that all functions easily can or should be replaced with the tables. It only means that you theoretically can do that for every function. So the conclusion would be that the functions are data because tables are (in the context of programming of course).

There is a lovely trick in Mathematica that creates a table as a side-effect of evaluating function-calls-as-rewrite-rules. Consider the classic slow-fibonacci
fib[1] = 1
fib[2] = 1
fib[n_] := fib[n-1] + fib[n-2]
The first two lines create table entries for the inputs 1 and 2. This is exactly the same as saying
fibTable = {};
fibTable[1] = 1;
fibTable[2] = 1;
in JavaScript. The third line of Mathematica says "please install a rewrite rule that will replace any occurrence of fib[n_], after substituting the pattern variable n_ with the actual argument of the occurrence, with fib[n-1] + fib[n-2]." The rewriter will iterate this procedure, and eventually produce the value of fib[n] after an exponential number of rewrites. This is just like the recursive function-call form that we get in JavaScript with
function fib(n) {
var result = fibTable[n] || ( fib(n-1) + fib(n-2) );
return result;
}
Notice it checks the table first for the two values we have explicitly stored before making the recursive calls. The Mathematica evaluator does this check automatically, because the order of presentation of the rules is important -- Mathematica checks the more specific rules first and the more general rules later. That's why Mathematica has two assignment forms, = and :=: the former is for specific rules whose right-hand sides can be evaluated at the time the rule is defined; the latter is for general rules whose right-hand sides must be evaluated when the rule is applied.
Now, in Mathematica, if we say
fib[4]
it gets rewritten to
fib[3] + fib[2]
then to
fib[2] + fib[1] + 1
then to
1 + 1 + 1
and finally to 3, which does not change on the next rewrite. You can imagine that if we say fib[35], we will generate enormous expressions, fill up memory, and melt the CPU. But the trick is to replace the final rewrite rule with the following:
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
This says "please replace every occurrence of fib[n_] with an expression that will install a new specific rule for the value of fib[n] and also produce the value." This one runs much faster because it expands the rule-base -- the table of values! -- at run time.
We can do likewise in JavaScript
function fib(n) {
var result = fibTable[n] || ( fib(n-1) + fib(n-2) );
fibTable[n] = result;
return result;
}
This runs MUCH faster than the prior definition of fib.
This is called "automemoization" [sic -- not "memorization" but "memoization" as in creating a memo for yourself].
Of course, in the real world, you must manage the sizes of the tables that get created. To inspect the tables in Mathematica, do
DownValues[fib]
To inspect them in JavaScript, do just
fibTable
in a REPL such as that supported by Node.JS.

In the context of functional programming, there is the concept of referential transparency. A function that is referentially transparent can be replaced with its value for any given argument (or set of arguments), without changing the behaviour of the program.
Referential Transparency
For example, consider a function F that takes 1 argument, n. F is referentially transparent, so F(n) can be replaced with the value of F evaluated at n. It makes no difference to the program.
In C#, this would look like:
public class Square
{
public static int apply(int n)
{
return n * n;
}
public static void Main()
{
//Should print 4
Console.WriteLine(Square.apply(2));
}
}
(I'm not very familiar with C#, coming from a Java background, so you'll have to forgive me if this example isn't quite syntactically correct).
It's obvious here that the function apply cannot have any other value than 4 when called with an argument of 2, since it's just returning the square of its argument. The value of the function only depends on its argument, n; in other words, referential transparency.
I ask you, then, what the difference is between Console.WriteLine(Square.apply(2)) and Console.WriteLine(4). The answer is, there's no difference at all, for all intents are purposes. We could go through the entire program, replacing all instances of Square.apply(n) with the value returned by Square.apply(n), and the results would be the exact same.
So what did Brian Beckman mean with his statement about replacing function calls with a table lookup? He was referring to this property of referentially transparent functions. If Square.apply(2) can be replaced with 4 with no impact on program behaviour, then why not just cache the values when the first call is made, and put it in a table indexed by the arguments to the function. A lookup table for values of Square.apply(n) would look somewhat like this:
n: 0 1 2 3 4 5 ...
Square.apply(n): 0 1 4 9 16 25 ...
And for any call to Square.apply(n), instead of calling the function, we can simply find the cached value for n in the table, and replace the function call with this value. It's fairly obvious that this will most likely bring about a large speed increase in the program.

Related

Perl 6 calculate the average of an int array at once using reduce

I'm trying to calculate the average of an integer array using the reduce function in one step. I can't do this:
say (reduce {($^a + $^b)}, <1 2 3>) / <1 2 3>.elems;
because it calculates the average in 2 separate pieces.
I need to do it like:
say reduce {($^a + $^b) / .elems}, <1 2 3>;
but it doesn't work of course.
How to do it in one step? (Using map or some other function is welcomed.)
TL;DR This answer starts with an idiomatic way to write equivalent code before discussing P6 flavored "tacit" programming and increasing brevity. I've also added "bonus" footnotes about the hyperoperation Håkon++ used in their first comment on your question.5
Perhaps not what you want, but an initial idiomatic solution
We'll start with a simple solution.1
P6 has built in routines2 that do what you're asking. Here's a way to do it using built in subs:
say { sum($_) / elems($_) }(<1 2 3>); # 2
And here it is using corresponding3 methods:
say { .sum / .elems }(<1 2 3>); # 2
What about "functional programming"?
First, let's replace .sum with an explicit reduction:
.reduce(&[+]) / .elems
When & is used at the start of an expression in P6 you know the expression refers to a Callable as a first class citizen.
A longhand way to refer to the infix + operator as a function value is &infix:<+>. The shorthand way is &[+].
As you clearly know, the reduce routine takes a binary operation as an argument and applies it to a list of values. In method form (invocant.reduce) the "invocant" is the list.
The above code calls two methods -- .reduce and .elems -- that have no explicit invocant. This is a form of "tacit" programming; methods written this way implicitly (or "tacitly") use $_ (aka "the topic" or simply "it") as their invocant.
Topicalizing (explicitly establishing what "it" is)
given binds a single value to $_ (aka "it") for a single statement or block.
(That's all given does. Many other keywords also topicalize but do something else too. For example, for binds a series of values to $_, not just one.)
Thus you could write:
say .reduce(&[+]) / .elems given <1 2 3>; # 2
Or:
$_ = <1 2 3>;
say .reduce(&[+]) / .elems; # 2
But given that your focus is FP, there's another way that you should know.
Blocks of code and "it"
First, wrap the code in a block:
{ .reduce(&[+]) / .elems }
The above is a Block, and thus a lambda. Lambdas without a signature get a default signature that accepts one optional argument.
Now we could again use given, for example:
say do { .reduce(&[+]) / .elems } given <1 2 3>; # 2
But we can also just use ordinary function call syntax:
say { .reduce(&[+]) / .elems }(<1 2 3>)
Because a postfix (...) calls the Callable on its left, and because in the above case one argument is passed in the parens to a block that expects one argument, the net result is the same as the do4 and the given in the prior line of code.
Brevity with built ins
Here's another way to write it:
<1 2 3>.&{.sum/.elems}.say; #2
This calls a block as if it were a method. Imo that's still eminently readable, especially if you know P6 basics.
Or you can start to get silly:
<1 2 3>.&{.sum/$_}.say; #2
This is still readable if you know P6. The / is a numeric (division) operator. Numeric operators coerce their operands to be numeric. In the above $_ is bound to <1 2 3> which is a list. And in Perls, a collection in numeric context is its number of elements.
Changing P6 to suit you
So far I've stuck with standard P6.
You can of course write subs or methods and name them using any Unicode letters. If you want single letter aliases for sum and elems then go ahead:
my (&s, &e) = &sum, &elems;
But you can also extend or change the language as you see fit. For example, you can create user defined operators:
#| LHS ⊛ RHS.
#| LHS is an arbitrary list of input values.
#| RHS is a list of reducer function, then functions to be reduced.
sub infix:<⊛> (#lhs, *#rhs (&reducer, *#fns where *.all ~~ Callable)) {
reduce &reducer, #fns».(#lhs)
}
say <1 2 3> ⊛ (&[/], &sum, &elems); # 2
I won't bother to explain this for now. (Feel free to ask questions in the comments.) My point is simply to highlight that you can introduce arbitrary (prefix, infix, circumfix, etc.) operators.
And if custom operators aren't enough you can change any of the rest of the syntax. cf "braid".
Footnotes
1 This is how I would normally write code to do the computation asked for in the question. #timotimo++'s comment nudged me to alter my presentation to start with that, and only then shift gears to focus on a more FPish solution.
2 In P6 all built in functions are referred to by the generic term "routine" and are instances of a sub-class of Routine -- typically a Sub or Method.
3 Not all built in sub routines have correspondingly named method routines. And vice-versa. Conversely, sometimes there are correspondingly named routines but they don't work exactly the same way (with the most common difference being whether or not the first argument to the sub is the same as the "invocant" in the method form.) In addition, you can call a subroutine as if it were a method using the syntax .&foo for a named Sub or .&{ ... } for an anonymous Block, or call a method foo in a way that looks rather like a subroutine call using the syntax foo invocant: or foo invocant: arg2, arg3 if it has arguments beyond the invocant.
4 If a block is used where it should obviously be invoked then it is. If it's not invoked then you can use an explicit do statement prefix to invoke it.
5 Håkon's first comment on your question used "hyperoperation". With just one easy to recognize and remember "metaop" (for unary operations) or a pair of them (for binary operations), hyperoperations distribute an operation to all the "leaves"6 of a data structure (for an unary) or create a new one based on pairing up the "leaves" of a pair of data structures (for binary operations). NB. Hyperoperations are done in parallel7.
6 What is a "leaf" for a hyperoperation is determined by a combination of the operation being applied (see the is nodal trait) and whether a particular element is Iterable.
7 Hyperoperation is applied in parallel, at least semantically. Hyperoperation assumes8 that the operations on the "leaves" have no mutually interfering side-effects -- that is to say, that any side effect when applying the operation to one "leaf" can safely be ignored in respect to applying the operation to any another "leaf".
8 By using a hyperoperation the developer is declaring that the assumption of no meaningful side-effects is correct. The compiler will act on the basis it is, but will not check that it is true. In the safety sense it's like a loop with a condition. The compiler will follow the dev's instructions, even if the result is an infinite loop.
Here is an example using given and the reduction meta operator:
given <1 2 3> { say ([+] $_)/$_.elems } ;

Parameters of function in Julia

Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified?  This requires, if we want to use it anyway, to go through a very artificial process, by representing these data in the form of a ridiculous single element table.
Ada, which had the same kind of limitation, abandoned it in its 2012 redesign to the great satisfaction of its users. A small keyword (like out in Ada) could very well indicate that the possibility of keeping the modifications of a parameter at the output is required.
From my experience in Julia it is useful to understand the difference between a value and a binding.
Values
Each value in Julia has a concrete type and location in memory. Value can be mutable or immutable. In particular when you define your own composite type you can decide if objects of this type should be mutable (mutable struct) or immutable (struct).
Of course Julia has in-built types and some of them are mutable (e.g. arrays) and other are immutable (e.g. numbers, strings). Of course there are design trade-offs between them. From my perspective two major benefits of immutable values are:
if a compiler works with immutable values it can perform many optimizations to speed up code;
a user is can be sure that passing an immutable to a function will not change it and such encapsulation can simplify code analysis.
However, in particular, if you want to wrap an immutable value in a mutable wrapper a standard way to do it is to use Ref like this:
julia> x = Ref(1)
Base.RefValue{Int64}(1)
julia> x[]
1
julia> x[] = 10
10
julia> x
Base.RefValue{Int64}(10)
julia> x[]
10
You can pass such values to a function and modify them inside. Of course Ref introduces a different type so method implementation has to be a bit different.
Variables
A variable is a name bound to a value. In general, except for some special cases like:
rebinding a variable from module A in module B;
redefining some constants, e.g. trying to reassign a function name with a non-function value;
rebinding a variable that has a specified type of allowed values with a value that cannot be converted to this type;
you can rebind a variable to point to any value you wish. Rebinding is performed most of the time using = or some special constructs (like in for, let or catch statements).
Now - getting to the point - function is passed a value not a binding. You can modify a binding of a function parameter (in other words: you can rebind a value that a parameter is pointing to), but this parameter is a fresh variable whose scope lies inside a function.
If, for instance, we wanted a call like:
x = 10
f(x)
change a binding of variable x it is impossible because f does not even know of existence of x. It only gets passed its value. In particular - as I have noted above - adding such a functionality would break the rule that module A cannot rebind variables form module B, as f might be defined in a module different than where x is defined.
What to do
Actually it is easy enough to work without this feature from my experience:
What I typically do is simply return a value from a function that I assign to a variable. In Julia it is very easy because of tuple unpacking syntax like e.g. x,y,z = f(x,y,z), where f can be defined e.g. as f(x,y,z) = 2x,3y,4z;
You can use macros which get expanded before code execution and thus can have an effect modifying a binding of a variable, e.g. macro plusone(x) return esc(:($x = $x+1)) end and now writing y=100; #plusone(y) will change the binding of y;
Finally you can use Ref as discussed above (or any other mutable wrapper - as you have noted in your question).
"Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified?" asked by Schemer
Your question is wrong because you assume the wrong things.
Parameters are variables
When you pass things to a function, often those things are values and not variables.
for example:
function double(x::Int64)
2 * x
end
Now what happens when you call it using
double(4)
What is the point of the function modifying it's parameter x , it's pointless. Furthermore the function has no idea how it is called.
Furthermore, Julia is built for speed.
A function that modifies its parameter will be hard to optimise because it causes side effects. A side effect is when a procedure/function changes objects/things outside of it's scope.
If a function does not modifies a variable that is part of its calling parameter then you can be safe knowing.
the variable will not have its value changed
the result of the function can be optimised to a constant
not calling the function will not break the program's behaviour
Those above three factors are what makes FUNCTIONAL language fast and NON FUNCTIONAL language slow.
Furthermore when you move into Parallel programming or Multi Threaded programming, you absolutely DO NOT WANT a variable having it's value changed without you (The programmer) knowing about it.
"How would you implement with your proposed macro, the function F(x) which returns a boolean value and modifies c by c:= c + 1. F can be used in the following piece of Ada code : c:= 0; While F(c) Loop ... End Loop;" asked by Schemer
I would write
function F(x)
boolean_result = perform_some_logic()
return (boolean_result,x+1)
end
flag = true
c = 0
(flag,c) = F(c)
while flag
do_stuff()
(flag,c) = F(c)
end
"Unfortunately no, because, and I should have said that, c has to take again the value 0 when F return the value False (c increases as long the Loop lives and return to 0 when it dies). " said Schemer
Then I would write
function F(x)
boolean_result = perform_some_logic()
if boolean_result == true
return (true,x+1)
else
return (false,0)
end
end
flag = true
c = 0
(flag,c) = F(c)
while flag
do_stuff()
(flag,c) = F(c)
end

How are functions curried?

I understand what the concept of currying is, and know how to use it. These are not my questions, rather I am curious as to how this is actually implemented at some lower level than, say, Haskell code.
For example, when (+) 2 4 is curried, is a pointer to the 2 maintained until the 4 is passed in? Does Gandalf bend space-time? What is this magic?
Short answer: yes a pointer is maintained to the 2 until the 4 is passed in.
Longer than necessary answer:
Conceptually, you're supposed to think about Haskell being defined in terms of the lambda calculus and term rewriting. Lets say you have the following definition:
f x y = x + y
This definition for f comes out in lambda calculus as something like the following, where I've explicitly put parentheses around the lambda bodies:
\x -> (\y -> (x + y))
If you're not familiar with the lambda calculus, this basically says "a function of an argument x that returns (a function of an argument y that returns (x + y))". In the lambda calculus, when we apply a function like this to some value, we can replace the application of the function by a copy of the body of the function with the value substituted for the function's parameter.
So then the expression f 1 2 is evaluated by the following sequence of rewrites:
(\x -> (\y -> (x + y))) 1 2
(\y -> (1 + y)) 2 # substituted 1 for x
(1 + 2) # substituted 2 for y
3
So you can see here that if we'd only supplied a single argument to f, we would have stopped at \y -> (1 + y). So we've got a whole term that is just a function for adding 1 to something, entirely separate from our original term, which may still be in use somewhere (for other references to f).
The key point is that if we implement functions like this, every function has only one argument but some return functions (and some return functions which return functions which return ...). Every time we apply a function we create a new term that "hard-codes" the first argument into the body of the function (including the bodies of any functions this one returns). This is how you get currying and closures.
Now, that's not how Haskell is directly implemented, obviously. Once upon a time, Haskell (or possibly one of its predecessors; I'm not exactly sure on the history) was implemented by Graph reduction. This is a technique for doing something equivalent to the term reduction I described above, that automatically brings along lazy evaluation and a fair amount of data sharing.
In graph reduction, everything is references to nodes in a graph. I won't go into too much detail, but when the evaluation engine reduces the application of a function to a value, it copies the sub-graph corresponding to the body of the function, with the necessary substitution of the argument value for the function's parameter (but shares references to graph nodes where they are unaffected by the substitution). So essentially, yes partially applying a function creates a new structure in memory that has a reference to the supplied argument (i.e. "a pointer to the 2), and your program can pass around references to that structure (and even share it and apply it multiple times), until more arguments are supplied and it can actually be reduced. However it's not like it's just remembering the function and accumulating arguments until it gets all of them; the evaluation engine actually does some of the work each time it's applied to a new argument. In fact the graph reduction engine can't even tell the difference between an application that returns a function and still needs more arguments, and one that has just got its last argument.
I can't tell you much more about the current implementation of Haskell. I believe it's a distant mutant descendant of graph reduction, with loads of clever short-cuts and go-faster stripes. But I might be wrong about that; maybe they've found a completely different execution strategy that isn't anything at all like graph reduction anymore. But I'm 90% sure it'll still end up passing around data structures that hold on to references to the partial arguments, and it probably still does something equivalent to factoring in the arguments partially, as it seems pretty essential to how lazy evaluation works. I'm also fairly sure it'll do lots of optimisations and short cuts, so if you straightforwardly call a function of 5 arguments like f 1 2 3 4 5 it won't go through all the hassle of copying the body of f 5 times with successively more "hard-coding".
Try it out with GHC:
ghc -C Test.hs
This will generate C code in Test.hc
I wrote the following function:
f = (+) 16777217
And GHC generated this:
R1.p[1] = (W_)Hp-4;
*R1.p = (W_)&stg_IND_STATIC_info;
Sp[-2] = (W_)&stg_upd_frame_info;
Sp[-1] = (W_)Hp-4;
R1.w = (W_)&integerzmgmp_GHCziInteger_smallInteger_closure;
Sp[-3] = 0x1000001U;
Sp=Sp-3;
JMP_((W_)&stg_ap_n_fast);
The thing to remember is that in Haskell, partially applying is not an unusual case. There's technically no "last argument" to any function. As you can see here, Haskell is jumping to stg_ap_n_fast which will expect an argument to be available in Sp.
The stg here stands for "Spineless Tagless G-Machine". There is a really good paper on it, by Simon Peyton-Jones. If you're curious about how the Haskell runtime is implemented, go read that first.

understanding referential transparency

Generally, I have a headache because something is wrong with my reasoning:
For 1 set of arguments, referential transparent function will always return 1 set of output values.
that means that such function could be represented as a truth table (a table where 1 set of output parameters is specified for 1 set of arguments).
that makes the logic behind such functions is combinational (as opposed to sequential)
that means that with pure functional language (that has only rt functions) it is possible to describe only combinational logic.
The last statement is derived from this reasoning, but it's obviously false; that means there is an error in reasoning. [question: where is error in this reasoning?]
UPD2. You, guys, are saying lots of interesting stuff, but not answering my question. I defined it more explicitly now. Sorry for messing up with question definition!
Question: where is error in this reasoning?
A referentially transparent function might require an infinite truth table to represent its behavior. You will be hard pressed to design an infinite circuit in combinatory logic.
Another error: the behavior of sequential logic can be represented purely functionally as a function from states to states. The fact that in the implementation these states occur sequentially in time does not prevent one from defining a purely referentially transparent function which describes how state evolves over time.
Edit: Although I apparently missed the bullseye on the actual question, I think my answer is pretty good, so I'm keeping it :-) (see below).
I guess a more concise way to phrase the question might be: can a purely functional language compute anything an imperative one can?
First of all, suppose you took an imperative language like C and made it so you can't alter variables after defining them. E.g.:
int i;
for (i = 0; // okay, that's one assignment
i < 10; // just looking, that's all
i++) // BUZZZ! Sorry, can't do that!
Well, there goes your for loop. Do we get to keep our while loop?
while (i < 10)
Sure, but it's not very useful. i can't change, so it's either going to run forever or not run at all.
How about recursion? Yes, you get to keep recursion, and it's still plenty useful:
int sum(int *items, unsigned int count)
{
if (count) {
// count the first item and sum the rest
return *items + sum(items + 1, count - 1);
} else {
// no items
return 0;
}
}
Now, with functions, we don't alter state, but variables can, well, vary. Once a variable passes into our function, it's locked in. However, we can call the function again (recursion), and it's like getting a brand new set of variables (the old ones stay the same). Although there are multiple instances of items and count, sum((int[]){1,2,3}, 3) will always evaluate to 6, so you can replace that expression with 6 if you like.
Can we still do anything we want? I'm not 100% sure, but I think the answer is "yes". You certainly can if you have closures, though.
You have it right. The idea is, once a variable is defined, it can't be redefined. A referentially transparent expression, given the same variables, always yields the same result value.
I recommend looking into Haskell, a purely functional language. Haskell doesn't have an "assignment" operator, strictly speaking. For instance:
my_sum numbers = ??? where
i = 0
total = 0
Here, you can't write a "for loop" that increments i and total as it goes along. All is not lost, though. Just use recursion to keep getting new is and totals:
my_sum numbers = f 0 0 where
f i total =
if i < length numbers
then f i' total'
else total
where
i' = i+1
total' = total + (numbers !! i)
(Note that this is a stupid way to sum a list in Haskell, but it demonstrates a method of coping with single assignment.)
Now, consider this highly imperative-looking code:
main = do
a <- readLn
b <- readLn
print (a + b)
It's actually syntactic sugar for:
main =
readLn >>= (\a ->
readLn >>= (\b ->
print (a + b)))
The idea is, instead of main being a function consisting of a list of statements, main is an IO action that Haskell executes, and actions are defined and chained together with bind operations. Also, an action that does nothing, yielding an arbitrary value, can be defined with the return function.
Note that bind and return aren't specific to actions. They can be used with any type that calls itself a Monad to do all sorts of funky things.
To clarify, consider readLn. readLn is an action that, if executed, would read a line from standard input and yield its parsed value. To do something with that value, we can't store it in a variable because that would violate referential transparency:
a = readLn
If this were allowed, a's value would depend on the world and would be different every time we called readLn, meaning readLn wouldn't be referentially transparent.
Instead, we bind the readLn action to a function that deals with the action, yielding a new action, like so:
readLn >>= (\x -> print (x + 1))
The result of this expression is an action value. If Haskell got off the couch and performed this action, it would read an integer, increment it, and print it. By binding the result of an action to a function that does something with the result, we get to keep referential transparency while playing around in the world of state.
As far as I understand it, referential transparency just means: A given function will always yield the same result when invoked with the same arguments. So, the mathematical functions you learned about in school are referentially transparent.
A language you could check out in order to learn how things are done in a purely functional language would be Haskell. There are ways to use "updateable storage possibilities" like the Reader Monad, and the State Monad for example. If you're interested in purely functional data structures, Okasaki might be a good read.
And yes, you're right: Order of evaluation in a purely functional language like haskell does not matter as in non-functional languages, because if there are no side effects, there is no reason to do someting before/after something else -- unless the input of one depends on the output of the other, or means like monads come into play.
I don't really know about the truth-table question.
Here's my stab at answering the question:
Any system can be described as a combinatorial function, large or small.
There's nothing wrong with the reasoning that pure functions can only deal with combinatorial logic -- it's true, just that functional languages hide that from you to some extent or another.
You could even describe, say, the workings of a game engine as a truth table or a combinatorial function.
You might have a deterministic function that takes in "the current state of the entire game" as the RAM occupied by the game engine and the keyboard input, and returns "the state of the game one frame later". The return value would be determined by the combinations of the bits in the input.
Of course, in any meaningful and sane function, the input is parsed down to blocks of integers, decimals and booleans, but the combinations of the bits in those values is still determining the output of your function.
Keep in mind also that basic digital logic can be described in truth tables. The only reason that that's not done for anything more than, say, arithmetic on 4-bit integers, is because the size of the truth table grows exponentially.
The error in Your reasoning is the following:
"that means that such function could be represented as a truth table".
You conclude that from a functional language's property of referential transparency. So far the conclusion would sound plausible, but You oversee that a function is able to accept collections as input and process them in contrast to the fixed inputs of a logic gate.
Therefore a function does not equal a logic gate but rather a construction plan of such a logic gate depending on the actual (at runtime determined) input!
To comment on Your comment: Functional languages can - although stateless - implement a state machine by constructing the states from scratch each time they are being accessed.

What is recursion and when should I use it?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
One of the topics that seems to come up regularly on mailing lists and online discussions is the merits (or lack thereof) of doing a Computer Science Degree. An argument that seems to come up time and again for the negative party is that they have been coding for some number of years and they have never used recursion.
So the question is:
What is recursion?
When would I use recursion?
Why don't people use recursion?
There are a number of good explanations of recursion in this thread, this answer is about why you shouldn't use it in most languages.* In the majority of major imperative language implementations (i.e. every major implementation of C, C++, Basic, Python, Ruby,Java, and C#) iteration is vastly preferable to recursion.
To see why, walk through the steps that the above languages use to call a function:
space is carved out on the stack for the function's arguments and local variables
the function's arguments are copied into this new space
control jumps to the function
the function's code runs
the function's result is copied into a return value
the stack is rewound to its previous position
control jumps back to where the function was called
Doing all of these steps takes time, usually a little bit more than it takes to iterate through a loop. However, the real problem is in step #1. When many programs start, they allocate a single chunk of memory for their stack, and when they run out of that memory (often, but not always due to recursion), the program crashes due to a stack overflow.
So in these languages recursion is slower and it makes you vulnerable to crashing. There are still some arguments for using it though. In general, code written recursively is shorter and a bit more elegant, once you know how to read it.
There is a technique that language implementers can use called tail call optimization which can eliminate some classes of stack overflow. Put succinctly: if a function's return expression is simply the result of a function call, then you don't need to add a new level onto the stack, you can reuse the current one for the function being called. Regrettably, few imperative language-implementations have tail-call optimization built in.
* I love recursion. My favorite static language doesn't use loops at all, recursion is the only way to do something repeatedly. I just don't think that recursion is generally a good idea in languages that aren't tuned for it.
** By the way Mario, the typical name for your ArrangeString function is "join", and I'd be surprised if your language of choice doesn't already have an implementation of it.
Simple english example of recursion.
A child couldn't sleep, so her mother told her a story about a little frog,
who couldn't sleep, so the frog's mother told her a story about a little bear,
who couldn't sleep, so the bear's mother told her a story about a little weasel...
who fell asleep.
...and the little bear fell asleep;
...and the little frog fell asleep;
...and the child fell asleep.
In the most basic computer science sense, recursion is a function that calls itself. Say you have a linked list structure:
struct Node {
Node* next;
};
And you want to find out how long a linked list is you can do this with recursion:
int length(const Node* list) {
if (!list->next) {
return 1;
} else {
return 1 + length(list->next);
}
}
(This could of course be done with a for loop as well, but is useful as an illustration of the concept)
Whenever a function calls itself, creating a loop, then that's recursion. As with anything there are good uses and bad uses for recursion.
The most simple example is tail recursion where the very last line of the function is a call to itself:
int FloorByTen(int num)
{
if (num % 10 == 0)
return num;
else
return FloorByTen(num-1);
}
However, this is a lame, almost pointless example because it can easily be replaced by more efficient iteration. After all, recursion suffers from function call overhead, which in the example above could be substantial compared to the operation inside the function itself.
So the whole reason to do recursion rather than iteration should be to take advantage of the call stack to do some clever stuff. For example, if you call a function multiple times with different parameters inside the same loop then that's a way to accomplish branching. A classic example is the Sierpinski triangle.
You can draw one of those very simply with recursion, where the call stack branches in 3 directions:
private void BuildVertices(double x, double y, double len)
{
if (len > 0.002)
{
mesh.Positions.Add(new Point3D(x, y + len, -len));
mesh.Positions.Add(new Point3D(x - len, y - len, -len));
mesh.Positions.Add(new Point3D(x + len, y - len, -len));
len *= 0.5;
BuildVertices(x, y + len, len);
BuildVertices(x - len, y - len, len);
BuildVertices(x + len, y - len, len);
}
}
If you attempt to do the same thing with iteration I think you'll find it takes a lot more code to accomplish.
Other common use cases might include traversing hierarchies, e.g. website crawlers, directory comparisons, etc.
Conclusion
In practical terms, recursion makes the most sense whenever you need iterative branching.
Recursion is a method of solving problems based on the divide and conquer mentality.
The basic idea is that you take the original problem and divide it into smaller (more easily solved) instances of itself, solve those smaller instances (usually by using the same algorithm again) and then reassemble them into the final solution.
The canonical example is a routine to generate the Factorial of n. The Factorial of n is calculated by multiplying all of the numbers between 1 and n. An iterative solution in C# looks like this:
public int Fact(int n)
{
int fact = 1;
for( int i = 2; i <= n; i++)
{
fact = fact * i;
}
return fact;
}
There's nothing surprising about the iterative solution and it should make sense to anyone familiar with C#.
The recursive solution is found by recognising that the nth Factorial is n * Fact(n-1). Or to put it another way, if you know what a particular Factorial number is you can calculate the next one. Here is the recursive solution in C#:
public int FactRec(int n)
{
if( n < 2 )
{
return 1;
}
return n * FactRec( n - 1 );
}
The first part of this function is known as a Base Case (or sometimes Guard Clause) and is what prevents the algorithm from running forever. It just returns the value 1 whenever the function is called with a value of 1 or less. The second part is more interesting and is known as the Recursive Step. Here we call the same method with a slightly modified parameter (we decrement it by 1) and then multiply the result with our copy of n.
When first encountered this can be kind of confusing so it's instructive to examine how it works when run. Imagine that we call FactRec(5). We enter the routine, are not picked up by the base case and so we end up like this:
// In FactRec(5)
return 5 * FactRec( 5 - 1 );
// which is
return 5 * FactRec(4);
If we re-enter the method with the parameter 4 we are again not stopped by the guard clause and so we end up at:
// In FactRec(4)
return 4 * FactRec(3);
If we substitute this return value into the return value above we get
// In FactRec(5)
return 5 * (4 * FactRec(3));
This should give you a clue as to how the final solution is arrived at so we'll fast track and show each step on the way down:
return 5 * (4 * FactRec(3));
return 5 * (4 * (3 * FactRec(2)));
return 5 * (4 * (3 * (2 * FactRec(1))));
return 5 * (4 * (3 * (2 * (1))));
That final substitution happens when the base case is triggered. At this point we have a simple algrebraic formula to solve which equates directly to the definition of Factorials in the first place.
It's instructive to note that every call into the method results in either a base case being triggered or a call to the same method where the parameters are closer to a base case (often called a recursive call). If this is not the case then the method will run forever.
Recursion is solving a problem with a function that calls itself. A good example of this is a factorial function. Factorial is a math problem where factorial of 5, for example, is 5 * 4 * 3 * 2 * 1. This function solves this in C# for positive integers (not tested - there may be a bug).
public int Factorial(int n)
{
if (n <= 1)
return 1;
return n * Factorial(n - 1);
}
Recursion refers to a method which solves a problem by solving a smaller version of the problem and then using that result plus some other computation to formulate the answer to the original problem. Often times, in the process of solving the smaller version, the method will solve a yet smaller version of the problem, and so on, until it reaches a "base case" which is trivial to solve.
For instance, to calculate a factorial for the number X, one can represent it as X times the factorial of X-1. Thus, the method "recurses" to find the factorial of X-1, and then multiplies whatever it got by X to give a final answer. Of course, to find the factorial of X-1, it'll first calculate the factorial of X-2, and so on. The base case would be when X is 0 or 1, in which case it knows to return 1 since 0! = 1! = 1.
Consider an old, well known problem:
In mathematics, the greatest common divisor (gcd) … of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.
The definition of gcd is surprisingly simple:
where mod is the modulo operator (that is, the remainder after integer division).
In English, this definition says the greatest common divisor of any number and zero is that number, and the greatest common divisor of two numbers m and n is the greatest common divisor of n and the remainder after dividing m by n.
If you'd like to know why this works, see the Wikipedia article on the Euclidean algorithm.
Let's compute gcd(10, 8) as an example. Each step is equal to the one just before it:
gcd(10, 8)
gcd(10, 10 mod 8)
gcd(8, 2)
gcd(8, 8 mod 2)
gcd(2, 0)
2
In the first step, 8 does not equal zero, so the second part of the definition applies. 10 mod 8 = 2 because 8 goes into 10 once with a remainder of 2. At step 3, the second part applies again, but this time 8 mod 2 = 0 because 2 divides 8 with no remainder. At step 5, the second argument is 0, so the answer is 2.
Did you notice that gcd appears on both the left and right sides of the equals sign? A mathematician would say this definition is recursive because the expression you're defining recurs inside its definition.
Recursive definitions tend to be elegant. For example, a recursive definition for the sum of a list is
sum l =
if empty(l)
return 0
else
return head(l) + sum(tail(l))
where head is the first element in a list and tail is the rest of the list. Note that sum recurs inside its definition at the end.
Maybe you'd prefer the maximum value in a list instead:
max l =
if empty(l)
error
elsif length(l) = 1
return head(l)
else
tailmax = max(tail(l))
if head(l) > tailmax
return head(l)
else
return tailmax
You might define multiplication of non-negative integers recursively to turn it into a series of additions:
a * b =
if b = 0
return 0
else
return a + (a * (b - 1))
If that bit about transforming multiplication into a series of additions doesn't make sense, try expanding a few simple examples to see how it works.
Merge sort has a lovely recursive definition:
sort(l) =
if empty(l) or length(l) = 1
return l
else
(left,right) = split l
return merge(sort(left), sort(right))
Recursive definitions are all around if you know what to look for. Notice how all of these definitions have very simple base cases, e.g., gcd(m, 0) = m. The recursive cases whittle away at the problem to get down to the easy answers.
With this understanding, you can now appreciate the other algorithms in Wikipedia's article on recursion!
A function that calls itself
When a function can be (easily) decomposed into a simple operation plus the same function on some smaller portion of the problem. I should say, rather, that this makes it a good candidate for recursion.
They do!
The canonical example is the factorial which looks like:
int fact(int a)
{
if(a==1)
return 1;
return a*fact(a-1);
}
In general, recursion isn't necessarily fast (function call overhead tends to be high because recursive functions tend to be small, see above) and can suffer from some problems (stack overflow anyone?). Some say they tend to be hard to get 'right' in non-trivial cases but I don't really buy into that. In some situations, recursion makes the most sense and is the most elegant and clear way to write a particular function. It should be noted that some languages favor recursive solutions and optimize them much more (LISP comes to mind).
A recursive function is one which calls itself. The most common reason I've found to use it is traversing a tree structure. For example, if I have a TreeView with checkboxes (think installation of a new program, "choose features to install" page), I might want a "check all" button which would be something like this (pseudocode):
function cmdCheckAllClick {
checkRecursively(TreeView1.RootNode);
}
function checkRecursively(Node n) {
n.Checked = True;
foreach ( n.Children as child ) {
checkRecursively(child);
}
}
So you can see that the checkRecursively first checks the node which it is passed, then calls itself for each of that node's children.
You do need to be a bit careful with recursion. If you get into an infinite recursive loop, you will get a Stack Overflow exception :)
I can't think of a reason why people shouldn't use it, when appropriate. It is useful in some circumstances, and not in others.
I think that because it's an interesting technique, some coders perhaps end up using it more often than they should, without real justification. This has given recursion a bad name in some circles.
Recursion is an expression directly or indirectly referencing itself.
Consider recursive acronyms as a simple example:
GNU stands for GNU's Not Unix
PHP stands for PHP: Hypertext Preprocessor
YAML stands for YAML Ain't Markup Language
WINE stands for Wine Is Not an Emulator
VISA stands for Visa International Service Association
More examples on Wikipedia
Recursion works best with what I like to call "fractal problems", where you're dealing with a big thing that's made of smaller versions of that big thing, each of which is an even smaller version of the big thing, and so on. If you ever have to traverse or search through something like a tree or nested identical structures, you've got a problem that might be a good candidate for recursion.
People avoid recursion for a number of reasons:
Most people (myself included) cut their programming teeth on procedural or object-oriented programming as opposed to functional programming. To such people, the iterative approach (typically using loops) feels more natural.
Those of us who cut our programming teeth on procedural or object-oriented programming have often been told to avoid recursion because it's error prone.
We're often told that recursion is slow. Calling and returning from a routine repeatedly involves a lot of stack pushing and popping, which is slower than looping. I think some languages handle this better than others, and those languages are most likely not those where the dominant paradigm is procedural or object-oriented.
For at least a couple of programming languages I've used, I remember hearing recommendations not to use recursion if it gets beyond a certain depth because its stack isn't that deep.
A recursive statement is one in which you define the process of what to do next as a combination of the inputs and what you have already done.
For example, take factorial:
factorial(6) = 6*5*4*3*2*1
But it's easy to see factorial(6) also is:
6 * factorial(5) = 6*(5*4*3*2*1).
So generally:
factorial(n) = n*factorial(n-1)
Of course, the tricky thing about recursion is that if you want to define things in terms of what you have already done, there needs to be some place to start.
In this example, we just make a special case by defining factorial(1) = 1.
Now we see it from the bottom up:
factorial(6) = 6*factorial(5)
= 6*5*factorial(4)
= 6*5*4*factorial(3) = 6*5*4*3*factorial(2) = 6*5*4*3*2*factorial(1) = 6*5*4*3*2*1
Since we defined factorial(1) = 1, we reach the "bottom".
Generally speaking, recursive procedures have two parts:
1) The recursive part, which defines some procedure in terms of new inputs combined with what you've "already done" via the same procedure. (i.e. factorial(n) = n*factorial(n-1))
2) A base part, which makes sure that the process doesn't repeat forever by giving it some place to start (i.e. factorial(1) = 1)
It can be a bit confusing to get your head around at first, but just look at a bunch of examples and it should all come together. If you want a much deeper understanding of the concept, study mathematical induction. Also, be aware that some languages optimize for recursive calls while others do not. It's pretty easy to make insanely slow recursive functions if you're not careful, but there are also techniques to make them performant in most cases.
Hope this helps...
I like this definition:
In recursion, a routine solves a small part of a problem itself, divides the problem into smaller pieces, and then calls itself to solve each of the smaller pieces.
I also like Steve McConnells discussion of recursion in Code Complete where he criticises the examples used in Computer Science books on Recursion.
Don't use recursion for factorials or Fibonacci numbers
One problem with
computer-science textbooks is that
they present silly examples of
recursion. The typical examples are
computing a factorial or computing a
Fibonacci sequence. Recursion is a
powerful tool, and it's really dumb to
use it in either of those cases. If a
programmer who worked for me used
recursion to compute a factorial, I'd
hire someone else.
I thought this was a very interesting point to raise and may be a reason why recursion is often misunderstood.
EDIT:
This was not a dig at Dav's answer - I had not seen that reply when I posted this
1.)
A method is recursive if it can call itself; either directly:
void f() {
... f() ...
}
or indirectly:
void f() {
... g() ...
}
void g() {
... f() ...
}
2.) When to use recursion
Q: Does using recursion usually make your code faster?
A: No.
Q: Does using recursion usually use less memory?
A: No.
Q: Then why use recursion?
A: It sometimes makes your code much simpler!
3.) People use recursion only when it is very complex to write iterative code. For example, tree traversal techniques like preorder, postorder can be made both iterative and recursive. But usually we use recursive because of its simplicity.
Here's a simple example: how many elements in a set. (there are better ways to count things, but this is a nice simple recursive example.)
First, we need two rules:
if the set is empty, the count of items in the set is zero (duh!).
if the set is not empty, the count is one plus the number of items in the set after one item is removed.
Suppose you have a set like this: [x x x]. let's count how many items there are.
the set is [x x x] which is not empty, so we apply rule 2. the number of items is one plus the number of items in [x x] (i.e. we removed an item).
the set is [x x], so we apply rule 2 again: one + number of items in [x].
the set is [x], which still matches rule 2: one + number of items in [].
Now the set is [], which matches rule 1: the count is zero!
Now that we know the answer in step 4 (0), we can solve step 3 (1 + 0)
Likewise, now that we know the answer in step 3 (1), we can solve step 2 (1 + 1)
And finally now that we know the answer in step 2 (2), we can solve step 1 (1 + 2) and get the count of items in [x x x], which is 3. Hooray!
We can represent this as:
count of [x x x] = 1 + count of [x x]
= 1 + (1 + count of [x])
= 1 + (1 + (1 + count of []))
= 1 + (1 + (1 + 0)))
= 1 + (1 + (1))
= 1 + (2)
= 3
When applying a recursive solution, you usually have at least 2 rules:
the basis, the simple case which states what happens when you have "used up" all of your data. This is usually some variation of "if you are out of data to process, your answer is X"
the recursive rule, which states what happens if you still have data. This is usually some kind of rule that says "do something to make your data set smaller, and reapply your rules to the smaller data set."
If we translate the above to pseudocode, we get:
numberOfItems(set)
if set is empty
return 0
else
remove 1 item from set
return 1 + numberOfItems(set)
There's a lot more useful examples (traversing a tree, for example) which I'm sure other people will cover.
Well, that's a pretty decent definition you have. And wikipedia has a good definition too. So I'll add another (probably worse) definition for you.
When people refer to "recursion", they're usually talking about a function they've written which calls itself repeatedly until it is done with its work. Recursion can be helpful when traversing hierarchies in data structures.
An example: A recursive definition of a staircase is:
A staircase consists of:
- a single step and a staircase (recursion)
- or only a single step (termination)
To recurse on a solved problem: do nothing, you're done.
To recurse on an open problem: do the next step, then recurse on the rest.
In plain English:
Assume you can do 3 things:
Take one apple
Write down tally marks
Count tally marks
You have a lot of apples in front of you on a table and you want to know how many apples there are.
start
Is the table empty?
yes: Count the tally marks and cheer like it's your birthday!
no: Take 1 apple and put it aside
Write down a tally mark
goto start
The process of repeating the same thing till you are done is called recursion.
I hope this is the "plain english" answer you are looking for!
A recursive function is a function that contains a call to itself. A recursive struct is a struct that contains an instance of itself. You can combine the two as a recursive class. The key part of a recursive item is that it contains an instance/call of itself.
Consider two mirrors facing each other. We've seen the neat infinity effect they make. Each reflection is an instance of a mirror, which is contained within another instance of a mirror, etc. The mirror containing a reflection of itself is recursion.
A binary search tree is a good programming example of recursion. The structure is recursive with each Node containing 2 instances of a Node. Functions to work on a binary search tree are also recursive.
This is an old question, but I want to add an answer from logistical point of view (i.e not from algorithm correctness point of view or performance point of view).
I use Java for work, and Java doesn't support nested function. As such, if I want to do recursion, I might have to define an external function (which exists only because my code bumps against Java's bureaucratic rule), or I might have to refactor the code altogether (which I really hate to do).
Thus, I often avoid recursion, and use stack operation instead, because recursion itself is essentially a stack operation.
You want to use it anytime you have a tree structure. It is very useful in reading XML.
Recursion as it applies to programming is basically calling a function from inside its own definition (inside itself), with different parameters so as to accomplish a task.
"If I have a hammer, make everything look like a nail."
Recursion is a problem-solving strategy for huge problems, where at every step just, "turn 2 small things into one bigger thing," each time with the same hammer.
Example
Suppose your desk is covered with a disorganized mess of 1024 papers. How do you make one neat, clean stack of papers from the mess, using recursion?
Divide: Spread all the sheets out, so you have just one sheet in each "stack".
Conquer:
Go around, putting each sheet on top of one other sheet. You now have stacks of 2.
Go around, putting each 2-stack on top of another 2-stack. You now have stacks of 4.
Go around, putting each 4-stack on top of another 4-stack. You now have stacks of 8.
... on and on ...
You now have one huge stack of 1024 sheets!
Notice that this is pretty intuitive, aside from counting everything (which isn't strictly necessary). You might not go all the way down to 1-sheet stacks, in reality, but you could and it would still work. The important part is the hammer: With your arms, you can always put one stack on top of the other to make a bigger stack, and it doesn't matter (within reason) how big either stack is.
Recursion is the process where a method call iself to be able to perform a certain task. It reduces redundency of code. Most recurssive functions or methods must have a condifiton to break the recussive call i.e. stop it from calling itself if a condition is met - this prevents the creating of an infinite loop. Not all functions are suited to be used recursively.
hey, sorry if my opinion agrees with someone, I'm just trying to explain recursion in plain english.
suppose you have three managers - Jack, John and Morgan.
Jack manages 2 programmers, John - 3, and Morgan - 5.
you are going to give every manager 300$ and want to know what would it cost.
The answer is obvious - but what if 2 of Morgan-s employees are also managers?
HERE comes the recursion.
you start from the top of the hierarchy. the summery cost is 0$.
you start with Jack,
Then check if he has any managers as employees. if you find any of them are, check if they have any managers as employees and so on. Add 300$ to the summery cost every time you find a manager.
when you are finished with Jack, go to John, his employees and then to Morgan.
You'll never know, how much cycles will you go before getting an answer, though you know how many managers you have and how many Budget can you spend.
Recursion is a tree, with branches and leaves, called parents and children respectively.
When you use a recursion algorithm, you more or less consciously are building a tree from the data.
In plain English, recursion means to repeat someting again and again.
In programming one example is of calling the function within itself .
Look on the following example of calculating factorial of a number:
public int fact(int n)
{
if (n==0) return 1;
else return n*fact(n-1)
}
Any algorithm exhibits structural recursion on a datatype if basically consists of a switch-statement with a case for each case of the datatype.
for example, when you are working on a type
tree = null
| leaf(value:integer)
| node(left: tree, right:tree)
a structural recursive algorithm would have the form
function computeSomething(x : tree) =
if x is null: base case
if x is leaf: do something with x.value
if x is node: do something with x.left,
do something with x.right,
combine the results
this is really the most obvious way to write any algorith that works on a data structure.
now, when you look at the integers (well, the natural numbers) as defined using the Peano axioms
integer = 0 | succ(integer)
you see that a structural recursive algorithm on integers looks like this
function computeSomething(x : integer) =
if x is 0 : base case
if x is succ(prev) : do something with prev
the too-well-known factorial function is about the most trivial example of
this form.
function call itself or use its own definition.

Resources