Related
I'm referring to this article https://codeburst.io/a-beginner-friendly-intro-to-functional-programming-4f69aa109569. I've been confused with how functional programming differs from procedural programming for a while and want to understand it.
function getOdds2(arr){
return arr.filter(num => num % 2 !== 0)
}
"we simply define the outcome we want to see by using a method called filter, and allow the machine to take care of all the steps in between. This is a more declarative approach."
They say they define the outcome we want to see. How is that any different from procedural programming where you call a function (a sub routine) and get the output you want to see. How is this "declarative approach" any different than just calling a function. The only difference I see between functional and procedural programming is the style.
Procedural programming uses procedures (i.e. calls functions, aka subroutines) and potentially (refers to, and/or changes in place aka "mutates") lots of global variables and/or data structures, i.e. "state".
Functional programming seeks to have all the relevant state in the function's arguments and return values, to minimize if not outright forbid the state external to the functions used (called) in a program.
So, it seeks to have few (or none at all) global variables and data structures, just have functions (subroutines) passing data along until the final one produces the result.
Both are calling functions / subroutines, yes. That's just structured programming, though.
"Pure" functional programming doesn't want its functions to have (and maintain) any internal state (between the calls) either. That way the programming functions / subroutines / procedures become much like mathematical functions, always producing the same output for the same input(s). The operations of such functions become reliable, allowing for the more declarative style of programming. (when it might do something different each time it's called, there's no name we can give it, no concept to call it for).
With functional programming, everything revolves around functions. In the example you gave above, you are defining and executing a function without even knowing it.
You're passing a lambda (anonymous function) to the filter function. With functional programming, that's really the point - defining everything in terms of functions, passing functions as arguments to functions, etc.
So yes, it is about style. It's about ease of expressing your ideas in code and being additionally terse.
Take this for example:
function getOdds(array) {
odds = [];
for(var i = 0; i < array.length; i++) {
if(isOdd(array[i]) array.push(i);
}
}
function isOdd(n) {
return n % 2 != 0;
}
it gets the job done but it's really verbose. Now compare it to:
function getOdds(array) {
array.filter(n => n % 2 != 0)
}
Here you're defining the isOdd function anonymously as a lambda and passing it directly to the filter function. Saves a lot of keystrokes.
I've been recently learning about functional languages and how many don't include for loops. While I don't personally view recursion as more difficult than a for loop (and often easier to reason out) I realized that many examples of recursion aren't tail recursive and therefor cannot use simple tail recursion optimization in order to avoid stack overflows. According to this question, all iterative loops can be translated into recursion, and those iterative loops can be transformed into tail recursion, so it confuses me when the answers on a question like this suggest that you have to explicitly manage the translation of your recursion into tail recursion yourself if you want to avoid stack overflows. It seems like it should be possible for a compiler to do all the translation from either recursion to tail recursion, or from recursion straight to an iterative loop with out stack overflows.
Are functional compilers able to avoid stack overflows in more general recursive cases? Are you really forced to transform your recursive code in order to avoid stack overflows yourself? If they aren't able to perform general recursive stack-safe compilation, why aren't they?
Any recursive function can be converted into a tail recursive one.
For instance, consider the transition function of a Turing machine, that
is the mapping from a configuration to the next one. To simulate the
turing machine you just need to iterate the transition function until
you reach a final state, that is easily expressed in tail recursive
form. Similarly, a compiler typically translates a recursive program into
an iterative one simply adding a stack of activation records.
You can also give a translation into tail recursive form using continuation
passing style (CPS). To make a classical example, consider the fibonacci
function.
This can be expressed in CPS style in the following way, where the second
parameter is the continuation (essentially, a callback function):
def fibc(n, cont):
if n <= 1:
return cont(n)
return fibc(n - 1, lambda a: fibc(n - 2, lambda b: cont(a + b)))
Again, you are simulating the recursion stack using a dynamic data structure:
in this case, lambda abstractions.
The use of dynamic structures (lists, stacks, functions, etc.) in all previous
examples is essential. That is to say, that in order to simulate a generic
recursive function iteratively, you cannot avoid dynamic memory allocation,
and hence you cannot avoid stack overflow, in general.
So, memory consumption is not only related to the iterative/recursive
nature of the program. On the other side, if you prevent dynamic memory
allocation, your
programs are essentially finite state machines, with limited computational
capabilities (more interesting would be to parametrise memory according to
the dimension of inputs).
In general, in the same way as you cannot predict termination, you cannot
predict an unbound memory consumption of your program: working with
a Turing complete language, at compile time
you cannot avoid divergence, and you cannot avoid stack overflow.
Tail Call Optimization:
The natural way to do arguments and calls is to sort out the cleaning up when exiting or when returning.
For tail calls to work you need to alter it so that the tail call inherits the current frame. Thus instead of making a new frame it massages the frame so that the next call returns to the current functions caller instead of this function, which really only cleans up and returns if it's a tail call.
Thus TCO is all about cleaning up before the last call.
Continuation Passing Style - make tail calls out of everything
A compiler can change the code such that it only does primitive operations and pass it to continuations. Thus the stack usage gets moved onto the heap since the computation to be continued is made a function.
An example is:
function hypotenuse(k1, k2) {
return sqrt(add(square(k1), square(k2)))
}
becomes
function hypotenuse(k, k1, k2) {
(function (sk1) {
(function (sk2) {
(function (ar) {
k(sqrt(ar));
}(add(sk1,sk2));
}(square(k2));
}(square(k1));
}
Notice every function has exactly one call now and the order of evaluation is set.
According to this question, all iterative loops can be translated into recursion
"Translated" might be a bit of a stretch. The proof that for every iterative loop there is an equivalent recursive program is trivial if you understand Turing completeness: since a Turing machine can be implemented using strictly iterative structures and strictly recursive structures, every program that can be expressed in an iterative language can be expressed in a recursive language, and vice-versa. This means that for every iterative loop there is an equivalent recursive construct (and the other way around). However, that doesn't mean we have some automated way of transforming one into the other.
and those iterative loops can be transformed into tail recursion
Tail recursion can perhaps be easily transformed into an iterative loop, and the other way around. But not all recursion is tail recursion. Here's an example. Suppose we have some binary tree. It consists of nodes. Each node can have a left and a right child and a value. If a node has no children, then isLeaf returns true for it. We'll assume there's some function max that returns the maximum of two values, and if one of the values is null it returns the other one. Now we want to define a function that finds the maximum value among all the leaf nodes. Here it is in some pseudo-code I cooked up.
findmax(node) {
if (node == null) {
return null
}
if (node.isLeaf) {
return node.value
} else {
return max(findmax(node.left), findmax(node.right))
}
}
There's two recursive calls in the max function, so we can't optimize for tail recursion. We need the results of both before we can supply them to the max function and determine the result of the call for the current node.
Now, there may be a way of getting the same result, using recursion and only a single tail-recursive call. It is functionally equivalent, but it is a different algorithm. Compilers can do a lot of transformations to create a functionally equivalent program with lots of optimizations, but they're not quite clever enough to create functionally equivalent algorithms.
Even the transformation of a function that only calls itself recursively once into a tail-recursive version would be far from trivial. Such an adaptation usually employs some argument passed into the recursive invocation that is used as an "accumulator" for the current results.
Look at the next naive implementation for calculating a factorial of a number (e.g. fact(5) = 5*4*3*2*1):
fact(number) {
if (number == 1) {
return 1
} else {
return number * fact(number - 1)
}
}
It's not tail-recursive. But it can be made so in this way:
fact(number, acc) {
if (number == 1) {
return acc
} else {
return fact(number - 1, number * acc)
}
}
// Helper function
fact(number) {
return fact(number, 1)
}
This requires an interpretation of what is being done. Recognizing the case for stuff like this is easy enough, but what if you call a function instead of a multiplication? How will the compiler know that for the initial call the accumulator must be 1 and not, say, 0? How do you translate this program?
recsub(number) {
if (number == 1) {
return 1
} else {
return number - recsub(number - 1)
}
}
This is as of yet outside the scope of the sort of compiler we have now, and may in fact always be.
Maybe it would be interesting to ask this on the computer science Stack Exchange to see if they know of some papers or proofs that investigate this more in-depth.
this is a purely academic question:
I have recently been working with languages that use tail recursion optimization. For practice I wrote two recursive implementations of sum functions in R, one of them being tail recursive. I quickly realized there is no tail recursion optimization in R. I can live with that.
However, I also noticed a different level of allowed depth when using the local helper function for the tail recursion.
Here is the code:
## Recursive
sum <- function (i, end, fun){
if (i>=end) 0
else fun(i) + sum(i+1, end, fun)
}
## Tail recursive
sum_tail <- function (i, end, fun){
sum_helper<- function(i, acc){
if (i>=end) acc
else sum_helper(i+1, acc+fun(i))
}
sum_helper(i, 0)
}
## Simple example
harmonic <- function(k){
return(1/(k))
}
print(sum(1, 1200, harmonic)) # <- This works fine, but is close to the limit
# print(sum_tail(1, 1200, harmonic)) <- This will crash
print(sum_tail(1, 996, harmonic)) # <- This is the deepest allowed
I am fairly intrigued. Can someone explain this behavior or point me towards a document explaining how the allowed recursion depth is calculated?
I'm not sure of R's internal implementation of the call stack, but it's pretty obvious from here that there is a maximum stack depth. (Many languages have this for various reasons, mostly related to memory and detecting infinite recursion.) You can set it with options(), and the default setting seems to depends on the platform -- on my machine, I can do print(sum_tail(1, 996, harmonic)) without difficulty.
Sidebar: you really shouldn't name your naive implementation sum() because you wind up shadowing a builtin. I know you're just playing with recursion here, but you should also generally avoid doing your own implementation of sum() -- it's not provided just as a convenience function but also because it's non trivial to implement a numerically correct version of sum() with floating point.
In your naive implementation, the call to fun() returns before the recursive call -- this means that each recursive call increases the depth of the call stack by exactly 1. In the other case, you've got an additional function call that's waiting to be evaluated. For more details, you should look into how R handles closures and how lazy / eager evaluation in R is handled. If I recall correctly, R uses environments (roughly, R's notion of scope, and deeply related to closures) to wrap arguments in certain situations and delay their evaluation, thus effectively using lazy evaluation. There's a lot of information on R internals available online, see here for a quick overview of argument evaluation. I'm not sure how accurate I am on the details, but it seems that the arguments to the tail-call are themselves getting placed on the call stack, thus increasing the depth of the call-stack by more than 1.
Sidebar the Second: I don't recall well enough how R implements this, and I know placing helper functions in the body is common practice, but placing a helper function definition in the recursive call could lead to each recursive call defining the helper function anew. This could interact in various ways with the way environments and closures are handled, but I'm not sure.
The functions traceback() and trace() could be useful in exploring the call behavior if you're curious about more of the details.
Closures are poor man's objects and vice versa.
I have seen this statement at many places on the web (including SO) but I don't quite understand what it means. Could someone please explain what it exactly means?
If possible, please include examples in your answer.
Objects are poor man's closures.
Consider Java. Java is an object-oriented programming language with no language level support for real lexical closures. As a work-around Java programmers use anonymous inner classes that can close over the variables available in lexical scope (provided they're final). In this sense, objects are poor man's closures.
Closures are poor man's objects.
Consider Haskell. Haskell is a functional language with no language level support for real objects. However they can be modeled using closures, as described in this excellent paper by Oleg Kiselyov and Ralf Lammel. In this sense, closures are poor man's objects.
If you come from an OO background, you'll probably find thinking in terms of objects more natural, and may therefore think of them as a more fundamental concept than closures. If you come from a FP background, you might find thinking in terms of closures more natural, and may therefore think of them as a more fundamental concept than objects.
Moral of the story is that closures and objects are ideas that are expressible in terms of each other, and none is more fundamental than the other. That's all there is to the statement under consideration.
In philosophy, this is referred to as model dependent realism.
The point is that closures and objects accomplish the same goal: encapsulation of data and/or functionality in a single, logical unit.
For example, you might make a Python class that represents a dog like this:
class Dog(object):
def __init__(self):
self.breed = "Beagle"
self.height = 12
self.weight = 15
self.age = 1
def feed(self, amount):
self.weight += amount / 5.0
def grow(self):
self.weight += 2
self.height += .25
def bark(self):
print "Bark!"
And then I instantiate the class as an object
>>> Shaggy = Dog()
The Shaggy object has data and functionality built in. When I call Shaggy.feed(5), he gains a pound. That pound is stored in variable that's stored as an attribute of the object, which more or less means that it's in the objects internal scope.
If I was coding some Javascript, I'd do something similar:
var Shaggy = function() {
var breed = "Beagle";
var height = 12;
var weight = 15;
var age = 1;
return {
feed : function(){
weight += amount / 5.0;
},
grow : function(){
weight += 2;
height += .25;
},
bark : function(){
window.alert("Bark!");
},
stats : function(){
window.alert(breed "," height "," weight "," age);
}
}
}();
Here, instead of creating a scope within an object, I've created a scope within a function and then called that function. The function returns a JavaScript object composed of some functions. Because those functions access data that was allocated in the local scope, the memory isn't reclaimed, allowing you to continue to use them through the interface provided by the closure.
An object, at its simplest, is just a collection of state and functions that operate on that state. A closure is also a collection of state and a function that operates on that state.
Let's say I call a function that takes a callback. In this callback, I need to operate on some state known before the function call. I can create an object that embodies this state ("fields") and contains a member function ("method") that performs as the callback. Or, I could take the quick and easy ("poor man's") route and create a closure.
As an object:
class CallbackState{
object state;
public CallbackState(object state){this.state = state;}
public void Callback(){
// do something with state
}
}
void Foo(){
object state = GenerateState();
CallbackState callback = new CallbackState(state);
PerformOperation(callback.Callback);
}
This is pseudo-C#, but is similar in concept to other OO languages. As you can see, there's a fair amount of boilerplate involved with the callback class to manage the state. This would be much simpler using a closure:
void Foo(){
object state = GenerateState();
PerformOperation(()=>{/*do something with state*/});
}
This is a lambda (again, in C# syntax, but the concept is similar in other languages that support closures) that gives us all the capabilities of the class, without having to write, use, and maintain a separate class.
You'll also hear the corollary: "objects are a poor man's closure". If I can't or won't take advantage of closures, then I am forced to do their work using objects, as in my first example. Although objects provide more functionality, closures are often a better choice where a closure will work, for the reasons already stated.
Hence, a poor man without objects can often get the job done with closures, and a poor man without closures can get the job done using objects. A rich man has both and uses the right one for each job.
EDITED: The title of the question does not include "vice versa" so I'll try not to assume the asker's intent.
The two common camps are functional vs imperative languages. Both are tools that can accomplish similar tasks in different ways with different sets of concerns.
Closures are poor man's objects.
Objects are poor man's closures.
Individually, each statement usually means the author has a some bias, one way or another, usually rooted in their comfort with one language or class of language vs discomfort with another. If not bias, they may be constrained with one environment or the other. The authors I read that say this sort of thing are usually the zealot, purist or language religious types. I avoid the language religious types if possible.
Closures are poor man's objects. Objects are poor man's closures.
The author of that is a "pragmatist" and also pretty clever. It means the author appreciates both points of view and appreciates they are conceptually one and the same. This is my sort of fellow.
Just so much sugar, as closures hide anonymous objects under their skirts.
"Objects are a poor man's closures" isn't just a statement of some theoretical equivalence — it's a common Java idiom. It's very common to use anonymous classes to wrap up a function that captures the current state. Here's how it's used:
public void foo() {
final String message = "Hey ma, I'm closed over!";
SwingUtilities.invokeLater(new Runnable() {
public void run() {
System.out.println(message);
}
});
}
This even looks a lot like the equivalent code using a closure in another language. For example, using Objective-C blocks (since Objective-C is reasonably similar to Java):
void foo() {
NSString *message = #"Hey ma, I'm closed over!";
[[NSOperationQueue currentQueue] addOperationWithBlock:^{
printf("%s\n", [message UTF8String]);
}];
}
The only real difference is that the functionality is wrapped in the new Runnable() anonymous class instance in the Java version.
That objects can be used as a replacement for closures is quite easy to understand, you just place the captured state in the object and the calling as a method. Indeed for example C++ lambda closures are implemented as objects (things are sort of tricky for C++ because the language doesn't provide garbage collection and true closures with mutable shared state are therefore hard to implement correctly because of the lifetime of captured context).
The opposite (closures can be used as objects) is less observed but it's IMO a very powerful technique... consider for example (Python)
def P2d(x, y):
def f(cmd, *args):
nonlocal x, y
if cmd == "x": return x
if cmd == "y": return y
if cmd == "set-x": x = args[0]
if cmd == "set-y": y = args[0]
return f
The function P2d returns a closure that captured the two values of x and y. The closure then provide access for reading and writing to them using a command. For example
p = P2d(10, 20)
p("x") # --> 10
p("set-x", 99)
p("x") # --> 99
so the closure is behaving like an object; moreover as any access is going through the command interface it's very easy to implement delegation, inheritance, computed attributes etc.
The nice book "Let Over Lambda" builds over this idea using Lisp as a language, but any language that supports closures can use this technique (in Lisp the advantage is that you can also bend the syntax using macros and read macros to improve usability and automatically generate all boilerplate code). The title of the book is exactly about this... a let wrapping a lambda:
(defun p2d (x y)
(let ((x x) (y y))
(lambda (cmd &rest args)
(cond
((eq cmd 'x) x)
((eq cmd 'y) y)
((eq cmd 'set-x) (setq x (first args)))
((eq cmd 'set-y) (setq y (first args)))))))
Actually I'm not sure I agree with the "poor" adjective in this approach.
I have to write, for academic purposes, an application that plots user-input expressions like: f(x) = 1 - exp(3^(5*ln(cosx)) + x)
The approach I've chosen to write the parser is to convert the expression in RPN with the Shunting-Yard algorithm, treating primitive functions like "cos" as unary operators. This means the function written above would be converted in a series of tokens like:
1, x, cos, ln, 5, *,3, ^, exp, -
The problem is that to plot the function I have to evaluate it LOTS of times, so applying the stack evaluation algorithm for each input value would be very inefficient.
How can I solve this? Do I have to forget the RPN idea?
How much is "LOTS of times"? A million?
What kind of functions could be input? Can we assume they are continuous?
Did you try measuring how well your code performs?
(Sorry, started off with questions!)
You could try one of the two approaches (or both) described briefly below (there are probably many more):
1) Parse Trees.
You could create a Parse Tree. Then do what most compilers do to optimize expressions, constant folding, common subexpression elimination (which you could achieve by linking together the common expression subtrees and caching the result), etc.
Then you could use lazy evaluation techniques to avoid whole subtrees. For instance if you have a tree
*
/ \
A B
where A evaluates to 0, you could completely avoid evaluating B as you know the result is 0. With RPN you would lose out on the lazy evaluation.
2) Interpolation
Assuming your function is continuous, you could approximate your function to a high degree of accuracy using Polynomial Interpolation. This way you can do the complicated calculation of the function a few times (based on the degree of polynomial you choose), and then do fast polynomial calculations for the rest of the time.
To create the initial set of data, you could just use approach 1 or just stick to using your RPN, as you would only be generating a few values.
So if you use Interpolation, you could keep your RPN...
Hope that helps!
Why reinvent the wheel? Use a fast scripting language instead.
Integrating something like lua into your code will take very little time and be very fast.
You'll usually be able byte compile your expression, and that should result in code that runs very fast, certainly fast enough for simple 1D graphs.
I recommend lua as its fast, and integrates with C/C++ easier than any other scripting language. Another good options would be python, but while its better known I found it trickier to integrate.
Why not keep around a parse tree (I use "tree" loosely, in your case it's a sequence of operations), and mark input variables accordingly? (e.g. for inputs x, y, z, etc. annotate "x" with 0 to signify the first input variable, "y" with 1 to signify the 2nd input variable, etc.)
That way you can parse the expression once, keep the parse tree, take in an array of inputs, and apply the parse tree to evaluate.
If you're worrying about the performance aspects of the evaluation step (vs. the parsing step), I don't think you'd do much better unless you get into vectorizing (applying your parse tree on a vector of inputs at once) or hard-coding the operations into a fixed function.
What I do is use the shunting algorithm to produce the RPN. I then "compile" the RPN into a tokenised form that can be executed (interpretively) repeatedly without re-parsing the expression.
Michael Anderson suggested Lua. If you want to try Lua for just this task, see my ae library.
Inefficient in what sense? There's machine time and programmer time. Is there a standard for how fast it needs to run with a particular level of complexity? Is it more important to finish the assignment and move on to the next one (perfectionists sometimes never finish)?
All those steps have to happen for each input value. Yes, you could have a heuristic that scans the list of operations and cleans it up a bit. Yes, you could compile some of it down to assembly instead of calling +, * etc. as high level functions. You can compare vectorization (doing all the +'s then all the *'s etc, with a vector of values) to doing the whole procedure for one value at a time. But do you need to?
I mean, what do you think happens if you plot a function in gnuplot or Mathematica?
Your simple interpretation of RPN should work just fine, especially since it contains
math library functions like cos, exp, and ^(pow, involving logs)
symbol table lookup
Hopefully, your symbol table (with variables like x in it) will be short and simple.
The library functions will most likely be your biggest time-takers, so unless your interpreter is poorly written, it will not be a problem.
If, however, you really gotta go for speed, you could translate the expression into C code, compile and link it into a dll on-the-fly and load it (takes about a second). That, plus memoized versions of the math functions, could give you the best performance.
P.S. For parsing, your syntax is pretty vanilla, so a simple recursive-descent parser (about a page of code, O(n) same as shunting-yard) should work just fine. In fact, you might just be able to compute the result as you parse (if math functions are taking most of the time), and not bother with parse trees, RPN, any of that stuff.
I think this RPN based library can serve the purpose: http://expressionoasis.vedantatree.com/
I used it with one of my calculator project and it works well. It is small and simple, but extensible.
One optimization would be to replace the stack with an array of values and implement the evaluator as a three address mechine where each operation loads from two (or one) location and saves to a third. This can make for very tight code:
struct Op {
enum {
add, sub, mul, div,
cos, sin, tan,
//....
} op;
int a, b, d;
}
void go(Op* ops, int n, float* v) {
for(int i = 0; i < n; i++) {
switch(ops[i].op) {
case add: v[op[i].d] = v[op[i].a] + v[op[i].b]; break;
case sub: v[op[i].d] = v[op[i].a] - v[op[i].b]; break;
case mul: v[op[i].d] = v[op[i].a] * v[op[i].b]; break;
case div: v[op[i].d] = v[op[i].a] / v[op[i].b]; break;
//...
}
}
}
The conversion from RPN to 3-address should be easy as 3-address is a generalization.