How are functions curried? - pointers

I understand what the concept of currying is, and know how to use it. These are not my questions, rather I am curious as to how this is actually implemented at some lower level than, say, Haskell code.
For example, when (+) 2 4 is curried, is a pointer to the 2 maintained until the 4 is passed in? Does Gandalf bend space-time? What is this magic?

Short answer: yes a pointer is maintained to the 2 until the 4 is passed in.
Longer than necessary answer:
Conceptually, you're supposed to think about Haskell being defined in terms of the lambda calculus and term rewriting. Lets say you have the following definition:
f x y = x + y
This definition for f comes out in lambda calculus as something like the following, where I've explicitly put parentheses around the lambda bodies:
\x -> (\y -> (x + y))
If you're not familiar with the lambda calculus, this basically says "a function of an argument x that returns (a function of an argument y that returns (x + y))". In the lambda calculus, when we apply a function like this to some value, we can replace the application of the function by a copy of the body of the function with the value substituted for the function's parameter.
So then the expression f 1 2 is evaluated by the following sequence of rewrites:
(\x -> (\y -> (x + y))) 1 2
(\y -> (1 + y)) 2 # substituted 1 for x
(1 + 2) # substituted 2 for y
3
So you can see here that if we'd only supplied a single argument to f, we would have stopped at \y -> (1 + y). So we've got a whole term that is just a function for adding 1 to something, entirely separate from our original term, which may still be in use somewhere (for other references to f).
The key point is that if we implement functions like this, every function has only one argument but some return functions (and some return functions which return functions which return ...). Every time we apply a function we create a new term that "hard-codes" the first argument into the body of the function (including the bodies of any functions this one returns). This is how you get currying and closures.
Now, that's not how Haskell is directly implemented, obviously. Once upon a time, Haskell (or possibly one of its predecessors; I'm not exactly sure on the history) was implemented by Graph reduction. This is a technique for doing something equivalent to the term reduction I described above, that automatically brings along lazy evaluation and a fair amount of data sharing.
In graph reduction, everything is references to nodes in a graph. I won't go into too much detail, but when the evaluation engine reduces the application of a function to a value, it copies the sub-graph corresponding to the body of the function, with the necessary substitution of the argument value for the function's parameter (but shares references to graph nodes where they are unaffected by the substitution). So essentially, yes partially applying a function creates a new structure in memory that has a reference to the supplied argument (i.e. "a pointer to the 2), and your program can pass around references to that structure (and even share it and apply it multiple times), until more arguments are supplied and it can actually be reduced. However it's not like it's just remembering the function and accumulating arguments until it gets all of them; the evaluation engine actually does some of the work each time it's applied to a new argument. In fact the graph reduction engine can't even tell the difference between an application that returns a function and still needs more arguments, and one that has just got its last argument.
I can't tell you much more about the current implementation of Haskell. I believe it's a distant mutant descendant of graph reduction, with loads of clever short-cuts and go-faster stripes. But I might be wrong about that; maybe they've found a completely different execution strategy that isn't anything at all like graph reduction anymore. But I'm 90% sure it'll still end up passing around data structures that hold on to references to the partial arguments, and it probably still does something equivalent to factoring in the arguments partially, as it seems pretty essential to how lazy evaluation works. I'm also fairly sure it'll do lots of optimisations and short cuts, so if you straightforwardly call a function of 5 arguments like f 1 2 3 4 5 it won't go through all the hassle of copying the body of f 5 times with successively more "hard-coding".

Try it out with GHC:
ghc -C Test.hs
This will generate C code in Test.hc
I wrote the following function:
f = (+) 16777217
And GHC generated this:
R1.p[1] = (W_)Hp-4;
*R1.p = (W_)&stg_IND_STATIC_info;
Sp[-2] = (W_)&stg_upd_frame_info;
Sp[-1] = (W_)Hp-4;
R1.w = (W_)&integerzmgmp_GHCziInteger_smallInteger_closure;
Sp[-3] = 0x1000001U;
Sp=Sp-3;
JMP_((W_)&stg_ap_n_fast);
The thing to remember is that in Haskell, partially applying is not an unusual case. There's technically no "last argument" to any function. As you can see here, Haskell is jumping to stg_ap_n_fast which will expect an argument to be available in Sp.
The stg here stands for "Spineless Tagless G-Machine". There is a really good paper on it, by Simon Peyton-Jones. If you're curious about how the Haskell runtime is implemented, go read that first.

Related

Simple example of call-by-need

I'm trying to understand the theorem behind "call-by-need." I do understand the definition, but I'm a bit confused. I would like to see a simple example which shows how call-by-need works.
After reading some previous threads, I found out that Haskell uses this kind of evaluation. Are there any other programming languages which support this feature?
I read about the call-by-name of Scala, and I do understand that call-by-name and call-by-need are similar but different by the fact that call-by-need will keep the evaluated value. But I really would love to see a real-life example (it does not have to be in Haskell), which shows call-by-need.
The function
say_hello numbers = putStrLn "Hello!"
ignores its numbers argument. Under call-by-value semantics, even though an argument is ignored, the parameter at the function call site may need to be evaluated, perhaps because of side effects that the rest of the program depends on.
In Haskell, we might call say_hello as
say_hello [1..]
where [1..] is the infinite list of naturals. Under call-by-value semantics, the CPU would run off trying to build an infinite list and never get to the say_hello at all!
Haskell merely outputs
$ runghc cbn.hs
Hello!
For less dramatic examples, the first ten natural numbers are
ghci> take 10 [1..]
[1,2,3,4,5,6,7,8,9,10]
The first ten odds are
ghci> take 10 $ filter odd [1..]
[1,3,5,7,9,11,13,15,17,19]
Under call-by-need semantics, each value — even a conceptually infinite one as in the examples above — is evaluated only to the extent required and no more.
update: A simple example, as asked for:
ff 0 = 1
ff 1 = 1
ff n = go (ff (n-1))
where
go x = x + x
Under call-by-name, each invocation of go evaluates ff (n-1) twice, each for each appearance of x in its definition (because + is strict in both arguments, i.e. demands the values of the both of them).
Under call-by-need, go's argument is evaluated at most once. Specifically, here, x's value is found out only once, and reused for the second appearance of x in the expression x + x. If it weren't needed, x wouldn't be evaluated at all, just as with call-by-name.
Under call-by-value, go's argument is always evaluated exactly once, prior to entering the function's body, even if it isn't used anywhere in the function's body.
Here's my understanding of it, in the context of Haskell.
According to Wikipedia, "call by need is a memoized variant of call by name where, if the function argument is evaluated, that value is stored for subsequent uses."
Call by name:
take 10 . filter even $ [1..]
With one consumer the produced value disappears after being produced so it might as well be call-by-name.
Call by need:
import qualified Data.List.Ordered as O
h = 1 : map (2*) h <> map (3*) h <> map (5*) h
where
(<>) = O.union
The difference is, here the h list is reused by several consumers, at different tempos, so it is essential that the produced values are remembered. In a call-by-name language there'd be much replication of computational effort here because the computational expression for h would be substituted at each of its occurrences, causing separate calculation for each. In a call-by-need--capable language like Haskell the results of computing the elements of h are shared between each reference to h.
Another example is, most any data defined by fix is only possible under call-by-need. With call-by-value the most we can have is the Y combinator.
See: Sharing vs. non-sharing fixed-point combinator and its linked entries and comments (among them, this, and its links, like Can fold be used to create infinite lists?).

Prolog arithmetic in foreach

So I'm learning Prolog. One of the things I've found to be really obnoxious is demonstrated in the following example:
foreach(
between(1,10,X),
somePredicate(X,X+Y,Result)
).
This does not work. I am well aware that X+Y is not evaluated, here, and instead I'd have to do:
foreach(
between(1,10,X),
(
XPlusY is X + Y,
somePredicate(X, XPlusY, Result)
)
).
Except, that doesn't work, either. As near as I can tell, the scope of XPlusY extends outside of foreach - i.e., XPlusY is 1 + Y, XPlusY is 2 + Y, etc. must all be true AT ONCE, and there is no XPlusY for which that is the case. So I have to do the following:
innerCode(X, Result) :-
XPlusY is X + Y,
somePredicate(X, XPlusY, Result).
...
foreach(
between(1,10,X),
innerCode(X, Result)
).
This, finally, works. (At least, I think so. I haven't tried this exact code, but this was the path I took earlier from "not working" to "working".) That's fine and all, except that it's exceptionally obnoxious. If I had a way of evaluating arithmetic operations in-line, I could halve the lines of code, make it more readable, and NOT create a one-use clutter predicate.
Question: Is there a way to evaluate arithmetic operations in-line, without declaring a new variable?
Failing that, it would be acceptable (and, in some cases, still useful for other things) if there were a way to restrict the scope of new variables. Suppose, for instance, you could define a block within the foreach, where the variables visible from the outside were marked, and any other variables in the block were considered new for that execution of the block. (I realize my terminology may be incorrect, but hopefully it gets the point across.) For example, something resembling:
foreach(
between(1,10,X),
(X, Result){
XPlusY is X + Y,
somePredicate(X, XPlusY, Result)
}
).
A possible solution might be if we can declare a lambda in-line, and immediately call it. Summed up:
Alternate question: Is there a way to limit the scope of new variables within a predicate, while retaining the ability to perform lasting unifications on one or more existing variables?
(The second half I added as clarification in response to an answer about forall.)
A solution to both questions is preferred, but a solution to either will suffice.
library(yall) allows you to define lambda expressions. For instance
?- foreach(between(1,3,X),call([Y]>>(Z is Y+1,writeln(Z)),X)).
2
3
4
true.
Alternatively, library(lambda) provides the construct:
?- [library(lambda)].
true.
?- foreach(between(1,3,X),call(\Y^(Z is Y+1,writeln(Z)),X)).
2
3
4
true.
In SWI-Prolog, library(yall) is autoloaded, while to get library(lambda) you should install the related pack:
?- pack_install(lambda).
Use in alternative the forall/2 de facto standard predicate:
forall(
between(1,10,X),
somePredicate(X,X+Y,Result)
).
While the foreach/2 predicate is usually implemented in the way you describe, the forall/2 predicate is defined as:
% forall(#callable, #callable)
forall(Generate, Test) :-
\+ (Generate, \+ Test).
Note that the use of negation implies that no bindings will be returned when a call to the predicate succeeds.
Update
Lambda libraries allow the specification of both lambda global (aka lambda free) and lambda local variables (aka lambda parameters). Using Logtalk lambdas syntax (available also in SWI-Prolog in library(yall), you can write (reusing Carlo's example) e.g.
?- G = 2, foreach(between(1,3,X),call({G}/[Y]>>(Z is Y+G,writeln(Z)),X)).
3
4
5
G = 2.
?- G = 4, foreach(between(1,3,X),call({G}/[Y]>>(Z is Y+G,writeln(Z)),X)).
5
6
7
G = 4.
Thus, it's possible to use lambdas to limit the scope of some variables within a goal without also limiting the scope of every unification in the goal.

Functional "simultanity"?

At this link, functional programming is spoken of. Specifically, the author says this:
Simultaneity means that we assume a statement in lambda calculus is evaluated all at once. The trivial function:
λf(x) ::= x f(x)
defines an infinite sequence of whatever you plug in for x. The stepwise expansion looks like this:
0 - f(x)
1 - x f(x)
2 - x x f(x)
3 - x x x f(x)
The point is that we have to assume that the 'f()' and 'x' in step three million have the same meaning they did in step one.
At this point, those of you who know something about FP are muttering "referential transparency" under your collective breath. I know. I'll beat up on that in a minute. For now, just suspend your disbelief enough to admit that the constraint does exist, and the aardvark won't get hurt.
The problem with infinite expansions in a real-world computer is that.. well.. they're infinite. As in, "infinite loop" infinite. You can't evaluate every term of an infinite sequence before moving on to the next evaluation unless you're planning to take a really long coffee break while you wait for the answers.
Fortunately, theoretical logic comes to the rescue and tells us that preorder evaluation will always give us the same results as postorder evaluation.
More vocabulary.. need another function for this.. fortunately, it's a simple one:
λg(x) ::= x x
Now.. when we make the statement:
g(f(x))
Preorder evaluation says we have to expand f(x) completely before plugging it into g(). But that takes forever, which is.. inconvenient. Postorder evaluation says we can do this:
0 - g(f(x))
1 - f(x) f(x)
2 - x f(x) x f(x)
3 - x x f(x) x x f(x)
. . . could someone explain to me what is meant here? I haven't a clue what's being said. Maybe point me to a really good FP primer that would get me started.
(Warning, this answer is very long-winded. I thought it best to include general knowledge of lambda calculus because it is near impossible to find good explanations of it)
The author appears to be using the syntax λg(x) to mean a named function, rather than a traditional function in lambda calculus. The author also appears to be going on at length about how lambda calculus is not functional programming in the same way that a Turing machine isn't imperative programming. There's practicalities and ideals that exist with those abstractions that aren't present in the programming languages frequently used to represent them. But before getting into that, a primer on lambda calculus may help. In lambda calculus, all functions look like this:
λarg.body
That's it. There's a λ symbol (called "lambda", hence the name) followed by a named argument and only one named argument, then followed by a period, then followed by an expression that represents the body of the function. For instance, the identity function which takes anything and just returns it right back would look like this:
λx.x
And evaluating an expression is just a series of simple rules for swapping out functions and arguments with their body expressions. An expression has the form:
function-or-expression arg-or-expression
Reducing it usually has the rules "If the left thing is an expression, reduce it. Otherwise, it must be a function, so use arg-or-expression as the argument to the function, and replace this expression with the body of the function. It is very important to note that there is no requirement that the arg-or-expression be reduced before being used as an argument. That is, both of the following are equivalent and mathematically identical reductions of the expression λx.x (λy.y 0) (assuming you have some sort of definition for 0, because lambda calculus requires you define numbers as functions):
λx.x (λy.y 0)
=> λx.x 0
=> 0
λx.x (λy.y 0)
=> λy.y 0
=> 0
In the first reduction, the argument was reduced before being used in the λx.x function. In the second, the argument was merely substituted into the λx.x function body - it wasn't reduced before being used. When this concept is used in programming, it's called "lazy evaluation" - you don't actually evaluate (reduce) an expression until you need to. What's important to note is that in lambda calculus, it does not matter whether an argument is reduced or not before substitution. The mathematics of lambda calculus prove that you'll get the same result either way as long as both terminate. This is definitely not the case in programming languages, because all sorts of things (usually relating to a change in the program's state) can make lazy evaluation different from normal evaluation.
Lambda calculus needs some extensions to be useful however. There's no way to name things. Suppose we allowed that though. In particular, let's create our own definition of what a function looks like in lambda calculus:
λname(arg).body
We'll say this means that the function λarg.body is bound to name, and anywhere else in any accompanying lambda expressions we can replace name with λarg.body. So we could do this:
λidentity(x).x
And now when we write identity, we'll just replace it with λx.x. This introduces a problem however. What happens if a named function refers to itself?
λevil(x).(evil x)
Now we've got a problem. According to our rule, we should be able to replace the evil in the body with what the name is bound to. But since the name is bound to λx.(evil x), as soon as we try:
λevil(x).(evil x)
=> λevil(x).(λx.(evil x) x)
=> λevil(x).(λx.(λx.(evil x) x) x)
=> ...
We get an infinite loop. We can never evaluate this expression, because we have no way of turning it from our special named lambda form to a regular lambda expression. We can't go from the language with our special extension down to regular lambda calculus because we can't satisfy the rule of "replace evil with the function expression evil is bound to". There are some tricks for dealing with this, but we'll get to that in a minute.
An important point here is that this is completely different from a regular lambda calculus program that evaluates infinitely and never finishes. For instance, consider the self application function which takes something and applies it to itself:
λx.(x x)
If we evaluate this with the identity function, we get:
λx.(x x) λx.x
=> λx.x λx.x
=> λx.x
Using named functions and naming this function self:
self identity
=> identity identity
=> identity
But what happens if we pass self to itself?
λx.(x x) λx.(x x)
=> λx.(x x) λx.(x x)
=> λx.(x x) λx.(x x)
=> ...
We get an expression that loops into repeatedly reducing self self into self self over and over again. This is a plain old infinite loop you'd find in any (Turing-complete) programming language.
The difference between this and our problem with recursive definitions is that our names and definitions are not lambda calculus. They are shorthands which we can expand to lambda calculus by following some rules. But in the case of λevil(x).(evil x), we can't expand it to lambda calculus so we don't even get a lambda calculus expression to run. Our named function "fails to compile" in a sense, similar to when you send the programming language compiler into an infinite loop and your code never even starts as opposed to when the actual runtime loops. (Yes, it is entirely possible to make the compiler get caught in an infinite loop.)
There are some very clever ways to get around this problem, one of which is the infamous Y-combinator. The basic idea is you take our problematic evil function and change it to instead of accepting an argument and trying to be recursive, accepts an argument and returns another function that accepts an argument, so your body expression has two arguments to work with:
λevil(f).λy.(f y)
If we evaluate evil identity, we'll get a new function that takes an argument and just calls identity with it. The following evaluation shows first the name replacement using ->, then the reduction using =>:
(evil identity) 0
-> (λf.λy.(f y) identity) 0
-> (λf.λy.(f y) λx.x) 0
=> λy.(λx.x y) 0
=> λx.x 0
=> 0
Where things get interesting is if we pass evil to itself instead of identity:
(evil evil) 0
-> (λf.λy.(f y) λf.λy.(f y)) 0
=> λy.(λf.λy.(f y) y) 0
=> λf.λy.(f y) 0
=> λy.(0 y)
We ended up with a function that's complete nonsense, but we achieved something important - we created one level of recursion. If we were to evaluate (evil (evil evil)), we would get two levels. With (evil (evil (evil evil))), three. So what we need to do is instead of passing evil to itself, we need to pass a function that somehow accomplishes this recursion for us. In particular, it should be a function with some sort of self application. What we want is the Y-combinator:
λf.(λx.(f (x x)) λx.(f (x x)))
This function is pretty tricky to wrap your head around from the definition, so it's best to just call it Y and see what happens when we try and evaluate a few things with it:
Y evil
-> λf.(λx.(f (x x)) λx.(f (x x))) evil
=> λx.(evil (x x)) λx.(evil (x x))
=> evil (λx.(evil (x x))
λx.(evil (x x)))
=> evil (evil (λx.(evil (x x))
λx.(evil (x x))))
=> evil (evil (evil (λx.(evil (x x))
λx.(evil (x x)))))
And as we can see, this goes on infinitely. What we've done is taken evil, which accepts first one function and then accepts an argument and evaluates that argument using the function, and passed it a specially modified version of the evil function which expands to provide recursion. So we can create a "recursion point" in the evil function by reducing evil (Y evil). So now, whenever we see a named function using recursion like this:
λname(x).(.... some body containing (name arg) in it somewhere)
We can transform it to:
λname-rec(f).λx.(...... body with (name arg) replaced with (f arg))
λname(x).((name-rec (Y name-rec)) x)
We turn the function into a version that first accepts a function to use as a recursion point, then we provide the function Y name-rec as the function to use as the recursion point.
The reason this works, and getting waaaaay back to the original point of the author, is because the expression name-rec (Y name-rec) does not have to fully reduce Y name-rec before starting its own reduction. I cannot stress this enough. We've already seen that reducing Y name-rec results in an infinite loop, so the recursion works if there's some sort of condition in the name-rec function that means that the next step of Y name-rec might not need to be reduced.
This breaks down in many programming languages, including functional ones, because they do not support this kind of lazy evaluation. Additionally, almost all programming languages support mutation. That is, if you define a variable x = 3, later in the same code you can make x = 5 and all the old code that referred to x when it was 3 will now see x as being 5. This means your program could have completely different results if that old code is "delayed" with lazy evaluation and only calculated later on, because by then x could be 5. In a language where things can be arbitrarily executed in any order at any time, you have to completely eliminate your program's dependency on things like order of statements and time-changing values. If you don't, your program could calculate arbitrarily different results depending on what order your code gets run in.
However, writing code that has no sense of order in it whatsoever is extremely difficult. We saw how complicated lambda calculus got just trying to get our heads around trivial recursion. Therefore, most functional programming languages pick a model that systematically defines in what order things are evaluated in, and they never deviate from that model.
Racket, a dialect of Scheme, specifies that in the normal Racket language, all expressions are evaluated "eagerly" (no delaying) and all function arguments are evaluated eagerly from left to right, but the Racket program includes special forms that let you selectively make certain expressions lazy, such as (promise ...). Haskell does the opposite, with expressions defaulting to lazy evaluation and having the compiler run a "strictness analyser" to determine which expressions are needed by functions that are specially declared to need arguments to be eagerly evaluated.
The primary point being made seems to be that it's just too impractical to design a language that completely allows all expressions to be individually lazy or eager, because the limitations this poses on what tools you can use in the language are severe. Therefore, it's important to keep in mind what tools a functional language provides you for manipulating lazy expressions and eager expressions, because they are most certainly not equivalent in all practical functional programming languages.

Replacing functions with Table Lookups

I've been watching this MSDN video with Brian Beckman and I'd like to better understand something he says:
Every imperitive programmer goes through this phase of learning that
functions can be replaced with table lookups
Now, I'm a C# programmer who never went to university, so perhaps somewhere along the line I missed out on something everyone else learned to understand.
What does Brian mean by:
functions can be replaced with table lookups
Are there practical examples of this being done and does it apply to all functions? He gives the example of the sin function, which I can make sense of, but how do I make sense of this in more general terms?
Brian just showed that the functions are data too. Functions in general are just a mapping of one set to another: y = f(x) is mapping of set {x} to set {y}: f:X->Y. The tables are mappings as well: [x1, x2, ..., xn] -> [y1, y2, ..., yn].
If function operates on finite set (this is the case in programming) then it's can be replaced with a table which represents that mapping. As Brian mentioned, every imperative programmer goes through this phase of understanding that the functions can be replaced with the table lookups just for performance reason.
But it doesn't mean that all functions easily can or should be replaced with the tables. It only means that you theoretically can do that for every function. So the conclusion would be that the functions are data because tables are (in the context of programming of course).
There is a lovely trick in Mathematica that creates a table as a side-effect of evaluating function-calls-as-rewrite-rules. Consider the classic slow-fibonacci
fib[1] = 1
fib[2] = 1
fib[n_] := fib[n-1] + fib[n-2]
The first two lines create table entries for the inputs 1 and 2. This is exactly the same as saying
fibTable = {};
fibTable[1] = 1;
fibTable[2] = 1;
in JavaScript. The third line of Mathematica says "please install a rewrite rule that will replace any occurrence of fib[n_], after substituting the pattern variable n_ with the actual argument of the occurrence, with fib[n-1] + fib[n-2]." The rewriter will iterate this procedure, and eventually produce the value of fib[n] after an exponential number of rewrites. This is just like the recursive function-call form that we get in JavaScript with
function fib(n) {
var result = fibTable[n] || ( fib(n-1) + fib(n-2) );
return result;
}
Notice it checks the table first for the two values we have explicitly stored before making the recursive calls. The Mathematica evaluator does this check automatically, because the order of presentation of the rules is important -- Mathematica checks the more specific rules first and the more general rules later. That's why Mathematica has two assignment forms, = and :=: the former is for specific rules whose right-hand sides can be evaluated at the time the rule is defined; the latter is for general rules whose right-hand sides must be evaluated when the rule is applied.
Now, in Mathematica, if we say
fib[4]
it gets rewritten to
fib[3] + fib[2]
then to
fib[2] + fib[1] + 1
then to
1 + 1 + 1
and finally to 3, which does not change on the next rewrite. You can imagine that if we say fib[35], we will generate enormous expressions, fill up memory, and melt the CPU. But the trick is to replace the final rewrite rule with the following:
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
This says "please replace every occurrence of fib[n_] with an expression that will install a new specific rule for the value of fib[n] and also produce the value." This one runs much faster because it expands the rule-base -- the table of values! -- at run time.
We can do likewise in JavaScript
function fib(n) {
var result = fibTable[n] || ( fib(n-1) + fib(n-2) );
fibTable[n] = result;
return result;
}
This runs MUCH faster than the prior definition of fib.
This is called "automemoization" [sic -- not "memorization" but "memoization" as in creating a memo for yourself].
Of course, in the real world, you must manage the sizes of the tables that get created. To inspect the tables in Mathematica, do
DownValues[fib]
To inspect them in JavaScript, do just
fibTable
in a REPL such as that supported by Node.JS.
In the context of functional programming, there is the concept of referential transparency. A function that is referentially transparent can be replaced with its value for any given argument (or set of arguments), without changing the behaviour of the program.
Referential Transparency
For example, consider a function F that takes 1 argument, n. F is referentially transparent, so F(n) can be replaced with the value of F evaluated at n. It makes no difference to the program.
In C#, this would look like:
public class Square
{
public static int apply(int n)
{
return n * n;
}
public static void Main()
{
//Should print 4
Console.WriteLine(Square.apply(2));
}
}
(I'm not very familiar with C#, coming from a Java background, so you'll have to forgive me if this example isn't quite syntactically correct).
It's obvious here that the function apply cannot have any other value than 4 when called with an argument of 2, since it's just returning the square of its argument. The value of the function only depends on its argument, n; in other words, referential transparency.
I ask you, then, what the difference is between Console.WriteLine(Square.apply(2)) and Console.WriteLine(4). The answer is, there's no difference at all, for all intents are purposes. We could go through the entire program, replacing all instances of Square.apply(n) with the value returned by Square.apply(n), and the results would be the exact same.
So what did Brian Beckman mean with his statement about replacing function calls with a table lookup? He was referring to this property of referentially transparent functions. If Square.apply(2) can be replaced with 4 with no impact on program behaviour, then why not just cache the values when the first call is made, and put it in a table indexed by the arguments to the function. A lookup table for values of Square.apply(n) would look somewhat like this:
n: 0 1 2 3 4 5 ...
Square.apply(n): 0 1 4 9 16 25 ...
And for any call to Square.apply(n), instead of calling the function, we can simply find the cached value for n in the table, and replace the function call with this value. It's fairly obvious that this will most likely bring about a large speed increase in the program.

Understanding Lazy Evaluation in Haskell

I am trying to learn Haskell, but i am stuck in understanding lazy evaluation.
Can someone explain me lazy evaluation in detail and the output of the following 2 cases[with explaination] in relation to the below given
Pseudo Code:
x = keyboard input (5)
y = x + 3 (=8)
echo y (8)
x = keyboard input (2)
echo y
Case 1: Static binding, lazy evaluation
Case 2: Dynamic binding, lazy evaluation.
I need to know what will the last line (echo y) is going to print...in the above 2 cases.
Sorry this is way too long but...
I'm afraid the answer is going to depend a lot on the meaning of the words...
First, here's that code in Haskell (which uses static binding and lazy evaluation):
readInt :: String -> Int
readInt = read
main = do
x <- fmap readInt getLine
let y = x + 3
print y
x <- fmap readInt getLine
print y
It prints 8 and 8.
Now here's that code in R which uses lazy evaluation and what some people call
dynamic binding:
delayedAssign('x', as.numeric(readLines(n=1)))
delayedAssign('y', x + 3)
print(y)
delayedAssign('x', as.numeric(readLines(n=1)))
print(y)
It prints 8 and 8. Not so different!
Now in C++, which uses strict evaluation and static binding:
#include <iostream>
int main() {
int x;
std::cin >> x;
int y = x + 3;
std::cout << y << "\n";
std::cin >> x;
std::cout << y << "\n";
}
It prints 8 and 8.
Now let me tell you what I think the point of the question actually was ;)
"lazy evaluation" can mean many different things. In Haskell it has a very
particular meaning, which is that in nested expressions:
f (g (h x))
evaluation works as if f gets evaluated before g (h x), ie evaluation
goes "outside -> in". Practically this means that if f looks like
f x = 2
ie just throws away its argument, g (h x) never gets evaluated.
But I think that that is not where the question was going with "lazy
evaluation". The reason I think this is that:
+ always evaluates its arguments! + is the same whether you're using lazy
evaluation or not.
The only computation that could actually be delayed is keyboard input --
and that's not really computation, because it causes an action to occur;
that is, it reads from the user.
Haskell people would generally not call this "lazy evaluation" -- they would call
it lazy (or deferred) execution.
So what would lazy execution mean for your question? It would mean that the
action keyboard input gets delayed... until the value x is really really
needed. It looks to me like that happens here:
echo y
because at that point you must show the user a value, and so you must know what
x is! So what would happen with lazy execution and static binding?
x = keyboard input # nothing happens
y = x + 3 # still nothing happens!
echo y (8) # y becomes 8. 8 gets printed.
x = keyboard input (2) # nothing happens
echo y # y is still 8. 8 gets printed.
Now about this word "dynamic binding". It can mean different things:
Variable scope and lifetime is decided at run time. This is what languages
like R do that don't declare variables.
The formula for a computation (like the formula for y is x + 3) isn't
inspected until the variable is evaluated.
My guess is that that is what "dynamic binding" means in your question. Going
over the code again with dynamic binding (sense 2) and lazy execution:
x = keyboard input # nothing happens
y = x + 3 # still nothing happens!
echo y (8) # y becomes 8. 8 gets printed.
x = keyboard input (2) # nothing happens
echo y # y is already evaluated,
# so it uses the stored value and prints 8
I know of no language that would actually print 7 for the last line... but I
really think that's what the question was hoping would happen!
The key thing about lazy evaluation in Haskell is that it doesn't affect the output of your program at all. You can read it just as if everything were evaluated as soon as it is defined, and you'll still get the same result.
Lazy evaluation is just a strategy for figuring out the value of an expression in the program. There are many possible and they all give the same result[1]; any evaluation strategy that changes the meaning of the program wouldn't be a valid strategy!
So from a certain perspective, you don't have to understand lazy evaluation (yet) if it's giving you trouble. When you're learning Haskell, especially if it's your first functional and pure language, thinking about expressing yourself in this way is much more important. I would also rate training yourself to become comfortable with reading Haskell's (often quite dense) syntax as more important than fully "grokking" lazy evaluation. So don't worry about it too much if the concept gives you difficulty.
That said, my go at explaining it is below. I haven't used your examples, as they're not really affected by lazy evaluation, and Owen has talked more clearly than I can about dynamic binding and delayed execution wrt your example.
The most important difference between (valid) evaluation strategies is that some strategies can fail to return a result at all where another strategy might succeed. Lazy evaluation has the particular property that if any (valid) evaluation strategy can find a result, lazy evaluation will find it. In particular, programs that generate infinite data structures and then only use a finite amount of the data can terminate with lazy evaluation. In the strict evaluation you're probably used to, the program has to finish generating the infinite data structure before it can go on to use part of it, and of course it will.
The way lazy evaluation achieves this is by only evaluating something when it's needed to figure out what to do next. When you call a function that returns a list, it "returns" straight away and gives you a placeholder for the list. That placeholder can be passed to other functions, stored in other data structures, anything. Only when the program needs to know something about the list will it be actually evaluated, and only as far as needed.
Say the program now is going to do something different if the list is empty than if it is not. The the function call that originally returned the placeholder is evaluated a little bit further, to see if it returns an empty list or a list with a head element. Then the evaluation stops again, as the program now knows which way to go. If the rest of the list is never needed, it will never be evaluated.
But it's also not evaluated more times than needed. If the placeholder was passed into multiple functions (so it's now involved in other not-yet-evaluated function calls), or stored into several different data structures, Haskell still "knows" that they're all the same thing, and arranges for them all to "see" the effects of any further evaluation of the placeholder triggered from any of them. Eventually, if all of the list is needed somewhere, they'll all be pointing to an ordinary fully-evaluated data structure, and laziness has no further impact.
But the key thing to remember is that everything needed to produce that list is already determined and fixed when the placeholder was generated. It can't be affected by anything else that's happened in the program since. If that were not so, then Haskell would not be pure. And vice versa; impure languages can't have Haskell-style full laziness behind the scenes, because the results you would get could change dramatically depending on when in the future the results are needed. Instead, impure languages that support lazy evaluation tend to have it only for certain things explicitly declared by the programmer, with warnings in the manual saying "don't use laziness on something dependent on side effects".
[1] I lie a little here. Keep reading below the line to see why.
Lazy Evaluation in Haskell: Leftmost-Outermost + Graph Reduction
Square x = x * x
Square (Square 42)
(Square 42) * (Square 42) -> Square 42 will be computed only one time thanks to Graph Reduction
(42 * 42) * (Square 42)
(1764) * (Square 42) -> next is Graph Reduction
1764 * 1764
=3111696
Leftmost-innermost (Java, C++)
Square (Square 42)
square ( 42 * 42)
square ( 1764 )
1764 * 1764
=3111696

Resources