How to make Pre and Post conditions for recursive functions in SPARK? - recursion

I'm translating an exercise I made in Dafny into SPARK, where one verifies a tail recursive function against a recursive one. The Dafny source (censored, because it might still be used for classes):
function Sum(n:nat):nat
decreases n
{
if n==0 then n else n+Sum(n-1)
}
method ComputeSum(n:nat) returns (s:nat)
ensures s == Sum(n)
{
s := 0;
// ...censored...
}
What I got in SPARK so far:
function Sum (n : in Natural) return Natural
is
begin
if n = 0 then
return n;
else
return n + Sum(n - 1);
end if;
end Sum;
function ComputeSum(n : in Natural) return Natural
with
Post => ComputeSum'Result = Sum(n)
is
s : Natural := 0;
begin
-- ...censored...
return s;
end ComputeSum;
I cannot seem to figure out how to express the decreases n condition (which now that I think about it might be a little odd... but I got graded for it a few years back so who am I to judge, and the question remains how to get it done). As a result I get warnings of possible overflow and/or infinite recursion.
I'm guessing there is a pre or post condition to be added. Tried Pre => n <= 1 which obviously does not overflow, but I still get the warning. Adding Post => Sum'Result <= n**n on top of that makes the warning go away, but that condition gets a "postcondition might fail" warning, which isn't right, but guess the prover can't tell. Also not really the expression I should check against, but I cannot seem to figure what other Post I'm looking for. Possibly something very close to the recursive expression, but none of my attempts work. Must be missing out on some language construct...
So, how could I express the recursive constraints?
Edit 1:
Following links to this SO answer and this SPARK doc section, I tried this:
function Sum (n : in Natural) return Natural
is
(if n = 0 then 0 else n + Sum(n - 1))
with
Pre => (n in 0 .. 2),
Contract_Cases => (n = 0 => Sum'Result = 0,
n >= 1 => Sum'Result = n + Sum(n - 1)),
Subprogram_Variant => (Decreases => n);
However getting these warnings from SPARK:
spark.adb:32:30: medium: overflow check might fail [reason for check: result of addition must fit in a 32-bits machine integer][#0]
spark.adb:36:56: warning: call to "Sum" within its postcondition will lead to infinite recursion

If you want to prove that the result of some tail-recursive summation function equals the result of a given recursive summation function for some value N, then it should, in principle, suffice to only define the recursive function (as an expression function) without any post-condition. You then only need to mention the recursive (expression) function in the post-condition of the tail-recursive function (note that there was no post-condition (ensures) on the recursive function in Dafny either).
However, as one of SPARK's primary goal is to proof the absence of runtime errors, you must have to prove that overflow cannot occur and for this reason, you do need a post-condition on the recursive function. A reasonable choice for such a post-condition is, as #Jeffrey Carter already suggested in the comments, the explicit summation formula for arithmetic progression:
Sum (N) = N * (1 + N) / 2
The choice is actually very attractive as with this formula we can now also functionally validate the recursive function itself against a well-known mathematically explicit expression for computing the sum of a series of natural numbers.
Unfortunately, using this formula as-is will only bring you somewhere half-way. In SPARK (and Ada as well), pre- and post-conditions are optionally executable (see also RM 11.4.2 and section 5.11.1 in the SPARK Reference Guide) and must therefore themselves be free of any runtime errors. Therefore, using the formula as-is will only allow you to prove that no overflow occurs for any positive number up until
max N s.t. N * (1 + N) <= Integer'Last <-> N = 46340
as in the post-condition, the multiplication is not allowed to overflow either (note that Natural'Last = Integer'Last = 2**31 - 1).
To work around this, you'll need to make use of the big integers package that has been introduced in the Ada 202x standard library (see also RM A.5.6; this package is already included in GNAT CE 2021 and GNAT FSF 11.2). Big integers are unbounded and computations with these integers never overflow. Using these integers, one can proof that overflow will not occur for any positive number up until
max N s.t. N * (1 + N) / 2 <= Natural'Last <-> N = 65535 = 2**16 - 1
The usage of these integers in a post-condition is illustrated in the example below.
Some final notes:
The Subprogram_Variant aspect is only needed to prove that a recursive subprogram will eventually terminate. Such a proof of termination must be requested explicitly by adding an annotation to the function (also shown in the example below and as discussed in the SPARK documentation pointed out by #egilhh in the comments). The Subprogram_Variant aspect is, however, not needed for your initial purpose: proving that the result of some tail-recursive summation function equals the result of a given recursive summation function for some value N.
To compile a program that uses functions from the new Ada 202x standard library, use compiler option -gnat2020.
While I use a subtype to constrain the range of permissible values for N, you could also use a precondition. This should not make any difference. However, in SPARK (and Ada as well), it is in general considered to be a best practise to express contraints using (sub)types as much as possible.
Consider counterexamples as possible clues rather than facts. They may or may not make sense. Counterexamples are optionally generated by some solvers and may not make sense. See also the section 7.2.6 in the SPARK user’s guide.
main.adb
with Ada.Numerics.Big_Numbers.Big_Integers;
procedure Main with SPARK_Mode is
package BI renames Ada.Numerics.Big_Numbers.Big_Integers;
use type BI.Valid_Big_Integer;
-- Conversion functions.
function To_Big (Arg : Integer) return BI.Valid_Big_Integer renames BI.To_Big_Integer;
function To_Int (Arg : BI.Valid_Big_Integer) return Integer renames BI.To_Integer;
subtype Domain is Natural range 0 .. 2**16 - 1;
function Sum (N : Domain) return Natural is
(if N = 0 then 0 else N + Sum (N - 1))
with
Post => Sum'Result = To_Int (To_Big (N) * (1 + To_Big (N)) / 2),
Subprogram_Variant => (Decreases => N);
-- Request a proof that Sum will terminate for all possible values of N.
pragma Annotate (GNATprove, Terminating, Sum);
begin
null;
end Main;
output (gnatprove)
$ gnatprove -Pdefault.gpr --output=oneline --report=all --level=1 --prover=z3
Phase 1 of 2: generation of Global contracts ...
Phase 2 of 2: flow analysis and proof ...
main.adb:13:13: info: subprogram "Sum" will terminate, terminating annotation has been proved
main.adb:14:30: info: overflow check proved
main.adb:14:32: info: subprogram variant proved
main.adb:14:39: info: range check proved
main.adb:16:18: info: postcondition proved
main.adb:16:31: info: range check proved
main.adb:16:53: info: predicate check proved
main.adb:16:69: info: division check proved
main.adb:16:71: info: predicate check proved
Summary logged in [...]/gnatprove.out
ADDENDUM (in response to comment)
So you can add the post condition as a recursive function, but that does not help in proving the absence of overflow; you will still have to provide some upper bound on the function result in order to convince the prover that the expression N + Sum (N - 1) will not cause an overflow.
To check the absence of overflow during the addition, the prover will consider all possible values that Sum might return according to it's specification and see if at least one of those value might cause the addition to overflow. In the absence of an explicit bound in the post condition, Sum might, according to its return type, return any value in the range Natural'Range. That range includes Natural'Last and that value will definitely cause an overflow. Therefore, the prover will report that the addition might overflow. The fact that Sum never returns that value given its allowable input values is irrelevant here (that's why it reports might). Hence, a more precise upper bound on the return value is required.
If an exact upper bound is not available, then you'll typically fallback onto a more conservative bound like, in this case, N * N (or use saturation math as shown in the Fibonacci example from the SPARK user manual, section 5.2.7, but that approach does change your function which might not be desirable).
Here's an alternative example:
example.ads
package Example with SPARK_Mode is
subtype Domain is Natural range 0 .. 2**15;
function Sum (N : Domain) return Natural
with Post =>
Sum'Result = (if N = 0 then 0 else N + Sum (N - 1)) and
Sum'Result <= N * N; -- conservative upper bound if the closed form
-- solution to the recursive function would
-- not exist.
end Example;
example.adb
package body Example with SPARK_Mode is
function Sum (N : Domain) return Natural is
begin
if N = 0 then
return N;
else
return N + Sum (N - 1);
end if;
end Sum;
end Example;
output (gnatprove)
$ gnatprove -Pdefault.gpr --output=oneline --report=all
Phase 1 of 2: generation of Global contracts ...
Phase 2 of 2: flow analysis and proof ...
example.adb:8:19: info: overflow check proved
example.adb:8:28: info: range check proved
example.ads:7:08: info: postcondition proved
example.ads:7:45: info: overflow check proved
example.ads:7:54: info: range check proved
Summary logged in [...]/gnatprove.out

I landed in something that sometimes works, which I think is enough for closing the title question:
function Sum (n : in Natural) return Natural
is
(if n = 0 then 0 else n + Sum(n - 1))
with
Pre => (n in 0 .. 10), -- works with --prover=z3, not Default (CVC4)
-- Pre => (n in 0 .. 100), -- not working - "overflow check might fail, e.g. when n = 2"
Subprogram_Variant => (Decreases => n),
Post => ((n = 0 and then Sum'Result = 0)
or (n > 0 and then Sum'Result = n + Sum(n - 1)));
-- Contract_Cases => (n = 0 => Sum'Result = 0,
-- n > 0 => Sum'Result = n + Sum(n - 1)); -- warning: call to "Sum" within its postcondition will lead to infinite recursion
-- Contract_Cases => (n = 0 => Sum'Result = 0,
-- n > 0 => n + Sum(n - 1) = Sum'Result); -- works
-- Contract_Cases => (n = 0 => Sum'Result = 0,
-- n > 0 => Sum'Result = n * (n + 1) / 2); -- works and gives good overflow counterexamples for high n, but isn't really recursive
Command line invocation in GNAT Studio (Ctrl+Alt+F), --counterproof=on and --prover=z3 my additions to it:
gnatprove -P%PP -j0 %X --output=oneline --ide-progress-bar --level=0 -u %fp --counterexamples=on --prover=z3
Takeaways:
Subprogram_Variant => (Decreases => n) is required to tell the prover n decreases for each recursive invocation, just like the Dafny version.
Works inconsistently for similar contracts, see commented Contract_Cases.
Default prover (CVC4) fails, using Z3 succeeds.
Counterproof on fail makes no sense.
n = 2 presented as counterproof for range 0 .. 100, but not for 0 .. 10.
Possibly related to this mention in the SPARK user guide: However, note that since the counterexample is always generated only using CVC4 prover, it can just explain why this prover cannot prove the property.
Cleaning between changing options required, e.g. --prover.

Related

Can't prove seemingly trivial equality in Ada Spark

So I have those two files.
testing.ads
package Testing with
SPARK_Mode
is
function InefficientEuler1Sum2 (N: Natural) return Natural;
procedure LemmaForTesting with
Ghost,
Post => (InefficientEuler1Sum2(0) = 0);
end Testing;
and testing.adb
package body Testing with
SPARK_Mode
is
function InefficientEuler1Sum2 (N: Natural) return Natural is
Sum: Natural := 0;
begin
for I in 0..N loop
if I mod 3 = 0 then
Sum := Sum + I;
end if;
if I mod 5 = 0 then
Sum := Sum + I;
end if;
if I mod 15 = 0 then
Sum := Sum - I;
end if;
end loop;
return Sum;
end InefficientEuler1Sum2;
procedure LemmaForTesting
is
begin
null;
end LemmaForTesting;
end Testing;
When I run SPARK -> Prove File, I get such message:
GNATprove
E:\Ada\Testing SPARK\search\src\testing.ads
10:14 medium: postcondition might fail
cannot prove InefficientEuler1Sum2(0) = 0
Why is that so? What I have misunderstood or maybe done wrong?
Thanks in advance.
To prove the trivial equality, you need to make sure it is covered by the function's post-condition. If so, you can prove the equality using a simple Assert statement as shown in the example below. No lemma is needed at this point.
However, a post-condition is not enough to prove the absence of runtime errors (AoRTE): given the allowable input range of the function, the summation may, for some values of N, overflow. To mitigate the issue, you need to bound the input values of N and show the prover that the value for Sum remains bounded during the loop using a loop invariant (see here, here and here for some background information on loop invariants). For illustration purposes, I've chosen a conservative bound of (2 * I) * I, which will severely restrict the allowable range of input values, but does allow the prover to proof the absence of runtime errors in the example.
testing.ads
package Testing with SPARK_Mode is
-- Using the loop variant in the function body, one can guarantee that no
-- overflow will occur for all values of N in the range
--
-- 0 .. Sqrt (Natural'Last / 2) <=> 0 .. 32767
--
-- Of course, this bound is quite conservative, but it may be enough for a
-- given application.
--
-- The post-condition can be used to prove the trivial equality as stated
-- in your question.
subtype Domain is Natural range 0 .. 32767;
function Inefficient_Euler_1_Sum_2 (N : Domain) return Natural
with Post => (if N = 0 then Inefficient_Euler_1_Sum_2'Result = 0);
end Testing;
testing.adb
package body Testing with SPARK_Mode is
-------------------------------
-- Inefficient_Euler_1_Sum_2 --
-------------------------------
function Inefficient_Euler_1_Sum_2 (N : Domain) return Natural is
Sum: Natural := 0;
begin
for I in 0 .. N loop
if I mod 3 = 0 then
Sum := Sum + I;
end if;
if I mod 5 = 0 then
Sum := Sum + I;
end if;
if I mod 15 = 0 then
Sum := Sum - I;
end if;
-- Changed slightly since initial post, no effect on Domain.
pragma Loop_Invariant (Sum <= (2 * I) * I);
end loop;
return Sum;
end Inefficient_Euler_1_Sum_2;
end Testing;
main.adb
with Testing; use Testing;
procedure Main with SPARK_Mode is
begin
pragma Assert (Inefficient_Euler_1_Sum_2 (0) = 0);
end Main;
output
$ gnatprove -Pdefault.gpr -j0 --level=1 --report=all
Phase 1 of 2: generation of Global contracts ...
Phase 2 of 2: flow analysis and proof ...
main.adb:5:19: info: assertion proved
testing.adb:13:15: info: division check proved
testing.adb:14:24: info: overflow check proved
testing.adb:16:15: info: division check proved
testing.adb:17:24: info: overflow check proved
testing.adb:19:15: info: division check proved
testing.adb:20:24: info: overflow check proved
testing.adb:20:24: info: range check proved
testing.adb:23:33: info: loop invariant preservation proved
testing.adb:23:33: info: loop invariant initialization proved
testing.adb:23:42: info: overflow check proved
testing.adb:23:46: info: overflow check proved
testing.ads:17:19: info: postcondition proved
Summary logged in /obj/gnatprove/gnatprove.out

Update this statement for Julia 1.0

Could someone explain how to update this for Julia 1.0
function _encode_zigzag{T <: Integer}(n::T)
num_bits = sizeof(T) * 8
(n << 1) ⊻ (n >> (num_bits - 1))
end
And also what is the difference with:
function _encode_zigzag(n::Integer)
num_bits = sizeof(T) * 8
(n << 1) ⊻ (n >> (num_bits - 1))
end
Firstly, in Julia 1.x the subtype constraints on type parameters are stated after the parameters and followed by the reserved word where.
function _encode_zigzag(n::T) where {T <: Integer}
num_bits = sizeof(T) * 8
(n << 1) ⊻ (n >> (num_bits - 1))
end
The curly braces are unnecessary when there is only one type parameter but it is recommended to keep for clarity.
Now for the second question. In the version of your method where n is an Integer, sizeof will not work since the size of an abstract type is undefined. In this case, establishing the subtype constraint helps make sure that the given argument will have a defined size while still giving flexibility for different types. Julia will compile different versions of the function; one for each Integer subtype that gets passed.
This is more efficient than declaring the function with n having a concrete type like Int64, since this would mean the argument would have to be converted to the same type before executing the function.
You can read more of this in the Julia documentation.

Time and space complexity of a "nested" recursive function

In one of the previous intro to cs exams there was a question: calculate the space and time complexity of the function f1 as a function of n, assume that the time complexity of malloc(n) is O(1) and its space complexity is O(n).
int f1(int n) {
if(n < 3)
return 1;
int* arr = (int*) malloc(sizeof(int) * n);
f1(f1(n – 3));
free(arr);
return n;
}
The official solution is: time complexity: O(2^(n/3)), space complexity: O(n^2)
I tried to solve it but i didn't know how until i saw a note in my notebook that said: since the function returns n then we can treat f(f(n-3)) as f(n-3)+f(n-3) or as 2f(n-3). In this case the question becomes very similar to this one: Space complexity of recursive function
I tried solving it this way and i got the correct answer.
For the time complexity:
T(n)=2T(n-3)+1 , T(0)=1
T(n-3)=2T(n-3*2)+1
T(n)=2*2T(n-3*2)+2+1
T(n-3*2)=2T(n-3*3)+1
T(n)=2*2*2T(n-3*3)+2*2+2+1
...
T(n)=(2^k)T(n-3*k)+2^(k-1)+...+2^2+2+1
n-3*k=0
k=n/3
===> 2^(n/3)+...+2^2+2+1=2^(n/3)[1+(1/2)+(1/2^2)+...]=2^(n/3)*constant
Thus I got O(2^(n/3))
For the space complexity: the tree depth is n/3 and each time we do malloc so we get (n/3)^2 thus O(n^2).
My question:
why can we treat f1(f1(n – 3)) as f1(n-3)+f1(n-3) or as 2f1(n-3)?
if the function didn't return n but changed it, for example: return n/3 instead of return n, then how do we solve it? do we treat it as 2f1((n-3)/3)?
if we can't always treat f1(f1(n – 3)) as f1(n-3)+f1(n-3) or as 2f1(n-3) then how do we draw the recursion tree and how do we write and solve it using the induction method T(n)?
Why can we treat f1(f1(n – 3)) as f1(n-3)+f1(n-3) or as 2f1(n-3)?
Because i) the nested f1 is evaluated first, and its return value is used to call the outer f1; so these nested calls are equivalent to:
int result = f1(n - 3);
f1(result);
... and ii) the return value of f1 is just its argument (except for the base case, but it doesn't matter asymptotically), so the above is further equivalent to:
f1(n - 3);
f1(n - 3); // result = n - 3
If the function didn't return n but changed it, for example: return n/3 instead of return n, then how do we solve it? do we treat it as 2f1((n-3)/3)?
Only the outer call is affected. Again, using the equivalent expression from before:
f1(n - 3); // = (n - 3) / 3
f1((n - 3) / 3);
i.e. just f1(n - 3) + f1((n - 3) / 3) for your example.
If we can't always treat f1(f1(n – 3)) as f1(n-3)+f1(n-3) or as 2f1(n-3) then how do we draw the recursion tree and how do we write and solve it using the induction method T(n)?
You can always separate them into two separate calls as above, and again remember that only the second call is affected by the return result. If this is different to n - 3 then you would need a recursion tree instead of simple expansion. Depends on the specific problem, needless to say.

How can this imperative code be rewritten to be more functional?

I found an answer on SO that explained how to write a randomly weighted drop system for a game. I would prefer to write this code in a more functional-programming style but I couldn't figure out a way to do that for this code. I'll inline the pseudo code here:
R = (some random int);
T = 0;
for o in os
T = T + o.weight;
if T > R
return o;
How could this be written in a style that's more functional? I am using CoffeeScript and underscore.js, but I'd prefer this answer to be language agnostic because I'm having trouble thinking about this in a functional way.
Here are two more functional versions in Clojure and JavaScript, but the ideas here should work in any language that supports closures. Basically, we use recursion instead of iteration to accomplish the same thing, and instead of breaking in the middle we just return a value and stop recursing.
Original pseudo code:
R = (some random int);
T = 0;
for o in os
T = T + o.weight;
if T > R
return o;
Clojure version (objects are just treated as clojure maps):
(defn recursive-version
[r objects]
(loop [t 0
others objects]
(let [obj (first others)
new_t (+ t (:weight obj))]
(if (> new_t r)
obj
(recur new_t (rest others))))))
JavaScript version (using underscore for convenience).
Be careful, because this could blow out the stack.
This is conceptually the same as the clojure version.
var js_recursive_version = function(objects, r) {
var main_helper = function(t, others) {
var obj = _.first(others);
var new_t = t + obj.weight;
if (new_t > r) {
return obj;
} else {
return main_helper(new_t, _.rest(others));
}
};
return main_helper(0, objects);
};
You can implement this with a fold (aka Array#reduce, or Underscore's _.reduce):
An SSCCE:
items = [
{item: 'foo', weight: 50}
{item: 'bar', weight: 35}
{item: 'baz', weight: 15}
]
r = Math.random() * 100
{item} = items.reduce (memo, {item, weight}) ->
if memo.sum > r
memo
else
{item, sum: memo.sum + weight}
, {sum: 0}
console.log 'r:', r, 'item:', item
You can run it many times at coffeescript.org and see that the results make sense :)
That being said, i find the fold a bit contrived, as you have to remember both the selected item and the accumulated weight between iterations, and it doesn't short-circuit when the item is found.
Maybe a compromise solution between pure FP and the tedium of reimplementing a find algorithm can be considered (using _.find):
total = 0
{item} = _.find items, ({weight}) ->
total += weight
total > r
Runnable example.
I find (no pun intended) this algorithm much more accessible than the first one (and it should perform better, as it doesn't create intermediate objects, and it does short-circuiting).
Update/side-note: the second algorithm is not "pure" because the function passed to _.find is not referentially transparent (it has the side effect of modifying the external total variable), but the whole of the algorithm is referentially transparent. If you were to encapsulate it in a findItem = (items, r) -> function, the function will be pure and will always return the same output for the same input. That's a very important thing, because it means that you can get the benefits of FP while using some non-FP constructs (for performance, readability, or whatever reason) under the hoods :D
I think the underlying task is randomly selecting 'events' (objects) from array os with a frequency defined by their respective weights. The approach is to map (i.e. search) a random number (with uniform distribution) onto the stairstep cumulative probability distribution function.
With positive weights, their cumulative sum is increasing from 0 to 1. The code you gave us simply searches starting at the 0 end. To maximize speed with repeated calls, pre calculate sums, and order the events so the largest weights are first.
It really doesn't matter whether you search with iteration (looping) or recursion. Recursion is nice in a language that tries to be 'purely functional' but doesn't help understanding the underlying mathematical problem. And it doesn't help you package the task into a clean function. The underscore functions are another way of packaging the iterations, but don't change the basic functionality. Only any and all exit early when the target is found.
For small os array this simple search is sufficient. But with a large array, a binary search will be faster. Looking in underscore I find that sortedIndex uses this strategy. From Lo-Dash (an underscore dropin), "Uses a binary search to determine the smallest index at which the value should be inserted into array in order to maintain the sort order of the sorted array"
The basic use of sortedIndex is:
os = [{name:'one',weight:.7},
{name:'two',weight:.25},
{name:'three',weight:.05}]
t=0; cumweights = (t+=o.weight for o in os)
i = _.sortedIndex(cumweights, R)
os[i]
You can hide the cumulative sum calculation with a nested function like:
osEventGen = (os)->
t=0; xw = (t+=y.weight for y in os)
return (R) ->
i = __.sortedIndex(xw, R)
return os[i]
osEvent = osEventGen(os)
osEvent(.3)
# { name: 'one', weight: 0.7 }
osEvent(.8)
# { name: 'two', weight: 0.25 }
osEvent(.99)
# { name: 'three', weight: 0.05 }
In coffeescript, Jed Clinger's recursive search could be written like this:
foo = (x, r, t=0)->
[y, x...] = x
t += y
return [y, t] if x.length==0 or t>r
return foo(x, r, t)
An loop version using the same basic idea is:
foo=(x,r)->
t=0
while x.length and t<=r
[y,x...]=x # the [first, rest] split
t+=y
y
Tests on jsPerf http://jsperf.com/sortedindex
suggest that sortedIndex is faster when os.length is around 1000, but slower than the simple loop when the length is more like 30.

Which languages support *recursive* function literals / anonymous functions?

It seems quite a few mainstream languages support function literals these days. They are also called anonymous functions, but I don't care if they have a name. The important thing is that a function literal is an expression which yields a function which hasn't already been defined elsewhere, so for example in C, &printf doesn't count.
EDIT to add: if you have a genuine function literal expression <exp>, you should be able to pass it to a function f(<exp>) or immediately apply it to an argument, ie. <exp>(5).
I'm curious which languages let you write function literals which are recursive. Wikipedia's "anonymous recursion" article doesn't give any programming examples.
Let's use the recursive factorial function as the example.
Here are the ones I know:
JavaScript / ECMAScript can do it with callee:
function(n){if (n<2) {return 1;} else {return n * arguments.callee(n-1);}}
it's easy in languages with letrec, eg Haskell (which calls it let):
let fac x = if x<2 then 1 else fac (x-1) * x in fac
and there are equivalents in Lisp and Scheme. Note that the binding of fac is local to the expression, so the whole expression is in fact an anonymous function.
Are there any others?
Most languages support it through use of the Y combinator. Here's an example in Python (from the cookbook):
# Define Y combinator...come on Gudio, put it in functools!
Y = lambda g: (lambda f: g(lambda arg: f(f)(arg))) (lambda f: g(lambda arg: f(f)(arg)))
# Define anonymous recursive factorial function
fac = Y(lambda f: lambda n: (1 if n<2 else n*f(n-1)))
assert fac(7) == 5040
C#
Reading Wes Dyer's blog, you will see that #Jon Skeet's answer is not totally correct. I am no genius on languages but there is a difference between a recursive anonymous function and the "fib function really just invokes the delegate that the local variable fib references" to quote from the blog.
The actual C# answer would look something like this:
delegate Func<A, R> Recursive<A, R>(Recursive<A, R> r);
static Func<A, R> Y<A, R>(Func<Func<A, R>, Func<A, R>> f)
{
Recursive<A, R> rec = r => a => f(r(r))(a);
return rec(rec);
}
static void Main(string[] args)
{
Func<int,int> fib = Y<int,int>(f => n => n > 1 ? f(n - 1) + f(n - 2) : n);
Func<int, int> fact = Y<int, int>(f => n => n > 1 ? n * f(n - 1) : 1);
Console.WriteLine(fib(6)); // displays 8
Console.WriteLine(fact(6));
Console.ReadLine();
}
You can do it in Perl:
my $factorial = do {
my $fac;
$fac = sub {
my $n = shift;
if ($n < 2) { 1 } else { $n * $fac->($n-1) }
};
};
print $factorial->(4);
The do block isn't strictly necessary; I included it to emphasize that the result is a true anonymous function.
Well, apart from Common Lisp (labels) and Scheme (letrec) which you've already mentioned, JavaScript also allows you to name an anonymous function:
var foo = {"bar": function baz() {return baz() + 1;}};
which can be handier than using callee. (This is different from function in top-level; the latter would cause the name to appear in global scope too, whereas in the former case, the name appears only in the scope of the function itself.)
In Perl 6:
my $f = -> $n { if ($n <= 1) {1} else {$n * &?BLOCK($n - 1)} }
$f(42); # ==> 1405006117752879898543142606244511569936384000000000
F# has "let rec"
You've mixed up some terminology here, function literals don't have to be anonymous.
In javascript the difference depends on whether the function is written as a statement or an expression. There's some discussion about the distinction in the answers to this question.
Lets say you are passing your example to a function:
foo(function(n){if (n<2) {return 1;} else {return n * arguments.callee(n-1);}});
This could also be written:
foo(function fac(n){if (n<2) {return 1;} else {return n * fac(n-1);}});
In both cases it's a function literal. But note that in the second example the name is not added to the surrounding scope - which can be confusing. But this isn't widely used as some javascript implementations don't support this or have a buggy implementation. I've also read that it's slower.
Anonymous recursion is something different again, it's when a function recurses without having a reference to itself, the Y Combinator has already been mentioned. In most languages, it isn't necessary as better methods are available. Here's a link to a javascript implementation.
In C# you need to declare a variable to hold the delegate, and assign null to it to make sure it's definitely assigned, then you can call it from within a lambda expression which you assign to it:
Func<int, int> fac = null;
fac = n => n < 2 ? 1 : n * fac(n-1);
Console.WriteLine(fac(7));
I think I heard rumours that the C# team was considering changing the rules on definite assignment to make the separate declaration/initialization unnecessary, but I wouldn't swear to it.
One important question for each of these languages / runtime environments is whether they support tail calls. In C#, as far as I'm aware the MS compiler doesn't use the tail. IL opcode, but the JIT may optimise it anyway, in certain circumstances. Obviously this can very easily make the difference between a working program and stack overflow. (It would be nice to have more control over this and/or guarantees about when it will occur. Otherwise a program which works on one machine may fail on another in a hard-to-fathom manner.)
Edit: as FryHard pointed out, this is only pseudo-recursion. Simple enough to get the job done, but the Y-combinator is a purer approach. There's one other caveat with the code I posted above: if you change the value of fac, anything which tries to use the old value will start to fail, because the lambda expression has captured the fac variable itself. (Which it has to in order to work properly at all, of course...)
You can do this in Matlab using an anonymous function which uses the dbstack() introspection to get the function literal of itself and then evaluating it. (I admit this is cheating because dbstack should probably be considered extralinguistic, but it is available in all Matlabs.)
f = #(x) ~x || feval(str2func(getfield(dbstack, 'name')), x-1)
This is an anonymous function that counts down from x and then returns 1. It's not very useful because Matlab lacks the ?: operator and disallows if-blocks inside anonymous functions, so it's hard to construct the base case/recursive step form.
You can demonstrate that it is recursive by calling f(-1); it will count down to infinity and eventually throw a max recursion error.
>> f(-1)
??? Maximum recursion limit of 500 reached. Use set(0,'RecursionLimit',N)
to change the limit. Be aware that exceeding your available stack space can
crash MATLAB and/or your computer.
And you can invoke the anonymous function directly, without binding it to any variable, by passing it directly to feval.
>> feval(#(x) ~x || feval(str2func(getfield(dbstack, 'name')), x-1), -1)
??? Maximum recursion limit of 500 reached. Use set(0,'RecursionLimit',N)
to change the limit. Be aware that exceeding your available stack space can
crash MATLAB and/or your computer.
Error in ==> create#(x)~x||feval(str2func(getfield(dbstack,'name')),x-1)
To make something useful out of it, you can create a separate function which implements the recursive step logic, using "if" to protect the recursive case against evaluation.
function out = basecase_or_feval(cond, baseval, fcn, args, accumfcn)
%BASECASE_OR_FEVAL Return base case value, or evaluate next step
if cond
out = baseval;
else
out = feval(accumfcn, feval(fcn, args{:}));
end
Given that, here's factorial.
recursive_factorial = #(x) basecase_or_feval(x < 2,...
1,...
str2func(getfield(dbstack, 'name')),...
{x-1},...
#(z)x*z);
And you can call it without binding.
>> feval( #(x) basecase_or_feval(x < 2, 1, str2func(getfield(dbstack, 'name')), {x-1}, #(z)x*z), 5)
ans =
120
It also seems Mathematica lets you define recursive functions using #0 to denote the function itself, as:
(expression[#0]) &
e.g. a factorial:
fac = Piecewise[{{1, #1 == 0}, {#1 * #0[#1 - 1], True}}] &;
This is in keeping with the notation #i to refer to the ith parameter, and the shell-scripting convention that a script is its own 0th parameter.
I think this may not be exactly what you're looking for, but in Lisp 'labels' can be used to dynamically declare functions that can be called recursively.
(labels ((factorial (x) ;define name and params
; body of function addrec
(if (= x 1)
(return 1)
(+ (factorial (- x 1))))) ;should not close out labels
;call factorial inside labels function
(factorial 5)) ;this would return 15 from labels
Delphi includes the anonymous functions with version 2009.
Example from http://blogs.codegear.com/davidi/2008/07/23/38915/
type
// method reference
TProc = reference to procedure(x: Integer);
procedure Call(const proc: TProc);
begin
proc(42);
end;
Use:
var
proc: TProc;
begin
// anonymous method
proc := procedure(a: Integer)
begin
Writeln(a);
end;
Call(proc);
readln
end.
Because I was curious, I actually tried to come up with a way to do this in MATLAB. It can be done, but it looks a little Rube-Goldberg-esque:
>> fact = #(val,branchFcns) val*branchFcns{(val <= 1)+1}(val-1,branchFcns);
>> returnOne = #(val,branchFcns) 1;
>> branchFcns = {fact returnOne};
>> fact(4,branchFcns)
ans =
24
>> fact(5,branchFcns)
ans =
120
Anonymous functions exist in C++0x with lambda, and they may be recursive, although I'm not sure about anonymously.
auto kek = [](){kek();}
'Tseems you've got the idea of anonymous functions wrong, it's not just about runtime creation, it's also about scope. Consider this Scheme macro:
(define-syntax lambdarec
(syntax-rules ()
((lambdarec (tag . params) . body)
((lambda ()
(define (tag . params) . body)
tag)))))
Such that:
(lambdarec (f n) (if (<= n 0) 1 (* n (f (- n 1)))))
Evaluates to a true anonymous recursive factorial function that can for instance be used like:
(let ;no letrec used
((factorial (lambdarec (f n) (if (<= n 0) 1 (* n (f (- n 1)))))))
(factorial 4)) ; ===> 24
However, the true reason that makes it anonymous is that if I do:
((lambdarec (f n) (if (<= n 0) 1 (* n (f (- n 1))))) 4)
The function is afterwards cleared from memory and has no scope, thus after this:
(f 4)
Will either signal an error, or will be bound to whatever f was bound to before.
In Haskell, an ad hoc way to achieve same would be:
\n -> let fac x = if x<2 then 1 else fac (x-1) * x
in fac n
The difference again being that this function has no scope, if I don't use it, with Haskell being Lazy the effect is the same as an empty line of code, it is truly literal as it has the same effect as the C code:
3;
A literal number. And even if I use it immediately afterwards it will go away. This is what literal functions are about, not creation at runtime per se.
Clojure can do it, as fn takes an optional name specifically for this purpose (the name doesn't escape the definition scope):
> (def fac (fn self [n] (if (< n 2) 1 (* n (self (dec n))))))
#'sandbox17083/fac
> (fac 5)
120
> self
java.lang.RuntimeException: Unable to resolve symbol: self in this context
If it happens to be tail recursion, then recur is a much more efficient method:
> (def fac (fn [n] (loop [count n result 1]
(if (zero? count)
result
(recur (dec count) (* result count))))))

Resources