How to test if two functions are the same? - functional-programming

I found a code snippet somewhere online:
(letrec
([id (lambda (v) v)]
[ctx0 (lambda (v) `(k ,v))]
.....
.....
(if (memq ctx (list ctx0 id)) <---- condition always return false
.....
where ctx is also a function:
However I could never make the test-statement return true.
Then I have the following test:
(define ctx0 (lambda (v) `(k ,v)))
(define ctx1 (lambda (v) `(k ,v)))
(eq? ctx0 ctx1)
=> #f
(eqv? ctx0 ctx1)
=> #f
(equal? ctx0 ctx1)
=> #f
Which make me suspect that two function are always different since they have different memory location.
But if functions can be compared against other functions, how can I test if two function are the same? and what if they have different variable name? for example:
(lambda (x) (+ x 1)) and (lambda (y) (+ y 1))
P.S. I use DrRacket to test the code.

You can’t. Functions are treated as opaque values: they are only compared by identity, nothing more. This is by design.
But why? Couldn’t languages implement meaningful ways to compare functions that might sometimes be useful? Well, not really, but sometimes it’s hard to see why without elaboration. Let’s consider your example from your question—these two functions seem equivalent:
(define ctx0 (lambda (v) `(k ,v)))
(define ctx1 (lambda (v) `(k ,v)))
And indeed, they are. But what would comparing these functions for equality accomplish? After all, we could just as easily implement another function:
(define ctx2 (lambda (w) `(k ,w)))
This function is, for all intents and purposes, identical to the previous two, but it would fail a naïve equality check!
In order to decide whether or not two values are equivalent, we must define some algorithm that defines equality. Given the examples I’ve provided thus far, such an algorithm seems obvious: two functions should be considered equal if (and only if) they are α-equivalent. With this in hand, we can now meaningfully check if two functions are equal!
...right?
(define ctx3 (lambda (v) (list 'k v)))
Uh, oh. This function does exactly the same thing, but it’s not implemented exactly the same way, so it fails our equality check. Surely, though, we can fix this. Quasiquotation and using the list constructor are pretty much the same, so we can define them to be equivalent in most circumstances.
(define ctx4 (lambda (v) (reverse (list v 'k))))
Gah! That’s also operationally equivalent, but it still fails our equivalence algorithm. How can we possibly make this work?
Turns out we can’t, really. Functions are units of abstraction—by their nature, we are not supposed to need to know how they are implemented, only what they do. This means that function equality can really only be correctly defined in terms of operational equivalence; that is, the implementation doesn’t matter, only the behavior does.
This is an undecidable problem in any nontrivial language. It’s impossible to determine if any two functions are operationally equivalent because, if we could, we could solve the halting problem.
Programming languages could theoretically provide a best-effort algorithm to determine function equivalency, perhaps using α-equivalency or some other sort of metric. Unfortunately, this really wouldn’t be useful—depending on the implementation of a function rather than its behavior to determine the semantics of a program breaks a fundamental law of functional abstraction, and as such any program that depended on such a system would be an antipattern.
Function equality is a very tempting problem to want to solve when the simple cases seem so easy, but most languages take the right approach and don’t even try. That’s not to say it isn’t a useful idea: if it were possible, it would be incredibly useful! But since it isn’t, you’ll have to use a different tool for the job.

Semantically, two function f and g are equal if they agree for every input, i.e. if for all x, we have (= (f x) (g x)). Of course, there's no way to test that for every possible value of x.
If all you want to do is be reasonably confident that (lambda (x) (+ x 1)) and (lambda (y) (+ y 1)) are the same, then you might try asserting that
(map (lambda (x) (+ x 1)) [(-5) (-4) (-3) (-2) (-1) 0 1 2 3 4 5])
and
(map (lambda (y) (+ y 1)) [(-5) (-4) (-3) (-2) (-1) 0 1 2 3 4 5])
are the same in your unit tests.

Related

Reversing list vs non tail recursion when traversing lists

I wonder how do you, experienced lispers / functional programmers usually make decision what to use. Compare:
(define (my-map1 f lst)
(reverse
(let loop ([lst lst] [acc '()])
(if (empty? lst)
acc
(loop (cdr lst) (cons (f (car lst)) acc))))))
and
(define (my-map2 f lst)
(if (empty? lst)
'()
(cons (f (car lst)) (my-map2 f (cdr lst)))))
The problem can be described in the following way: whenever we have to traverse a list, should we collect results in accumulator, which preserves tail recursion, but requires list reversion in the end? Or should we use unoptimized recursion, but then we don't have to reverse anything?
It seems to me the first solution is always better. Indeed, there's additional complexity (O(n)) there. However, it uses much less memory, let alone calling a function isn't done instantly.
Yet I've seen different examples where the second approach was used. Either I'm missing something or these examples were only educational. Are there situations where unoptimized recursion is better?
When possible, I use higher-order functions like map which build a list under the hood. In Common Lisp I also tend to use loop a lot, which has a collect keyword for building list in a forward way (I also use the series library which also implements it transparently).
I sometimes use recursive functions that are not tail-recursive because they better express what I want and because the size of the list is going to be relatively small; in particular, when writing a macro, the code being manipulated is not usually very large.
For more complex problems I don't collect into lists, I generally accept a callback function that is being called for each solution. This ensures that the work is more clearly separated between how the data is produced and how it is used.
This approach is to me the most flexible of all, because no assumption is made about how the data should be processed or collected. But it also means that the callback function is likely to perform side-effects or non-local returns (see example below). I don't think it is particularly a problem as long the the scope of the side-effects is small (local to a function).
For example, if I want to have a function that generates all natural numbers between 0 and N-1, I write:
(defun range (n f)
(dotimes (i n)
(funcall f i)))
The implementation here iterates over all values from 0 below N and calls F with the value I.
If I wanted to collect them in a list, I'd write:
(defun range-list (N)
(let ((list nil))
(range N (lambda (v) (push v list)))
(nreverse list)))
But, I can also avoid the whole push/nreverse idiom by using a queue. A queue in Lisp can be implemented as a pair (first . last) that keeps track of the first and last cons cells of the underlying linked-list collection. This allows to append elements in constant time to the end, because there is no need to iterate over the list (see Implementing queues in Lisp by P. Norvig, 1991).
(defun queue ()
(let ((list (list nil)))
(cons list list)))
(defun qpush (queue element)
(setf (cdr queue)
(setf (cddr queue)
(list element))))
(defun qlist (queue)
(cdar queue))
And so, the alternative version of the function would be:
(defun range-list (n)
(let ((q (queue)))
(range N (lambda (v) (qpush q v)))
(qlist q)))
The generator/callback approach is also useful when you don't want to build all the elements; it is a bit like the lazy model of evaluation (e.g. Haskell) where you only use the items you need.
Imagine you want to use range to find the first empty slot in a vector, you could do this:
(defun empty-index (vector)
(block nil
(range (length vector)
(lambda (d)
(when (null (aref vector d))
(return d))))))
Here, the block of lexical name nil allows the anonymous function to call return to exit the block with a return value.
In other languages, the same behaviour is often reversed inside-out: we use iterator objects with a cursor and next operations. I tend to think it is simpler to write the iteration plainly and call a callback function, but this would be another interesting approach too.
Tail recursion with accumulator
Traverses the list twice
Constructs two lists
Constant stack space
Can crash with malloc errors
Naive recursion
Traverses list twice (once building up the stack, once tearing down the stack).
Constructs one list
Linear stack space
Can crash with stack overflow (unlikely in racket), or malloc errors
It seems to me the first solution is always better
Allocations are generally more time-expensive than extra stack frames, so I think the latter one will be faster (you'll have to benchmark it to know for sure though).
Are there situations where unoptimized recursion is better?
Yes, if you are creating a lazily evaluated structure, in haskell, you need the cons-cell as the evaluation boundary, and you can't lazily evaluate a tail recursive call.
Benchmarking is the only way to know for sure, racket has deep stack frames, so you should be able to get away with both versions.
The stdlib version is quite horrific, which shows that you can usually squeeze out some performance if you're willing to sacrifice readability.
Given two implementations of the same function, with the same O notation, I will choose the simpler version 95% of the time.
There are many ways to make recursion preserving iterative process.
I usually do continuation passing style directly. This is my "natural" way to do it.
One takes into account the type of the function. Sometimes you need to connect your function with the functions around it and depending on their type you can choose another way to do recursion.
You should start by solving "the little schemer" to gain a strong foundation about it. In the "little typer" you can discover another type of doing recursion, founded on other computational philosophy, used in languages like agda, coq.
In scheme you can write code that is actually haskell sometimes (you can write monadic code that would be generated by a haskell compiler as intermediate language). In that case the way to do recursion is also different that "usual" way, etc.
false dichotomy
You have other options available to you. Here we can preserve tail-recursion and map over the list with a single traversal. The technique used here is called continuation-passing style -
(define (map f lst (return identity))
(if (null? lst)
(return null)
(map f
(cdr lst)
(lambda (r) (return (cons (f (car lst)) r))))))
(define (square x)
(* x x))
(map square '(1 2 3 4))
'(1 4 9 16)
This question is tagged with racket, which has built-in support for delimited continuations. We can accomplish map using a single traversal, but this time without using recursion. Enjoy -
(require racket/control)
(define (yield x)
(shift return (cons x (return (void)))))
(define (map f lst)
(reset (begin
(for ((x lst))
(yield (f x)))
null)))
(define (square x)
(* x x))
(map square '(1 2 3 4))
'(1 4 9 16)
It's my intention that this post will show you the detriment of pigeonholing your mind into a particular construct. The beauty of Scheme/Racket, I have come to learn, is that any implementation you can dream of is available to you.
I would highly recommend Beautiful Racket by Matthew Butterick. This easy-to-approach and freely-available ebook shatters the glass ceiling in your mind and shows you how to think about your solutions in a language-oriented way.

some strategies to refactor my Common Lisp code

I'm Haruo. My pleasure is solving SPOJ in Common Lisp(CLISP). Today I solved Classical/Balk! but in SBCL not CLISP. My CLISP submit failed due to runtime error (NZEC).
I hope my code becomes more sophisticated. Today's problem is just a chance. Please the following my code and tell me your refactoring strategy. I trust you.
https://github.com/haruo-wakakusa/SPOJ-ClispAnswers/blob/0978813be14b536bc3402f8238f9336a54a04346/20040508_adrian_b.lisp
Haruo
Take for example get-x-depth-for-yz-grid.
(defun get-x-depth-for-yz-grid (planes//yz-plane grid)
(let ((planes (get-planes-including-yz-grid-in planes//yz-plane grid)))
(unless (evenp (length planes))
(error "error in get-x-depth-for-yz-grid"))
(sort planes (lambda (p1 p2) (< (caar p1) (caar p2))))
(do* ((rest planes (cddr rest)) (res 0))
((null rest) res)
(incf res (- (caar (second rest)) (caar (first rest)))))))
style -> ERROR can be replaced by ASSERT.
possible bug -> SORT is possibly destructive -> make sure you have a fresh list consed!. If it is already fresh allocated by get-planes-including-yz-grid-in, then we don't need that.
bug -> SORT returns a sorted list. The sorted list is possibly not a side-effect. -> use the returned value
style -> DO replaced with LOOP.
style -> meaning of CAAR unclear. Find better naming or use other data structures.
(defun get-x-depth-for-yz-grid (planes//yz-plane grid)
(let ((planes (get-planes-including-yz-grid-in planes//yz-plane grid)))
(assert (evenp (length planes)) (planes)
"error in get-x-depth-for-yz-grid")
(setf planes (sort (copy-list planes) #'< :key #'caar))
(loop for (p1 p2) on planes by #'cddr
sum (- (caar p2) (caar p1)))))
Some documentation makes a bigger improvement than refactoring.
Your -> macro will confuse sbcl’s type inference. You should have (-> x) expand into x, and (-> x y...) into (let (($ x)) (-> y...))
You should learn to use loop and use it in more places. dolist with extra mutation is not great
In a lot of places you should use destructuring-bind instead of eg (rest (rest )). You’re also inconsistent as sometimes you’d write (cddr...) for that instead.
Your block* suffers from many problems:
It uses (let (foo) (setf foo...)) which trips up sbcl type inference.
The name block* implies that the various bindings are scoped in a way that they may refer to those previously defined things but actually all initial value may refer to any variable or function name and if that variable has not been initialised then it evaluates to nil.
The style of defining lots of functions inside another function when they can be outside is more typical of scheme (which has syntax for it) than Common Lisp.
get-x-y-and-z-ranges really needs to use loop. I think it’s wrong too: the lists are different lengths.
You need to define some accessor functions instead of using first, etc. Maybe even a struct(!)
(sort foo) might destroy foo. You need to do (setf foo (sort foo)).
There’s basically no reason to use do. Use loop.
You should probably use :key in a few places.
You write defvar but I think you mean defparameter
*t* is a stupid name
Most names are bad and don’t seem to tell me what is going on.
I may be an idiot but I can’t tell at all what your program is doing. It could probably do with a lot of work

What are the merits of letrec as a programming language feature

I've looked at everything I can find about letrec, and I still don't understand what it brings to a language as a feature. It seems like everything expressible with letrec could just as easily be written as a recursive function. But are there any reasons to expose letrec as a feature of a programming language, if the language already supports recursive functions? Why do several languages expose both?
I get that letrec might be used to implement other features including recursive functions, but that's not relevant to why it should itself be a feature. I've also read that some people find it more readable than recursive functions in some lisps, but again this is not relevant, because the designer of the language can make an effort to make recursive functions readable enough to not need another feature. Finally, I've been told that letrec makes it possible to express some kinds of recursive values more succinctly, but I have yet to find a motivating example.
TL;DR: define is letrec. This is what enables us to write recursive defintions in the first place.
Consider
let fact = fun (n => (n==0 -> 1 ; n * fact (n-1)))
To what entity does the name fact inside the body of this definiton refer? With let foo = val, val is defined in terms of already known entities, so it can't refer to foo which is not defined yet. In terms of scope this can be said (and usually is) that the RHS of the let equation is defined in the outer scope.
The only way for the inner fact to actually point at the one being defined, is to use letrec, where the entity being defined is allowed to refer to the scope in which it is being defined. So while causing evaluation of an entity while its definition is in progress is an error, storing a reference to its (future, at this point in time) value is fine -- in the case of using letrec that is.
The define you refer to, is just letrec under another name. In Scheme as well.
Without the ability of an entity being defined to refer to itself, i.e. in languages with non-recursive let, to have recursion one has to resort to the use of arcane devices such as the y-combinator. Which is cumbersome and usually inefficient. Another way is the definitions like
let fact = (fun (f => f f)) (fun (r => n => (n==0 -> 1 ; n * r r (n-1))))
So letrec brings to the table the efficiency of implementation, and convenience for a programmer.
The quesion then becomes, why expose the non-recursive let? Haskell indeed does not. Scheme has both letrec and let. One reason might be for completeness. Another might be a simpler implementation for let, with less self-referential run-time structures in memory making it easier on the garbage collector.
You ask for a motivational example. Consider defining Fibonacci numbers as a self-referential lazy list:
letrec fibs = {0} + {1} + add fibs (tail fibs)
With non-recursive let another copy of the list fibs will be defined, to be used as the input to the element-wise addition function add. Which will cause the definition of another copy of fibs for this one to be defined in its terms. And so on; accessing the nth Fibonacci number will cause a chain of n-1 lists to be created and maintained at run-time! Not a pretty picture.
And that's assuming the same fibs was used for tail fibs as well. If not, all bets are off.
What is needed is that fibs uses itself, refers to itself, so only one copy of the list is maintained.
NB: Although this is not a Scheme specific problem I'm using Scheme to demonstrate the differences. Hope you can read a little lisp code
A letrec is just a special let where the bindings themselves are defined before the expressions that represent their values are evaluated. Imagine this:
(define (fib n)
(let ((fib (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(fib n))
This code fails since while fib does exist in the body of the let it does exist in the closure it defines since the binding didn't exist when the lambda was evaluated. To fix this letrec comes to the rescue:
(define (fib n)
(letrec ((fib (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(fib n))
That letrec is just syntax that does something like this:
(define (fib n)
(let ((fib 'undefined))
(let ((tmp (lambda (n a b)
(if (zero? n)
a
(fib (- n 1) b (+ a b))))))
(set! fib tmp))
(fib n)))
So here you clearly see fib exists when the lambda gets evaluated and the binding is later set to the closure itself. The binding is the same, only it's pointer has changed. It's circular reference 101..
So what happens when you make a global function? Clearly if it is to recurse it needs to exist before the lambda is evaluated or the environment has to be mutated. It needs to fix the same problem here too.
In a functional language implementation where mutation is not ok you can solve this problem with a Y (or Z) combinator.
If you are interested in how languages are implemented I suggest you start at Matt Mights articles.

out-mode parameters in scheme recursive function

Reader beware: Clueless about functional programming and even more clueless about Scheme.
I have a recursive function in Scheme. In the non-base-case portion, the function calls itself twice, comparing the two calls in an if statement. I need to return the result that is greater. So... what I'm currently doing is:
(if (> (recursive-call a b-1) (recursive-call a-1 b))
(recursive-call a b-1)
(recursive-call a-1 b))
Which obviously requires that I make 3 recursive calls instead of 2.
Is there a way to reference the value of the recursive calls from the if statement? I am not allowed to define additional functions or use let. I'm thinking it has to do with out-mode parameters but don't know how to use, assign, or access the value of an out-mode parameter. When I asked the professor, I was pointed to parameter passing methods in functional languages as well as the general process of a function return. It wasn't helpful in the least. I can't post the full code as it's an assignment for a class. Any chance this is enough for someone to point me in the right direction?
Note: The only constructs we're allowed are null?, car, cdr, else, lcm, +, >, if, parameters to the recursive function (which must be a list and numbers only), integer literals, and parenthesis. No use of max, define, or let, unfortunately.
Note: b-1 and a-1 are names for variables. If you wanted a subtraction you'd use (- a 1). I'll use (- a 1) in my answer, but if it really was a variable you can just replace it.
The obvious way to do this particular logic without restrictions is:
;; return the largest of the two
(max (recursive-call a (- b 1))
(recursive-call (- a 1) b))
The standard Scheme way would be to bind values you use more than once in variables using let so that you don't do the computation more than you need to:
;; cache computed values in local bindings
(let ((a (recursive-call a (- b 1)))
(b (recursive-call (- a 1) b)))
(if (> a b) a b))
Since you are restricted to not use either of those you can rewrite the let version to its primitive form. A let can be rewritten like this:
(let ((ba va) (bb vb))
...)
; ===
((lambda (ba bb)
...)
va
vb)
I guess you should be able to figure it out from here.

Why is it legal in a function definition to make self-call but illegal for a value?

Structure and Interpretation of Computer Programs (SICP) 3.5.2 introduces infinite streams:
(define ones
(cons-stream 1 ones))
This code doesn't work in DrRacket, with the error:
ones: undefined; cannot reference an identifier before its definition
Other code like this:
(define (integers-starting-from n)
(cons-stream n
(integers-starting-from (+ n 1))))
(define integers (integers-starting-from 1))
produce error:
Interactions disabled
(fall in infinite loop?)
So far as I read(SICP), the key point of implementing an infinite stream is delayed evaluation:
(define (delay exp)
(lambda () exp))
(define (cons-stream a b)
(cons a
(delay b)))
With this used in cons-stream, infinite stream still illegal.
Such data structure reminds me of recursive function in whose definition self-call is legal (in compiling) within or without an actual exit.
Why is it illegal for a value to reference itself? Even the reference has been delayed?
Could infinite stream supported by other programming language?
If not, is it about the way processor deal with assembly language? The data stack stuff?
When you make a procedure the arguments are already evaluated when you start the body of the procedure. Thus delay would not do anything since it's already computed at that time. cons-stream need to be a macro.
DrRacket is not one language implementation. It's an IDE that supports lots of languages. One of them is a SICP compatibility language. I mananged to run your code without errors using this code in DrRacket:
#!planet neil/sicp
(define ones
(cons-stream 1 ones))
(define (integers-starting-from n)
(cons-stream n
(integers-starting-from (+ n 1))))
(define integers (integers-starting-from 1))
It works like a charm. The default language in DrRacket, #!racket, has streams too, but the names are different:
#!racket
(define ones
(stream-cons 1 ones))
(define (integers-starting-from n)
(stream-cons n
(integers-starting-from (+ n 1))))
(define integers (integers-starting-from 1))
However for Scheme you should use SRFI-41 that you can use from both #!racket (using (require srfi/41)) and #!r6rs (and R7RS-large when it's finished)
(import (rnrs)
(srfi :41))
(define ones
(stream-cons 1 ones))
(define (integers-starting-from n)
(stream-cons n
(integers-starting-from (+ n 1))))
(define integers (integers-starting-from 1))
To roll your own SICP streams in both #!racket and #!r6rs you can use define-syntax
(define-syntax stream-cons
(syntax-rules ()
((_ a d) (cons a (delay d)))))
Note that RSFI-41 and rackets own stream library for #!racket delays both values in a stream-cons and not just the tail as in SICP version here.
It's because the expression in a define is evaluated before the name is bound to a value. It tries to evaluate (cons-stream 1 ones) before ones is defined, causing an error.
The reason this works fine for functions is that the body of a function is not evaluated when the function is. That is, to evaluate (lambda (x) (f x)), the language returns a function without looking at its body. Since
(define (f x) (f x))
is syntax sugar for defining a lambda, the same logic applies.
Did you define cons-stream yourself as above? If you make cons-stream a normal function it won't work properly. Since Scheme is strict by default, arguments are evaluated before the function gets called. If cons-stream is a normal function, b would get evaluated completely before being passed to delay, making you hit an infinite loop.
cons-stream in SICP is a "special form" rather than a function, which means that it can control how its arguments are evaluated.
If you use the stream-cons and other stream- operations built into Racket, you'd get the behavior you want.
Finally, some other languages do allow values to reference themselves. A great example is Haskell, where this works because everything is lazy by default. Here's a Haskell snippet defining ones to be an infinite list of ones. (Since everything is lazy, Haskell lists behave just like Scheme streams):
ones = 1 : ones
This definition works in Racket:
(define ones
(cons-stream 1 ones))
…As long as you provide a delayed implementation of cons-stream as a special form, which is the whole point of section 3.5 in SICP:
(define-syntax cons-stream
(syntax-rules ()
((_ head tail)
(cons head (delay tail)))))
Adding to the answers above, to make sure the code run, you should also define delay like this:
(define-syntax delay
(syntax-rules ()
((_ exp)
(lambda () exp))))
delay as well as cons-stream has to be defined as a macro.
In another option, you could just call the delay predefined in Racket rather than building a new one.
But don't do this:
(define (delay exp)
(lambda () exp))
The compiler would take your definition, thus the program crashes in evaluation:
Interactions disabled

Resources