some strategies to refactor my Common Lisp code - common-lisp

I'm Haruo. My pleasure is solving SPOJ in Common Lisp(CLISP). Today I solved Classical/Balk! but in SBCL not CLISP. My CLISP submit failed due to runtime error (NZEC).
I hope my code becomes more sophisticated. Today's problem is just a chance. Please the following my code and tell me your refactoring strategy. I trust you.
https://github.com/haruo-wakakusa/SPOJ-ClispAnswers/blob/0978813be14b536bc3402f8238f9336a54a04346/20040508_adrian_b.lisp
Haruo

Take for example get-x-depth-for-yz-grid.
(defun get-x-depth-for-yz-grid (planes//yz-plane grid)
(let ((planes (get-planes-including-yz-grid-in planes//yz-plane grid)))
(unless (evenp (length planes))
(error "error in get-x-depth-for-yz-grid"))
(sort planes (lambda (p1 p2) (< (caar p1) (caar p2))))
(do* ((rest planes (cddr rest)) (res 0))
((null rest) res)
(incf res (- (caar (second rest)) (caar (first rest)))))))
style -> ERROR can be replaced by ASSERT.
possible bug -> SORT is possibly destructive -> make sure you have a fresh list consed!. If it is already fresh allocated by get-planes-including-yz-grid-in, then we don't need that.
bug -> SORT returns a sorted list. The sorted list is possibly not a side-effect. -> use the returned value
style -> DO replaced with LOOP.
style -> meaning of CAAR unclear. Find better naming or use other data structures.
(defun get-x-depth-for-yz-grid (planes//yz-plane grid)
(let ((planes (get-planes-including-yz-grid-in planes//yz-plane grid)))
(assert (evenp (length planes)) (planes)
"error in get-x-depth-for-yz-grid")
(setf planes (sort (copy-list planes) #'< :key #'caar))
(loop for (p1 p2) on planes by #'cddr
sum (- (caar p2) (caar p1)))))

Some documentation makes a bigger improvement than refactoring.
Your -> macro will confuse sbcl’s type inference. You should have (-> x) expand into x, and (-> x y...) into (let (($ x)) (-> y...))
You should learn to use loop and use it in more places. dolist with extra mutation is not great
In a lot of places you should use destructuring-bind instead of eg (rest (rest )). You’re also inconsistent as sometimes you’d write (cddr...) for that instead.
Your block* suffers from many problems:
It uses (let (foo) (setf foo...)) which trips up sbcl type inference.
The name block* implies that the various bindings are scoped in a way that they may refer to those previously defined things but actually all initial value may refer to any variable or function name and if that variable has not been initialised then it evaluates to nil.
The style of defining lots of functions inside another function when they can be outside is more typical of scheme (which has syntax for it) than Common Lisp.
get-x-y-and-z-ranges really needs to use loop. I think it’s wrong too: the lists are different lengths.
You need to define some accessor functions instead of using first, etc. Maybe even a struct(!)
(sort foo) might destroy foo. You need to do (setf foo (sort foo)).
There’s basically no reason to use do. Use loop.
You should probably use :key in a few places.
You write defvar but I think you mean defparameter
*t* is a stupid name
Most names are bad and don’t seem to tell me what is going on.
I may be an idiot but I can’t tell at all what your program is doing. It could probably do with a lot of work

Related

Reversing list vs non tail recursion when traversing lists

I wonder how do you, experienced lispers / functional programmers usually make decision what to use. Compare:
(define (my-map1 f lst)
(reverse
(let loop ([lst lst] [acc '()])
(if (empty? lst)
acc
(loop (cdr lst) (cons (f (car lst)) acc))))))
and
(define (my-map2 f lst)
(if (empty? lst)
'()
(cons (f (car lst)) (my-map2 f (cdr lst)))))
The problem can be described in the following way: whenever we have to traverse a list, should we collect results in accumulator, which preserves tail recursion, but requires list reversion in the end? Or should we use unoptimized recursion, but then we don't have to reverse anything?
It seems to me the first solution is always better. Indeed, there's additional complexity (O(n)) there. However, it uses much less memory, let alone calling a function isn't done instantly.
Yet I've seen different examples where the second approach was used. Either I'm missing something or these examples were only educational. Are there situations where unoptimized recursion is better?
When possible, I use higher-order functions like map which build a list under the hood. In Common Lisp I also tend to use loop a lot, which has a collect keyword for building list in a forward way (I also use the series library which also implements it transparently).
I sometimes use recursive functions that are not tail-recursive because they better express what I want and because the size of the list is going to be relatively small; in particular, when writing a macro, the code being manipulated is not usually very large.
For more complex problems I don't collect into lists, I generally accept a callback function that is being called for each solution. This ensures that the work is more clearly separated between how the data is produced and how it is used.
This approach is to me the most flexible of all, because no assumption is made about how the data should be processed or collected. But it also means that the callback function is likely to perform side-effects or non-local returns (see example below). I don't think it is particularly a problem as long the the scope of the side-effects is small (local to a function).
For example, if I want to have a function that generates all natural numbers between 0 and N-1, I write:
(defun range (n f)
(dotimes (i n)
(funcall f i)))
The implementation here iterates over all values from 0 below N and calls F with the value I.
If I wanted to collect them in a list, I'd write:
(defun range-list (N)
(let ((list nil))
(range N (lambda (v) (push v list)))
(nreverse list)))
But, I can also avoid the whole push/nreverse idiom by using a queue. A queue in Lisp can be implemented as a pair (first . last) that keeps track of the first and last cons cells of the underlying linked-list collection. This allows to append elements in constant time to the end, because there is no need to iterate over the list (see Implementing queues in Lisp by P. Norvig, 1991).
(defun queue ()
(let ((list (list nil)))
(cons list list)))
(defun qpush (queue element)
(setf (cdr queue)
(setf (cddr queue)
(list element))))
(defun qlist (queue)
(cdar queue))
And so, the alternative version of the function would be:
(defun range-list (n)
(let ((q (queue)))
(range N (lambda (v) (qpush q v)))
(qlist q)))
The generator/callback approach is also useful when you don't want to build all the elements; it is a bit like the lazy model of evaluation (e.g. Haskell) where you only use the items you need.
Imagine you want to use range to find the first empty slot in a vector, you could do this:
(defun empty-index (vector)
(block nil
(range (length vector)
(lambda (d)
(when (null (aref vector d))
(return d))))))
Here, the block of lexical name nil allows the anonymous function to call return to exit the block with a return value.
In other languages, the same behaviour is often reversed inside-out: we use iterator objects with a cursor and next operations. I tend to think it is simpler to write the iteration plainly and call a callback function, but this would be another interesting approach too.
Tail recursion with accumulator
Traverses the list twice
Constructs two lists
Constant stack space
Can crash with malloc errors
Naive recursion
Traverses list twice (once building up the stack, once tearing down the stack).
Constructs one list
Linear stack space
Can crash with stack overflow (unlikely in racket), or malloc errors
It seems to me the first solution is always better
Allocations are generally more time-expensive than extra stack frames, so I think the latter one will be faster (you'll have to benchmark it to know for sure though).
Are there situations where unoptimized recursion is better?
Yes, if you are creating a lazily evaluated structure, in haskell, you need the cons-cell as the evaluation boundary, and you can't lazily evaluate a tail recursive call.
Benchmarking is the only way to know for sure, racket has deep stack frames, so you should be able to get away with both versions.
The stdlib version is quite horrific, which shows that you can usually squeeze out some performance if you're willing to sacrifice readability.
Given two implementations of the same function, with the same O notation, I will choose the simpler version 95% of the time.
There are many ways to make recursion preserving iterative process.
I usually do continuation passing style directly. This is my "natural" way to do it.
One takes into account the type of the function. Sometimes you need to connect your function with the functions around it and depending on their type you can choose another way to do recursion.
You should start by solving "the little schemer" to gain a strong foundation about it. In the "little typer" you can discover another type of doing recursion, founded on other computational philosophy, used in languages like agda, coq.
In scheme you can write code that is actually haskell sometimes (you can write monadic code that would be generated by a haskell compiler as intermediate language). In that case the way to do recursion is also different that "usual" way, etc.
false dichotomy
You have other options available to you. Here we can preserve tail-recursion and map over the list with a single traversal. The technique used here is called continuation-passing style -
(define (map f lst (return identity))
(if (null? lst)
(return null)
(map f
(cdr lst)
(lambda (r) (return (cons (f (car lst)) r))))))
(define (square x)
(* x x))
(map square '(1 2 3 4))
'(1 4 9 16)
This question is tagged with racket, which has built-in support for delimited continuations. We can accomplish map using a single traversal, but this time without using recursion. Enjoy -
(require racket/control)
(define (yield x)
(shift return (cons x (return (void)))))
(define (map f lst)
(reset (begin
(for ((x lst))
(yield (f x)))
null)))
(define (square x)
(* x x))
(map square '(1 2 3 4))
'(1 4 9 16)
It's my intention that this post will show you the detriment of pigeonholing your mind into a particular construct. The beauty of Scheme/Racket, I have come to learn, is that any implementation you can dream of is available to you.
I would highly recommend Beautiful Racket by Matthew Butterick. This easy-to-approach and freely-available ebook shatters the glass ceiling in your mind and shows you how to think about your solutions in a language-oriented way.

Named let in Scheme

I am attempting to write a loop in Scheme using named let. I would like to be able to break out of the iteration early based on various criteria, rather than always looping right at the end. Effectively, I would like to have while, break and continue. I am constrained to use guile 1.8 for strong reasons, and guile 1.8 does not implement the R6RS while construct. My question is, does recursing using named let have to be tail recursive, and why can't you restart the loop before the end? [Does this need a code example?] When I do attempt to recurse using an early exit at several point with IO operations I invariably read past EOF and get unpredictable corruption of data.
(let name ((iter iter-expr) (arg1 expr1) (arg2 expr2))
(cond
(continue-predicate (name (next iter) arg1 arg2)))
(break-predicate break-expression-value)
(else (name (next iter) next-arg1-expression next-ar2-expression))))
A continue is just calling again using most of the same arguments unchanged except the iterated parts which will change to the next thing.
A break is a base case.
So what is a while? It is a named let with a break predicate and a default case.
Scheme doesn't really have a while construct. If you read the report you'll see that it is just a syntax sugar (macro) that turns into something like a named let.
You need it to be tail recursive if you want to be able to exit it without all the previous calculations to be done. You can also use call/cc to supply an exit continuation which is basically having Scheme doing it for you under the hood. Usually call/cc is quite far out for beginners and it takes some time to master so making your procedures tail recursive is easier to understand than doing something like this:
(define (cars lists-of-pair)
(call/cc (lambda (exit)
(fold (lambda (e a)
(if (pair? e)
(cons (car e) a)
(exit '()))) ; throw away continuations to make current result make it ()
'()
lists-of-pair)))
(cars '((1 2) (a b))) ; ==> (1 a)
(cars '((1 2) ())) ; ==> ()

How to get all instances of a class in common lisp?

Imagine I have a class:
(defclass person () ())
And then I make some instances:
(setf anna (make-instance 'person))
(setf lisa (make-instance 'person))
How can I get either the objects themselves or the symbol names they were assigned to?
I want to be able to say something like (find-instances 'person) and get something like (anna lisa) or at least (#<PERSON {100700E793}> #<PERSON {100700E793}>).
What I am search for is the equivalent of each_object in ruby.
I very much want to be able to do it without an external library.
There is nothing like that built-in in Common Lisp.
Recording instances
For finding all instances of a class, one would usually make it that the class records the instance upon instance creation. One can imagine various mechanisms for that. Sometimes one would still want instances to be garbage collected - then one needs some kind of non-standard weak datastructure to do so. I would expect, that there are some libraries which implement similar things for CLOS instances.
Iterating over symbols of a package
If you would like to know which symbols of some or all packages have CLOS instances as a value, you could iterate over them (DO-SYMBOLS, DO-ALL-SYMBOLS, ...) and check if the have a symbol value and if that symbol value is an instance of a certain class.
There is no portable solution for this, as far as I know. If you are working on CCL, then map-heap-objects may do, what you are looking for
(defclass foo () ())
(defvar *x* (make-instance 'foo))
(defvar *y* (list (make-instance 'foo)))
(defun find-instances (n class)
(let ((buffer (make-array n :fill-pointer 0 :initial-element nil)))
(ccl:map-heap-objects (lambda (x)
(when (and (typep x class) (< (fill-pointer buffer) n))
(setf (aref buffer (fill-pointer buffer)) x)
(incf (fill-pointer buffer)))))
buffer))
(find-instances 2 'foo)
==> (#<FOO #x30200126F40D> #<FOO #x30200126634D>)
Similar solutions may exist for other Common Lisp implementations. Note, that you have to have an initial hunch as to how many instances the traversal may find. The reason is, that (as Rainer Joswig noted), the callback function should avoid consing. In order to achieve that, this implementation allocates a buffer up-front, and collects at most that many instances.

find free variables in lambda expression

Does anyone know how I can figure out the free variables in a lambda expression? Free variables are the variables that aren't part of the lambda parameters.
My current method (which is getting me nowhere) is to simply use car and cdr to go through the expression. My main problem is figuring out if a value is a variable or if it's one of the scheme primitives. Is there a way to test if something evaluates to one of scheme's built-in functions? For example:
(is-scheme-primitive? 'and)
;Value: #t
I'm using MIT scheme.
For arbitrary MIT Scheme programs, there isn't any way to do this. One problem is that the function you describe just can't work. For example, this doesn't use the 'scheme primitive' and:
(let ((and 7)) (+ and 1))
but it certainly uses the symbol 'and.
Another problem is that lots of things, like and, are special forms that are implemented with macros. You need to know what all of the macros in your program expand into to figure out even what variables are used in your program.
To make this work, you need to restrict the set of programs that you accept as input. The best choice is to restrict it to "fully expanded" programs. In other words, you want to make sure that there aren't any uses of macros left in the input to your free-variables function.
To do this, you can use the expand function provided by many Scheme systems. Unfortunately, from the online documentation, it doesn't look like MIT Scheme provides this function. If you're able to use a different system, Racket provides the expand function as well as local-expand which works correctly inside macros.
Racket actually also provides an implementation of the free-variables function that you ask for, which, as I described, requires fully expanded programs as input (such as the output of expand or local-expand). You can see the source code as well.
For a detailed discussion of the issues involved with full expansion of source code, see this upcoming paper by Flatt, Culpepper, Darais and Findler.
[EDIT 4] Disclaimer; or, looking back a year later:
This is actually a really bad way to go about solving this problem. It works as a very quick and dirty method that accomplishes the basic goal of the OP, but does not stand up to any 'real life' use cases. Please see the discussion in the comments on this answer as well as the other answer to see why.
[/EDIT]
This solution is probably less than ideal, but it will work for any lambda form you want to give it in the REPL environment of mit-scheme (see edits). Documentation for the procedures I used is found at the mit.edu doc site. get-vars takes a quoted lambda and returns a list of pairs. The first element of each pair is the symbol and the second is the value returned by environment-reference-type.
(define (flatten lst)
(cond ((null? lst) ())
((pair? (car lst)) (append (flatten (car lst)) (flatten (cdr lst))))
(else
(cons (car lst) (flatten (cdr lst))))))
(define (get-free-vars proc-form)
(let ((env (ge (eval proc-form user-initial-environment))))
(let loop ((pf (flatten proc-form))
(out ()))
(cond ((null? pf) out)
((symbol? (car pf))
(loop (cdr pf) (cons (cons (car pf) (environment-reference-type env (car pf))) out)))
(else
(loop (cdr pf) out))))))
EDIT: Example usage:
(define a 100)
(get-vars '(lambda (x) (* x a g)))
=> ((g . unbound) (a . normal) (x . unbound) (* . normal) (x . unbound) (lambda . macro))
EDIT 2: Changed code to guard agains calling environment-reference-type being called with something other than a symbol.
EDIT 3: As Sam has pointed out in the comments, this will not see the symbols bound in a let under the lambda as having any value.. not sure there is an easy fix for this. So, my statement about this taking any lambda is wrong, and should have read more like "Any simple lambda that doesn't contain new binding forms"... oh well.

How do I tell if the value of a variable is a symbol bound to a procedure in Scheme?

I am familiar with Common Lisp and trying to learn some Scheme, so I have been trying to understand how I'd use Scheme for things I usually code in Common Lisp.
In Common Lisp there's fboundp, which tells me if a symbol (the value of a variable) is bound to a function. So, I would do this:
(let ((s (read)))
(if (fboundp s)
(apply (symbol-function s) args)
(error ...)))
Is that possible in Scheme? I've been trying to find this in the R6RS spec but coudn't find anything similar.
This way?
check if it is a symbol
evaluate the symbol using EVAL to get its value
check if the result is a procedure with PROCEDURE?
In Scheme, functions are not tied to symbols like they are in Common Lisp. If you need to know, whether a value is actually a procedure, you can use the procedure? predicate:
(if (procedure? s) (do-something-with s) (do-something-else))
There is no direct way in portable Scheme to achieve, what your example code wants to do, as symbols in Scheme are simply kind of unified strings, lacking Common Lisp's value/function/plist slots.
You could try something like:
(define function-table (list `(car ,car) `(cdr ,cdr) `(cons ,cons) `(display ,display)))
(let* ((s (read))
(f (cond ((assq s function-table) => cadr)
(else (error "undefined function")))))
(apply f args))
i.e., defining your own mapping of "good" functions. This would have the advantage, that you can limit the set of function to only "safe" ones, or whatsoever.

Resources