Common Lisp SXHASH and nested lists - common-lisp

The standard says that
(equal x y) implies (= (sxhash x) (sxhash y)). Let us check it:
(defun sxhash-test ()
(let ((obj1 (list 1 2 (list 1 1)))
(obj2 (list 1 2 (list 1 2))))
(format t "are objects equal?: ~a~%" (equal obj1 obj2)) ;; => NIL
(format t "are their hashes equal?: ~a~%"(= (sxhash obj1) (sxhash obj2))))) ;; => T
The function equal works as expected but sxhash doesn't. Could you please explain what I am doing wrong? I use SBCL 2.1.9.
Thank you.

sxhash has to satisfy four requirements:
objects which are equal (and hence objects which are eql, eq, but not necessarily equalp will have the same sxhash value;
the sxhash value of an object must not change during the life of single image unless the object is changed in such a way as to not make it equal to a copy of it before the change;
objects of various types which have a well-defined notion of similarity between images then their sxhash value must be the same in each image.
computation of sxhash must always terminate.
(There is another vague requirement of 'being a good hash code').
(1) means that two objects which are not equal may have the same code but may not, but two objects which are equal must have the same value. A terrible but possible implementation of sxhash would be:
(defun sxhash/terrible (it)
(declare (ignore it))
0)
This fails the 'being a good hash code' test, but that's not something that can really be enforced.
What you are seeing is that two objects which are not equal do have the same sxhash value: that's fine.
Indeed, (1) together with (4) mean that if an implementation is going to compute sxhash on conses in such a way that it walks the graph, then it has to be pretty careful about that: it either needs an occurs check or it needs to only go so deep.
However it is quite possible that sxhash does descend into cons trees. As an example here is LispWorks doing just that:
> (sxhash '(1 2 3))
11890816076270616
> (sxhash '(1 2 3 (4)))
369102953153702944
> (sxhash '(1 2 3 (4)))
740958182301008344
> (sxhash '(1 2 3 (5)))
740958455027237144
> (sxhash '(1 2 3 (5 6)))
741326672350173760
> (sxhash '(1 2 3 (5 (6))))
925006242171775434
Equally it is quite plausible that sxhash treats all instances of a given structure class (or of a given instance of standard-class) as having the same value, because the address of such an object is not constant and there's no obvious place to store the hash code without burning memory. But that's in no way a requirement.

The reason why this effect is observed is that two things are required for the values to be equal:
The two values are the same, meaning that have the same hash.
The two values have the same address
The two lists tested have the same hash, because sxhash doesn't follow nesting. In fact, two structures will always have the same hash.
(sxhash (list 1 2 3)); => 3971322300187561939
(sxhash (list 1 2 3)); => 3971322300187561939 (so, repeatable)
(sxhash (list 1 2 3 (list 4))) ; => 3180777146619076709
(sxhash (list 1 2 3 (list 5))) ; => 3180777146619076709 (ok ...)
Why does `sxhash` return a constant for all structs?
As for equal addresses, if I create two values like 'a they in fact turn out to be one item with one address, and is only stored on the first time that it is seen. Whereas (list 1 2 (list 1 1)) and (list 1 2 (list 1 2)) are different things and are stored at separate addresses.
(sb-kernel:get-lisp-obj-address 'a) ; => 68772678703
(sb-kernel:get-lisp-obj-address 'a) ; => 68772678703 (same...)
(sb-kernel:get-lisp-obj-address (list 1 2 3 (list 4))) ; => 68772925863
(sb-kernel:get-lisp-obj-address (list 1 2 3 (list 5))) ; => 68772805335
Testing these two lists for equality pass using sxhash, but fail with different addresses.

Related

How does scramble function works? (Chapter 1 of The Seasoned Schemer)

According to the book, this is what the function definition is,
The function scramble takes a non-empty tuple in which no argument is greater than its own index and returns a tuple of same length. Each number in the argument is treated as a backward index from its own position to a point earlier in tuple. The result at each position is obtained by counting backward from the current position according to this index.
And these are some examples,
; Examples of scramble
(scramble '(1 1 1 3 4 2 1 1 9 2)) ; '(1 1 1 1 1 4 1 1 1 9)
(scramble '(1 2 3 4 5 6 7 8 9)) ; '(1 1 1 1 1 1 1 1 1)
(scramble '(1 2 3 1 2 3 4 1 8 2 10)) ; '(1 1 1 1 1 1 1 1 2 8 2)
Here is the implementation,
(define pick
(λ (i lat)
(cond
((eq? i 1) (car lat))
(else (pick (sub1 i)
(cdr lat))))))
(define scramble-b
(lambda (tup rev-pre)
(cond
((null? tup) '())
(else
(cons (pick (car tup) (cons (car tup) rev-pre))
(scramble-b (cdr tup)
(cons (car tup) rev-pre)))))))
(define scramble
(lambda (tup)
(scramble-b tup '())))
This is a case where using a very minimal version of the language means that the code is verbose enough that understanding the algorithm is not perhaps easy.
One way of dealing with this problem is to write the program in a much richer language, and then work out how the algorithm, which is now obvious, is implemented in the minimal version. Let's pick Racket as the rich language.
Racket has a function (as does Scheme) called list-ref: (list-ref l i) returns the ith element of l, zero-based.
It also has a nice notion of 'sequences' which are pretty much 'things you can iterate over' and a bunch of constructs whose names begin with for for iterating over sequences. There are two functions which make sequences we care about:
in-naturals makes an infinite sequence of the natural numbers, which by default starts from 0, but (in-naturals n) starts from n.
in-list makes a sequence from a list (a list is already a sequence in fact, but in-list makes things clearer and there are rumours also faster).
And the iteration construct we care about is for/list which iterates over some sequences and collects the result from its body into a list.
Given these, then the algorithm is almost trivial: we want to iterate along the list, keeping track of the current index and then do the appropriate subtraction to pick a value further back along the list. The only non-trivial bit is dealing with zero- vs one-based indexing.
(define (scramble l)
(for/list ([index (in-naturals)]
[element (in-list l)])
(list-ref l (+ (- index element) 1))))
And in fact if we cause in-naturals to count from 1 we can avoid the awkward adding-1:
(define (scramble l)
(for/list ([index (in-naturals 1)]
(element (in-list l)))
(list-ref l (- index element))))
Now looking at this code, even if you don't know Racket, the algorithm is very clear, and you can check it gives the answers in the book:
> (scramble '(1 1 1 3 4 2 1 1 9 2))
'(1 1 1 1 1 4 1 1 1 9)
Now it remains to work out how the code in the book implements the same algorithm. That's fiddly, but once you know what the algorithm is it should be straightforward.
If the verbal description looks vague and hard to follow, we can try following the code itself, turning it into a more visual pseudocode as we go:
pick i [x, ...ys] =
case i {
1 --> x ;
pick (i-1) ys }
==>
pick i xs = nth1 i xs
(* 1 <= i <= |xs| *)
scramble xs =
scramble2 xs []
scramble2 xs revPre =
case xs {
[] --> [] ;
[x, ...ys] -->
[ pick x [x, ...revPre],
...scramble2 ys
[x, ...revPre]] }
Thus,
scramble [x,y,z,w, ...]
=
[ nth1 x [x] (*x=1..1*)
, nth1 y [y,x] (*y=1..2*)
, nth1 z [z,y,x] (*z=1..3*)
, nth1 w [w,z,y,x] (*w=1..4*)
, ... ]
Thus each element in the input list is used as an index into the reversed prefix of that list, up to and including that element. In other words, an index into the prefix while counting backwards, i.e. from the element to the left, i.e. towards the list's start.
So we have now visualized what the code is doing, and have also discovered requirements for its input list's elements.

Distinguishing an integer from a string vector

I am trying to dispatch on the type of an array. Here's a test case:
(defun column-summary2 (column)
(typecase column
(simple-double-float-vector (format t "Column is a simple-double-float-vector~%"))
;; (simple-integer-vector (format t "Column is a simple-integer-vector~%"))
;; (simple-string-vector (format t "Column is a simple-string-vector~%"))
((simple-array string (*)) (format t "~A Column is a string-array~%" column))
((simple-array float (*)) (format t "~A is a simple-float-array~%" column))
((simple-array integer (*)) (format t "~A is a simple-float-array~%" column))
(bit-vector (make-bit-vector-summary :length (length column) :count (count 1 column))))))
This works as expected for the built in type, bit-vector, and with my own simple-double-float-vector type:
(deftype simple-double-float-vector (&optional (length '*))
"Simple vector of double-float elements."
`(simple-array double-float (,length)))
but fails for string and integer:
LS-USER> (df::column-summary2 #("foo" "bar" "baz"))
#(foo bar baz) Column is a string-array
NIL
LS-USER> (df::column-summary2 #(1 2 3))
#(1 2 3) Column is a string-array
I tried defining types for these two:
(deftype simple-integer-vector (&optional (length '*))
"Simple vector of integer elements."
`(simple-array integer (,length)))
(deftype simple-string-vector (&optional (length '*))
"Simple vector of integer elements."
`(simple-array string (,length)))
Edit: Coerce also seems to fail:
CL-USER> (type-of (coerce #(4 4 1 1 2 1 4 2 2 4 4 3 3 3 4 4 4 1 2 1 1 2 2 4 2 1 2 2 4 6 8 2) '(simple-array integer (32))))
(SIMPLE-VECTOR 32)
CL-USER> (type-of (coerce #("foo" "bar" "baz") '(simple-array string (3))))
(SIMPLE-VECTOR 3)
but it doesn't help. It seems that integer and string are always conflated. Can anyone see why?
typecase can only distinguish types that are implementationally distinct, and it is very unlikely that arrays of integers and strings are. You can test this by, for instance:
(eq (upgraded-array-element-type 'integer)
(upgraded-array-element-type 'string))
Which will very likely return t. And in fact it's likely that upgraded-array-element-type on both these types is itself t: the most specialised array that can store a general string is the same as the one that can store a general integer, since both of these types really require the elements of the array to be general pointers.
The thing here is that when typecase sees an array all it can dispatch on is the implementational type of the array, rather than anything else and those two types are the same in many cases where they are not the same conceptually.
An array's type can only be the type given to make-array as its :element-type, see type simple-array in the spec. If you use array literals, that is likely not the case.
It does not check at runtime the type of each element.
The word can is a hint that this is also influenced by the upgrading of array element types: there is only a fixed set (implementation defined) of array types, mostly determined by whether there is a specialized representation. The actual array element type is the most specialized of that set that fits the declared type.
If you need the exact information at runtime, you need to wrap and tag yourself.

Average using &rest in lisp

So i was asked to do a function i LISP that calculates the average of any given numbers. The way i was asked to do this was by using the &rest parameter. so i came up with this :
(defun average (a &rest b)
(cond ((null a) nil)
((null b) a)
(t (+ (car b) (average a (cdr b))))))
Now i know this is incorrect because the (cdr b) returns a list with a list inside so when i do (car b) it never returns an atom and so it never adds (+)
And that is my first question:
How can i call the CDR of a &rest parameter and get only one list instead of a list inside a list ?
Now there is other thing :
When i run this function and give values to the &rest, say (average 1 2 3 4 5) it gives me stackoverflow error. I traced the funcion and i saw that it was stuck in a loop, always calling the function with the (cdr b) witch is null and so it loops there.
My question is:
If i have a stopping condition: ( (null b) a) , shouldnt the program stop when b is null and add "a" to the + operation ? why does it start an infinite loop ?
EDIT: I know the function only does the + operation, i know i have to divide by the length of the b list + 1, but since i got this error i'd like to solve it first.
(defun average (a &rest b)
; ...
)
When you call this with (average 1 2 3 4) then inside the function the symbol a will be bound to 1 and the symbol b to the proper list (2 3 4).
So, inside average, (car b) will give you the first of the rest parameters, and (cdr b) will give you the rest of the rest parameters.
But when you then recursively call (average a (cdr b)), then you call it with only two arguments, no matter how many parameters where given to the function in the first place. In our example, it's the same as (average 1 '(3 4)).
More importantly, the second argument is now a list. Thus, in the second call to average, the symbols will be bound as follows:
a = 1
b = ((3 4))
b is a list with only a single element: Another list. This is why you'll get an error when passing (car b) as argument to +.
Now there is other thing : When i run this function and give values to the &rest, say (average 1 2 3 4 5) it gives me stackoverflow error. I traced the funcion and i saw that it was stuck in a loop, always calling the function with the (cdr b) witch is null and so it loops there. My question is:
If i have a stopping condition: ( (null b) a) , shouldnt the program stop when b is null and add "a" to the + operation ? why does it start an infinite loop ?
(null b) will only be truthy when b is the empty list. But when you call (average a '()), then b will be bound to (()), that is a list containing the empty list.
Solving the issue that you only pass exactly two arguments on the following calls can be done with apply: It takes the function as well as a list of parameters to call it with: (appply #'average (cons a (cdr b)))
Now tackling your original goal of writing an average function: Computing the average consists of two tasks:
Compute the sum of all elements.
Divide that with the number of all elements.
You could write your own function to recursively add all elements to solve the first part (do it!), but there's already such a function:
(+ 1 2) ; Sum of two elements
(+ 1 2 3) ; Sum of three elements
(apply #'+ '(1 2 3)) ; same as above
(apply #'+ some-list) ; Summing up all elements from some-list
Thus your average is simply
(defun average (&rest parameters)
(if parameters ; don't divide by 0 on empty list
(/ (apply #'+ parameters) (length parameters))
0))
As a final note: You shouldn't use car and cdr when working with lists. Better use the more descriptive names first and rest.
If performance is critical to you, it's probably best to fold the parameters (using reduce which might be optimized):
(defun average (&rest parameters)
(if parameters
(let ((accum
(reduce #'(lambda (state value)
(list (+ (first state) value) ;; using setf is probably even better, performance wise.
(1+ (second state))))
parameters
:initial-value (list 0 0))))
(/ (first accum) (second accum)))
0))
(Live demo)
#' is a reader macro, specifically one of the standard dispatching macro characters, and as such an abbreviation for (function ...)
Just define average*, which calls the usual average function.
(defun average* (&rest numbers)
(average numbers))
I think that Rainer Joswig's answer is pretty good advice: it's easier to first define a version that takes a simple list argument, and then define the &rest version in terms of it. This is a nice opportunity to mention spreadable arglists, though. They're a nice technique that can make your library code more convenient to use.
In most common form, the Common Lisp function apply takes a function designator and a list of arguments. You can do, for instance,
(apply 'cons '(1 2))
;;=> (1 . 2)
If you check the docs, though, apply actually accepts a spreadable arglist designator as an &rest argument. That's a list whose last element must be a list, and that represents a list of all the elements of the list except the last followed by all the elements in that final list. E.g.,
(apply 'cons 1 '(2))
;;=> (1 . 2)
because the spreadable arglist is (1 (2)), so the actual arguments (1 2). It's easy to write a utility to unspread a spreadable arglist designator:
(defun unspread-arglist (spread-arglist)
(reduce 'cons spread-arglist :from-end t))
(unspread-arglist '(1 2 3 (4 5 6)))
;;=> (1 2 3 4 5 6)
(unspread-arglist '((1 2 3)))
;;=> (1 2 3)
Now you can write an average* function that takes one of those (which, among other things, gets you the behavior, just like with apply, that you can pass a plain list):
(defun %average (args)
"Returns the average of a list of numbers."
(do ((sum 0 (+ sum (pop args)))
(length 0 (1+ length)))
((endp args) (/ sum length))))
(defun average* (&rest spreadable-arglist)
(%average (unspread-arglist spreadable-arglist)))
(float (average* 1 2 '(5 5)))
;;=> 3.25
(float (average* '(1 2 5)))
;;=> 2.66..
Now you can write average as a function that takes a &rest argument and just passes it to average*:
(defun average (&rest args)
(average* args))
(float (average 1 2 5 5))
;;=> 3.5
(float (average 1 2 5))
;;=> 2.66..

Lisp list-contains program

how can I make a Lisp program that checks if a character, string or number is in a list?
(list-contains '(1 a 2 d 2 5) 'a) => T
(list-contains '(1 a 2 d 2 5) 'x) => NIL
You can use (find x the-list) which returns x if x is in the list or NIL if it is not.
(find 'a '(1 a 2 d 2 5)) ; A
(find 'x '(1 a 2 d 2 5)) ; NIL
Since this is homework, your professor would probably like to see you implement an algorithm. Try this:
Take the car of the list and compare it against the input symbol.
If it's the same, return true; you're done.
If it's empty, return false; you're done.
Recurse back to #1, using the cdr of the list. (Here, implied that the car was not empty and was not the comparison symbol)
Greg's solution is what you should implement. But I want to add that, in case you hadn't head of it, The Little Schemer is a great introduction to this sort of thing. Try to get a copy, or even just open the preview up in Google Books and search for "member?". They do what you'd expect (that is, check if car is equal, recur on cdr if it isn't) but they trace it and ask you questions at each step.
It's not a very long or expensive book, but once you read it, you will have a natural feel for how to approach this sort of problem. They all boil down to the same thing, which for lists amounts to asking if we've hit the empty list yet, and if not, either doing something with car or recurring on cdr.
I recommend you the position function. It returns the position of the element in the list (the first position is 0) or NIL if it is not.
(position 'a '(1 a 2 d 2 5)) ; 1
(position 'x '(1 a 2 d 2 5)) ; NIL
position has an advantage over find. You can know if the symbol 'NIL in a list.
(position 'NIL '(1 a NIL d 2 5)) ; 2
(position 'NIL '(1 a 2 d 2 5)) ; NIL
However,
(find 'NIL '(1 a NIL d 2 5)) ; NIL
(find 'NIL '(1 a 2 d 2 5)) ; NIL
So with find there is no way to distinguish one case from the other.

List operations in Lisp

I have been searching everywhere for the following functionality in Lisp, and have gotten nowhere:
find the index of something in a list. example:
(index-of item InThisList)
replace something at a specific spot in a list. example:
(replace item InThisList AtThisIndex) ;i think this can be done with 'setf'?
return an item at a specific index. example:
(return InThisList ItemAtThisIndex)
Up until this point, I've been faking it with my own functions. I'm wondering if I'm just creating more work for myself.
This is how I've been faking number 1:
(defun my-index (findMe mylist)
(let ((counter 0) (found 1))
(dolist (item mylist)
(cond
((eq item findMe) ;this works because 'eq' checks place in memory,
;and as long as 'findMe' was from the original list, this will work.
(setq found nil)
(found (incf counter))))
counter))
You can use setf and nth to replace and retrieve values by index.
(let ((myList '(1 2 3 4 5 6)))
(setf (nth 4 myList) 101); <----
myList)
(1 2 3 4 101 6)
To find by index you can use the position function.
(let ((myList '(1 2 3 4 5 6)))
(setf (nth 4 myList) 101)
(list myList (position 101 myList)))
((1 2 3 4 101 6) 4)
I found these all in this index of functions.
find the index of something in a list.
In Emacs Lisp and Common Lisp, you have the position function:
> (setq numbers (list 1 2 3 4))
(1 2 3 4)
> (position 3 numbers)
2
In Scheme, here's a tail recursive implementation from DrScheme's doc:
(define list-position
(lambda (o l)
(let loop ((i 0) (l l))
(if (null? l) #f
(if (eqv? (car l) o) i
(loop (+ i 1) (cdr l)))))))
----------------------------------------------------
> (define numbers (list 1 2 3 4))
> (list-position 3 numbers)
2
>
But if you're using a list as a collection of slots to store structured data, maybe you should have a look at defstruct or even some kind of Lisp Object System like CLOS.
If you're learning Lisp, make sure you have a look at Practical Common Lisp and / or The Little Schemer.
Cheers!
Answers:
(position item sequence &key from-end (start 0) end key test test-not)
http://lispdoc.com/?q=position&search=Basic+search
(setf (elt sequence index) value)
(elt sequence index)
http://lispdoc.com/?q=elt&search=Basic+search
NOTE: elt is preferable to nth because elt works on any sequence, not just lists
Jeremy's answers should work; but that said, if you find yourself writing code like
(setf (nth i my-list) new-elt)
you're probably using the wrong datastructure. Lists are simply linked lists, so they're O(N) to access by index. You might be better off using arrays.
Or maybe you're using lists as tuples. In that case, they should be fine. But you probably want to name accessors so someone reading your code doesn't have to remember what "nth 4" is supposed to mean. Something like
(defun my-attr (list)
(nth 4 list))
(defun (setf my-attr) (new list)
(setf (nth 4 list) new))
+2 for "Practical Common Lisp". It is a mixture of a Common Lisp Cookbook and a quality Teach Yourself Lisp book.
There's also "Successful Common Lisp" (http://www.psg.com/~dlamkins/sl/cover.html and http://www.psg.com/~dlamkins/sl/contents.html) which seemed to fill a few gaps / extend things in "Practical Common Lisp".
I've also read Paul Graham's "ANSI Common Lisp" which is more about the basics of the language, but a bit more of a reference manual.
I have to agree with Thomas. If you use lists like arrays then that's just going to be slow (and possibly awkward). So you should either use arrays or stick with the functions you've written but move them "up" in a way so that you can easily replace the slow lists with arrays later.

Resources