Why nil cannot be matched in case/ecase? - common-lisp

The following form
(let ((foo nil))
(ecase foo
(:bar 1)
(:baz 2)
(nil 3)))
throws error NIL fell through ECASE expression. The workaround for this seems to be to wrap nil case with parenthesis, like this:
(let ((foo nil))
(ecase foo
(:bar 1)
(:baz 2)
((nil) 3))) ;; => 3
but why the first example doesn't work? Does unwrapped nil case have some special meaning?

Each clause in case can be match a single item or a list of items, e.g. you can write:
(ecase foo
(:bar 1) ;; match a specific symbol
((:baz :quux) 2)) ;; match either of these symbols
NIL is also the empty list in Lisp, and when it's used as the test in case this is how it's treated, so it never matches anything. Similarly, T and OTHERWISE are used to specify the default case, so you can't match them as single items. To match any of these, you need to put them in a list.
More technically, the specification says this about the keys in a clause:
keys---a designator for a list of objects. In the case of case, the symbols t and otherwise may not be used as the keys designator. To refer to these symbols by themselves as keys, the designators (t) and (otherwise), respectively, must be used instead.
and the definition of a list designator says:
a designator for a list of objects; that is, an object that denotes a list and that is one of: a non-nil atom (denoting a singleton list whose element is that non-nil atom) or a proper list (denoting itself).
Notice that the case where a single atom is treated as a singleton list says "non-nil atom". That permits NIL to fall into the second case, where a proper list denotes itself. Otherwise, there would be no way to create a list designator for an empty list, since NIL would denote (NIL).
You might argue that there's not much point in having an empty list of keys in CASE, since it will never match anything and the whole clause could just be omitted. This degenerate case is there for the benefit of automated code generation (e.g. other macros that expand into CASE), since they might produce empty lists, and these should be treated consistently with other list. It would be harder for them to make the generation of the clause conditional on whether there are any keys.

Related

How come (let ((x 'huh?)) (cons (boundp 'x) x)) evaluates to (NIL . HUH?)?

I do not understand this:
CL-USER> (let ((x 'huh?)) (cons (boundp 'x) x))
(NIL . HUH?)
I had expected that inside the let expression above, x would be bound, and therefore that the whole expression would have evaluated to (t . huh?). Or else, if (contrary to my expectation) x was not bound in the let's body, then at least that the evaluation of the expression above would have resulted in an error (on account of my having passed an unbound variable as the second argument to cons).
To add to my confusion, the Common Lisp HyperSpec's description for boundp says:
Returns true if symbol is bound; otherwise, returns false.
...where the word "bound" is hyperlinked to this glossary definition (my emphasis)1:
bound adj., v.t. 1. adj. having an associated denotation in a binding. ``The variables named by a let are bound within its body.'' See unbound. 2. adj. having a local binding which shadows[2] another. ``The variable print-escape is bound while in the princ function.'' 3. v.t. the past tense of bind.
Furthermore, the CLHS's documentation for let says the following (my emphasis):
...all of the variables varj are bound to the corresponding values; ...
Granted, the HyperSpec's page for boundp (which I already linked to earlier) also has the following example:
(let ((x 2)) (boundp 'x)) => false
...which indeed would justify the assertion that what I observed is in fact "officially documented behavior", but this narrowly correct justification is little comfort in light of everything else I've cited above.
Could someone please resolve for me this whopping (and hopefully only apparent) contradiction?
1 I realize that the highlighted phrase above is just an example of how the word "bound" would be used in a sentence, but it would be a truly perverse example if what it stated was exactly the opposite of what is actually the case for Common Lisp.
This can be a bit confusing, the spec for BOUNDP does however say that:
The function bound [sic (it should be boundp)] determines only whether a symbol has a value in the global environment; any lexical bindings are ignored.
So it only informs you if a given symbol is bound in the global environment, which happens if the variable has its value cell set to a value (see SYMBOL-VALUE),
or the variable is declared special and was previously bound by a let form. This second case happens notably for variables declared with defvar and defparameter, but also any variable you declare as special:
(let ((%my-var% 0))
(declare (special %my-var%))
...)
Note that to each time you want to use %my-var% you need to use that declaration, except if you declaimed it globally.
(defun use-my-var (input)
(declare (special %my-var%))
(print `(:my-var ,%my-var% :input ,input)))
When you write the use-my-var function, you have normally no problem identifying that input is bound, in fact a compiler would warn you if that was not the case. For lexical scopes, (boundp x) would compile down to a constant value, T or NIL. It is more interesting to check if the symbol-value of a symbol is globally bound or dynamically bound.
Here above, since %my-var% is a special variable, it can be bound or not in different calling contexts:
(let ((%my-var% 0))
(declare (special %my-var%))
(use-my-var 1))
=> (:my-var 0 :input 1)
(use-my-var 0)
;; ERROR: The variable %MY-VAR% is unbound.
boundp is for determining whether symbols are bound in the global environment. Note the following two examples from the HyperSpec:
(let ((x 2)) (boundp 'x)) ;=> false
(let ((x 2)) (declare (special x)) (boundp 'x)) ;=> true
The notes at the bottom of the page say:
The function bound determines only whether a symbol has a value in the global environment; any lexical bindings are ignored.
The appearance of bound in the note instead of boundp seems to be a typo. In any case, CLTL2 was a bit more specific about this:
boundp is true if the dynamic (special) variable named by symbol has a value; otherwise, it returns nil.
Note that fboundp has a similar restriction; here is an example from the HyperSpec:
(flet ((my-function (x) x))
(fboundp 'my-function)) ;=> false
There isn't much point in boundp handling lexical variables. From the HyperSpec 3.1.2.1.1.1 Lexical Variables:
A lexical variable always has a value. There is no operator that introduces a binding for a lexical variable without giving it an initial value, nor is there any operator that can make a lexical variable be unbound.
This is to say that lexical variables are always bound in their lexical environments. But dynamic variables may be either bound or unbound, and which of the two is the case can depend upon the environment in which the question is asked:
The value part of the binding for a dynamic variable might be empty; in this case, the dynamic variable is said to have no value, or to be unbound. A dynamic variable can be made unbound by using makunbound....
A dynamic variable is unbound unless and until explicitly assigned a value, except for those variables whose initial value is defined in this specification or by an implementation.

If a string in lisp is a vector, why can't I access the first element using svref?

So, I'm trying to learn Lisp, and I've come across a problem in the definition of what a String is.
I'm reading the ANSI Common Lisp by Paul Graham, and in this book it states that a String is a vector, or one-dimensional Array.
So, I create a String:
(defvar *my-string* "abc")
And then I can access the first value of my-string this way:
(aref *my-string* 0)
But if it is a vector, why can't I access that element this way:
(svref *my-string* 0)
I mean, when I create a vector this way:
(defvar my-vec (make-array 4 :initial-element 1))
I can access first element using svref:
(svref my-vec 0) ; returns 1
I forgot to add error when I try svref on String:
"The value "abc" is not of type (SIMPLE-ARRAY T (*))."
String is a vector, but it isn't a simple-vector. svref takes a simple-vector as first argument.
You can check it by calling:
(vector-p *my-string*)
which returns true
Unlike:
(simple-vector-p *my-string*)
which returns false.
Notice that (simple-vector-p my-vec) will return true as well, which confirms that make-array creates a simple-vector.
soulcheck's answer is absolutely right, but it's worth the time to become comfortable with the HyperSpec. For instance, if you start at the page for svref, there's a note at the bottom:
Notes:
svref is identical to aref except that it requires its first argument to be a simple vector.
The glossary entry for simple vector (linked above) says:
simple vector n. a vector of type simple-vector, sometimes called a "simple general vector." Not all vectors that are simple are simple vectors—only those that have element type t.
15.2 The Arrays Dictionary is also helpful here, as is 15. Arrays as a whole.

What's the difference between a sequence and a collection in Clojure

I am a Java programmer and am new to Clojure. From different places, I saw sequence and collection are used in different cases. However, I have no idea what the exact difference is between them.
For some examples:
1) In Clojure's documentation for Sequence:
The Seq interface
(first coll)
Returns the first item in the collection.
Calls seq on its argument. If coll is nil, returns nil.
(rest coll)
Returns a sequence of the items after the first. Calls seq on its argument.
If there are no more items, returns a logical sequence for which seq returns nil.
(cons item seq)
Returns a new seq where item is the first element and seq is the rest.
As you can see, when describing the Seq interface, the first two functions (first/rest) use coll which seems to indicate this is a collection while the cons function use seq which seems to indicate this is a sequence.
2) There are functions called coll? and seq? that can be used to test if a value is a collection or a sequence. It is clearly collection and sequence are different.
3) In Clojure's documentation about 'Collections', it is said:
Because collections support the seq function, all of the sequence
functions can be used with any collection
Does this mean all collections are sequences?
(coll? [1 2 3]) ; => true
(seq? [1 2 3]) ; => false
The code above tells me it is not such case because [1 2 3] is a collection but is not a sequence.
I think this is a pretty basic question for Clojure but I am not able to find a place explaining this clearly what their difference is and which one should I use in different cases. Any comment is appreciated.
Any object supporting the core first and rest functions is a sequence.
Many objects satisfy this interface and every Clojure collection provides at least one kind of seq object for walking through its contents using the seq function.
So:
user> (seq [1 2 3])
(1 2 3)
And you can create a sequence object from a map too
user> (seq {:a 1 :b 2})
([:a 1] [:b 2])
That's why you can use filter, map, for, etc. on maps sets and so on.
So you can treat many collection-like objects as sequences.
That's also why many sequence handling functions such as filter call seq on the input:
(defn filter
"Returns a lazy sequence of the items in coll for which
(pred item) returns true. pred must be free of side-effects."
{:added "1.0"
:static true}
([pred coll]
(lazy-seq
(when-let [s (seq coll)]
If you call (filter pred 5)
Don't know how to create ISeq from: java.lang.Long
RT.java:505 clojure.lang.RT.seqFrom
RT.java:486 clojure.lang.RT.seq
core.clj:133 clojure.core/seq
core.clj:2523 clojure.core/filter[fn]
You see that seq call is the is this object a sequence validation.
Most of this stuff is in Joy of Clojure chapter 5 if you want to go deeper.
Here are few points that will help understand the difference between collection and sequence.
"Collection" and "Sequence" are abstractions, not a property that can be determined from a given value.
Collections are bags of values.
Sequence is a data structure (subset of collection) that is expected to be accessed in a sequential (linear) manner.
The figure below best describes the relation between them:
You can read more about it here.
Every sequence is a collection, but not every collection is a sequence.
The seq function makes it possible to convert a collection into a sequence. E.g. for a map you get a list of its entries. That list of entries is different from the map itself, though.
In Clojure for the brave and true the author sums it up in a really understandable way:
The collection abstraction is closely related to the sequence
abstraction. All of Clojure's core data structures — vectors, maps,
lists and sets — take part in both abstractions.
The abstractions differ in that the sequence abstraction is "about"
operating on members individually while the collection abstraction is
"about" the data structure as a whole. For example, the collection
functions count, empty?, and every? aren't about any individual
element; they're about the whole.
I have just been through Chapter 5 - "Collection Types" of "The Joy of Clojure", which is a bit confusing (i.e. the next version of that book needs a review). In Chapter 5, on page 86, there is a table which I am not fully happy with:
So here's my take (fully updated after coming back to this after a month of reflection).
collection
It's a "thing", a collection of other things.
This is based on the function coll?.
The function coll? can be used to test for this.
Conversely, anything for which coll? returns true is a collection.
The coll? docstring says:
Returns true if x implements IPersistentCollection
Things that are collections as grouped into three separate classes. Things in different classes are never equal.
Maps Test using (map? foo)
Map (two actual implementations with slightly differing behaviours)
Sorted map. Note: (sequential? (sorted-map :a 1) ;=> false
Sets Test using (set? foo)
Set
Sorted set. Note: (sequential? (sorted-set :a :b)) ;=> false
Sequential collections Test using (sequential? foo)
List
Vector
Queue
Seq: (sequential? (seq [1 2 3])) ;=> true
Lazy-Seq: (sequential? (lazy-seq (seq [1 2 3]))) ;=> true
The Java interop stuff is outside of this:
(coll? (to-array [1 2 3])) ;=> false
(map? (doto (new java.util.HashMap) (.put "a" 1) (.put "b" 2))) ;=> false
sequential collection (a "chain")
It's a "thing", a collection holding other things according to a specific, stable ordering.
This is based on the function sequential?.
The function sequential? can be used to test for this.
Conversely, anything for which sequential? returns true is a sequential collection.
The sequential? docstring says:
Returns true if coll implements Sequential
Note: "sequential" is an adjective! In "The Joy of Clojure", the adjective is used as a noun and this is really, really, really confusing:
"Clojure classifies each collection data type into one of three
logical categories or partitions: sequentials, maps, and sets."
Instead of "sequential" one should use a "sequential thing" or a "sequential collection" (as used above). On the other hand, in mathematics the following words already exist: "chain", "totally ordered set", "simply ordered set", "linearly ordered set". "chain" sounds excellent but no-one uses that word. Shame!
"Joy of Clojure" also has this to say:
Beware type-based predicates!
Clojure includes a few predicates with names like the words just
defined. Although they’re not frequently used, it seems worth
mentioning that they may not mean exactly what the definitions here
might suggest. For example, every object for which sequential? returns
true is a sequential collection, but it returns false for some that
are also sequential [better: "that can be considered sequential
collections"]. This is because of implementation details that may be
improved in a future version of Clojure [and maybe this has already been
done?]
sequence (also "sequence abstraction")
This is more a concept than a thing: a series of values (thus ordered) which may or may not exist yet (i.e. a stream). If you say that a thing is a sequence, is that thing also necessarily a Clojure collection, even a sequential collection? I suppose so.
That sequential collection may have been completely computed and be completely available. Or it may be a "machine" to generate values on need (by computation - likely in a "pure" fashion - or by querying external "impure", "oracular" sources: keyboard, databases)
seq
This is a thing: something that can be processed by the functions
first, rest, next, cons (and possibly others?), i.e. something that obeys the protocol clojure.lang.ISeq (which is about the same concept as "providing an implementation for an interface" in Java), i.e. the system has registered function implementations for a pair (thing, function-name) [I sure hope I get this right...]
This is based on the function seq?.
The function seq? can be used to test for this
Conversely, a seq is anything for which seq? returns true.
Docstring for seq?:
Return true if x implements ISeq
Docstring for first:
Returns the first item in the collection. Calls seq on its argument.
If coll is nil, returns nil.
Docstring for rest:
Returns a possibly empty seq of the items after the first. Calls seq
on its argument.
Docstring for next:
Returns a seq of the items after the first. Calls seq on its argument.
If there are no more items, returns nil.
You call next on the seq to generate the next element and a new seq. Repeat until nil is obtained.
Joy of Clojure calls this a "simple API for navigating collections" and says "a seq is any object that implements the seq API" - which is correct if "the API" is the ensemble of the "thing" (of a certain type) and the functions which work on that thing. It depends on suitable shift in the concept of API.
A note on the special case of the empty seq:
(def empty-seq (rest (seq [:x])))
(type? empty-seq) ;=> clojure.lang.PersistentList$EmptyList
(nil? empty-seq) ;=> false ... empty seq is not nil
(some? empty-seq) ;=> true ("true if x is not nil, false otherwise.")
(first empty-seq) ;=> nil ... first of empty seq is nil ("does not exist"); beware confusing this with a nil in a nonempty list!
(next empty-seq) ;=> nil ... "next" of empty seq is nil
(rest empty-seq) ;=> () ... "rest" of empty seq is the empty seq
(type (rest empty-seq)) ;=> clojure.lang.PersistentList$EmptyList
(seq? (rest empty-seq)) ;=> true
(= (rest empty-seq) empty-seq) ;=> true
(count empty-seq) ;=> 0
(empty? empty-seq) ;=> true
Addenda
The function seq
If you apply the function seq to a thing for which that makes sense (generally a sequential collection), you get a seq representing/generating the members of that collection.
The docstring says:
Returns a seq on the collection. If the collection is empty, returns
nil. (seq nil) returns nil. seq also works on Strings, native Java
arrays (of reference types) and any objects that implement Iterable.
Note that seqs cache values, thus seq should not be used on any
Iterable whose iterator repeatedly returns the same mutable object.
After applying seq, you may get objects of various actual classes:
clojure.lang.Cons - try (class (seq (map #(* % 2) '( 1 2 3))))
clojure.lang.PersistentList
clojure.lang.APersistentMap$KeySeq
clojure.lang.PersistentList$EmptyList
clojure.lang.PersistentHashMap$NodeSeq
clojure.lang.PersistentQueue$Seq
clojure.lang.PersistentVector$ChunkedSeq
If you apply seq to a sequence, the actual class of the thing returned may be different from the actual class of the thing passed in. It will still be a sequence.
What the "elements" in the sequence are depends. For example, for maps, they are key-value pairs which look like 2-element vector (but their actual class is not really a vector).
The function lazy-seq
Creates a thing to generate more things lazily (a suspended machine, a suspended stream, a thunk)
The docstring says:
Takes a body of expressions that returns an ISeq or nil, and yields a
Seqable object that will invoke the body only the first time seq is
called, and will cache the result and return it on all subsequent seq
calls. See also - realized?"
A note on "functions" and "things" ... and "objects"
In the Clojure Universe, I like to talk about "functions" and "things", but not about "objects", which is a term heavily laden with Java-ness and other badness. Mention of objects feels like shards poking up from the underlying Java universe.
What is the difference between function and thing?
It's fluid! Some stuff is pure function, some stuff is pure thing, some is in between (can be used as function and has attributes of a thing)
In particular, Clojure allows contexts where one considers keywords (things) as functions (to look up values in maps) or where one interpretes maps (things) as functions, or shorthand for functions (which take a key and return the value associated to that key in the map)
Evidently, functions are things as they are "first-class citizens".
It's also contextual! In some contexts, a function becomes a thing, or a thing becomes a function.
There are nasty mentions of objects ... these are shards poking up from the underlying Java universe.
For presentation purposes, a diagram of Collections
For seq?:
Return true if x implements ISeq
For coll?:
Returns true if x implements IPersistentCollection
And I found ISeq interface extends from IPersistentCollection in Clojure source code, so as Rörd said, every sequences is a collection.

Why does OCaml only have first::rest on list, but not rest::last for list?

In OCaml, for list, we always do first::rest. it is convenient to get the first element out of a list, or insert an element in front of a list.
But why does OCaml not have rest::last? Without List's functions, we can't easily do getting last element of a list or insert an element to the end of a list.
The list datatype is not a magic builtin, only a regular recursive datatype with some syntactic sugar. You could implement it yourself, using Nil instead of [] and Cons(first,rest) instead of first::rest, in the following way:
type 'a mylist =
| Nil
| Cons of 'a * 'a mylist
I'm not sure if you will see the definition above as an answer to your question, but it really is: when you write first::rest, you're not calling a function, you're just using a datatype constructor that builds a new value (in constant time and space).
This definition is simple and has clear algorithmic properties: lists are immutable, accessing the first element of the list is O(1), accessing the k-th element is O(k), concatenation of two lists li1 and li2 is O(length(li1)), etc. In particular, accessing the last element or adding something at the end of a list li would be O(length(li)); we're not eager to expose this as a convenient operation because it is costly.
If you want to add elements at the end of a sequence, lists are not the right data structure. You may want to use a queue (if you follow a first-in, first-out access discipline), a deque, etc. There is a (mutable) Queue structure in the standard library, and the two third-party overlays Core and Batteries have a deque module (persistent in Batteries, mutable in Core).
Because lists are simply plain data types defined as
type 'a list = Nil | Cons of 'a * 'a list
except that you spell Nil as [] and Cons as infix ::. In other words, lists are "singly-linked" if you want. There is no magic involved except for the syntax of the constructors. Obviously, to get to the last element, or append one, you need some auxiliary functions then.

Does REMOVE ever return the same sequence, in practice?

Does REMOVE ever return the same sequence in any real implementations of Common Lisp? The spec suggests that it is allowed:
The result of remove may share with
sequence; the result may be identical
to the input sequence if no elements
need to be removed.
SBCL does not seem to do this, for example, but I only did a crude (and possibly insufficient) test, and I'm wondering what other implementations do.
CL-USER> (defparameter *str* "bbb")
*STR*
CL-USER> *str*
"bbb"
CL-USER> (defparameter *str2* (remove #\a *str*))
*STR2*
CL-USER> (eq *str* *str2*)
NIL
CL-USER> *str*
"bbb"
CL-USER> *str2*
"bbb"
Returning the original string could be useful. In case no element of a string gets removed, returning the original sequence prevents allocation of a new sequence. Even if a new sequence has been allocated internally, this new sequence could be turned into garbage as soon as possible.
CLISP for example returns the original string.
[1]> (let ((a "abc")) (eq a (remove #\d a)))
T
I suspect it mostly depends on the implementation. On the whole, I suspect it's not that common, as the typical case would be that something gets removed when REMOVE is called, so making a space optimisation for the nothing-removed case would incur a run-time penalty and not necessarily saving any space, since you'd want to allocate space for the return value for strings and arrays and would either need to construct a list as you go OR do a two-pass operation.

Resources