The relationship between quotation, reification and reflection - reflection

I recently get confused with quotation, reification and reflection. Someone could offer a good explanation about their relationship and differences (if any)?

Quoting
This is probably the easiest one. Consider what happens when you type the following into the REPL:
(+ a 1)
REPL stands for Read Eval Print Loop, so first it Reads this in. This is a list, so after reading we have a list of 3 elements containing: <the symbol "+"> <the symbol "a"> <the number 1>
The next step is evaluation. Evaluating a list in Common Lisp involves looking up the function (or macro) bound to the first item in the list. Since + is bound to a function and not a macro, it then evaluates each subsequent element in the list. Numbers evaluate to themselves, and "a" will evaluate to whatever it is bound to. Now that the arguments are evaluated, the function "+" is called with the results of the evaluation.
We then Print the result and Loop back to the Read step
So this is great, but what if we want something that, when evaluated, will end up as a list of 3 elements containing <the symbol "+"> <the symbol "a"> <the number 1>? The solution to this is quoting. Lisps in general have a special form called "quote" that takes a single argument, and the result is that argument, unevaluated. So
(quote (+ a 1))
will evaluate to that list. As some syntactic sugar, ' is treated the same as (quote ), so we can just write '(+ a 1).
Reification
Reification is a generic term that roughly means "Make an abstract concept" concrete. Specific to programming, when something is reified, it roughly means you can treat it as data (or "First-class object"). An example of this in lisp is functions the lambda expression gives you the ability to create a concrete, first-class object that represents the abstract concept of a function call. Another example is a CLOS class, which is itself a CLOS object, that represents the abstract concept of a class.
Reflection
To a certain degree, reflection is the opposite of Reification. Given something concrete, you want some information about it's abstract representation. Consider a Common Lisp Package object, which is a reification of the concept of a package, which is just a mapping from symbol names to symbols. You can use do-symbols to iterate over all of the symbols in a package, thus getting that information out at runtime.
Also, remember how I said lambda's were a reification of functions? Well "function-lambda-expression" is the reflection on functions.
The metaobject protocol (MOP) is a semi-standard way of doing all sorts of things with classes and objects. Among other things, it allows reflection on classes and objects.

Related

Knowing when what you're looking at must be a macro

I know there is macro-function, explained here, which allows you to check, but is it also possible in simply reading lisp source to sometimes infer of what you're looking at "that must be a macro"? (assuming of course you have never seen the function/macro before).
I'm fairly sure the answer is yes, but as this seems so fundamental, I thought worth asking, especially because any nuances on this may be valuable & interesting to know about.
In Paul Graham's ANSI Common Lisp, p70, he is describing how to use defstruct.
When I see (defstruct point x y), were I to know absolutely nothing about what defstruct was, this could just as well be a function.
But when I see
(defstruct polemic
(subject "foo")
(effect "bar"))
I know that must be a macro because (let's assume), I also know that subject and effect are undefined functions. (I know that because they error with undefined function when called 'at the top level'(?)) (if that's the right term).
If the two list arguments to defstruct above were quoted, it would not be so simple. Because they're not quoted, it must be a macro.
Is it as simple as that?
I've changed the field names slightly from those used on the book to make this question clearer.
Finally, Graham writes:
"We can specify default values for structure fields by enclosing the field name and a default expression in a list in the original definition"
What I'm noticing is that that's true but it is not a (quoted) list. Would any readers of this post have phrased the above sentence at all differently (given that macros haven't been introduced in the book yet (though I have a basic awareness of what they are)).
My feeling is it's not a "data list" those default expressions are enclosed in. (apologies for bad terminology) - seeking how rightly to conceptualise here.
In general, you're right: if there's some nesting inside the call and you are sure that the car's of the nested lists aren't functions - it's a macro.
Also, almost always, def-something and with-something are macros.
But there's no guarantee. The question is, what are you trying to accomplish? Some code walking/transformation or external processing (like in an editor). For the latter, you should keep in mind that full control is possible only if you perform code evaluation, although heuristics (like in Emacs) can take you pretty far. Or you just want to develop your intuition for faster code reading...
There is a set of conventions that identify quite cleary what forms are supposed to be macros, simply by mimicking the syntax of existing macros or special operators of CL.
For example, the following is a mix of various imaginary macros, but even without knowing their definition, the code shouldn't be too hard to figure out:
(defun/typed example ((id (integer 0 10)))
(with-connection (connection (connect id))
(do-events (event connection)
(event-case event
(:quit (&optional code) (return code))))))
The usual advice about macros is to avoid them if possible, so if you spot something that doesn't make sense as a lisp expression, it probably is, or is enclosed in, a macro.
(defstruct point x y)
[...] were I to know absolutely nothing about what defstruct was, this could just as well be a function.
There are various hints that this is not a function. First of all, the name starts with def. Then, if defstruct was a function, then point, x and y would all be evaluated before calling the function, and that means the code would be relying on global variables, even though they are not wearing earmuffs (e.g. *point*, *x*, *y*), and you probably won't find any definition for them in the preceding forms (or later in the same compilation unit). Also, if it was a function, the result would be discarded directly since it is not used (this is a toplevel form). That only indicates the probable presence of side-effects, but still, this would be unusual.
A top-level function with side-effects would look like this instead, with quoted data:
(register-struct 'point '(x y))
Finally, there are cases where you cannot easily guess if you are using a macro or a function:
(my-get object :slot)
This could be a function call, or you could have a macro that turns the above to (aref object 0) (assuming :slot is the zeroth slot in object, because all your objects are assumed to be of a certain custom type backed by a vector). You could also have compiler macros. In case of doubt, try to macroexpand it and look at the documentation.

Percent sign in defun and defstruct

I am teaching myself Common Lisp. I have been looking at an example of Conway's game of life, and there is a piece of syntax I do not understand.
The complete code is available here. The part in particular I am having trouble with is as follows:
(defstruct (world (:constructor %make-world))
current
next)
(defun make-world (width height)
(flet ((make-plane (width height)
(make-array (list width height)
:element-type 'bit
:initial-element 0)))
(%make-world
:current (make-plane width height)
:next (make-plane width height))))
I am wondering, first, what is the significance of the percent-sign in %make-world? Second, why does the constructor specify two different names? (make-world and %make-world) I have seen this syntax in use before, but the names are always the same. It seems like there is some deeper functionality, but it is escaping me.
There are several naming conventions is the Lisp world when it comes to identifiers. For an overview see: http://www.cliki.net/Naming+conventions
Making objects or structures can be done with system generated functions. DEFSTRUCT will create a MAKE-FOO function with init values for the slots as keyword arguments.
Sometimes people prefer functions with normal positional arguments - it's shorter to write and the arguments have to be given when calling the function - you can't omit them.
Here in this case there is the need to name the DEFSTRUCT generated function in such a way that it does not collide with the name, which the user should use. So %MAKE-FOO says that is an internal helper function to the library and is expected to NOT be called by user-level code.
My Lisp is a little rusty, but I believe it goes like this:
The % sign has no special meaning. I've seen it used for internal functions (e.g. defined by labels), but nothing wil stop you from calling it normally. If you look at defstruct documetation, you'll see that (:constructor %make-world) defines a named constructor %make-world (by default the constructor would be called make-world. This contructor can be used to create world structs, initializing fields using named parameters.
The function make-world exists to make creating these structs easier. The thing is, current and next should be 2-dimensional arrays, but it's more convenient if, instead of passing these arrays to the constructor, you could just say what the dimensions are and a function would create those arrays for you. Which is exactly what make-world does here. It first defines an internal function make-plane, which can create an array, then uses it to create 2 arrays and pass them to the constructor %make-plane.
In line with the usual usage of the % character (again, this is just a convention), it tells you that, as a programmer wishing to use the world struct, you should not use the %make-world constructor, but the make-world function instead.

Why does OCaml only have first::rest on list, but not rest::last for list?

In OCaml, for list, we always do first::rest. it is convenient to get the first element out of a list, or insert an element in front of a list.
But why does OCaml not have rest::last? Without List's functions, we can't easily do getting last element of a list or insert an element to the end of a list.
The list datatype is not a magic builtin, only a regular recursive datatype with some syntactic sugar. You could implement it yourself, using Nil instead of [] and Cons(first,rest) instead of first::rest, in the following way:
type 'a mylist =
| Nil
| Cons of 'a * 'a mylist
I'm not sure if you will see the definition above as an answer to your question, but it really is: when you write first::rest, you're not calling a function, you're just using a datatype constructor that builds a new value (in constant time and space).
This definition is simple and has clear algorithmic properties: lists are immutable, accessing the first element of the list is O(1), accessing the k-th element is O(k), concatenation of two lists li1 and li2 is O(length(li1)), etc. In particular, accessing the last element or adding something at the end of a list li would be O(length(li)); we're not eager to expose this as a convenient operation because it is costly.
If you want to add elements at the end of a sequence, lists are not the right data structure. You may want to use a queue (if you follow a first-in, first-out access discipline), a deque, etc. There is a (mutable) Queue structure in the standard library, and the two third-party overlays Core and Batteries have a deque module (persistent in Batteries, mutable in Core).
Because lists are simply plain data types defined as
type 'a list = Nil | Cons of 'a * 'a list
except that you spell Nil as [] and Cons as infix ::. In other words, lists are "singly-linked" if you want. There is no magic involved except for the syntax of the constructors. Obviously, to get to the last element, or append one, you need some auxiliary functions then.

Haskell "collections" language design

Why is the Haskell implementation so focused on linked lists?
For example, I know Data.Sequence is more efficient
with most of the list operations (except for the cons operation), and is used a lot;
syntactically, though, it is "hardly supported". Haskell has put a lot of effort into functional abstractions, such as the Functor and the Foldable class, but their syntax is not compatible with that of the default list.
If, in a project I want to optimize and replace my lists with sequences - or if I suddenly want support for infinite collections, and replace my sequences with lists - the resulting code changes are abhorrent.
So I guess my wondering can be made concrete in questions such as:
Why isn't the type of map equal to (Functor f) => (a -> b) -> f a -> f b?
Why can't the [] and (:) functions be used for, for example, the type in Data.Sequence?
I am really hoping there is some explanation for this, that doesn't include the words "backwards compatibility" or "it just grew that way", though if you think there isn't, please let me know. Any relevant language extensions are welcome as well.
Before getting into why, here's a summary of the problem and what you can do about it. The constructors [] and (:) are reserved for lists and cannot be redefined. If you plan to use the same code with multiple data types, then define or choose a type class representing the interface you want to support, and use methods from that class.
Here are some generalized functions that work on both lists and sequences. I don't know of a generalization of (:), but you could write your own.
fmap instead of map
mempty instead of []
mappend instead of (++)
If you plan to do a one-off data type replacement, then you can define your own names for things, and redefine them later.
-- For now, use lists
type List a = [a]
nil = []
cons x xs = x : xs
{- Switch to Seq in the future
-- type List a = Seq a
-- nil = empty
-- cons x xs = x <| xs
-}
Note that [] and (:) are constructors: you can also use them for pattern matching. Pattern matching is specific to one type constructor, so you can't extend a pattern to work on a new data type without rewriting the pattern-matchign code.
Why there's so much list-specific stuff in Haskell
Lists are commonly used to represent sequential computations, rather than data. In an imperative language, you might build a Set with a loop that creates elements and inserts them into the set one by one. In Haskell, you do the same thing by creating a list and then passing the list to Set.fromList. Since lists so closely match this abstraction of computation, they have a place that's unlikely to ever be superseded by another data structure.
The fact remains that some functions are list-specific when they could have been generic. Some common functions like map were made list-specific so that new users would have less to learn. In particular, they provide simpler and (it was decided) more understandable error messages. Since it's possible to use generic functions instead, the problem is really just a syntactic inconvenience. It's worth noting that Haskell language implementations have very little list-speficic code, so new data structures and methods can be just as efficient as the "built-in" ones.
There are several classes that are useful generalizations of lists:
Functor supplies fmap, a generalization of map.
Monoid supplies methods useful for collections with list-like structure. The empty list [] is generalized to other containers by mempty, and list concatenation (++) is generalized to other containers by mappend.
Applicative and Monad supply methods that are useful for interpreting collections as computations.
Traversable and Foldable supply useful methods for running computations over collections.
Of these, only Functor and Monad were in the influential Haskell 98 spec, so the others have been overlooked to varying degrees by library writers, depending on when the library was written and how actively it was maintained. The core libraries have been good about supporting new interfaces.
I remember reading somewhere that map is for lists by default since newcomers to Haskell would be put off if they made a mistake and saw a complex error about "Functors", which they have no idea about. Therefore, they have both map and fmap instead of just map.
EDIT: That "somewhere" is the Monad Reader Issue 13, page 20, footnote 3:
3You might ask why we need a separate map function. Why not just do away with the current
list-only map function, and rename fmap to map instead? Well, that’s a good question. The
usual argument is that someone just learning Haskell, when using map incorrectly, would much
rather see an error about lists than about Functors.
For (:), the (<|) function seems to be a replacement. I have no idea about [].
A nitpick, Data.Sequence isn't more efficient for "list operations", it is more efficient for sequence operations. That said, a lot of the functions in Data.List are really sequence operations. The finger tree inside Data.Sequence has to do quite a bit more work for a cons (<|) equivalent to list (:), and its memory representation is also somewhat larger than a list as it is made from two data types a FingerTree and a Deep.
The extra syntax for lists is fine, it hits the sweet spot at what lists are good at - cons (:) and pattern-matching from the left. Whether or not sequences should have extra syntax is further debate, but as you can get a very long way with lists, and lists are inherently simple, having good syntax is a must.
List isn't an ideal representation for Strings - the memory layout is inefficient as each Char is wrapped with a constructor. This is why ByteStrings were introduced. Although they are laid out as an array ByteStrings have to do a bit of administrative work - [Char] can still be competitive if you are using short strings. In GHC there are language extensions to give ByteStrings more String-like syntax.
The other major lazy functional Clean has always represented strings as byte arrays, but its type system made this more practical - I believe the ByteString library uses unsafePerfomIO under the hood.
With version 7.8, ghc supports overloading list literals, compare the manual. For example, given appropriate IsList instances, you can write
['0' .. '9'] :: Set Char
[1 .. 10] :: Vector Int
[("default",0), (k1,v1)] :: Map String Int
['a' .. 'z'] :: Text
(quoted from the documentation).
I am pretty sure this won't be an answer to your question, but still.
I wish Haskell had more liberal function names(mixfix!) a la Agda. Then, the syntax for list constructors (:,[]) wouldn't have been magic; allowing us to at least hide the list type and use the same tokens for our own types.
The amount of code change while migrating between list and custom sequence types would be minimal then.
About map, you are a bit luckier. You can always hide map, and set it equal to fmap yourself.
import Prelude hiding(map)
map :: (Functor f) => (a -> b) -> f a -> f b
map = fmap
Prelude is great, but it isn't the best part of Haskell.

What is the difference between a variable and a symbol in LISP?

In terms of scope? Actual implementation in memory? The syntax? For eg, if (let a 1) Is 'a' a variable or a symbol?
Jörg's answer points in the right direction. Let me add a bit to it.
I'll talk about Lisps that are similar to Common Lisp.
Symbols as a data structure
A symbol is a real data structure in Lisp. You can create symbols, you can use symbols, you can store symbols, you can pass symbols around and symbols can be part of larger data structures, for example lists of symbols. A symbol has a name, can have a value and can have a function value.
So you can take a symbol and set its value.
(setf (symbol-value 'foo) 42)
Usually one would write (setq foo 42), or (set 'foo 42) or (setf foo 42).
Symbols in code denoting variables
But!
(defun foo (a)
(setq a 42))
or
(let ((a 10))
(setq a 42))
In both forms above in the source code there are symbols and a is written like a symbol and using the function READ to read that source returns a symbol a in some list. But the setq operation does NOT set the symbol value of a to 42. Here the LET and the DEFUN introduce a VARIABLE a that we write with a symbol. Thus the SETQ operation then sets the variable value to 42.
Lexical binding
So, if we look at:
(defvar foo nil)
(defun bar (baz)
(setq foo 3)
(setq baz 3))
We introduce a global variable FOO.
In bar the first SETQ sets the symbol value of the global variable FOO. The second SETQ sets the local variable BAZ to 3. In both case we use the same SETQ and we write the variable as a symbol, but in the first case the FOO donates a global variable and those store values in the symbol value. In the second case BAZ denotes a local variable and how the value gets stored, we don't know. All we can do is to access the variable to get its value. In Common Lisp there is no way to take a symbol BAZ and get the local variable value. We don't have access to the local variable bindings and their values using symbols. That's a part of how lexical binding of local variables work in Common Lisp.
This leads for example to the observation, that in compiled code with no debugging information recorded, the symbol BAZ is gone. It can be a register in your processor or implemented some other way. The symbol FOO is still there, because we use it as a global variable.
Various uses of symbols
A symbol is a data type, a data structure in Lisp.
A variable is a conceptual thing. Global variables are based on symbols. Local lexical variables not.
In source code we write all kinds of names for functions, classes and variables using symbols.
There is some conceptual overlap:
(defun foo (bar) (setq bar 'baz))
In the above SOURCE code, defun, foo, bar, setq and baz are all symbols.
DEFUN is a symbol providing a macro.
FOO is a symbol providing a function.
SETQ is a symbol providing a special operator.
BAZ is a symbol used as data. Thus the quote before BAZ.
BAR is a variable. In compiled code its symbol is no longer needed.
Quoting from the Common Lisp HyperSpec:
symbol n. an object of type symbol.
variable n. a binding in the “variable” namespace.
binding n. an association between a name and that which the name denotes. (…)
Explanation time.
What Lisp calls symbols is fairly close to what many languages call variables. In a first approximation, symbols have values; when you evaluate the expression x, the value of the expression is the value of the symbol x; when you write (setq x 3), you assign a new value to x. In Lisp terminology, (setq x 3) binds the value 3 to the symbol x.
A feature of Lisp that most languages don't have is that symbols are ordinary objects (symbols are first-class objects, in programming language terminology). When you write (setq x y), the value of x becomes whatever the value of y was at the time of the assignment. But you can write (setq x 'y), in which case the value of x is the symbol y.
Conceptually speaking, there is an environment which is an association table from symbols to values. Evaluating a symbol means looking it up in the current environment. (Environments are first-class objects too, but this is beyond the scope of this answer.) A binding refers to a particular entry in an environment. However, there's an additional complication.
Most Lisp dialects have multiple namespaces, at least a namespace of variables and a namespace of functions. An environment can in fact contain multiple entries for one symbol, one entry for each namespace. A variable, strictly speaking, is an entry in an environment in the namespace of variables. In everyday Lisp terminology, a symbol is often referred to as a variable when its binding as a variable is what you're interested in.
For example, in (setq a 1) or (let ((a 1)) ...), a is a symbol. But since the constructs act on the variable binding for the symbol a, it's common to refer to a as a variable in this context.
On the other hand, in (defun a (...) ...) or (flet ((a (x) ...)) ...), a is a also symbol, but these constructs act on its function binding, so a would not be considered a variable.
In most cases, when a symbol appears unquoted in an expression, it is evaluated by looking up its variable binding. The main exception is that in a function call (foo arg1 arg2 ...), the function binding for foo is used. The value of a quoted symbol 'x or (quote x) is itself, as with any quoted expression. Of course, there are plenty of special forms where you don't need to quote a symbol, including setq, let, flet, defun, etc.
A symbol is a name for a thing. A variable is a mutable pointer to a mutable storage location.
In the code snippet you showed, both let and a are symbols. Within the scope of the let block, the symbol a denotes a variable which is currently bound to the value 1.
But the name of the thing is not the thing itself. The symbol a is not a variable. It is a name for a variable. But only in this specific context. In a different context, the name a can refer to a completely different thing.
Example: the symbol jaguar may, depending on context, denote
OSX 10.2
a gaming console
a car manufacturer
a ground attack military jet airplane
another military jet airplane
a supercomputer
an electric guitar
and a whole lot of other things
oh, did I forget something?
Lisp uses environments which are similar to maps (key -> value) but with extra built-in mechanisms for chaining environments and controlling bindings.
Now, symbols are pretty much, keys (except special form symbols), and point to a value,
ie function, integer, list, etc.
Since Common Lisp gives you a way to alter the values, i.e. with setq, symbols in some contexts
(your example) are also variables.
A symbol is a Lisp data object. A Lisp "form" means a Lisp object that is intended to be evaluated. When a symbol itself is used as a Lisp form, i.e. when you evaluate a symbol, the result is a value that is associated with that symbol. The way values are associated with symbols is a deep part of the Lisp langauge. Whether the symbol has been declared to be "special" or not greatly changes the way evaluation works.
Lexical values are denoted by symbols, but you can't manipulate those symbols as objects yourself. In my opinion, explaining anything in Lisp in terms of "pointers" or "locations" is not the best way.
Adding a side note to the above answers:
Newcomers to Lisp often are not sure exactly what symbols are for, besides being the names of variables. I think the best answer is that they are like enumeration constants, except that you don't have to declare them before using them. Of course, as others have explained, they are also objects. (This shouldn't seem strange to users of Java, in which enumeration constants are objects too.)
Symbol and variable are 2 different things.
Like in mathematic symbol is a value. And variable have the same meaning than in mathematic.
But your confusion came from the fact that symbol are the meta representation of a variable.
That is if you do
(setq a 42)
You just define a variable a. Incidentally the way common lisp store it is throw the structure of a symbol.
In common lips symbol is a structure withe different property. Each one can be access with function like symbol-name, symbol-function...
In the case of variable you can access his value via ssymbol-value
? (symbol-value 'a)
42
This is not the common case of getting the value of a.
? a
42
Note that symbols are self evaluating that mean that if you ask a symbol you get the symbol not the symbol-value
? 'a
A

Resources