Percent sign in defun and defstruct - functional-programming

I am teaching myself Common Lisp. I have been looking at an example of Conway's game of life, and there is a piece of syntax I do not understand.
The complete code is available here. The part in particular I am having trouble with is as follows:
(defstruct (world (:constructor %make-world))
current
next)
(defun make-world (width height)
(flet ((make-plane (width height)
(make-array (list width height)
:element-type 'bit
:initial-element 0)))
(%make-world
:current (make-plane width height)
:next (make-plane width height))))
I am wondering, first, what is the significance of the percent-sign in %make-world? Second, why does the constructor specify two different names? (make-world and %make-world) I have seen this syntax in use before, but the names are always the same. It seems like there is some deeper functionality, but it is escaping me.

There are several naming conventions is the Lisp world when it comes to identifiers. For an overview see: http://www.cliki.net/Naming+conventions
Making objects or structures can be done with system generated functions. DEFSTRUCT will create a MAKE-FOO function with init values for the slots as keyword arguments.
Sometimes people prefer functions with normal positional arguments - it's shorter to write and the arguments have to be given when calling the function - you can't omit them.
Here in this case there is the need to name the DEFSTRUCT generated function in such a way that it does not collide with the name, which the user should use. So %MAKE-FOO says that is an internal helper function to the library and is expected to NOT be called by user-level code.

My Lisp is a little rusty, but I believe it goes like this:
The % sign has no special meaning. I've seen it used for internal functions (e.g. defined by labels), but nothing wil stop you from calling it normally. If you look at defstruct documetation, you'll see that (:constructor %make-world) defines a named constructor %make-world (by default the constructor would be called make-world. This contructor can be used to create world structs, initializing fields using named parameters.
The function make-world exists to make creating these structs easier. The thing is, current and next should be 2-dimensional arrays, but it's more convenient if, instead of passing these arrays to the constructor, you could just say what the dimensions are and a function would create those arrays for you. Which is exactly what make-world does here. It first defines an internal function make-plane, which can create an array, then uses it to create 2 arrays and pass them to the constructor %make-plane.
In line with the usual usage of the % character (again, this is just a convention), it tells you that, as a programmer wishing to use the world struct, you should not use the %make-world constructor, but the make-world function instead.

Related

Using functions as parameter or use them directly in clojure?

i have this two functions and i use share-card-to-player in share-cards. Should i not rather pass share-card-to-player as argument or use it as anonymous function?
In the way, i am using it, i doubt, that i am using it in a functional way, because i am basically referencing to a global variable namely the function share-card-to-player. Is that assumption correct?
(defn share-card-to-player [game [player cards]]
(assoc-in game [:players player :cards ]
cards))
(defn share-cards [{players :players cards :cards :as game}]
(reduce share-card-to-player game
(map vector
(keys players)
(->> cards
(partition (/ (count cards)
(count players)))))))
i am basically referencing to a global variable namely the function share-card-to-player
assoc-in is a global variable. / is a global variable. count is a global variable. Should you avoid them all? It will be tough to get anything done! The problem with globals is mutation, and functions are not mutable.
You can define share-card-to-player as an anonymous local function if you want. But it's fine as is. Extracting it as a function parameter would make sense only if you expect you would want to pass another function to it at another use site.
Passing in the function as a parameter would only make sense to me if you plan to pass in different functions in other parts of your program to change the behavior of share-cards.
In your particular example, I'd even consider factoring out the call to map if you can give a good name to that thing because it takes longer to read and understand that code.
To summarize:
Functions as parameter should change the behavior.
Factoring out things to their own functions, even if only used in one place of your program, makes sense if you can give a good name to what the code does.

How is it possible that a function can call itself

I know about recursion, but I don't know how it's possible. I'll use the fallowing example to further explain my question.
(def (pow (x, y))
(cond ((y = 0) 1))
(x * (pow (x , y-1))))
The program above is in the Lisp language. I'm not sure if the syntax is correct since I came up with it in my head, but it will do. In the program, I am defining the function pow, and in pow it calls itself. I don't understand how it's able to do this. From what I know the computer has to completely analyze a function before it can be defined. If this is the case, then the computer should give an undefined message when I use pow because I used it before it was defined. The principle I'm describing is the one at play when you use an x in x = x + 1, when x was not defined previously.
Compilers are much smarter than you think.
A compiler can turn the recursive call in this definition:
(defun pow (x y)
(cond ((zerop y) 1)
(t (* x (pow x (1- y))))))
into a goto intruction to re-start the function from scratch:
Disassembly of function POW
(CONST 0) = 1
2 required arguments
0 optional arguments
No rest parameter
No keyword parameters
12 byte-code instructions:
0 L0
0 (LOAD&PUSH 1)
1 (CALLS2&JMPIF 172 L15) ; ZEROP
4 (LOAD&PUSH 2)
5 (LOAD&PUSH 3)
6 (LOAD&DEC&PUSH 3)
8 (JSR&PUSH L0)
10 (CALLSR 2 57) ; *
13 (SKIP&RET 3)
15 L15
15 (CONST 0) ; 1
16 (SKIP&RET 3)
If this were a more complicated recursive function that a compiler cannot unroll into a loop, it would merely call the function again.
From what I know the computer has to completely analyze a function before it can be defined.
When the compiler sees that one defines a function POW, then it tells itself: now we are defining function POW. If it then inside the definition sees a call to POW, then the compiler says to itself: oh, this seems to be a call to the function that I'm currently compiling and it can then create code to make a recursive call.
A function is just a block of code. It's name is just help so you don't have to calculate the exact address it will end up in. The programming language will turn the names into where the program is to go to execute.
How one function call another is by storing the address of the next command in this function on the stack, perhaps add arguments to the stack and then jump to the address location of the function. The function itself jumps to the return address it finds so that control goes back to the callee. There are several calling conventions implemented by the language on which side do what. CPUs don't really have function support so just like there is nothing called a while loop in CPUs functions are emulated.
Just like functions have names, arguments have names too, however they are mere pointers just like the return address. When calling itself it just adds a new return address and arguments onto the stack and jump to itself. The top of the stack will be different and thus the same variable names are unique addresses to the call so x and y in the previous call is somewhere else than the current x and y. In fact there is no special treatment needed for calling itself than calling anything else.
Historically the first high level language, Fortran, did not support recursion. It would call itself but when it returned it returned to the original callee without doing the rest of the function after the self call. Fortran itself would have been impossible to write without recursion so while itself used recursion it did not offer it to the programmer that used it. This limitation is the reason why John McCarthy discovered Lisp.
I think to see how this can work in general, and in particular in cases where recursive calls can't be turned into loops, it's worth thinking about how a general compiled language might work, because the problems are not different.
Let's imagine how a compiler might turn this function into machine code:
(defun foo (x)
(+ x (bar x)))
And let's assume that it does not know anything about bar at the time of compilation. Well, it has two options.
It can compile foo in such a way that the call to bar is translated a set of instructions which say, 'look up the function definition stored under the name bar, whatever it currently is, and arrange to call that function with the right arguments'.
It can compile foo in such a way that there is a machine-level function call to a function but the address of that function is left as a placeholder of some kind. And it can then attach some metadata to foo which says: 'before this function is called you need to find the function named bar, find its address, splice it into the code in the right place, and remove this metadata.
Both of these mechanisms allow foo to be defined before it's known what bar is. And note that instead of bar I could have written foo: these mechanisms deal with recursive calls too. They differ apart from that, however.
The first mechanism means that, every time foo is called it needs to do some kind of dynamic lookup for bar which will involve some overhead (but this overhead can be pretty small):
as a consequence of this the first mechanism will be slightly slower than it might be;
but, also as a consequence of this, if bar gets redefined, then the new definition will get picked up, which is a very desirable thing for an interactive language, which Lisp implementations usually are.
The second mechanism means that, after foo has all its references to other functions linked in to it, then the calls happen at the machine level:
this means they will be quick;
but that redefinition will be, at best, more complicated or, at worst, not possible at all.
The second of these implementations is close to how traditional compilers compile code: they compile code leaving a bunch of placeholders with associated metadata saying what names those placeholders correspond to. A linker, (sometimes known as a link-loader, or loader) then grovels over all the files produced by the compiler as well as other libraries of code and resolves all these references, resulting in a bit of code which can actually be run.
A very simple-minded Lisp system might work entirely by the first mechanism (I am pretty sure that this is how Python works, for instance). A more advanced compiler will probably work by some combination of the first and second mechanism. As an example of this, CL allows the compiler to make assumptions that apparent self-calls in functions really are self-calls, and so the compiler may well compile them as direct calls (essentially it will compile the function and then link it on the fly). But when compiling code in general, it might call 'through the name' of the function.
There are also more-or-less heroic strategies which things could do: for instance at the first call of a function link it, on the fly, to all the things it refers to, and note in their definitions that if they change then this thing needs to be unlinked as well so it all happens again. These kind of tricks once seemed implausible, but compilers for languages like JavaScript do things at least as hairy as this all the time now.
Note that compilers and linkers for modern systems actually do something more complicated than I've described, because of shared libraries &c: what I described is more-or-less what happened pre shared-library.

Knowing when what you're looking at must be a macro

I know there is macro-function, explained here, which allows you to check, but is it also possible in simply reading lisp source to sometimes infer of what you're looking at "that must be a macro"? (assuming of course you have never seen the function/macro before).
I'm fairly sure the answer is yes, but as this seems so fundamental, I thought worth asking, especially because any nuances on this may be valuable & interesting to know about.
In Paul Graham's ANSI Common Lisp, p70, he is describing how to use defstruct.
When I see (defstruct point x y), were I to know absolutely nothing about what defstruct was, this could just as well be a function.
But when I see
(defstruct polemic
(subject "foo")
(effect "bar"))
I know that must be a macro because (let's assume), I also know that subject and effect are undefined functions. (I know that because they error with undefined function when called 'at the top level'(?)) (if that's the right term).
If the two list arguments to defstruct above were quoted, it would not be so simple. Because they're not quoted, it must be a macro.
Is it as simple as that?
I've changed the field names slightly from those used on the book to make this question clearer.
Finally, Graham writes:
"We can specify default values for structure fields by enclosing the field name and a default expression in a list in the original definition"
What I'm noticing is that that's true but it is not a (quoted) list. Would any readers of this post have phrased the above sentence at all differently (given that macros haven't been introduced in the book yet (though I have a basic awareness of what they are)).
My feeling is it's not a "data list" those default expressions are enclosed in. (apologies for bad terminology) - seeking how rightly to conceptualise here.
In general, you're right: if there's some nesting inside the call and you are sure that the car's of the nested lists aren't functions - it's a macro.
Also, almost always, def-something and with-something are macros.
But there's no guarantee. The question is, what are you trying to accomplish? Some code walking/transformation or external processing (like in an editor). For the latter, you should keep in mind that full control is possible only if you perform code evaluation, although heuristics (like in Emacs) can take you pretty far. Or you just want to develop your intuition for faster code reading...
There is a set of conventions that identify quite cleary what forms are supposed to be macros, simply by mimicking the syntax of existing macros or special operators of CL.
For example, the following is a mix of various imaginary macros, but even without knowing their definition, the code shouldn't be too hard to figure out:
(defun/typed example ((id (integer 0 10)))
(with-connection (connection (connect id))
(do-events (event connection)
(event-case event
(:quit (&optional code) (return code))))))
The usual advice about macros is to avoid them if possible, so if you spot something that doesn't make sense as a lisp expression, it probably is, or is enclosed in, a macro.
(defstruct point x y)
[...] were I to know absolutely nothing about what defstruct was, this could just as well be a function.
There are various hints that this is not a function. First of all, the name starts with def. Then, if defstruct was a function, then point, x and y would all be evaluated before calling the function, and that means the code would be relying on global variables, even though they are not wearing earmuffs (e.g. *point*, *x*, *y*), and you probably won't find any definition for them in the preceding forms (or later in the same compilation unit). Also, if it was a function, the result would be discarded directly since it is not used (this is a toplevel form). That only indicates the probable presence of side-effects, but still, this would be unusual.
A top-level function with side-effects would look like this instead, with quoted data:
(register-struct 'point '(x y))
Finally, there are cases where you cannot easily guess if you are using a macro or a function:
(my-get object :slot)
This could be a function call, or you could have a macro that turns the above to (aref object 0) (assuming :slot is the zeroth slot in object, because all your objects are assumed to be of a certain custom type backed by a vector). You could also have compiler macros. In case of doubt, try to macroexpand it and look at the documentation.

What are the typical use-cases of (defun (setf …)) defsetf and define-setf-expander

When developing with Common Lisp, we have three possibilities to define new setf-forms:
We can define a function whose name is a list of two symbols, the first one being setf, e.g. (defun (setf some-observable) (…)).
We can use the short form of defsetf.
We can use the long form of defsetf.
We can use define-setf-expander.
I am not sure what is the right or intended use-case for each of these possibilities.
A response to this question could hint at the most generic solution and outline contexts where other solutions are superior.
define-setf-expander is the most general of these. All of setf's functionality is encompassed by it.
Defining a setf function works fine for most accessors. It is also valid to use a generic function, so polymorphism is insufficient to require using something else. Controlling evaluation either for correctness or performance is the main reason to not use a setf function.
For correctness, many forms of destructuring are not possible to do with a setf function (e.g. (setf (values ...) ...)). Similarly I've seen an example that makes functional data structures behave locally like a mutable one by changing (setf (dict-get key some-dict) 2) to assign a new dictionary to some-dict.
For performance, consider the silly case of (incf (nth 10000 list)) which if the nth writer were implemented as a function would require traversing 10k list nodes twice, but in a setf expander can be done with a single traversal.

The relationship between quotation, reification and reflection

I recently get confused with quotation, reification and reflection. Someone could offer a good explanation about their relationship and differences (if any)?
Quoting
This is probably the easiest one. Consider what happens when you type the following into the REPL:
(+ a 1)
REPL stands for Read Eval Print Loop, so first it Reads this in. This is a list, so after reading we have a list of 3 elements containing: <the symbol "+"> <the symbol "a"> <the number 1>
The next step is evaluation. Evaluating a list in Common Lisp involves looking up the function (or macro) bound to the first item in the list. Since + is bound to a function and not a macro, it then evaluates each subsequent element in the list. Numbers evaluate to themselves, and "a" will evaluate to whatever it is bound to. Now that the arguments are evaluated, the function "+" is called with the results of the evaluation.
We then Print the result and Loop back to the Read step
So this is great, but what if we want something that, when evaluated, will end up as a list of 3 elements containing <the symbol "+"> <the symbol "a"> <the number 1>? The solution to this is quoting. Lisps in general have a special form called "quote" that takes a single argument, and the result is that argument, unevaluated. So
(quote (+ a 1))
will evaluate to that list. As some syntactic sugar, ' is treated the same as (quote ), so we can just write '(+ a 1).
Reification
Reification is a generic term that roughly means "Make an abstract concept" concrete. Specific to programming, when something is reified, it roughly means you can treat it as data (or "First-class object"). An example of this in lisp is functions the lambda expression gives you the ability to create a concrete, first-class object that represents the abstract concept of a function call. Another example is a CLOS class, which is itself a CLOS object, that represents the abstract concept of a class.
Reflection
To a certain degree, reflection is the opposite of Reification. Given something concrete, you want some information about it's abstract representation. Consider a Common Lisp Package object, which is a reification of the concept of a package, which is just a mapping from symbol names to symbols. You can use do-symbols to iterate over all of the symbols in a package, thus getting that information out at runtime.
Also, remember how I said lambda's were a reification of functions? Well "function-lambda-expression" is the reflection on functions.
The metaobject protocol (MOP) is a semi-standard way of doing all sorts of things with classes and objects. Among other things, it allows reflection on classes and objects.

Resources