Is the X in (LET ((x ...) a fully fleshed symbol? - common-lisp

Or in other words: Is it possible for a variable in CL not to be (part of) a symbol?
I think I may have a profound misconception about variables in CL.
I always thought CL has no variables, only symbols, and symbols have (among other properties) a name and a value cell (which is the variable).
And when someone said "variable x has the value 42" I thought it was short for "the value cell of the symbol named x stores the value 42".
But this is probably wrong.
When I type
> (let ((a 42))
(type-of 'a))
SYMBOL
; caught STYLE-WARNING:
; The variable A is defined but never used.
is the lexical variable a in this example a fully fleshed symbol whose value cell has been set to 42?
Because the warning The variable A is defined but never used suggests otherwise and it appears that the lexical variable is not the same thing as the symbol a in the following form (type-of 'a).

Common Lisp has two data types which have a special meaning for evaluation:
cons cells / lists -> used in Lisp source code, lists are Lisp forms
symbols -> used as names for various purposes
If you want to use them as data in Lisp code, then you have to quote them.
Both are used in the Lisp source code, but once you compile code, they may disappear.
Variables are written as symbols in the source code. But in compiled code they may go away - when they are lexical variables.
Example using SBCL:
a file with
(defun test (foo)
(+ foo foo))
Now we do:
CL-USER> (proclaim '(optimize (debug 0))) ; the compiler saves no debug info
; No value
CL-USER> (compile-file "/tmp/test.lisp")
; compiling file "/private/tmp/test.lisp" (written 23 MAY 2017 09:06:51 PM):
; compiling (DEFUN TEST ...)
; /tmp/test.fasl written
; compilation finished in 0:00:00.013
#P"/private/tmp/test.fasl"
NIL
NIL
CL-USER> (find-symbol "FOO")
FOO
:INTERNAL
The compiler has read the source code and created a compiled FASL file. We see that the symbol FOO is now in the current package. FOO names the variable in our source code.
Now quit SBCL and restart it.
Let's load the machine code:
CL-USER> (load "/tmp/test")
T
CL-USER> (find-symbol "FOO")
NIL
NIL
There is no symbol FOO anymore. It's also not possible to retrieve the lexical value of the variable FOO using the symbol FOO. There is no mapping (like some kind of explicit lexical environment) from symbols to lexical values.

The value cell is used for dynamic (AKA "special") variables, not lexical variables. Lexical variables are symbols in the source code, but they don't have any runtime relationship to the symbol (except for internal use by the debugger).
So if you wrote:
(let ((a 42))
(declare (special a))
(print (symbol-value 'a)))
it would work because the declaration makes it a dynamic variable, and then you can access the value in the function cell.

You are not checking the type of the bound variable a or its value but that of a literal constant symbol that happens to have the same name as the variable in your let form:
(let ((a 42))
(type-of 'literal-symbol))
; ==> symbol (since 'literal-symbol evaluates to a symbol, just like 'a does)
To check the type of the value of the binding a you do it without the literal quote:
(let ((a 42))
(type-of a))
; ==> (integer 0 281474976710655)
Here you actually check the type of the let bound value and it's an integer. Surprised that 42 is a number and not a symbol?
(let ((a 10) (b 'a))
(list a b))
; ==> (10 a)
The variable a and the quoted literal 'a are not the same. They just happen to look the same when displayed but 'a is data and a is code. In CL a compiler might use lists and symbols internally but what it is when its executing is entirely up to the implementation and in most implementations they stack allocate when they can and the code that evaluate a stack allocated variable would be replaced by something that picks the value at the index from the stack. CL has a disassemble function and if you check the output in SBCL from something you'll see it's more similar to the output of a C compiler than the original lisp source.

Related

make-symbol creates a symbol with a sharpsign, getf fails in accessing it [duplicate]

There is something I can't understand about Common lisp.
Assume I'm writing a macro similar to this:
(defmacro test-macro ()
(let ((result (gensym)))
`(let ((,result 1))
(print (incf ,result)))))
Than I can do
> (test-macro)
2
2
Now I want to see how it expands
> (macroexpand-1 '(test-macro))
(LET ((#:G4315 1)) (PRINT (INCF #:G4315))) ;
T
Ok. There are unique symbols generated with gensym that were printed as uninterned.
So as far as I know the uninterned symbols are the symbols for which the evaluator does't create symbol-data binding internally.
So, if we macro-expand to that form there should be an error on (incf #:G4315).
To test this we can just evaluate that form in the REPL:
> (LET ((#:G4315 1)) (PRINT (INCF #:G4315)))
*** - SETQ: variable #:G4315 has no value
So why does the macro that expands to this string works perfectly and the form itself does not?
Symbols can be interned in a package or not. A symbol interned in a package can be looked up and found. An uninterned symbol can't be looked up in a package. Only one symbol of a certain name can be in a package. There is only one symbol CL-USER::FRED.
You write:
So as far as I know the uninterned symbols are the symbols for which the evaluator does't create symbol-data binding internally.
That's wrong. Uninterned symbols are symbols which are not interned in any package. Otherwise they are perfectly fine. interned means registered in the package's registry for its symbols.
The s-expression reader does use the symbol name and the package to identify symbols during reading. If there is no such symbol, it is interned. If there is one, then this one is returned.
The reader does look up symbols by their name, here in the current package:
(read-from-string "FOO") -> symbol `FOO`
a second time:
(read-from-string "FOO") -> symbol `FOO`
it is always the same symbol FOO.
(eq (read-from-string "FOO") (read-from-string "FOO")) -> T
#:FOO is the syntax for an uninterned symbol with the name FOO. It is not interned in any package. If the reader sees this syntax, it creates a new uninterned symbol.
(read-from-string "#:FOO") -> new symbol `FOO`
a second time:
(read-from-string "#:FOO") -> new symbol `FOO`
Both symbols are different. They have the same name, but they are different data objects. There is no other registry for symbols, than the packages.
(eq (read-from-string "#:FOO") (read-from-string "#:FOO")) -> NIL
Thus in your case (LET ((#:G4315 1)) (PRINT (INCF #:G4315))), the uninterned symbols are different objects. The second one then is a different variable.
Common Lisp has a way to print data, so that the identity is preserved during printing/reading:
CL-USER 59 > (macroexpand-1 '(test-macro))
(LET ((#:G1996 1)) (PRINT (INCF #:G1996)))
T
CL-USER 60 > (setf *print-circle* t)
T
CL-USER 61 > (macroexpand-1 '(test-macro))
(LET ((#1=#:G1998 1)) (PRINT (INCF #1#)))
T
Now you see that the printed s-expression has a label #1= for the first symbol. It then later references the same variable. This can be read back and the symbol identities are preserved - even though the reader can't identify the symbol by looking at the package.
Thus the macro creates a form, where there is only one symbol generated. When we print that form and want to read it back, we need to make sure that the identity of uninterned symbols is preserved. Printing with *print-circle* set to T helps to do that.
Q: Why do we use uninterned generated symbols in macros by using GENSYM (generate symbol)?
That way we can have unique new symbols which do not clash with other symbols in the code. They get a name by the function gensym- usually with a counted number at the end. Since they are fresh new symbols not interned in any package, there can't be any naming conflict.
CL-USER 66 > (gensym)
#:G1999
CL-USER 67 > (gensym)
#:G2000
CL-USER 68 > (gensym "VAR")
#:VAR2001
CL-USER 69 > (gensym "PERSON")
#:PERSON2002
CL-USER 70 > (gensym)
#:G2003
CL-USER 71 > (describe *)
#:G2003 is a SYMBOL
NAME "G2003"
VALUE #<unbound value>
FUNCTION #<unbound function>
PLIST NIL
PACKAGE NIL <------- no package
gensym generate a symbol and when you print it you get the "string" representation of that symbol which isn't the same thing as "reader" representation i.e the code representation of the symbol.

Symbol names undefined variable

I have a question concerning the relation between symbols and global variables.
The hyperspec states for the value attribute of a symbol:
"If a symbol has a value attribute, it is said to be bound, and that fact can be detected by the function boundp. The object contained in the value cell of a bound symbol is the value of the global variable named by that symbol, and can be accessed by the function symbol-value."
If I apply the following steps:
CL-USER> (intern "*X*")
*X*
NIL
CL-USER> (boundp '*x*)
NIL
CL-USER> (setf (symbol-value '*x*) 1)
1
CL-USER> (boundp '*x*)
T
as I understand the above cited conditions are fulfilled. There should be a global variable named by the symbol and the value of the variable is the symbol-value. But this is wrong.
CL-USER> (describe '*x*)
COMMON-LISP-USER::*X*
[symbol]
*X* names an undefined variable:
Value: 1
; No value
CL-USER>
It has to be proclaimed special.
CL-USER> (proclaim '(special *x*))
; No value
CL-USER> (describe '*x*)
COMMON-LISP-USER::*X*
[symbol]
*X* names a special variable:
Value: 1
; No value
CL-USER>
Can you please explain this behaviour. What means an "undefined variable", I did not find this term in the hyperspec.
(I use SBCL 1.3.15.)
Thank you for your answers.
Edit:
(Since this comment applies to both answers below (user Svante and coredump), I write it as an Edit and not as a comment to both answers).
I agree with the answers that * x* is a global variable.
The hyperspec states for global variable:
"global variable n. a dynamic variable or a constant variable."
Therefore I think now, the reason why SBCL says it is "undefined" is not whether is special, but whether it is a dynamic (special) variable or a constant variable (hyperspec:"constant variable n. a variable, the value of which can never change").
The third definition which is mentioned in the below answers (maybe I understand the answers wrong), that it is a global variable which is not special (and not a constant) does not exist according to the hyperspec.
Can you agree with that?
Edit 2:
Ok, in summary, I thought, since the hyperspec does not define undefined global variables, they do not exist.
But the correct answer is, they do exist and are undefined, that means it is implementation dependent how it is dealt with them.
Thank you for your answers, I accept all three of them, but I can only mark one.
There is a global variable named by the symbol, and its value is the symbol-value. That is what the output tells you. The thing that is undefined is the status of the variable: whether it is special. I agree that the wording of the output is a bit special.
If you set the value of a variable without creating it first (which is what a naked setq would do as well), it is undefined whether it becomes special or not.
Conventionally, one does not use global variables that are not special. That's why you should use defvar, defparameter etc..
As said in the other answer(s), *X* is not declare special (dynamic). SBCL also gives you a warning if your lexically bind the symbol:
FUN> (let ((*X* 30)) (list *X* (symbol-value '*X*)))
; in: LET ((*X* 30))
; (LET ((FUN::*X* 30))
; (LIST FUN::*X* (SYMBOL-VALUE 'FUN::*X*)))
;
; caught STYLE-WARNING:
; using the lexical binding of the symbol (FUN::*X*), not the
; dynamic binding, even though the name follows
; the usual naming convention (names like *FOO*) for special variables
;
; compilation unit finished
; caught 1 STYLE-WARNING condition
(30 10)
Note also what happens if *X* is locally declared as special:
FUN> (let ((*X* 30)) (declare (special *X*)) (list *X* (symbol-value '*X*)))
(30 30)
The symbol-value accessor retrieves the binding from the dynamic environment.
global variable which is not special (and not a constant) does not exist according to the hyperspec.
The actual behaviour is undefined in the standard, but in implementations it might work in some way.
This example in LispWorks:
CL-USER 46 > (boundp 'foo)
NIL
So FOO is unbound.
CL-USER 47 > (defun baz (bar) (* foo bar))
BAZ
The above defines a function baz in the LispWorks interpreter - it is not compiled. There is no warning.
Now we set this symbol foo:
CL-USER 48 > (setq foo 20)
20
CL-USER 49 > (baz 22)
440
We have successfully called it, even though FOO was not declared as a global function.
Let's check, if it is declared as special:
CL-USER 50 > (SYSTEM:DECLARED-SPECIAL-P 'foo)
NIL
Now we compile the function from above:
CL-USER 51 > (compile 'baz)
;;;*** Warning in BAZ: FOO assumed special
BAZ
The compiler says that it does not know of FOO and assumes that it is special.
This behaviour is undefined and implementations differ:
an interpreter might just use the global symbol value and not complain at all - see the LispWorks example above. That's relatively common in implementations.
a compiler might assume that the undefined variable is a special variable and warns. This is also relatively common in implementations.
a compiler might assume that the undefined variable is a special variable and also declare the variable to be special. This is not so common - the CMUCL did (does?) that by default. This behaviour is not common and not liked, since there is no standard way to undo the declaration.

LISP - Make new unique symbol

In a Common Lisp program, I want to find a way to generate a new symbol that is not in use in the program at the time. I am aware of the (gensym) function, but this makes symbols that may already be present in the program. I have some understanding that I need to intern the symbol, so I tried this:
(defun new-symbol () (intern (symbol-name (gensym))))
Which seems to get halfway to the answer. For instance,
[1]> (new-symbol)
G3069
NIL
[2]> (new-symbol)
G3070
NIL
[3]> (defvar a 'G3071)
A
[4]> (new-symbol)
G3071
:INTERNAL
As you can see, the function seems to recognize that the symbol 'G3071' is already in use elsewhere, but I don't know how to get it to generate a new symbol if that is the case.
I am aware of the (gensym) function, but this makes symbols that may already be present in the program.
No, it doesn't. It creates symbols that aren't interned in any package. The documentation says (emphasis added):
Function GENSYM
Syntax:
gensym &optional x &Rightarrow; new-symbol
Arguments and Values:
x—a string or a non-negative integer. Complicated defaulting
behavior; see below.
new-symbol—a fresh, uninterned symbol.
You can also use make-symbol to create a symbol that's not interned in any package. It's documentation summary (pretty much identical to gensym's):
Function MAKE-SYMBOL
Syntax:
make-symbol name &Rightarrow; new-symbol
Arguments and Values:
name—a string.
new-symbol—a fresh, uninterned symbol.
Regarding your second point:
I have some understanding that I need to intern the symbol, so I tried this:
No, if you want a fresh symbol, then you probably don't want to be interning anywhere. Interning is the process whereby you take a string (not a symbol) and get the symbol with the given name within a particular package. If you call intern with the same symbol name and package twice, you'll get back the same symbol, which is what you're trying to avoid.
CL-USER> (defparameter *a* (intern "A"))
*A*
CL-USER> (eq *a* (intern "A"))
T
Since gensym generates the name of the fresh symbol using *gensym-counter*, if you take the name from the gensym symbol and intern it somewhere, you could get the same symbol if someone modifies the value of *gensym-counter*. E.g.:
(let ((counter *gensym-counter*) ; save counter value
(s1 (gensym))) ; create a gensym (s1)
(setf *gensym-counter* counter) ; reset counter value
(let ((s2 (gensym))) ; create a gensym (s2)
(list s1
s2
(eq s1 s2)
(symbol-name s1)
(symbol-name s2)
(string= (symbol-name s1)
(symbol-name s2)))))
; (#:G1037 #:G1037 NIL ; different symbols
; "G1037" "G1037" T) ; same name

In Common Lisp, how to test if variable is special?

I thought I would be able to find this through Google, SO, or the books I'm reading, but it is proving elusive.
In the implementation I'm learning with, I can do the following at the top-level:
(defvar *foo* 4)
(set 'bar 3)
If I then call (describe '*foo*) and (describe 'bar), I get a description saying that *foo* is special and bar is non-special (among other details).
Is there a function that takes a symbol variable as an argument and returns true or false if it is special? If so, is describe probably implemented in part by calling it?
Context: I'm learning Common Lisp, but at work I have a system with a dialect of Lisp similar to Common Lisp, but the describe function is unimplemented. There's sort of an XY thing going on here, but I'm also trying to grok Lisp and CL.
Many Common Lisp implementations provide the function variable-information in some system dependent package.
Here in SBCL:
* (require :sb-cltl2)
NIL
* (sb-cltl2:variable-information '*standard-output*)
:SPECIAL
NIL
((TYPE . STREAM))
This function was proposed as part of some other functionality to be included into ANSI CL, but didn't make it into the standard. Still many implementations have it. For documentation see: https://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node102.html
A non-special variable's environment will be captured when you create a closure over it:
(let ((x 1))
(let ((f (lambda () x)))
(let ((x 2))
(eql 2 (funcall f)))))
;;=> NIL
A special variable's lexical environment will not:
(defvar *x*) ; *x* is special
(let ((*x* 1))
(let ((f (lambda () *x*)))
(let ((*x* 2))
(eql 2 (funcall f)))))
;;=> T
Using this approach, you could easily define a macro that will expand to code like the previous that will let you determine whether a symbol is globally proclaimed special:
(defmacro specialp (symbol)
(let ((f (gensym "FUNC-")))
`(let ((,symbol 1))
(let ((,f (lambda () ,symbol)))
(let ((,symbol 2))
(eql 2 (funcall ,f)))))))
(specialp x) ;=> NIL
(specialp *x*) ;=> T
Note that this isn't a function, it's a macro. That means that the macro function for specialp is getting called with the symbols X and *X*. This is important, because we have to construct code that uses these symbols. You can't do this with a function, because there'd be no (portable) way to take a symbol and create a lexical environment that has a lexical variable with that name and a lambda function that refers to it.
This also has some risks if you try to use it with certain symbols. For instance, in SBCL, if you try to bind, e.g., *standard-output* to something that isn't a stream or a stream designator, you'll get an error:
CL-USER> (specialp *standard-output*)
; in: SPECIALP *STANDARD-OUTPUT*
; (LET ((*STANDARD-OUTPUT* 1))
; (LET ((#:FUNC-1038 (LAMBDA # *STANDARD-OUTPUT*)))
; (LET ((*STANDARD-OUTPUT* 2))
; (EQL 2 (FUNCALL #:FUNC-1038)))))
;
; caught WARNING:
; Constant 1 conflicts with its asserted type STREAM.
; See also:
; The SBCL Manual, Node "Handling of Types"
;
; compilation unit finished
; caught 1 WARNING condition
Defining globals with set or setq is not supported. There are 2 common ways to define globals:
(defparameter *par* 20) ; notice the earmuffs in the name!
(defvar *var* 30) ; notice the earmuffs in the name!
All global variables are special. Lexically scoped variables (not special) are not possible to get described. E.g.
(let ((x 10))
(describe 'x)) ; ==> X is the symbol X
It describes not the lexical variable but the symbol representation. It really doesn't matter since you probably never need to know in run time since you know this when you're writing if it's a bound lexical variable or global special by conforming to the earmuffs naming convention for global variables.
I believe the only way to get this information at run time* is by either using an extension to CL, as Rainer noted, or to use eval.
(defun specialp (x)
(or (boundp x)
(eval `(let (,x)
(declare (ignorable ,x))
(boundp ',x)))))
(Defect warning: If the variable is unbound but declared to be a type incompatible with nil, this could raise an error. Thanks Joshua for pointing it out in his answer.)
* The macro approach determines which symbol it is checking at macro expansion time, and whether that symbol is lexical or special at compile time. That's fine for checking the status of a variable at the repl. If you wanted to e.g. print all of the special variables exported by a package, though, you would find that to use the macro version you would end up having to use eval at the call site:
(loop for s being the external-symbols of :cl-ppcre
when (eval `(specialp-macro ,s)) do (print s))

Why is SET deprecated?

I'm curious to learn the reason for this since set seems unique. For instance:
(set 'nm 3) ;; set evaluates its first argument, a symbol, has to be quoted
nm ;; ==> evaluates to 3
(set 'nm 'nn) ;; assigns nn to the value cell of nm
nm ;; ==> evaluates to nn
nn ;; ==> ERROR. no value
(set nm 3) ;; since nm evaluates to nn ...
nm ;; evaluates to nn
nn ;; evaluates to 3
To achieve similar behavior, I've only been able to use setf:
(setq tu 'ty) ;;
(symbol-value 'tu) ;; returns ty
(setq (symbol-value 'tu) 5) ;; ERROR. setq expects a symbol
(setf (symbol-value tu) 5) ;; has to be unquoted to access the value cell
tu ;; ==> evaluates to ty
ty ;; ==> evaluates to 3
In other programming languages the reason(s) for demotion are pretty clear: inefficient, bug prone, or insecure come to mind. I wonder what the criteria for deprecation for set was at the time. All I've been able to glean from the web is this, which is laughable. Thanks.
The main reason set is deprecated is that its use can lead to errors when it is used on bound variables (e.g., in functions):
(set 'a 10)
==> 10
a
==> 10
(let ((a 1)) ; new lexical binding
(set 'a 15) ; modification of the value slot of the global symbol, not the lexical variable
a) ; access the lexical variable
==> 1 ; nope, not 15!
a
==> 15
set is a legacy function from the times when Lisp was "the language of symbols and lists". Lisp has matured since then; direct operations on symbol slots are relatively rare, and there is no reason to use set instead of the more explicit (setf symbol-value).
In your example:
(set nm 'nn) ;; assigns nn to the value cell of nm
nm ;; ==> evaluates to nn
This is completely wrong and the main reason why it's deprecated. It's the symbol you get when evaluating nm that is bound to nn. Unfortunately in your example that is the number 3 and it will signal an error at run time since you cannot use numbers as variables. If you were to write (setq 3 'nn) the error can be seen at compile time.
With set there is an additional reason. It's very difficult to compile when you don't know what symbol is to be bound and the compiler cannot optimize it.
Scheme didn't have automatic quoting either in it's first version and they didn't even have a quoted macro like setq. Instead it stated that set' would suffice. It's obvious that it didn't since Scheme no longer have set but only define.
I personally disagree that it should be removed (deprecation results in removal eventually) but like eval it should be avoided and only used as a last resort.

Resources