Understanding type specifiers in Common Lisp

Understanding type specifiers in Common Lisp - common-lisp

I've found a great example of type checking in LispWorks Hyper Spec, but the "type specifier" link leads to a mere glossary not the denotation, and I got a little confused with the syntax.
In (check-type n (integer 0 *) "a positive integer") what does (integer 0 *) mean? I assume it means inclusive range from 0 to infinity, but is this so?

Yes you can use type specifiers in common lisp, they can be very powerful if your compiler chooses to use them. While you may find uses for check-type, the most common kinds of type specifications come in the form of declarations.
The declare expression is not only used for types however it has a number of declaration identifiers and common lisp implementations are actually free to add their own.
The bit you are interested in though is 'types' and more specifically than that 'Type Specifiers'. That page will give you the low-down on a variety of ways to specify types, including the way you mentioned in your question.
Again be aware that your implementation doesn’t have to use the declarations it could just ignore them! Here is some more info on that.
And for some example code, here is the example that got me understanding the basics of how this works. Here and more here.
From the 4.2.3 Type Specifiers:
If a type specifier is a list, the car of the list is a symbol, and
the rest of the list is subsidiary type information. Such a type
specifier is called a compound type specifier. Except as explicitly
stated otherwise, the subsidiary items can be unspecified. The
unspecified subsidiary items are indicated by writing *. For example,
to completely specify a vector, the type of the elements and the
length of the vector must be present.
(vector double-float 100)
The following leaves the length unspecified:
(vector double-float *)
The following leaves the element type unspecified:
(vector * 100)

Related

how to specify type for a constant?

I have a bunch of constants, which I want to be of type (unsigned-byte 8).
(declaim (type '(unsigned-byte 8) +c0+ +c1+))
(defconstant +c0+ #x0)
(defconstant +c1+ #x10)
But the declaim does not seem to do the trick, given when I type (type-of +c0+) it returns BIT (or integer, depending on the value), which is clearly not what I want.
So, how can I specify the type for constants?
Update
As it turns out, the question - while still a question - was not the root cause for my problems. In the (make-array '(2) ... part which caused the errors about "incompatible types", I entered for the initial-contents a quoted list where I should have put a "listed" list. WRONG: '(+c0+ +c1+), RIGHT: (list +c0+ +c1+).
Given I still associate variables instead of values with types in my mind, I could not interpret the meaning of the error messages coming from this.
So, basically I would delete the question if the system let me.

Types in Common Lisp are really just sets of values. Any value can be of an infinite number of types.
For example, the number 1 is of the type bit (which is an alias for (integer 0 1)). It is also of type (integer 0 2), or (integer -47 234). It is even of the type (or string null (integer 0 277)). So, when you ask (type-of 1), what should be the answer?
Lisp implementations know about some types that are built in. They will usually return the most restricted of those types that they know contains the value. If your Lisp implementation had special handling of numbers in base 5, it might return (integer 0 5) (or an alias of that) for 2.
So that's why the CLHS says that it returns a type specifier, not the type. It also specifies that it has to return something sensible (take a look there).
Your declamation is about the constant named +c0+, but the type-of call doesn't see that constant, it sees only the value coming out of it (think about the evaluation steps). So that declamation cannot have an effect here.
If you want to restrict the type of a value on the fly, you could use the or check-type.

(type-of +c0+) does not return a declared type of the variable or constant. It returns the dynamic type of the object which is the value of +c0+.
With your type declaration you declared the type of a constant identifier. Actually the effects of that is implementation specific. SBCL will use that type declaration.
But you need to understand the difference between static type declarations using DECLAIM, DECLARE, etc and dynamic types like the types used in TYPE-OF, TYPE-P, the element type option of MAKE-ARRAY, ...

Check whether a symbol is bound

In Emacs Lisp (boundp 'symbol) returns t if symbol is bound to some value, nil otherwise. Is there an equivalent procedure in Guile Scheme?

Scheme avoids leaking implementation into the specification and speaks of 'identifiers' rather than of binding an interned symbol to a value - see §2.1 of R7RS. In scheme, an 'identifier' is just a name.
An identifier name is treated as identifying a variable unless it identifies a macro (syntax) or it is in a context requiring it to be treated as identifying a symbol, such as by quotation. In particular, §2.1 of R7RS states that "When an identifier appears as a literal or within a literal (see section 4.1.2), it is being used to denote a symbol (see section 6.5)". You can test whether an identifer identifies a symbol with the symbol? procedure.
Guile scheme does in fact implement identifiers by interning symbols and you can query whether a symbol is bound using defined?:
(defined? 'num)
=> #f
(define num 1)(defined? 'num)
=> #t
This is a guile implementation matter and not portable scheme.
Edit: Note that defined? only works with top level variables defined with define. It does not work with let and cognates.

Why is there no generic operators for Common Lisp?

In CL, we have many operators to check for equality that depend on the data type: =, string-equal, char=, then equal, eql and whatnot, so on for other data types, and the same for comparison operators (edit don't forget to answer about these please :) do we have generic <, > etc ? can we make them work for another object ?)
However the language has mechanisms to make them generic, for example generics (defgeneric, defmethod) as described in Practical Common Lisp. I imagine very well the same == operator that will work on integers, strings and characters, at least !
There have been work in that direction: https://common-lisp.net/project/cdr/document/8/cleqcmp.html
I see this as a major frustration, and even a wall, for beginners (of which I am), specially we who come from other languages like python where we use one equality operator (==) for every equality check (with the help of objects to make it so on custom types).
I read a blog post (not a monad tutorial, great serie) today pointing this. The guy moved to Clojure, for other reasons too of course, where there is one (or two?) operators.
So why is it so ? Is there any good reasons ? I can't even find a third party library, not even on CL21. edit: cl21 has this sort of generic operators, of course.
On other SO questions I read about performance. First, this won't apply to the little code I'll write so I don't care, and if you think so do you have figures to make your point ?
edit: despite the tone of the answers, it looks like there is not ;) We discuss in comments.

Kent Pitman has written an interesting article that tackles this subject: The Best of intentions, EQUAL rights — and wrongs — in Lisp.
And also note that EQUAL does work on integers, strings and characters. EQUALP also works for lists, vectors and hash tables an other Common Lisp types but objects… For some definition of work. The note at the end of the EQUALP page has a nice answer to your question:
Object equality is not a concept for which there is a uniquely determined correct algorithm. The appropriateness of an equality predicate can be judged only in the context of the needs of some particular program. Although these functions take any type of argument and their names sound very generic, equal and equalp are not appropriate for every application.
Specifically note that there is a trick in my last “works” definition.

A newer library adds generic interfaces to standard Common Lisp functions: https://github.com/alex-gutev/generic-cl/
GENERIC-CL provides a generic function wrapper over various functions in the Common Lisp standard, such as equality predicates and sequence operations. The goal of the wrapper is to provide a standard interface to common operations, such as testing for the equality of two objects, which is extensible to user-defined types.
It does this for equality, comparison, arithmetic, objects, iterators, sequences, hash-tables, math functions,…
So one can define his own + operator for example.

Yes we have! eq works with all values and it works all the time. It does not depend on the data type at all. It is exactly what you are looking for. It's like the is operator in python. It must be exactly what you were looking for? All the other ones agree with eq when it's t, however they tend to be t for totally different values that have various levels of similarities.
(defparameter *a* "this is a string")
(defparameter *b* *a*)
(defparameter *c* "this is a string")
(defparameter *d* "THIS IS A STRING")
All of these are equalp since they contain the same meaning. equalp is perhaps the sloppiest of equal functions. I don't think 2 and 2.0 are the same, but equalp does. In my mind 2 is 2 while 2.0 is somewhere between 1.95 and 2.04. you see they are not the same.
equal understands me. (equal *c* *d*) is definitely nil and that is good. However it returns t for (equal *a* *c*) as well. Both are arrays of characters and each character are the same value, however the two strings are not the same object. they just happen to look the same.
Notice I'm using string here for every single one of them. We have 4 equal functions that tells you if two values have something in common, but only eq tells you if they are the same.
None of these are type specific. They work on all types, however they are not generics since they were around long before that was added in the language. You could perhaps make 3-4 generic equal functions but would they really be any better than the ones we already have?

Fortunately CL21 introduces (more) generic operators, particularly for sequences it defines length, append, setf, first, rest, subseq, replace, take, drop, fill, take-while, drop-while, last, butlast, find-if, search, remove-if, delete-if, reverse, reduce, sort, split, join, remove-duplicates, every, some, map, sum (and some more). Unfortunately the doc isn't great, it's best to look at the sources. Those should work at least for strings, lists, vectors and define methods of the new abstract-sequence.
see also
https://github.com/cl21/cl21/wiki
https://lispcookbook.github.io/cl-cookbook/cl21.html

What characters are allowed in common lisp symbols?

What characters are allowed in common lisp symbols? Can you give a regular expression to match them (or are they beyond the capable of regular grammars to describe)?
I have tried looking for information on this, but all I can find are some examples in CLHS, but no concrete definition of what exactly a legal symbol is.
Edit:
So, common lisp symbols can legally contain any character.
However, the parser doesn't just accept any character as it reads lisp code. What are the rules for parsable symbols? E.g. symbols that can be supplied as 'quoted symbols or inside of '(quoted lists).
I am interested in generating and reading non-bar-delimited symbols, from a non-lisp language. It should suffice, for my application, to use [a-zA-Z0-9:&-]+, but I tend to prefer to be as accurate as possible, which is why I am trying to determine if there is a regex that can match symbols. Matching the |delimited syntax| would be a bonus, but non-delimited symbols would suffice.
This needs to be symbols that would be loaded legally when using (read). The answer is not that symbols can contain any character:
[1]> (read t)
#
*** - READ from #<IO TERMINAL-STREAM>: objects printed as # in view of *PRINT-LEVEL* cannot be read back in
I want to know the rules, or a regex, for what is a valid symbol here, without delimiting it with |.

As sds mentioned, symbol names can contain any characters. Given any string, you can create a symbol with that name. However, based on your comments, it sounds like you're wonder what, under fairly default settings, will be read as a symbol. The answer is still "pretty much anything", with a few exceptions.
The relevant sections in the HyperSpec begin with 2.2 Reader Algorithm, which describes the tokenization process. It describes the process in detail, but perhaps the most important part is:
When dealing with tokens, the reader's basic function is to
distinguish representations of symbols from those of numbers. When a
token is accumulated, it is assumed to represent a number if it
satisfies the syntax for numbers listed in Figure 2-9. If it does not
represent a number, it is then assumed to be a potential number if it
satisfies the rules governing the syntax for a potential number. If a
valid token is neither a representation of a number nor a potential
number, it represents a symbol.
The Figure 2.9 mentioned in that except is in section 2.3.1 Numbers as Tokens, which says:
When a token is read, it is interpreted as a number or symbol. The token is interpreted as a number if it satisfies the syntax for numbers specified in the next figure.
So, the process is really "tokenize the stream, and for each token, check if it's a number, and if it's not a number, then it's a symbol." I realize this doesn't provide an a nice clean grammar for symbols, but that's just the way the language is defined. If you sit down to the task of writing a tokenizer and reader for a Lisp, you may find that this is a pretty convenient way of going about it. You pretty much just need to recognize which characters terminate a symbol, which characters start and end lists, what gets eliminated as whitespace, and what your escape characters are. Then you read nested lists of tokens, turning each token into a number or a symbol (or a string, etc.).
Perhaps one of the easiest ways to see why you have to do this in terms of tokenization and then checking for numbers is the fact that Common Lisp has a *read-base*variable that controls the base. Depending on the value of *read-base*, some things are numbers or symbols, and you can't know until you know what the complete token is, and what the current state of the runtime is.
CL-USER> 'beef
BEEF
CL-USER> (setf *read-base* 16)
16
CL-USER> 'beef
48879
CL-USER> (setf *read-base* a) ; set it back to 10, which is now a
10
CL-USER> (setf *read-base* 36)
36
CL-USER> 'hello ; a number
29234652
CL-USER> 'hello\ world ; a symbol
|HELLO WORLD|

Any character can be in a symbol. E.g.:
(length (loop for i to char-code-limit
collect (intern (string (code-char i)))))
==> 1114113

How do I create a synonym for a type class name?

I want to abbreviate create a synonym for a type class name. Here's how I'm doing it now:
class fooC = linordered_idom
instance int :: fooC
proof qed
definition foof :: "'a::fooC ⇒ 'a" where
"foof x = x"
term "foof (x::int)"
value "foof (x::int)"
This works fine if there's not a better way to do it. The disadvantage is that I have to instantiate int, and the class command takes time to implement itself.
Update 140314
This update is to clarify for Makarius what it is I want, to explain my purpose in wanting it, and give a list of commands that I'm familiar with for creating notation, abbreviations, and synonyms, but commands which I couldn't get to work for what I want.
My initial choice of "abbreviation" rather than "synonym"
I guess "synonym" would have been a better word, but I chose "abbreviation" because it describes what I want, which is to be able to create a shorter name for for a type class, like renaming linordered_semidom to losdC. Though Isar abbreviation has some of the attributes of definition, it also just defines syntax. So, because "abbreviate" describes what I want, and abbreviation just defines syntax, I chose "abbreviation" instead of "synonym" or "alias".
Synonym/alias, Isar commands I couldn't get to work for that
"Alias" would describe what I want. As to the sentence "If you just want to save typing in the editor, you could use some abbreviations there," here are the commands I've experimented with to try and rename linordered_idom, but I couldn't get them to work for me:
type_notation
type_synonym
notation
abbreviation
syntax
Rather than explain what I've tried, and try to remember what I tried, I just list them. I did searches on "class" and only found the Isar commands class and classes. I thought maybe locale commands might be applicable, but I didn't find anything.
What I want is simple, like how type_synonym is used to define synonyms for types.
The purpose
There is my general desire to shorten type class names such as linordered_idom, because eventually, I plan on using the algebra type classes extensively.
However, there is a second reason, and that is to rename something like linordered_semidom to be part of a naming scheme of three types.
For any algebraic type class, such as linordered_semidom, I can use that type class, along with quotient_type, to create what I'll call a number system, such as how nat is used to define int.
Using Int.thy as a template, I did that with linordered_semidom, and then instantiated it as comm_ring_1, which is as far as I have time to go these days.
Additionally, with typedef, for any algebraic type class which has the dependencies of zero and one (and others such as ord), I can define a type of all elements greater than or equal to zero, and another one for all elements greater than zero. I did that for linordered_idom, but then I figured out that I actually needed to go the quotient_type route, to get things that model rat.
That's the long explanation. Eventually, I'll start working with numerous algebraic type classes, and from one type class, I'll get two more. If I do that for 20 type classes, and also use them, then long, descriptive names don't work, and renaming type classes will help me in knowing what type classes go together.
Here would be the scheme for linordered_semidom, where I don't know how this will actually work out, until I'm able to try it all:
linordered_semidom is the base class. I rename it to losdC. It's the numbers greater than or equal to zero for these three types.
losdQ is defined from losdC using quotient_type. It gives me the negative numbers, and the ability to coerce losdC to losdQ.
losd1 is defined using typedef, and is the numbers greater than zero.
I need a consistent naming scheme, to keep it all straight: losdC, losdQ, and losd1.
Finally, eventually even 4 types instead of 3 types
I haven't completely worked and thought things out (I'm not even close), but analogously, it's all related to implementing, for algebra type classes, the basic relationship between nat, int, and rat, where real might eventually come into play. Additionally, it's about getting a type, from these types, of the non-negative or positive members, if those don't come by default.
There is nat used for int, and int used for rat.
With nat used for int, we get the non-negative integers by default, which is nat.
With int used for rat, we don't get the non-negative members of rat, we get fractions. (Again, I'm talking about a type of non-negatives and positives, not a set of non-negatives and positives.)
So, if I use linordered_idom and quotient_type to define fractions, then I have to use typedef twice to get the non-negative and positive members of those fractions, which means I would have 4 types to keep track of, liodC, liodQ, liod0, and liod1.
If there's a simple solution to renaming type classes, then I've unnecessarily said about 600 words.

A definition is not an abbreviation, it introduces a separate term that is logical equal. That works for term constants.
A type class is semantically a predicate over types, and thus connected to some predicate (term constant), but in practice you rarely access that.
So what exactly means to "abbreviate a type class"?
For example, you might want to manipulate the class name space to get an alias for it, which is in principle possible. But what is the purpose?
If you just want to save typing in the editor, you could use some abbreviations there.
Another possibility, within the formal system, is to introduce genuine aliases in the name space. Isabelle provides some facilities for that, which are not very much advertized, because there is a real danger of obscuring libraries and preventing anyone else from understanding them, if names are changed too much.
This is how it works, using some friendly Isabelle/ML within the theory source:
class foobar = ord + fixes foobar :: 'a
setup {* Sign.class_alias #{binding f} #{class foobar} *}
typ "'a::f"
instantiation nat :: f
begin
definition foobar_nat :: nat where "foobar_nat = 0"
instance ..
end
Note that Sign.class_alias only refers to the type class name space in the narrow sense. A class is many things at the same time: locale, const (the prodicate), type class. You can see this in the following examples where the class is used as "target" for local definitions and theorems:
definition (in foobar) "fuzz = foobar"
theorem (in foobar) "fuzz = foobar" by (simp add: fuzz_def)
Technically, the locale name space used above could support aliases as well, but this is not done. Only basic Sign.class_alias, Sign.type_alias, Sign.const_alias are exposed for unusual situations, to address problems with legacy libraries.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex