What is the effect of ftype declarations on built-in functions in SBCL? - common-lisp

I'm building on some old Common Lisp code written by others, which includes lines such as the following at the start of a few functions:
(declare (ftype (function (&rest float) float) + - * min max))
My understanding is that the purpose of this is to tell the compiler that the five functions listed at the end of the form will only be passed floats. The compiler may use this information to create more efficient code.
Some Lisps do not complain about this declaration (ABCL, CCL, ECL, LispWorks, CLISP), but SBCL will not accept this declaration in the default configuration. SBCL can be made to accept it by placing
(unlock-package 'common-lisp)
in the .sbclrc initialization file. That's what I've been doing for the last year or so. I assume that this is needed because +, -, etc. are in that package, and the code alters these functions' declarations.
My question is: Can declaring the function type of built-in functions such as + and min have a beneficial effect on compiled code in SBCL? (If it can, then why does SBCL complain about these declarations by default?) Would I be better off removing such ftype declarations, and then getting rid of the unlock-package line in .sbclrc?
Thanks.

My understanding is that the purpose of this is to tell the compiler that the five functions listed at the end of the form will only be passed floats. The compiler may use this information to create more efficient code.
Also, they will only return floats. With certain optimization settings, the a Common Lisp compiler does not generate runtime checks and may only generate code for float computations. Also SBCL may show compile-time warnings in certain cases, where it detects that code violates type declarations.
It's also a source for errors, since from now on (within the scope of the declaration) basic functions like +and - are declared not to work on other number types (integer, complex, ...).
So, what is the purpose for these declarations? Since it is portable code (and most implementations don't implement compile-time type checking), it can only be for optimization purposes. Some of that might not be necessary in SBCL, since it uses type inference.
Why does SBCL not allow to alter the built-in functionality by default? It is so to prevent shooting in your foot: you are altering the base language. Now basic numeric operations may lead to errors.
Ways to deal with that:
use only local declarations, don't alter the language globally. You indicate that these are only locally declared - that's good.
declare the values of variables instead
write special functions for the float case and declare them inline.
only unlock the package CL during compilation of these few functions. keep it locked later.
My question is: Can declaring the function type of built-in functions such as + and min have a beneficial effect on compiled code in SBCL?
You can check that by looking at the disassembled code and also by profiling. Make sure that you compile the function with the right optimization setting. In Common Lisp the function DISASSEMBLE should show you machine code in a readable way. The SBCL compiler should also tell you if it can't optimize the compiled code.

Related

thy_goal_defn and keywords in Isabelle/HOL

I'm trying to get a good understanding of how math is built up in Isabelle. For whatever reason, all the tutorial/manuals hide a lot of the implementation details of basic types such as how the natural numbers, integers, rationals, and reals are constructed. When looking the src/HOL directory and examining the .thy files, I've encountered code blocks such as:
keywords
"print_quotmapsQ3" "print_quotientsQ3" "print_quotconsts" :: diag and
"quotient_type" :: thy_goal_defn and "/" and
"quotient_definition" :: thy_goal_defn
begin
in Quotient.thy. Here, keywords is being used so that later you can define a type as:
quotient_type rat = "int * int" / partial: "ratrel"
and other related definitions. I haven't been able to figure out how the "keywords" feature works. It's not particularly obvious from the code, and the only documentation I can find is in the Isabelle/Isar Reference Manual where the following is written:
"The keywords specification declares outer syntax (chapter 3) that is introduced in this theory later on (rare in end-user applications). Both minor keywords and major keywords of the Isar command lan- guage need to be specified, in order to make parsing of proof docu- ments work properly. Command keywords need to be classified ac- cording to their structural role in the formal text. Examples may be seen in Isabelle/HOL sources itself, such as keywords "typedef" :: thy_goal_defn or keywords "datatype" :: thy_defn for theory-level definitions with and without proof, respectively." (p. 91)
This raises the question what a theory-level definition, which I haven't been able to figure out.
Isabelle's surface language, Isar, is extensible in multiple dimensions. In particular, a significant chunk of the keywords you'd usually use in day-to-day formalizations are defined in userspace. This sets Isar apart from many other programming languages, where syntax is fixed.
Roughly speaking, a theory file in Isabelle consists of two parts:
The header, which can be parsed statically, i.e. without running any custom code.
The contents, where e.g. logical definitions (types, constants, ...) and proofs can be mode.
Parsing of the contents happens in two phases:
First, the command structure is being parsed. This can be done by looking at the table of all the keywords that exist (those are declared in the header). There are various different types of keywords (as the manual points out). Command keywords start a new atomic chunk in the theory. (Consequently, theory contents can be seen as a sequence of commands.)
Second, the commands themselves are parsed, by using custom parsing code specified by whoever declared the corresponding keyword. This will execute any action, e.g. actually defining a type in the theory when the keyword typedef is encountered.
Commands can be classified according to the context in which they can appear. Top-level commands may only appear – well – on the top-level of a theory. Other commands may be freely nested in local contexts. Yet other commands do not modify the theory, but only print diagnostic output (diag). Theory processing in Isabelle takes that into account when e.g. parallelizing execution of a theory.
The example you mentioned, thy_goal_defn, is a keyword that adds some definitions to the theory and also enters proof mode, because quotient_type requires some proofs about the wellformedness of the definition.

Why use macros in Julia?

I was reading up on the documentation of macros and ran into the following under the `Hold up: why macros' section. The reasoning given to use macros is as follows:
Macros are necessary because they execute when code is parsed,
therefore, macros allow the programmer to generate and include
fragments of customized code before the full program is run
This leads me to wonder why someone would want to use "generate and include fragments of customized code before the full program is run". Can someone provide context as to why this would be beneficial and/or other good use cases for macros?
Let me give you my view on macros.
A macro basically is a code -> code function. It takes code (a Julia expression) as input and spits out code (a different Julia expression).
Why is this useful? It has multiple purposes:
compile time copy-and-paste: You don't have to write the same piece of code multiple times but instead can define a short macro that writes it for you wherever you put it. (example)
domain specific language (DSL): You can create special syntax that after the macros code -> code transform is replaced by pure Julia constructs. This is used in many packages to define special syntax, for example here and here.
code generation: Imagine you want to write a really long piece of code which, although being long, is very simple because it has some kind of pattern that repeats itself rather trivially. Writing that code by hand can be a pain (or even practically impossible). A macro can programmatically generate the code for you. One example is for-loop unrolling (see here and here). But even the #time macro isn't doing much more than just putting a bunch of Base.time_ns() function calls around the provided Julia expression.
special string parsing: If you type the literal 3.2 in Julia it will be parsed and interpreted as a Float64. Now, imagine you want to supply a number literally that goes beyond Float64 precision but would fit into a BigFloat. Typing big(3.123124812498124812498) won't work, because the literal number is first interpreted as a Float64 and then handed to the big function. Instead you need a way to tell Julia at parse time that this should become a BigFloat. This is handled by a #big_str 3.2 macro which (for convenience) can also be written as big"3.2". The latter is just syntactic sugar.
There might be many more applications of macros, but those are the most important to me.
Let me end by referencing Steven G. Johnson's great talk at JuliaCon 2019:
Most of the time, don't do metaprogramming :)

Knowing when what you're looking at must be a macro

I know there is macro-function, explained here, which allows you to check, but is it also possible in simply reading lisp source to sometimes infer of what you're looking at "that must be a macro"? (assuming of course you have never seen the function/macro before).
I'm fairly sure the answer is yes, but as this seems so fundamental, I thought worth asking, especially because any nuances on this may be valuable & interesting to know about.
In Paul Graham's ANSI Common Lisp, p70, he is describing how to use defstruct.
When I see (defstruct point x y), were I to know absolutely nothing about what defstruct was, this could just as well be a function.
But when I see
(defstruct polemic
(subject "foo")
(effect "bar"))
I know that must be a macro because (let's assume), I also know that subject and effect are undefined functions. (I know that because they error with undefined function when called 'at the top level'(?)) (if that's the right term).
If the two list arguments to defstruct above were quoted, it would not be so simple. Because they're not quoted, it must be a macro.
Is it as simple as that?
I've changed the field names slightly from those used on the book to make this question clearer.
Finally, Graham writes:
"We can specify default values for structure fields by enclosing the field name and a default expression in a list in the original definition"
What I'm noticing is that that's true but it is not a (quoted) list. Would any readers of this post have phrased the above sentence at all differently (given that macros haven't been introduced in the book yet (though I have a basic awareness of what they are)).
My feeling is it's not a "data list" those default expressions are enclosed in. (apologies for bad terminology) - seeking how rightly to conceptualise here.
In general, you're right: if there's some nesting inside the call and you are sure that the car's of the nested lists aren't functions - it's a macro.
Also, almost always, def-something and with-something are macros.
But there's no guarantee. The question is, what are you trying to accomplish? Some code walking/transformation or external processing (like in an editor). For the latter, you should keep in mind that full control is possible only if you perform code evaluation, although heuristics (like in Emacs) can take you pretty far. Or you just want to develop your intuition for faster code reading...
There is a set of conventions that identify quite cleary what forms are supposed to be macros, simply by mimicking the syntax of existing macros or special operators of CL.
For example, the following is a mix of various imaginary macros, but even without knowing their definition, the code shouldn't be too hard to figure out:
(defun/typed example ((id (integer 0 10)))
(with-connection (connection (connect id))
(do-events (event connection)
(event-case event
(:quit (&optional code) (return code))))))
The usual advice about macros is to avoid them if possible, so if you spot something that doesn't make sense as a lisp expression, it probably is, or is enclosed in, a macro.
(defstruct point x y)
[...] were I to know absolutely nothing about what defstruct was, this could just as well be a function.
There are various hints that this is not a function. First of all, the name starts with def. Then, if defstruct was a function, then point, x and y would all be evaluated before calling the function, and that means the code would be relying on global variables, even though they are not wearing earmuffs (e.g. *point*, *x*, *y*), and you probably won't find any definition for them in the preceding forms (or later in the same compilation unit). Also, if it was a function, the result would be discarded directly since it is not used (this is a toplevel form). That only indicates the probable presence of side-effects, but still, this would be unusual.
A top-level function with side-effects would look like this instead, with quoted data:
(register-struct 'point '(x y))
Finally, there are cases where you cannot easily guess if you are using a macro or a function:
(my-get object :slot)
This could be a function call, or you could have a macro that turns the above to (aref object 0) (assuming :slot is the zeroth slot in object, because all your objects are assumed to be of a certain custom type backed by a vector). You could also have compiler macros. In case of doubt, try to macroexpand it and look at the documentation.

Replacing an ordinary function with a generic function

I'd like to use names such as elt, nth and mapcar with a new data structure that I am prototyping, but these names designate ordinary functions and so, I think, would need to be redefined as generic functions.
Presumably it's bad form to redefine these names?
Is there a way to tell defgeneric not to generate a program error and to go ahead and replace the function binding?
Is there a good reason for these not being generic functions or is just historic?
What's the considered wisdom and best practice here please?
If you are using SBCL or ABCL, and aren't concerned with ANSI compliance, you could investigate Extensible Sequences:
http://www.sbcl.org/manual/#Extensible-Sequences
http://www.doc.gold.ac.uk/~mas01cr/papers/ilc2007/sequences-20070301.pdf
...you can't redefine functions in the COMMON-LISP package, but you could create a new package and shadow the imports of the functions you want to redefine.
Is there a good reason for these not being generic functions or is just historic?
Common Lisp has some layers of language in some of its areas. Higher-level parts of the software might need to be built on lower-level constructs.
One of its goals was being fast enough for a range of applications.
Common Lisp also introduced the idea of sequences, the abstraction over lists and vectors, at a time, when the language didn't have an object-system. CLOS came several years after the initial Common Lisp design.
Take for example something like equality - for numbers.
Lisp has =:
(= a b)
That's the fastest way to compare numbers. = is also defined only for numbers.
Then there are eql, equal and equalp. Those work for numbers, but also for some other data types.
Now, if you need more speed, you can declare the types and tell the compiler to generate faster code:
(locally
(declare (fixnum a b)
(optimize (speed 3) (safety 0)))
(= a b))
So, why is = not a CLOS generic function?
a) it was introduced when CLOS did not exist
but equally important:
b) in Common Lisp it wasn't known (and it still isn't) how to make a CLOS generic function = as fast as a non-generic function for typical usage scenarios - while preserving dynamic typing and extensibility
CLOS generic function simply have a speed penalty. The runtime dispatch costs.
CLOS is best used for higher level code, which then really benefits from features like extensibility, multi-dispatch, inheritance/combinations. Generic functions should be used for defined generic behavior - not as collections of similar methods.
With better implementation technology, implementation-specific language enhancements, etc. it might be possible to increase the range of code which can be written in a performant way using CLOS. This has been tried with programming languages like Dylan and Julia.
Presumably it's bad form to redefine these names?
Common Lisp implementations don't let you replace them just so. Be aware, that your replacement functions should be implemented in a way which works consistently with the old functions. Also, old versions could be inlined in some way and not be replaceable everywhere.
Is there a way to tell defgeneric not to generate a program error and to go ahead and replace the function binding?
You would need to make sure that the replacement is working while replacing it. The code replacing functions, might use those function you are replacing.
Still, implementations allow you to replace CL functions - but this is implementation specific. For example LispWorks provides the variables lispworks:*packages-for-warn-on-redefinition* and lispworks:*handle-warn-on-redefinition*. One can bind them or change them globally.
What's the considered wisdom and best practice here please?
There are two approaches:
use implementation specific ways to replace standard Common Lisp functions
This can be dangerous. Plus you need to support it for all implementations of CL you want to use...
use a language package, where you define your new language. Here this would be standard Common Lisp plus your extensions/changes. Export everything the user would use. In your software use this package instead of CL.

What does it mean to "open code" something in Common Lisp?

In the SBCL user manual there are several references to the term "open code". Common Lisp hackers also use this term when referring to optimizing code.
Could you please explain what it means to "open code" something and give an example of how it works?
What It Is
Open-coding, AKA inlining, means replacing function calls with inline assembly.
The idea is that funcall and apply are expensive (they require saving and restoring the stack &c) and replacing them with the few operations which constitute the function can be beneficial.
E.g., the function 1+ is a single instruction when the argument is a fixnum (which it usually is in practice), so turning the funcall into the two parallel branches (fixnum and otherwise) would be a win.
How to Control it
Declarations
They user can control this optimization explicitly by the inline
declaration (which implementations are free to ignore).
The user can also influence this optimization by the optimize declaration.
Both will affect inlining the code of a function defined just as a function (see below).
Macros
The "old" way is to implement the function as a macro. E.g., instead of
(defun last1f (list)
(car (last list)))
write
(defmacro last1m (list)
`(car (last ,list)))
and last1m will be always open-coded. The problem with this approach is that you cannot use last1m as a function - you cannot pass it to, say, mapcar.
Thus Common Lisp has an alternative way - compiler macros, which tell the compiler how to transform the form before compiling it:
(define-compiler-macro last1f (list)
;; use in conjunction with (defun last1f ...)
`(car (last ,list)))
See also the excellent examples in the aforelinked CLHS page.
Its Effects on Optimization
A comment asked about the effects of inlining, i.e., what optimizations result from it. E.g., constant propagation in addition to eliminating a function call.
The answer to this question is left to implementations.
IOW, the CL standard does not specify what optimizations must be done.
However, Minimal Compilation implies that
if an implementation does something (e.g. constant folding), it will be
done for compiler macros too.
For more details, you should compare the results of
disassemble with and
without the declarations and whatnot and see the effects.
For explanations, you should ask the vendor (e.g., by using the appropriate
tag here - e.g., sbcl, clisp, &c).

Resources