Upon using (make-package 'test) (in-package test) in SBCL and CCL implementations, I've noticed that SBCL requires (cl:defun foo () (...)) or (cl:describe <symbol name here>) while CCL does not require any colons or double colons to use built-in symbols. It is my understanding that external symbols must be accessed with one colon, even if they are built-in. However CCL seems to work differently in this respect.
This confuses me somewhat regarding the use of external symbols. Are external symbols supposed to be available without any colons or is CCL just using/importing/inheriting automatically for convenience-sake?
Also, are there any more of these small but significant differences between the implementations regarding symbols and packages?
The ANSI CL standard does not define which packages are used when you create a package
At some point in time, SBCL deviated from common practice, but still follows ANSI CL standards.
Using packages
Using packages in some other package means making their symbols available in that package.
You can get a package's use list by calling the function package-use-list.
It is undefined in the ANSI Common Lisp standard which packages a new package uses by default and also if it uses any at all.
Different common practice in implementations
There are now two common practices in implementations:
use COMMON-LISP and some implementation specific packages. CCL does that.
Example in CCL:
? (package-use-list (make-package "FOOBAR"))
(#<Package "CCL"> #<Package "COMMON-LISP">)
LispWorks:
CL-USER 17 > (package-use-list (make-package "FOOBAR"))
(#<The COMMON-LISP package, 0/4 internal, 978/1024 external>
#<The HARLEQUIN-COMMON-LISP package, 0/4 internal, 365/512 external>
#<The LISPWORKS package, 0/4 internal, 226/256 external>)
use no package. SBCL does that. If you want a new package to use the package COMMON-LISP, then you have to explicitly request that.
Example in SBCL:
* (package-use-list (make-package "FOOBAR"))
NIL
ABCL:
CL-USER(1): (package-use-list (make-package "FOOBAR"))
NIL
Writing portable code
Thus in SBCL and thus in portable Common Lisp you need to tell Lisp which packages should be used. To get the COMMON-LISP package used and only that package, you need to write:
(make-package "FOO" :use '("COMMON-LISP"))
Background
The original idea in the first Common Lisp was that one could write (in-package "FOO") at the REPL and the package was created with sensible defaults and one was directly in that package. The defaults were usually the package for the language (at that time called "LISP") and packages for common extensions (for example with CLOS+MOP, threads, ...).
Later Common Lisp was changed so that IN-PACKAGE did not create a package and it is defined that upon package creation it is undefined which packages are used and it is not required to use any package upon package creation. The SBCL maintainers then thought: instead of supporting common practice (which is not mentioned in the standard), provide a more neutral and predictable behaviour of using no package upon package creation.
Other differences
Most other differences in package systems in Common Lisp are around extensions to the standard. Examples:
hierarchical/nested packages
package prefixes to whole forms (not just symbols)
A bigger and incompatible change provided by some implementations as an option:
lowercase of all existing symbols and lowercasing reader. The standard defines symbols to be internally uppercase by default.
Undefined:
garbage collection of otherwise unreferenced, but interned, symbols
Related
I just started learning Common Lisp a few weeks ago, so sorry if this is an obvious question. How can I load modules programmatically? I have a directory of "tasks", each a Lisp program, and would like to import each and run a specific function that they all contain.
I found out a way to iterate over a directory (via UIOP:DIRECTORY-FILES). But I'm stuck trying to figure out a way to load a module "as" a specific name, as in Python. That would allow me to load "module-1.lisp" as mod and then load "module-2.lisp" as mod in a loop.
Pseudocode:
for path in directory
(load path as mod)
(mod:function)
If there is a better way to achieve what I am trying to do, feel free to say so! Thanks in advance for any help!
Quick summary of packages and systems
"Loading as" is not something meaningful in Common Lisp, because loading is not the same as defining a new system or package. In Python, a file is a module etc. but not in Lisp.
Lisp has a concept of packages, which are namespaces. They are used to organize symbols (symbols are first-class values). In other words, they use, export or maybe shadow symbols from other packages and that's it. When you evaluate (in-package pack), you can write all accessible symbols from pack directly, like my-fun, otherwise you need to fully qualify symbols, as some-package:their-func.
There are extensions that allows you to import a package with a different name, but this is not standard (see https://github.com/phoe/trivial-package-local-nicknames)
Loading a Lisp script is not necessarily the same as defining a package, it depends on what belongs in the file (if it as defpackage forms).
When you want to organize your source code in Lisp, you define a system, using ASDF. A system list all its dependencies, its components (files) and the dependencies between their components. That's how you describe in which order files should be loaded, compiled, tested, etc.
Packages and systems are independant but in small systems you often have the the same name for a system and the unique package it defines. For larger systems there might be multiple packages.
See 21. Programming in the Large: Packages and Symbols.
Your question
Each of your file needs to define a package, to avoid polluting a single namespace, for example:
(defpackage :common-name.01
(:use :cl :utils)
(:export #:run-me))
(in-package :common-name.01)
;; some code
But once they are all loaded, either using your approach or by defining a proper ASDF system, you want to be able to access all the RUN-ME functions, in all the packages.
You can write some introspective code that lists packages, etc. but I think a better approach would be to have a way for each of your file to declare that their RUN-ME functions should be registered in your framework, like test frameworks do.
For example, at the end of your files, you could write:
(provide-function 'run-me)
This assumes that e.g. your utils package defines and exports a function named provide-function that stores values in a central registry.
For example, in utils.lisp:
(defpackage :utils
(:use :cl)
(:export #:provide-function
#:run-all))
(in-package :utils)
(defvar *all-interesting-functions* nil)
(defun provide-function (f)
(pushnew f *all-interesting-functions* :test #'eql))
And when you want to run all the functions, you can iterate over this list:
(defun run-all ()
(mapcar #'funcall *all-interesting-functions*))
I have a function foo defined in a package my-package:
(in-package :my-package)
(defun foo (a)
(if (eql a 'value1)
(do-this)
(do-the-other)))
When I call this function from a different package I have to qualify the parameter with the package name:
(in-package :cl-user)
(my-package:foo 'my-package::value1)
but this is rather ugly. I want to share the symbol value1 with all other packages.
I found one workaround which is to import the symbol value1, but this only works if it has been already defined in the other package.
Another possibility is to pass strings, "value1", but again, this is just a patch.
What is the best way to share symbols across packages?
Thanks for your help.
Use a keyword symbol, which you can always write without naming its package keyword:
(foo:bar :value1)
Keyword symbols are in the KEYWORD package, are evaluating to themselves, are automatically exported and you don't need to write down the package name.
Since a keyword symbol evaluates to itself, you even don't have to quote them - but you can.
(foo:bar ':value1)
Alternative: short package names
Sometimes it might be useful to have a symbol in a specific package. Then I would use a short package name, which you can also define as a nickname. See the options on DEFPACKAGE. For example the package color-graphics could have the nickname cg.
Then one would write:
(foo:bar 'cg:green)
Since it is a normal symbol, you have to quote it, otherwise it would be a variable.
I've set up Quicklisp to run whenever SBCL runs, and added the following line to the top of my file that I'm trying to use the priority-queue library in (as suggested in the answer to my earlier question, Priority queue for Common Lisp?). However, when I try to use it, I get errors from SBCL, saying that the functions from priority-queue are not defined! What am I missing?
For reference, I tried to write something like this:
(ql:quickload "priority-queue")
(defparameter *heap* (make-pqueue #'<))
And I get an error saying that make-pqueue is not defined.
In common lisp, anything that's named (a variable, a function, a macro) is attached to a symbol. In this case, you have a function which is attached to the symbol make-pqueue. Symbols are separated from each other using packages. This keeps collisions to a minimum and also allows for things like internal variables/functions that aren't exported by the package.
Sounds like you need to do one of three things:
Use the package name before the function: (priority-queue:make-pqueue #'<). This method is good if you want people reading your source to know exactly what code is being run. however, it can get cumbersome if you call the package many times.
Use the priority-queue package in the current package you're in:
(use-package :priority-queue)
(make-pqueue #'<)
What this does is import every exported symbol from the priority-queue package into the current package you're in (most likely cl-user). While this is good for testing, you generally want to create your own package. See next item.
Define your own package that uses priority-queue:
(defpackage :queue-test (:use :cl :priority-queue))
(in-package :queue-test)
(make-pqueue #'<)
Defining your own packages seems like a lot of work at first, but you'll start to like the separation you get, especially if you start integrating different pieces of your code together.
The "Writing R Extensions" manual provides the following guidance on when to use Imports or Depends:
The general rules are
Packages whose namespace only is needed to load the package using library(pkgname) must be listed in the ‘Imports’ field and not in the
‘Depends’ field.
Packages that need to be attached to successfully load the package using library(pkgname) must be listed in the ‘Depends’ field, only.
Can someone provide a bit more clarity on this? How do I know when my package only needs namespaces loaded versus when I need a package to be attached? What are examples of both? I think the typical package is just a collection of functions that sometimes call functions in other packages (where some bit of work has already been coded-up). Is this scenario 1 or 2 above?
Edit
I wrote a blog post with a section on this specific topic (search for 'Imports v Depends'). The visuals make it a lot easier to understand.
"Imports" is safer than "Depends" (and also makes a package using it a 'better citizen' with respect to other packages that do use "Depends").
A "Depends" directive attempts to ensure that a function from another package is available by attaching the other package to the main search path (i.e. the list of environments returned by search()). This strategy can, however, be thwarted if another package, loaded later, places an identically named function earlier on the search path. Chambers (in SoDA) uses the example of the function "gam", which is found in both the gam and mgcv packages. If two other packages were loaded, one of them depending on gam and one depending on mgcv, the function found by calls to gam() would depend on the order in which they those two packages were attached. Not good.
An "Imports" directive should be used for any supporting package whose functions are to be placed in <imports:packageName> (searched immediately after <namespace:packageName>), instead of on the regular search path. If either one of the packages in the example above used the "Imports" mechanism (which also requires import or importFrom directives in the NAMESPACE file), matters would be improved in two ways. (1) The package would itself gain control over which mgcv function is used. (2) By keeping the main search path clear of the imported objects, it would not even potentially break the other package's dependency on the other mgcv function.
This is why using namespaces is such a good practice, why it is now enforced by CRAN, and (in particular) why using "Imports" is safer than using "Depends".
Edited to add an important caveat:
There is one unfortunately common exception to the advice above: if your package relies on a package A which itself "Depends" on another package B, your package will likely need to attach A with a "Depends directive.
This is because the functions in package A were written with the expectation that package B and its functions would be attached to the search() path.
A "Depends" directive will load and attach package A, at which point package A's own "Depends" directive will, in a chain reaction, cause package B to be loaded and attached as well. Functions in package A will then be able to find the functions in package B on which they rely.
An "Imports" directive will load but not attach package A and will neither load nor attach package B. ("Imports", after all, expects that package writers are using the namespace mechanism, and that package A will be using "Imports" to point to any functions in B that it need access to.) Calls by your functions to any functions in package A which rely on functions in package B will consequently fail.
The only two solutions are to either:
Have your package attach package A using a "Depends" directive.
Better in the long run, contact the maintainer of package A and ask them to do a more careful job of constructing their namespace (in the words of Martin Morgan in this related answer).
Hadley Wickham gives an easy explanation (http://r-pkgs.had.co.nz/namespace.html):
Listing a package in either Depends or Imports ensures that it’s
installed when needed. The main difference is that where Imports just
loads the package, Depends attaches it. There are no other
differences. [...]
Unless there is a good reason otherwise, you should always list
packages in Imports not Depends. That’s because a good package is
self-contained, and minimises changes to the global environment
(including the search path). The only exception is if your package is
designed to be used in conjunction with another package. For example,
the analogue package builds on top of vegan. It’s not useful without
vegan, so it has vegan in Depends instead of Imports. Similarly,
ggplot2 should really Depend on scales, rather than Importing it.
Chambers in SfDA says to use 'Imports' when this package uses a 'namespace' mechanism and since all packages are now required to have them, then the answer might now be always use 'Imports'. In the past packages could have been loaded without actually having namespaces and in that case you would need to have used Depends.
Here is a simple question to help you decide which to use:
Does your package require the end user to have direct access to the functions of another package?
NO -> Imports (most common answer)
YES -> Depends
The only time you should use 'Depends' is when your package is an add-on or companion to another package, where your end user will be using functions from both your package and the 'Depends' package in their code. If your end user will only be interfacing with your functions, and the other package will only be doing work behind the scenes, then use 'Imports' instead.
The caveat to this is that if you add a package to 'Imports', as you usually should, your code will need to refer to functions from that package, using the full namespace syntax, e.g. dplyr::mutate(), instead of just mutate(). It makes the code a little clunkier to read, but it’s a small price to pay for better package hygiene.
The "Writing R Extensions" manual provides the following guidance on when to use Imports or Depends:
The general rules are
Packages whose namespace only is needed to load the package using library(pkgname) must be listed in the ‘Imports’ field and not in the
‘Depends’ field.
Packages that need to be attached to successfully load the package using library(pkgname) must be listed in the ‘Depends’ field, only.
Can someone provide a bit more clarity on this? How do I know when my package only needs namespaces loaded versus when I need a package to be attached? What are examples of both? I think the typical package is just a collection of functions that sometimes call functions in other packages (where some bit of work has already been coded-up). Is this scenario 1 or 2 above?
Edit
I wrote a blog post with a section on this specific topic (search for 'Imports v Depends'). The visuals make it a lot easier to understand.
"Imports" is safer than "Depends" (and also makes a package using it a 'better citizen' with respect to other packages that do use "Depends").
A "Depends" directive attempts to ensure that a function from another package is available by attaching the other package to the main search path (i.e. the list of environments returned by search()). This strategy can, however, be thwarted if another package, loaded later, places an identically named function earlier on the search path. Chambers (in SoDA) uses the example of the function "gam", which is found in both the gam and mgcv packages. If two other packages were loaded, one of them depending on gam and one depending on mgcv, the function found by calls to gam() would depend on the order in which they those two packages were attached. Not good.
An "Imports" directive should be used for any supporting package whose functions are to be placed in <imports:packageName> (searched immediately after <namespace:packageName>), instead of on the regular search path. If either one of the packages in the example above used the "Imports" mechanism (which also requires import or importFrom directives in the NAMESPACE file), matters would be improved in two ways. (1) The package would itself gain control over which mgcv function is used. (2) By keeping the main search path clear of the imported objects, it would not even potentially break the other package's dependency on the other mgcv function.
This is why using namespaces is such a good practice, why it is now enforced by CRAN, and (in particular) why using "Imports" is safer than using "Depends".
Edited to add an important caveat:
There is one unfortunately common exception to the advice above: if your package relies on a package A which itself "Depends" on another package B, your package will likely need to attach A with a "Depends directive.
This is because the functions in package A were written with the expectation that package B and its functions would be attached to the search() path.
A "Depends" directive will load and attach package A, at which point package A's own "Depends" directive will, in a chain reaction, cause package B to be loaded and attached as well. Functions in package A will then be able to find the functions in package B on which they rely.
An "Imports" directive will load but not attach package A and will neither load nor attach package B. ("Imports", after all, expects that package writers are using the namespace mechanism, and that package A will be using "Imports" to point to any functions in B that it need access to.) Calls by your functions to any functions in package A which rely on functions in package B will consequently fail.
The only two solutions are to either:
Have your package attach package A using a "Depends" directive.
Better in the long run, contact the maintainer of package A and ask them to do a more careful job of constructing their namespace (in the words of Martin Morgan in this related answer).
Hadley Wickham gives an easy explanation (http://r-pkgs.had.co.nz/namespace.html):
Listing a package in either Depends or Imports ensures that it’s
installed when needed. The main difference is that where Imports just
loads the package, Depends attaches it. There are no other
differences. [...]
Unless there is a good reason otherwise, you should always list
packages in Imports not Depends. That’s because a good package is
self-contained, and minimises changes to the global environment
(including the search path). The only exception is if your package is
designed to be used in conjunction with another package. For example,
the analogue package builds on top of vegan. It’s not useful without
vegan, so it has vegan in Depends instead of Imports. Similarly,
ggplot2 should really Depend on scales, rather than Importing it.
Chambers in SfDA says to use 'Imports' when this package uses a 'namespace' mechanism and since all packages are now required to have them, then the answer might now be always use 'Imports'. In the past packages could have been loaded without actually having namespaces and in that case you would need to have used Depends.
Here is a simple question to help you decide which to use:
Does your package require the end user to have direct access to the functions of another package?
NO -> Imports (most common answer)
YES -> Depends
The only time you should use 'Depends' is when your package is an add-on or companion to another package, where your end user will be using functions from both your package and the 'Depends' package in their code. If your end user will only be interfacing with your functions, and the other package will only be doing work behind the scenes, then use 'Imports' instead.
The caveat to this is that if you add a package to 'Imports', as you usually should, your code will need to refer to functions from that package, using the full namespace syntax, e.g. dplyr::mutate(), instead of just mutate(). It makes the code a little clunkier to read, but it’s a small price to pay for better package hygiene.