In Common Lisp there is a famous built-in function called remove-if-not.
I could not find this on Racket`s documentation.
Did I miss something? Does Racket offer this function with a different name?
This function is available in Racket under the much more standard name, filter. Its inverse, the equivalent of CL’s remove-if, is available as filter-not.
Related
Nowdays flatMap is the most widely used name for correspondent operation on monad-like objects.
But I can't find where it has appeared for the first time and what has popularized it.
The oldest appearance I know about is in Scala.
In Haskell it is called bind.
In category theory Greek notation is used.
Partial answer, which hopefully provides some useful "seed nodes" to start more thorough search. My best guess:
1958 for map used for list processing,
1988 for flatten used in context of monads,
2004 for flatMap used as important method backing for-comprehensions in Scala.
The function / method name flatMap seems to be a portmanteau word composed from flatten and map. This makes sense, because whenever M is some monad, A,B some types, and a: M[A], f: A => M[B] a value and a function, then the implementations of map, flatMap and flatten should satisfy
a.flatMap(f) = a.map(f).flatten
(in Scala-syntax).
Let's first consider the both components map and flatten separately.
Map
The map-function seems to have been used to map over lists since time immemorial. My best guess would be that it came from Lisp (around 1958), and then spread to all other languages that had anything resembling higher-order functions.
Flatten
Given how many things are represented by lists in Lisp, I assume that flatten has also been used there for list processing.
The usage of flatten in context of monads must be much more recent, because the monads themselves have been introduced in programming quite a bit later. If we are looking for the usage of word "flatten" in the context of monadic computations, we probably should at least check the papers by Eugenio Moggi. Indeed, in "Computational Lambda-Calculus and Monads" from 1988, he uses the formulation:
Remark 2.2: Intuitively eta_A: A -> TA gives the inclusion of values into computations, while mu_A: T^2 A -> TA flatten a computation of a computation into a computation.
(typesetting changed by me, emphasis mine, text in italic as in original). I think it's interesting that Moggi talks about flattening computations, and not just lists.
Math notation / "Greek"
On the Greek used in mathematical notation: in category theory, the more common way to introduce monads is through the natural transformations that correspond to pure and flatten, the morphisms corresponding to flatMap are deemphasized. However, nobody calls it "flatten". For example, Maclane calls the natural transformation corresponding to method pure "unit" (not to be confused with method unit), and flatten is usually called "multiplication", in analogy with Monoids. One might investigate further whether it was different when the "triple"-terminology was more prevalent.
flatMap
To find the origins of the flatMap portmanteau word, I'd propose to start with the most prominent popularizer today, and then try to backtrack from there. Apparently, flatMap is a Scala meme, so it seems reasonable to start from Scala. One might check the standard libraries (especially the List data structure) of the usual suspects: the languages that influenced Scala. These "roots" are named in Chapter 1, section 1.4 in Odersky's "Programming in Scala":
C, C++ and C# are probably not where it came from.
In Java it was the other way around: the flatMap came from Scala into version 1.8 of Java.
I can't say anything about Smalltalk
Ruby definitely has flat_map on Enumerable, but I don't know anything about Ruby, and I don't want to dig into the source code to find out when it was introduced.
Algol and Simula: definitely not.
Strangely enough ML (SML) seems to get by without flatMap, it only has concat (which is essentially the same as flatten). OCaml's lists also seem to have flatten, but no flatMap.
As you've already mentioned, Haskell had all this long ago, but in Haskell it is called bind and written as an operator
Erlang has flatmap on lists, but I'm not sure whether this is the origin, or whether it was introduced later. The problem with Erlang is that it is from 1986, back then there was no github.
I can't say anything about Iswim, Beta and gbeta.
I think it would be fair to say that flatMap has been popularized by Scala, for two reasons:
The flatMap took a prominent role in the design of Scala's collection library, and few years later it turned out to generalize nicely to huge distributed collections (Apache Spark and similar tools)
The flatMap became the favorite toy of everyone who decided to do functional programming on the JVM properly (Scalaz and libraries inspired by Scalaz, like Scala Cats)
To sum it up: the "flatten" terminology has been used in the context of monads since the very beginning. Later, it was combined with map into flatMap, and popularized by Scala, or more specifically by frameworks such as Apache Spark and Scalaz.
flatmap was introduced in Section 2.2.3 Sequences as Conventional Interfaces in "Structure and Interpretation of Computer Programs" as
(define (flatmap proc seq)
(accumulate append nil (map proc seq)))
The first edition of the book appeared in 1985.
I'd like to use names such as elt, nth and mapcar with a new data structure that I am prototyping, but these names designate ordinary functions and so, I think, would need to be redefined as generic functions.
Presumably it's bad form to redefine these names?
Is there a way to tell defgeneric not to generate a program error and to go ahead and replace the function binding?
Is there a good reason for these not being generic functions or is just historic?
What's the considered wisdom and best practice here please?
If you are using SBCL or ABCL, and aren't concerned with ANSI compliance, you could investigate Extensible Sequences:
http://www.sbcl.org/manual/#Extensible-Sequences
http://www.doc.gold.ac.uk/~mas01cr/papers/ilc2007/sequences-20070301.pdf
...you can't redefine functions in the COMMON-LISP package, but you could create a new package and shadow the imports of the functions you want to redefine.
Is there a good reason for these not being generic functions or is just historic?
Common Lisp has some layers of language in some of its areas. Higher-level parts of the software might need to be built on lower-level constructs.
One of its goals was being fast enough for a range of applications.
Common Lisp also introduced the idea of sequences, the abstraction over lists and vectors, at a time, when the language didn't have an object-system. CLOS came several years after the initial Common Lisp design.
Take for example something like equality - for numbers.
Lisp has =:
(= a b)
That's the fastest way to compare numbers. = is also defined only for numbers.
Then there are eql, equal and equalp. Those work for numbers, but also for some other data types.
Now, if you need more speed, you can declare the types and tell the compiler to generate faster code:
(locally
(declare (fixnum a b)
(optimize (speed 3) (safety 0)))
(= a b))
So, why is = not a CLOS generic function?
a) it was introduced when CLOS did not exist
but equally important:
b) in Common Lisp it wasn't known (and it still isn't) how to make a CLOS generic function = as fast as a non-generic function for typical usage scenarios - while preserving dynamic typing and extensibility
CLOS generic function simply have a speed penalty. The runtime dispatch costs.
CLOS is best used for higher level code, which then really benefits from features like extensibility, multi-dispatch, inheritance/combinations. Generic functions should be used for defined generic behavior - not as collections of similar methods.
With better implementation technology, implementation-specific language enhancements, etc. it might be possible to increase the range of code which can be written in a performant way using CLOS. This has been tried with programming languages like Dylan and Julia.
Presumably it's bad form to redefine these names?
Common Lisp implementations don't let you replace them just so. Be aware, that your replacement functions should be implemented in a way which works consistently with the old functions. Also, old versions could be inlined in some way and not be replaceable everywhere.
Is there a way to tell defgeneric not to generate a program error and to go ahead and replace the function binding?
You would need to make sure that the replacement is working while replacing it. The code replacing functions, might use those function you are replacing.
Still, implementations allow you to replace CL functions - but this is implementation specific. For example LispWorks provides the variables lispworks:*packages-for-warn-on-redefinition* and lispworks:*handle-warn-on-redefinition*. One can bind them or change them globally.
What's the considered wisdom and best practice here please?
There are two approaches:
use implementation specific ways to replace standard Common Lisp functions
This can be dangerous. Plus you need to support it for all implementations of CL you want to use...
use a language package, where you define your new language. Here this would be standard Common Lisp plus your extensions/changes. Export everything the user would use. In your software use this package instead of CL.
What are all Racket procedures that mutate state?
I'm trying to create a program with as little side-effects as possible
So, I'd do something like:
#lang racket/base
(provide (except-out (all-from-out racket/base) set! …more here…))
What else should I exclude besides set! ?
Is there a complete list somewhere of all impure functions?
Oh, and the program also uses #lang racket/gui (which is mostly impure, by what I could gather). So that may be tricky...
Thank you.
There is no pre-built list of non-pure functions in Racket.
If you just refrain from using anything which has a ! in the name, you will get close.
Note that you can use mutable data structures and still be programming in a purely functional way - as long as you don't mutate them.
In the SBCL user manual there are several references to the term "open code". Common Lisp hackers also use this term when referring to optimizing code.
Could you please explain what it means to "open code" something and give an example of how it works?
What It Is
Open-coding, AKA inlining, means replacing function calls with inline assembly.
The idea is that funcall and apply are expensive (they require saving and restoring the stack &c) and replacing them with the few operations which constitute the function can be beneficial.
E.g., the function 1+ is a single instruction when the argument is a fixnum (which it usually is in practice), so turning the funcall into the two parallel branches (fixnum and otherwise) would be a win.
How to Control it
Declarations
They user can control this optimization explicitly by the inline
declaration (which implementations are free to ignore).
The user can also influence this optimization by the optimize declaration.
Both will affect inlining the code of a function defined just as a function (see below).
Macros
The "old" way is to implement the function as a macro. E.g., instead of
(defun last1f (list)
(car (last list)))
write
(defmacro last1m (list)
`(car (last ,list)))
and last1m will be always open-coded. The problem with this approach is that you cannot use last1m as a function - you cannot pass it to, say, mapcar.
Thus Common Lisp has an alternative way - compiler macros, which tell the compiler how to transform the form before compiling it:
(define-compiler-macro last1f (list)
;; use in conjunction with (defun last1f ...)
`(car (last ,list)))
See also the excellent examples in the aforelinked CLHS page.
Its Effects on Optimization
A comment asked about the effects of inlining, i.e., what optimizations result from it. E.g., constant propagation in addition to eliminating a function call.
The answer to this question is left to implementations.
IOW, the CL standard does not specify what optimizations must be done.
However, Minimal Compilation implies that
if an implementation does something (e.g. constant folding), it will be
done for compiler macros too.
For more details, you should compare the results of
disassemble with and
without the declarations and whatnot and see the effects.
For explanations, you should ask the vendor (e.g., by using the appropriate
tag here - e.g., sbcl, clisp, &c).
For instance, mask in Haskell is of type (((forall a . IO a -> IO a) -> IO b) -> IO b). What is the purpose of such a function? Any language with a notion of a higher-order function is welcome.
For purposes of exactness, include only functions which are defined in public libraries or in use in live code.
Okasaki exhibits a 6th order function:
Even Higher-Order Functions for Parsing
or
Why Would Anyone Ever Want To Use a
Sixth-Order Function?