What kind of object is `...`? - r

I recently thought about the ... argument for a function and noticed that R does not allow to check the class of the object.
f <- function(...) {
class(...)
}
f(1, 2, 3)
## Error in class(...) : 3 arguments passed to 'class' which requires 1
Now with the quote
“To understand computations in R, two slogans are helpful:
• Everything that exists is an object. • Everything that happens is a
function call."
— John Chambers
in my head I'm wondering: What kind of object is ...?

What an interesting question!
Dot-dot-dot ... is an object (John Chambers is right!) and it's a type of pairlist. Well, I searched the documentation, so I'd like to share it with you:
R Language Definition document says:
The ‘...’ object type is stored as a type of pairlist. The components of ‘...’ can be accessed in the usual pairlist manner from C code, but is not easily accessed as an object in interpreted code. The object can be captured as a list.
Another chapter defines pairlists in detail:
Pairlist objects are similar to Lisp’s dotted-pair lists.
Pairlists are handled in the R language in exactly the same way as generic vectors (“lists”).
Help on Generic and Dotted Pairs says:
Almost all lists in R internally are Generic Vectors, whereas traditional dotted pair lists (as in LISP) remain available but rarely seen by users (except as formals of functions).
And a nice summary is here at Stack Overflow!

Related

Where is the complete documentation of set-dispatch-macro-character?

I am trying to find the complete documentation of set-dispatch-macro-character. I have tried the HyperSpec, but it seems to be incomplete; the exact specification of the new-function argument is not mentioned. It only says that new-function is a "function designator" without stating how many arguments the function designator takes and what those arguments are.
Since the HyperSpec is incomplete, what are some canonical reference materials that Common Lisp programmers use to get the exact specifications of Common Lisp built-ins?
The documentation of set-dispatch-macro-character contains the following cross-reference:
For more information about how the new-function is invoked, see Section 2.1.4.4 (Macro Characters).
The details you ask for are in that section. In particular, it says:
The reader macro function associated with the sub-character C2 is invoked with three arguments: the stream, the sub-character C2, and the infix parameter P.

Expressible vs denotable values

What programming language has a value that is expressible but not denotable. Also what would this imply?
I don't really understand the difference. At the moment I think it means a functional language because then you can't give variables values only point to them?
Is this completely wrong?
According to these lecture notes by David Schmidt:
Expressible values are values that can be produced by expressions in code, like strings, numbers, lambdas/anonymous functions (in languages that support them), etc.
Denotable values are values that can be named (bound to an identifier) and referred to later, like values of variables or named functions.
For example, a language can have syntax for declaring named functions, but no expression syntax for anonymous functions. So (if I understand that correctly) in this language, functions would be denotable but not expressible.
The only example I could find of values that are expressible but not denotable are error values (in some theoretical languages p.11), which can be produced by an expression (like 1/0) but cannot be bound to an identifier (saved in a variable).
(This assumes that the assignment statement propagates the error instead of simply storing the error value in the variable.)
Anonymous types are also somewhat similar. For example in C#, you can define an anonymous object, which has an anonymous type that cannot be bound to an identifier (is not denotable):
// anonymous objects can only be saved into a variable by using type inference
var obj = new { Name = "Farah", Kind = "Human" };

Reflecting on a Type parameter

I am trying to create a function
import Language.Reflection
foo : Type -> TT
I tried it by using the reflect tactic:
foo = proof
{
intro t
reflect t
}
but this reflects on the variable t itself:
*SOQuestion> foo
\t => P Bound (UN "t") (TType (UVar 41)) : Type -> TT
Reflection in Idris is a purely syntactic, compile-time only feature. To predict how it will work, you need to know about how Idris converts your program to its core language. Importantly, you won't be able to get ahold of reflected terms at runtime and reconstruct them like you would with Lisp. Here's how your program is compiled:
Internally, Idris creates a hole that will expect something of type Type -> TT.
It runs the proof script for foo in this state. We start with no assumptions and a goal of type Type -> TT. That is, there's a term being constructed which looks like ?rhs : Type => TT . rhs. The ?foo : ty => body syntax shows that there's a hole called foo whose eventual value will be available inside of body.
The step intro t creates a function whose argument is t : Type - this means that we now have a term like ?foo_body : TT . \t : Type => foo_body.
The reflect t step then fills the current hole by taking the term on its right-hand side and converting it to a TT. That term is in fact just a reference to the argument of the function, so you get the variable t. reflect, like all other proof script steps, only has access to the information that is available directly at compile time. Thus, the result of filling in foo_body with the reflection of the term t is P Bound (UN "t") (TType (UVar (-1))).
If you could do what you are wanting here, it would have major consequences both for understanding Idris code and for running it efficiently.
The loss in understanding would come from the inability to use parametricity to reason about the behavior of functions based on their types. All functions would effectively become potentially ad-hoc polymorphic, because they could (say) run differently on lists of strings than on lists of ints.
The loss in performance would come from representing enough type information to do the reflection. After Idris code is compiled, there is no type information left in it (unlike in a system such as the JVM or .NET or a dynamically typed system such as Python, where types have a runtime representation that code can access). In Idris, types can be very large, because they can contain arbitrary programs - this means that far more information would have to be maintained, and computation occurring at the type level would also have to be preserved and repeated at runtime.
If you're wanting to reflect on the structure of a type for further proof automation at compile time, take a look at the applyTactic tactic. Its argument should be a function that takes a reflected context and goal and gives back a new reflected tactic script. An example can be seen in the Data.Vect source.
So I suppose the summary is that Idris can't do what you want, and it probably never will be able to, but you might be able to make progress another way.

The relationship between quotation, reification and reflection

I recently get confused with quotation, reification and reflection. Someone could offer a good explanation about their relationship and differences (if any)?
Quoting
This is probably the easiest one. Consider what happens when you type the following into the REPL:
(+ a 1)
REPL stands for Read Eval Print Loop, so first it Reads this in. This is a list, so after reading we have a list of 3 elements containing: <the symbol "+"> <the symbol "a"> <the number 1>
The next step is evaluation. Evaluating a list in Common Lisp involves looking up the function (or macro) bound to the first item in the list. Since + is bound to a function and not a macro, it then evaluates each subsequent element in the list. Numbers evaluate to themselves, and "a" will evaluate to whatever it is bound to. Now that the arguments are evaluated, the function "+" is called with the results of the evaluation.
We then Print the result and Loop back to the Read step
So this is great, but what if we want something that, when evaluated, will end up as a list of 3 elements containing <the symbol "+"> <the symbol "a"> <the number 1>? The solution to this is quoting. Lisps in general have a special form called "quote" that takes a single argument, and the result is that argument, unevaluated. So
(quote (+ a 1))
will evaluate to that list. As some syntactic sugar, ' is treated the same as (quote ), so we can just write '(+ a 1).
Reification
Reification is a generic term that roughly means "Make an abstract concept" concrete. Specific to programming, when something is reified, it roughly means you can treat it as data (or "First-class object"). An example of this in lisp is functions the lambda expression gives you the ability to create a concrete, first-class object that represents the abstract concept of a function call. Another example is a CLOS class, which is itself a CLOS object, that represents the abstract concept of a class.
Reflection
To a certain degree, reflection is the opposite of Reification. Given something concrete, you want some information about it's abstract representation. Consider a Common Lisp Package object, which is a reification of the concept of a package, which is just a mapping from symbol names to symbols. You can use do-symbols to iterate over all of the symbols in a package, thus getting that information out at runtime.
Also, remember how I said lambda's were a reification of functions? Well "function-lambda-expression" is the reflection on functions.
The metaobject protocol (MOP) is a semi-standard way of doing all sorts of things with classes and objects. Among other things, it allows reflection on classes and objects.

Haskell "collections" language design

Why is the Haskell implementation so focused on linked lists?
For example, I know Data.Sequence is more efficient
with most of the list operations (except for the cons operation), and is used a lot;
syntactically, though, it is "hardly supported". Haskell has put a lot of effort into functional abstractions, such as the Functor and the Foldable class, but their syntax is not compatible with that of the default list.
If, in a project I want to optimize and replace my lists with sequences - or if I suddenly want support for infinite collections, and replace my sequences with lists - the resulting code changes are abhorrent.
So I guess my wondering can be made concrete in questions such as:
Why isn't the type of map equal to (Functor f) => (a -> b) -> f a -> f b?
Why can't the [] and (:) functions be used for, for example, the type in Data.Sequence?
I am really hoping there is some explanation for this, that doesn't include the words "backwards compatibility" or "it just grew that way", though if you think there isn't, please let me know. Any relevant language extensions are welcome as well.
Before getting into why, here's a summary of the problem and what you can do about it. The constructors [] and (:) are reserved for lists and cannot be redefined. If you plan to use the same code with multiple data types, then define or choose a type class representing the interface you want to support, and use methods from that class.
Here are some generalized functions that work on both lists and sequences. I don't know of a generalization of (:), but you could write your own.
fmap instead of map
mempty instead of []
mappend instead of (++)
If you plan to do a one-off data type replacement, then you can define your own names for things, and redefine them later.
-- For now, use lists
type List a = [a]
nil = []
cons x xs = x : xs
{- Switch to Seq in the future
-- type List a = Seq a
-- nil = empty
-- cons x xs = x <| xs
-}
Note that [] and (:) are constructors: you can also use them for pattern matching. Pattern matching is specific to one type constructor, so you can't extend a pattern to work on a new data type without rewriting the pattern-matchign code.
Why there's so much list-specific stuff in Haskell
Lists are commonly used to represent sequential computations, rather than data. In an imperative language, you might build a Set with a loop that creates elements and inserts them into the set one by one. In Haskell, you do the same thing by creating a list and then passing the list to Set.fromList. Since lists so closely match this abstraction of computation, they have a place that's unlikely to ever be superseded by another data structure.
The fact remains that some functions are list-specific when they could have been generic. Some common functions like map were made list-specific so that new users would have less to learn. In particular, they provide simpler and (it was decided) more understandable error messages. Since it's possible to use generic functions instead, the problem is really just a syntactic inconvenience. It's worth noting that Haskell language implementations have very little list-speficic code, so new data structures and methods can be just as efficient as the "built-in" ones.
There are several classes that are useful generalizations of lists:
Functor supplies fmap, a generalization of map.
Monoid supplies methods useful for collections with list-like structure. The empty list [] is generalized to other containers by mempty, and list concatenation (++) is generalized to other containers by mappend.
Applicative and Monad supply methods that are useful for interpreting collections as computations.
Traversable and Foldable supply useful methods for running computations over collections.
Of these, only Functor and Monad were in the influential Haskell 98 spec, so the others have been overlooked to varying degrees by library writers, depending on when the library was written and how actively it was maintained. The core libraries have been good about supporting new interfaces.
I remember reading somewhere that map is for lists by default since newcomers to Haskell would be put off if they made a mistake and saw a complex error about "Functors", which they have no idea about. Therefore, they have both map and fmap instead of just map.
EDIT: That "somewhere" is the Monad Reader Issue 13, page 20, footnote 3:
3You might ask why we need a separate map function. Why not just do away with the current
list-only map function, and rename fmap to map instead? Well, that’s a good question. The
usual argument is that someone just learning Haskell, when using map incorrectly, would much
rather see an error about lists than about Functors.
For (:), the (<|) function seems to be a replacement. I have no idea about [].
A nitpick, Data.Sequence isn't more efficient for "list operations", it is more efficient for sequence operations. That said, a lot of the functions in Data.List are really sequence operations. The finger tree inside Data.Sequence has to do quite a bit more work for a cons (<|) equivalent to list (:), and its memory representation is also somewhat larger than a list as it is made from two data types a FingerTree and a Deep.
The extra syntax for lists is fine, it hits the sweet spot at what lists are good at - cons (:) and pattern-matching from the left. Whether or not sequences should have extra syntax is further debate, but as you can get a very long way with lists, and lists are inherently simple, having good syntax is a must.
List isn't an ideal representation for Strings - the memory layout is inefficient as each Char is wrapped with a constructor. This is why ByteStrings were introduced. Although they are laid out as an array ByteStrings have to do a bit of administrative work - [Char] can still be competitive if you are using short strings. In GHC there are language extensions to give ByteStrings more String-like syntax.
The other major lazy functional Clean has always represented strings as byte arrays, but its type system made this more practical - I believe the ByteString library uses unsafePerfomIO under the hood.
With version 7.8, ghc supports overloading list literals, compare the manual. For example, given appropriate IsList instances, you can write
['0' .. '9'] :: Set Char
[1 .. 10] :: Vector Int
[("default",0), (k1,v1)] :: Map String Int
['a' .. 'z'] :: Text
(quoted from the documentation).
I am pretty sure this won't be an answer to your question, but still.
I wish Haskell had more liberal function names(mixfix!) a la Agda. Then, the syntax for list constructors (:,[]) wouldn't have been magic; allowing us to at least hide the list type and use the same tokens for our own types.
The amount of code change while migrating between list and custom sequence types would be minimal then.
About map, you are a bit luckier. You can always hide map, and set it equal to fmap yourself.
import Prelude hiding(map)
map :: (Functor f) => (a -> b) -> f a -> f b
map = fmap
Prelude is great, but it isn't the best part of Haskell.

Resources