While I somewhat understand the differences between the two methods of evaluation, I am trying to understand the meaning inherent in their names. Why name it 'normal' order, what norm is being adhered to (if the question makes any sense)? The term 'applicative' makes sense in that it describes a practical method for evaluating an expression, as opposed to the 'normal' one, but who would think of using normal order first? References to the person or people who named these terms and their rationale will be greatly appreciated.
Related
I've been reading a bit from http://c2.com/cgi/wiki?ImplementingMultipleDispatch
I've been having some trouble finding reference on how Julia implements multimethods. What's the runtime complexity of dispatch, and how does it achieve it?
Dr. Bezanson's thesis is certainly the best source for descriptions of Julia's internals right now:
4.3 Dispatch system
Julia’s dispatch system strongly resembles the multimethod systems found in some object-oriented languages [17, 40, 110, 31, 32, 33]. However we prefer the term type-based dispatch, since our system actually works by dispatching a single tuple type of arguments. The difference is subtle and in many cases not noticeable, but has important conceptual implications. It means that methods are not necessarily restricted to specifying a type for each argument “slot”. For example a method signature could be Union{Tuple{Any,Int}, Tuple{Int,Any}}, which matches calls where either, but not necessarily both, of two arguments is an Int.2
The section continues to describe the type and method caches, sorting by specificity, parametric dispatch, and ambiguities. Note that tuple types are covariant (unlike all other Julian types) to match the covariant behavior of method dispatch.
The biggest key here is that the method definitions are sorted by specificity, so it's just a linear search to check if the type of the argument tuple is a subtype of the signature. So that's just O(n), right? The trouble is that checking subtypes with full generality (including Unions and TypeVars, etc) is hard. Very hard, in fact. Worse than NP-complete, it's estimated to be ΠP2 (see the polynomial hierarchy) — that is, even if P=NP, this problem would still take non-polynomial time! It might even be PSPACE or worse.
Of course, the best source for how it actually works is the implementation in JuliaLang/julia/src/gf.c (gf = generic function). There's a rather useful comment there:
Method caches are divided into three parts: one for signatures where
the first argument is a singleton kind (Type{Foo}), one indexed by the
UID of the first argument's type in normal cases, and a fallback
table of everything else.
So the answer to your question about the complexity of method lookup is: "it depends." The first time a method is called with a new set of argument types, it must go through that linear search, looking for a subtype match. If it finds one, it specializes that method for the particular arguments and places it into one of those caches. This means that before embarking upon the hard subtype search, Julia can perform a quick equality check against the already-seen methods… and the number of methods it needs to check are further reduced since the caches are stored as hashtables based upon the first argument.
But, really, your question was about the runtime complexity of dispatch. In that case, the answer is quite often "what dispatch?" — because it has been entirely eliminated! Julia uses LLVM as a just-barely-ahead-of-time compiler, wherein methods are compiled on-demand as they're needed. In performant Julia code, the types should be concretely inferred at compile time, and so dispatch can also be performed at compile time. This completely eliminates the runtime dispatch overhead, and potentially even inlines the found method directly into the caller's body (if it's small) to remove all function call overhead and allow for further optimizations downstream. If the types aren't concretely inferred, there are other performance pitfalls, and I've not profiled to see how much time is typically spent in dispatch. There are ways to optimize this case further, but there's likely bigger fish to fry first… and for now it's typically easiest to just make the hot loops type-stable in the first place.
I have always thought the definition of both of these were functions that take other functions as arguments. I understand the domain of each is different, but what are their defining characteristics?
Well, let me try to kind of derive their defining characteristics from their different domains ;)
First of all, in their usual context combinators are higher order functions. But as it turns out, context is an important thing to keep in mind when talking about differences of these two terms:
Higher Order Functions
When we think of higher order functions, the first thing usually mentioned is "oh, they (also) take at least one function as an argument" (thinking of fold, etc)... as if they were something special because of that. Which - depending on context - they are.
Typical context: functional programming, haskell, any other (usually typed) language where functions are first class citizens (like when LINQ made C# even more awesome)
Focus: let the caller specify/customize some functionality of this function
Combinators
Combinators are somewhat special functions, primitive ones do not even mind what they are given as arguments (argument type often does not matter at all, so passing functions as arguments is not a problem at all). So can the identity-combinator also be called "higher order function"??? Formally: No, it does not need a function as argument! But hold on... in which context would you ever encounter/use combinators (like I, K, etc) instead of just implementing desired functionality "directly"? Answer: Well, in purely functional context!
This is not a law or something, but I can really not think of a situation where you would see actual combinators in a context where you suddenly pass pointers, hash-tables, etc. to a combinator... again, you can do that, but in such scenarios there should really be a better way than using combinators.
So based on this "weak" law of common sense - that you will work with combinators only in a purely functional context - they inherently are higher order functions. What else would you have available to pass as arguments? ;)
Combining combinators (by application only, of course - if you take it seriously) always gives new combinators that therefore also are higher order functions, again. Primitive combinators usually just represent some basic behaviour or operation (thinking of S, K, I, Y combinators) that you want to apply to something without using abstractions. But of course the definition of combinators does not limit them to that purpose!
Typical context: (untyped) lambda calculus, combinatory logic (surprise)
Focus: (structurally) combine existing combinators/"building blocks" to something new (e.g. using the Y-combinator to "add recursion" to something that is not recursive, yet)
Summary
Yes, as you can see, it might be more of a contextual/philosophical thing or about what you want to express: I would never call the K-combinator (definition: K = \a -> \b -> a) "higher order function" - although it is very likely that you will never see K being called with something else than functions, therefore "making" it a higher order function.
I hope this sort of answered your question - formally they certainly are not the same, but their defining characteristics are pretty similar - personally I think of combinators as functions used as higher order functions in their typical context (which usually is somewhere between special an weird).
EDIT: I have adjusted my answer a little bit since - as it turned out - it was slightly "biased" by personal experience/imression. :) To get an even better idea about correctly distinguishing combinators from HOFs, read the comments below!
EDIT2: Taking a look at HaskellWiki also gives a technical definition for combinators that is pretty far away from HOFs!
For a while I was thinking that you just need a map to a monoid, and then reduce would do reduction according to monoid's multiplication.
First, this is not exactly how monoids work, and second, this is not exactly how map/reduce works in practice.
Namely, take the ubiquitous "count" example. If there's nothing to count, any map/reduce engine will return an empty dataset, not a neutral element. Bummer.
Besides, in a monoid, an operation is defined for two elements. We can easily extend it to finite sequences, or, due to associativity, to finite ordered sets. But there's no way to extend it to arbitrary "collections" unless we actually have a σ-algebra.
So, what's the theory? I tried to figure it out, but I could not; and I tried to go Google it but found nothing.
I think the right way to think about map-reduce is not as a computational paradigm in its own right, but rather as a control flow construct similar to a while loop. You can view while as a program constructor with two arguments, a predicate function and an arbitrary program. Similarly, the map-reduce construct has two arguments named map and reduce, each functions. So analogously to while, the useful questions to ask are about proving correctness of constructed programs relative to given preconditions and postconditions. And as usual, those questions involve (a) termination and run-time performance and (b) maintenance of invariants.
I see a lot of documentation and questions about what the currying technique is, but I have found very little information on why one would use it in practice. My question is, what are the advantages of currying? Perhaps you can provide a trivial example where currying would be preferable to conventional method invocation.
I work in C++ while the sun is up, so to date I have had little exposure to currying other than out-of work tinkering with languages.
First, it's very common to mistake partial function application for currying. See this for example (I'm sure there are better resources describing it, but this was the first one i found). I've almost never seen anyone use currying in practice (except for languages like Haskell, where every function is curried by the language itself, so to speak, but even that is in order to enable simple partial function application). Partial function application on the other hand is quite useful in many languages.
Anyway, assuming you're talking about partial function application (since that's what most people are talking about when they're asking about currying), the concept is not quite as natural in C++ as in a (purely) functional language, such as Haskell for example.
For example, here we define a function sum that takes an array of numbers list and sums all the numbers together. If you're unfamilir with the concept of fold (or reduce or inject, as it is sometimes called), read this. Anyway, it would look like this:
sum list = foldl (+) 0 list
But wait a minute. We could shorten it by using partial function application! Instead of supplying an argument, we just say that sum is a function that is equal to foldl, with + and 0 partially applied.
sum = foldl (+) 0
Which one is easier to read? A matter of preference probably, but the latter emphazises the relation between sum and foldl more clearly in my opinion. And please take into account that this is a very simple example. I honestly don't know how to write a good example in C++, so you'll have to excuse me there. In any case, what is the practical advantage? Readability. Clearer intent. Shorter code.
Disclaimer: If you actually wanted to know the advantages of currying (as opposed to partial function application) I'm sorry to have made you read all this. But on the other hand, if you understand the difference between the two will also understand that currying is a great way to implement partial function application.
I've been using Scheme and Common Lisp for a while and there is one thing about nomenclature that I never got:
I know that combinators are procedures with no free variables, but I seldom see them being called "combinators", except for those that deal with lists and other sequences.
Is my perception correct? Or is there some other definition of "combination" that I'm missing?
If you have any function that deals with lists, then it's no longer a real combinator, since it needs to use list functions. A "real" combinator is one that really uses no free identifiers, not even cons etc. (But the term can sometime be used more loosely.)