Explanation of combinators for the working man - functional-programming

What is a combinator??
Is it "a function or definition with no free variables" (as defined on SO)?
Or how about this: according to John Hughes in his well-known paper on Arrows, "a combinator is a function which builds program fragments from program fragments", which is advantageous because "... the programmer using combinators constructs much of the desired program automatically, rather than writing every detail by hand". He goes on to say that map and filter are two common examples of such combinators.
Some combinators which match the first definition:
S
K
Y
others from To Mock a Mockingbird (I may be wrong -- I haven't read this book)
Some combinators which match the second definition:
map
filter
fold/reduce (presumably)
any of >>=, compose, fmap ?????
I'm not interested in the first definition -- those would not help me to write a real program (+1 if you convince me I'm wrong). Please help me understand the second definition. I think map, filter, and reduce are useful: they allow me to program at a higher level -- fewer mistakes, shorter and clearer code. Here are some of my specific questions about combinators:
What are more examples of combinators such as map, filter?
What combinators do programming languages often implement?
How can combinators help me design a better API?
How do I design effective combinators?
What are combinators similar to in a non-functional language (say, Java), or what do these languages use in place of combinators?
Update
Thanks to #C. A. McCann, I now have a somewhat better understanding of combinators. But one question is still a sticking point for me:
What is the difference between a functional program written with, and one written without, heavy use of combinators?
I suspect the answer is that the combinator-heavy version is shorter, clearer, more general, but I would appreciate a more in-depth discussion, if possible.
I'm also looking for more examples and explanations of complex combinators (i.e. more complex than fold) in common programming languages.

I'm not interested in the first definition -- those would not help me to write a real program (+1 if you convince me I'm wrong). Please help me understand the second definition. I think map, filter, and reduce are useful: they allow me to program at a higher level -- fewer mistakes, shorter and clearer code.
The two definitions are basically the same thing. The first is based on the formal definition and the examples you give are primitive combinators--the smallest building blocks possible. They can help you to write a real program insofar as, with them, you can build more sophisticated combinators. Think of combinators like S and K as the machine language of a hypothetical "combinatory computer". Actual computers don't work that way, of course, so in practice you'll usually have higher-level operations implemented behind the scenes in other ways, but the conceptual foundation is still a useful tool for understanding the meaning of those higher-level operations.
The second definition you give is more informal and about using more sophisticated combinators, in the form of higher-order functions that combine other functions in various ways. Note that if the basic building blocks are the primitive combinators above, everything built from them is a higher-order function and a combinator as well. In a language where other primitives exist, however, you have a distinction between things that are or are not functions, in which case a combinator is typically defined as a function that manipulates other functions in a general way, rather than operating on any non-function things directly.
What are more examples of combinators such as map, filter?
Far too many to list! Both of those transform a function that describes behavior on a single value into a function that describes behavior on an entire collection. You can also have functions that transform only other functions, such as composing them end-to-end, or splitting and recombining arguments. You can have combinators that turn single-step operations into recursive operations that produce or consume collections. Or all kinds of other things, really.
What combinators do programming languages often implement?
That's going to vary quite a bit. There're relatively few completely generic combinators--mostly the primitive ones mentioned above--so in most cases combinators will have some awareness of any data structures being used (even if those data structures are built out of other combinators anyway), in which case there are typically a handful of "fully generic" combinators and then whatever various specialized forms someone decided to provide. There are a ridiculous number of cases where (suitably generalized versions of) map, fold, and unfold are enough to do almost everything you might want.
How can combinators help me design a better API?
Exactly as you said, by thinking in terms of high-level operations, and the way those interact, instead of low-level details.
Think about the popularity of "for each"-style loops over collections, which let you abstract over the details of enumerating a collection. These are just map/fold operations in most cases, and by making that a combinator (rather than built-in syntax) you can do things such as take two existing loops and directly combine them in multiple ways--nest one inside the other, do one after the other, and so on--by just applying a combinator, rather than juggling a whole bunch of code around.
How do I design effective combinators?
First, think about what operations make sense on whatever data your program uses. Then think about how those operations can be meaningfully combined in generic ways, as well as how operations can be broken down into smaller pieces that are connected back together. The main thing is to work with transformations and operations, not direct actions. When you have a function that just does some complicated bit of functionality in an opaque way and only spits out some sort of pre-digested result, there's not much you can do with that. Leave the final results to the code that uses the combinators--you want things that take you from point A to point B, not things that expect to be the beginning or end of a process.
What are combinators similar to in a non-functional language (say, Java), or what do these languages use in place of combinators?
Ahahahaha. Funny you should ask, because objects are really higher-order thingies in the first place--they have some data, but they also carry around a bunch of operations, and quite a lot of what constitutes good OOP design boils down to "objects should usually act like combinators, not data structures".
So probably the best answer here is that instead of combinator-like things, they use classes with lots of getter and setter methods or public fields, and logic that mostly consists of doing some opaque, predefined action.

Related

Functional Programming: why pair as a basic constructed unit?

Basic cons-cell sticks together two arbitrary things and is the basic unit that allows a construction of linked lists and arbitrary data objects. Question: is there a reason to stick with this simplistic language design descision (for instance, in all lisp families)?
Why not use fixed length arrays for this purpose (or some nested stacks)? I can't foresee any problems with that, but there are clear advantages of a more "packed" memory, less pointer resolution and less "dead-weight" cons-cells to define hierarchy of the data.
You have titled your question “Functional Programming: why pair as a basic constructed unit?”, but this title does not reflect correctly the fact that many important and well known functional languages (e.g. Haskell, F#, Scala, SML, Clojure etc.) have either algebraic data types or different collection of data structures, in which the pair is just one of the different type of constructors, if even available. The situation is similar for other multiparadigm languages, that have support for functional programming, like C++, Java, Objective-C, Swift, etc.
In all these cases the pair, if present, is exactly “basic” as an array, a record, or list, or any other type of data constructor.
What is left is the family of Lisp languages, notably Common Lisp and Scheme, that, beside having a rich set of data structures, like those cited in the comment of Rainer Joswig, use the pair for an important task: as basic data constructor to represent programs.
The fact that Lisp code is a s-expression (that is a list of lists and atoms) has foundamental consequences, the most notable of all being the rising of macro systems, that allow programmers to create easily new syntax, or even new domain-specific languages.
Renzo's answer about other structures in functional programming is spot on. Function programming is about aligning programming with logic and mathematics, where expressions denote values, and there's no such thing as a side effect. (Of course, in practice, we need side effects for I/O, etc.) Functional programming doesn't require the singly linked list as a fundamental construct.
Lists are one of the things that make Lisps lispy, though.
One of the reasons that pairs are so common in the Lisp family of languages may be that ordered pairs are very easy to implement in the lambda calculus that Lisps are inspired by. (I say "inspired by" rather than "based on" because after the syntax and use of lambda to denote anonymous functions, there's plenty of difference, and it's best not to assume that things about one apply to the other.) See the answer to Use of lambda for cons/car/cdr definition in SICP for a quick lesson in how cons, car, and cdr can implemented using nothing but lexical closures.

What is the difference between a combinator and a higher order function?

I have always thought the definition of both of these were functions that take other functions as arguments. I understand the domain of each is different, but what are their defining characteristics?
Well, let me try to kind of derive their defining characteristics from their different domains ;)
First of all, in their usual context combinators are higher order functions. But as it turns out, context is an important thing to keep in mind when talking about differences of these two terms:
Higher Order Functions
When we think of higher order functions, the first thing usually mentioned is "oh, they (also) take at least one function as an argument" (thinking of fold, etc)... as if they were something special because of that. Which - depending on context - they are.
Typical context: functional programming, haskell, any other (usually typed) language where functions are first class citizens (like when LINQ made C# even more awesome)
Focus: let the caller specify/customize some functionality of this function
Combinators
Combinators are somewhat special functions, primitive ones do not even mind what they are given as arguments (argument type often does not matter at all, so passing functions as arguments is not a problem at all). So can the identity-combinator also be called "higher order function"??? Formally: No, it does not need a function as argument! But hold on... in which context would you ever encounter/use combinators (like I, K, etc) instead of just implementing desired functionality "directly"? Answer: Well, in purely functional context!
This is not a law or something, but I can really not think of a situation where you would see actual combinators in a context where you suddenly pass pointers, hash-tables, etc. to a combinator... again, you can do that, but in such scenarios there should really be a better way than using combinators.
So based on this "weak" law of common sense - that you will work with combinators only in a purely functional context - they inherently are higher order functions. What else would you have available to pass as arguments? ;)
Combining combinators (by application only, of course - if you take it seriously) always gives new combinators that therefore also are higher order functions, again. Primitive combinators usually just represent some basic behaviour or operation (thinking of S, K, I, Y combinators) that you want to apply to something without using abstractions. But of course the definition of combinators does not limit them to that purpose!
Typical context: (untyped) lambda calculus, combinatory logic (surprise)
Focus: (structurally) combine existing combinators/"building blocks" to something new (e.g. using the Y-combinator to "add recursion" to something that is not recursive, yet)
Summary
Yes, as you can see, it might be more of a contextual/philosophical thing or about what you want to express: I would never call the K-combinator (definition: K = \a -> \b -> a) "higher order function" - although it is very likely that you will never see K being called with something else than functions, therefore "making" it a higher order function.
I hope this sort of answered your question - formally they certainly are not the same, but their defining characteristics are pretty similar - personally I think of combinators as functions used as higher order functions in their typical context (which usually is somewhere between special an weird).
EDIT: I have adjusted my answer a little bit since - as it turned out - it was slightly "biased" by personal experience/imression. :) To get an even better idea about correctly distinguishing combinators from HOFs, read the comments below!
EDIT2: Taking a look at HaskellWiki also gives a technical definition for combinators that is pretty far away from HOFs!

Difference between a procedure and a combinator?

I've been using Scheme and Common Lisp for a while and there is one thing about nomenclature that I never got:
I know that combinators are procedures with no free variables, but I seldom see them being called "combinators", except for those that deal with lists and other sequences.
Is my perception correct? Or is there some other definition of "combination" that I'm missing?
If you have any function that deals with lists, then it's no longer a real combinator, since it needs to use list functions. A "real" combinator is one that really uses no free identifiers, not even cons etc. (But the term can sometime be used more loosely.)

Mathematical notation of programming concepts

There are many methods for representing structure of a program (like UML class diagrams etc.). I am interested if there is a convention which describes programs in a strict, mathematical way. I am especially interested in the use of mathematical notation for this purpose.
An example: Classes are represented as sets (fields, properties) and functions (operating on the elements of sets). A parent class' fields are a subset of child class'. Functions are described in pseudocode which has to look like this and that...
I know that Z Notation has been used to some extent in the formal verification of software, such as the Tokeneer project.
Z Notation
Z Reference Manual
http://www.amazon.com/Concrete-Mathematics-Foundation-Computer-Science/dp/0201558025
Yes, there is, Floyd-Hoare Logic.
There are a lot of way, but i think most of them are inconvenient for expressing the structure since the structure is often not expressable in default mathematical concepts. The main exception is of course functional programing languages. Think about folds (catamorphisme), groups, algebra's etc.
For imperative programming I know of the existence of Z, which uses (pure and extended) lambda calculus set theory and (first order) predicate logic. However, i dont think it's very convenient. The only upside of using mathematics to express structure is the fact that you can prove stuff about it. But if you want to do that, take a look at JML, Spec# or Eiffel.
Depends on what you're trying to accomplish, but going down this road with specific languages can get you into trouble.
For example, see the circle-ellipse discussion on C++ FAQ Lite.
This book applies the deductive method
to programming by affiliating programs
with the abstract mathematical
theories that enable them work. [...]
I believe that Elements of Programming by Alexander Stepanov and Paul McJones, is pretty close to what you are looking for.
Concepts
A concept is a description of
requirements on one or more types
stated in terms of the existence and
properties of procedures, type
attributes, and type functions defined
on the types.
Z, which has already been mentioned, is pretty much what you describe. There are some variants of it for object-oriented modelling, but I think you can get quite far with "standard Z's" schemas if you wish to model classes.
There's also Alloy, which is newer and inspired by Z. Its notation is perhaps a bit closer to object-orientation. It is also analysable, i.e. you can check the models you create whether they fulfill certain conditions, but it cannot prove that properties hold, just attempt to refute within a finite scope.
The article Dependable Software by Design is a nice introduction to Alloy and its ilk, along with a table of available similar tools.
You are looking for functional programming. There are several functional programming languages, and they are all based on a fundamental mathematical theory called the Lambda calculus. Programs written in a functional programming language such as LISP are a mathematical representation of themselves. ;-)
There is a mathematical language which actually describes a program or rather it's operations. You take the initial state and then transform this state until you reach the desired target state. The transformations yield the program code which must be executed.
See the Wikipedia article about Hoare logic.
The basic idea is that for every function (no matter if you put that into a class or into an old style function), you have a pre- and a post-condition. For example, the precondition can be that you have an array which has >= 0 elements. the post-condition is that every element[i] must by <= element[j] for every i <= j.
The usual description would be "the function sorts the array". But the mathematical terms allow you to transform the input (which must match the precondition) into the output (which must match the postcondition).
It's a bit unwieldy to use, especially for more complex programs but some of the examples are pretty impressive. Often, you get really compact code as the result which looks quite complex but works at first try.
I'd like to suggest Algebra of Programming. It's a calculational approach to programs, using Relational Algebra, and Galois Connections.
If you have further interest on this topic, you can find an amazing paper, here, by Shin-Cheng Mu, and José Nuno Oliveira (slides).
Using Relational Algebra and First-Order Logic, also has a nice synergy with Alloy, Functional Programming, and Design by Contract (easily applied to Object-Oriented Programming).

Defining point of functional programming

I can enumerate many features of functional programming, but when my friend asked me Could you define functional programming for me? I couldn't.
I would say that the defining point of pure functional programming is that all computation is done in functions with no side effects. That is, functions take inputs and return values, but do not change any hidden state, In this paradigm, functions more closely model their mathematical cousins.
This was nailed down for me when I started playing with Erlang, a language with a write-once stack. However, it should be clarified that there is a difference between a programming paradigm, and a programming language. Languages that are generally referred to as functional provide a number of features that encourage or enforce the functional paradigm (e.g., Erlang with it's write-once stack, higher order functions, closures, etc.). However the functional programming paradigm can be applied in many languages (with varying degrees of pain).
A lot of the definitions so far have emphasized purity, but there are many languages that are considered functional that are not at all pure (e.g., ML, Scheme). I think the key properties that make a language "functional" are:
Higher-order functions. Functions are a built-in datatype no different from integers and booleans. Anonymous functions are easy to create and idiomatic (e.g., lambdas).
Everything is an expression. In imperative languages, a distinction is made between statements, which mutate state and affect control flow, and expressions, which yield values. In functional languages (even impure functional languages), expression evaluation is the fundamental unit of execution.
Given these two properties, you naturally get the behavior we think of as functional (e.g., expressing computations in terms of folds and maps). Eliminating mutable state is a way to make things even more functional.
From wikipedia:
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast with the imperative programming style that emphasizes changes in state.
Using a functional approach gives the following benefits:
Concurrent programming is much easier in functional languages.
Functions in FP can never cause side effects - this makes unit testing much easier.
Hot Code Deployment in production environments is much easier.
Functional languages can be reasoned about mathematically.
Lazy evaluation provides potential for performance optimizations.
More expressive - closures, pattern matching, advanced type systems etc. allow programmers to 'say what they mean' more readily.
Brevity - for some classes of program a functional solution is significantly more concise.
There is a great article with more detail here.
Being able to enumerate the features is more useful than trying to define the term itself, as people will use the term "functional programming" in a variety of contexts with many shades of meaning across a continuum, whereas the individual features have individually crisper definitions that are more universally agreed upon.
Below are the features that come to mind. Most people use the term "functional programming" to refer to some subset of those features (the most common/important ones being "purity" and "higher-order functions").
FP features:
Purity (a.k.a. immutability, eschewing side-effects, referential transparency)
Higher-order functions (e.g. pass a function as a parameter, return it as a result, define anonymous function on the fly as a lambda expression)
Laziness (a.k.a. non-strict evaluation, most useful/usable when coupled with purity)
Algebraic data types and pattern matching
Closures
Currying / partial application
Parametric polymorphism (a.k.a. generics)
Recursion (more prominent as a result of purity)
Programming with expressions rather than statements (again, from purity)
...
The more features from the above list you are using, the more likely someone will label what you are doing "functional programming" (and the first two features--purity and higher-order functions--are probably worth the most extra bonus points towards your "FP score").
I have to add that functional programming tends to also abstract control structures of your program as well as the domain - e.g., you no longer do a 'for loop' on some list of things, but you 'map' it with some function to produce the output.
i think functional programming is a state of mind as well as the definition given above.
There are two separate definitions:
The older definition (first-class functions) has been given by Chris Conway.
The newer definition (avoiding side effects like mutation) has been given by John Stauffer. This is more generally known as purely functional programming.
This is a source of much confusion...
It's like drawing a picture by using vectors instead of bitmaps - tell the painter how to change the picture instead of what the picture looks like at each step.
It's application of functions as opposed to changing the state.
I think John Stauffer mostly has the definition. I would also add that you need to be able to pass functions around. Essentially you need high order functions, meaning you can pass functions around easily (although passing blocks is good enough).
For example a very popular functional call is map. It is basically equivalent to
list is some list of items
OutList is some empty list
foreach item in list
OutList.append(function(item))
return OutList
so that code is expressed as map(function, list). The revolutionary concept is that function is a function. Javascript is a great example of a language with high order functions. Basically functions can be treated like a variable and passed into functions or returned from functions. C++ and C have function pointers which can be used similarly. .NET delegates can also be used similarly.
then you can think of all sorts of cool abstractions...
Do you have a function AddItemsInList, MultiplyItemsInList, etc..?
Each function takes (List) and returns a single result
You could create (note, many languages do not allow you to pass + around as a function but it seems the clearest way to express the concept)....
AggregateItemsInList(List, combinefunction, StepFunction)
Increment functions work on indexes...better would be to make them work on list using list operations like next and for incTwo next next if it exists....
function incNormal(x) {
return x + 1
}
function incTwo(x) {
return x + 2
}
AggregateItemsInList(List, +, incNormal)
Want to do every other item?
AggegateItemsInList(List, +, incTwo)
Want to multiply?
AggregateItemsInList(List, *, incNormal)
Want to add exam scores together?
function AddScores (studenta, studentb) {
return studenta.score + studentb.score
}
AggregateItemsInList(ListOfStudents, AddScores, incOne)
High order functions are a very powerful abstraction. Instead of having to write custom methods for numbers, strings, students, etc.. you have one aggregate method that loops through a list of anything and you just have to create the addition operation for each data type.

Resources