What is so special about Monads? - math

A monad is a mathematical structure which is heavily used in (pure) functional programming, basically Haskell. However, there are many other mathematical structures available, like for example applicative functors, strong monads, or monoids. Some have more specific, some are more generic. Yet, monads are much more popular. Why is that?
One explanation I came up with, is that they are a sweet spot between genericity and specificity. This means monads capture enough assumptions about the data to apply the algorithms we typically use and the data we usually have fulfills the monadic laws.
Another explanation could be that Haskell provides syntax for monads (do-notation), but not for other structures, which means Haskell programmers (and thus functional programming researchers) are intuitively drawn towards monads, where a more generic or specific (efficient) function would work as well.

I suspect that the disproportionately large attention given to this one particular type class (Monad) over the many others is mainly a historical fluke. People often associate IO with Monad, although the two are independently useful ideas (as are list reversal and bananas). Because IO is magical (having an implementation but no denotation) and Monad is often associated with IO, it's easy to fall into magical thinking about Monad.
(Aside: it's questionable whether IO even is a monad. Do the monad laws hold? What do the laws even mean for IO, i.e., what does equality mean? Note the problematic association with the state monad.)

If a type m :: * -> * has a Monad instance, you get Turing-complete composition of functions with type a -> m b. This is a fantastically useful property. You get the ability to abstract various Turing-complete control flows away from specific meanings. It's a minimal composition pattern that supports abstracting any control flow for working with types that support it.
Compare this to Applicative, for instance. There, you get only composition patterns with computational power equivalent to a push-down automaton. Of course, it's true that more types support composition with more limited power. And it's true that when you limit the power available, you can do additional optimizations. These two reasons are why the Applicative class exists and is useful. But things that can be instances of Monad usually are, so that users of the type can perform the most general operations possible with the type.
Edit:
By popular demand, here are some functions using the Monad class:
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM c x y = c >>= \z -> if z then x else y
whileM :: Monad m => (a -> m Bool) -> (a -> m a) -> a -> m a
whileM p step x = ifM (p x) (step x >>= whileM p step) (return x)
(*&&) :: Monad m => m Bool -> m Bool -> m Bool
x *&& y = ifM x y (return False)
(*||) :: Monad m => m Bool -> m Bool -> m Bool
x *|| y = ifM x (return True) y
notM :: Monad m => m Bool -> m Bool
notM x = x >>= return . not
Combining those with do syntax (or the raw >>= operator) gives you name binding, indefinite looping, and complete boolean logic. That's a well-known set of primitives sufficient to give Turing completeness. Note how all the functions have been lifted to work on monadic values, rather than simple values. All monadic effects are bound only when necessary - only the effects from the chosen branch of ifM are bound into its final value. Both *&& and *|| ignore their second argument when possible. And so on..
Now, those type signatures may not involve functions for every monadic operand, but that's just a cognitive simplification. There would be no semantic difference, ignoring bottoms, if all the non-function arguments and results were changed to () -> m a. It's just friendlier to users to optimize that cognitive overhead out.
Now, let's look at what happens to those functions with the Applicative interface.
ifA :: Applicative f => f Bool -> f a -> f a -> f a
ifA c x y = (\c' x' y' -> if c' then x' else y') <$> c <*> x <*> y
Well, uh. It got the same type signature. But there's a really big problem here already. The effects of both x and y are bound into the composed structure, regardless of which one's value is selected.
whileA :: Applicative f => (a -> f Bool) -> (a -> f a) -> a -> f a
whileA p step x = ifA (p x) (whileA p step <$> step x) (pure x)
Well, ok, that seems like it'd be ok, except for the fact that it's an infinite loop because ifA will always execute both branches... Except it's not even that close. pure x has the type f a. whileA p step <$> step x has the type f (f a). This isn't even an infinite loop. It's a compile error. Let's try again..
whileA :: Applicative f => (a -> f Bool) -> (a -> f a) -> a -> f a
whileA p step x = ifA (p x) (whileA p step <*> step x) (pure x)
Well shoot. Don't even get that far. whileA p step has the type a -> f a. If you try to use it as the first argument to <*>, it grabs the Applicative instance for the top type constructor, which is (->), not f. Yeah, this isn't gonna work either.
In fact, the only function from my Monad examples that would work with the Applicative interface is notM. That particular function works just fine with only a Functor interface, in fact. The rest? They fail.
Of course it's to be expected that you can write code using the Monad interface that you can't with the Applicative interface. It is strictly more powerful, after all. But what's interesting is what you lose. You lose the ability to compose functions that change what effects they have based on their input. That is, you lose the ability to write certain control-flow patterns that compose functions with types a -> f b.
Turing-complete composition is exactly what makes the Monad interface interesting. If it didn't allow Turing-complete composition, it would be impossible for you, the programmer, to compose together IO actions in any particular control flow that wasn't nicely prepackaged for you. It was the fact that you can use the Monad primitives to express any control flow that made the IO type a feasible way to manage the IO problem in Haskell.
Many more types than just IO have semantically valid Monad interfaces. And it happens that Haskell has the language facilities to abstract over the entire interface. Due to those factors, Monad is a valuable class to provide instances for, when possible. Doing so gets you access to all the existing abstract functionality provided for working with monadic types, regardless of what the concrete type is.
So if Haskell programmers seem to always care about Monad instances for a type, it's because it's the most generically-useful instance that can be provided.

First, I think that it is not quite true that monads are much more popular than anything else; both Functor and Monoid have many instances that are not monads. But they are both very specific; Functor provides mapping, Monoid concatenation. Applicative is the one class that I can think of that is probably underused given its considerable power, due largely to its being a relatively recent addition to the language.
But yes, monads are extremely popular. Part of that is the do notation; a lot of Monoids provide Monad instances that merely append values to a running accumulator (essentially an implicit writer). The blaze-html library is a good example. The reason, I think, is the power of the type signature (>>=) :: Monad m => m a -> (a -> m b) -> m b. While fmap and mappend are useful, what they can do is fairly narrowly constrained. bind, however, can express a wide variety of things. It is, of course, canonized in the IO monad, perhaps the best pure functional approach to IO before streams and FRP (and still useful beside them for simple tasks and defining components). But it also provides implicit state (Reader/Writer/ST), which can avoid some very tedious variable passing. The various state monads, especially, are important because they provide a guarantee that state is single threaded, allowing mutable structures in pure (non-IO) code before fusion. But bind has some more exotic uses, such as flattening nested data structures (the List and Set monads), both of which are quite useful in their place (and I usually see them used desugared, calling liftM or (>>=) explicitly, so it is not a matter of do notation). So while Functor and Monoid (and the somewhat rarer Foldable, Alternative, Traversable, and others) provide a standardized interface to a fairly straightforward function, Monad's bind is considerably more flexibility.
In short, I think that all your reasons have some role; the popularity of monads is due to a combination of historical accident (do notation and the late definition of Applicative) and their combination of power and generality (relative to functors, monoids, and the like) and understandability (relative to arrows).

Well, first let me explain what the role of monads is: Monads are very powerful, but in a certain sense: You can pretty much express anything using a monad. Haskell as a language doesn't have things like action loops, exceptions, mutation, goto, etc. Monads can be expressed within the language (so they are not special) and make all of these reachable.
There is a positive and a negative side to this: It's positive that you can express all those control structures you know from imperative programming and a whole bunch of them you don't. I have just recently developed a monad that lets you reenter a computation somewhere in the middle with a slightly changed context. That way you can run a computation, and if it fails, you just try again with slightly adjusted values. Furthermore monadic actions are first class, and that's how you build things like loops or exception handling. While while is primitive in C in Haskell it's actually just a regular function.
The negative side is that monads give you pretty much no guarantees whatsoever. They are so powerful that you are allowed to do whatever you want, to put it simply. In other words just like you know from imperative languages it can be hard to reason about code by just looking at it.
The more general abstractions are more general in the sense that they allow some concepts to be expressed which you can't express as monads. But that's only part of the story. Even for monads you can use a style known as applicative style, in which you use the applicative interface to compose your program from small isolated parts. The benefit of this is that you can reason about code by just looking at it and you can develop components without having to pay attention to the rest of your system.

What is so special about monads?
The monadic interface's main claim to fame in Haskell is its role in the replacement of the original and unwieldy dialogue-based I/O mechanism.
As for their status in a formal investigative context...it is merely an iteration of a seemingly-cyclic endeavour which is now (2021 Oct) approximately one half-century old:
During the 1960s, several researchers began work on proving things about programs. Efforts were
made to prove that:
A program was correct.
Two programs with different code computed the same answers when given the
same inputs.
One program was faster than another.
A given program would always terminate.
While these are abstract goals, they are all, really, the same as the practical goal of "getting the
program debugged".
Several difficult problems emerged from this work. One was the problem of specification: before
one can prove that a program is correct, one must specify the meaning of "correct", formally and
unambiguously. Formal systems for specifying the meaning of a program were developed, and they
looked suspiciously like programming languages.
The Anatomy of Programming Languages, Alice E. Fischer and Frances S. Grodzinsky.
(emphasis by me.)
...back when "programming languages" - apart from an intrepid few - were most definitely imperative.
Anyone for elevating this mystery to the rank of Millenium problem? Solving it would definitely advance the science of computing and the engineering of software, one way or the other...

Monads are special because of do notation, which lets you write imperative programs in a functional language. Monad is the abstraction that allows you to splice together imperative programs from smaller, reusable components (which are themselves imperative programs). Monad transformers are special because they represent enhancing an imperative language with new features.

Related

Isabelle/HOL foundations

I have seen a lot of documentation about Isabelle's syntax and proof strategies. However, little have I found about its foundations. I have a few questions that I would be very grateful if someone could take the time to answer:
Why doesn't Isabelle/HOL admit functions that do not terminate? Many other languages such as Haskell do admit non-terminating functions.
What symbols are part of Isabelle's meta-language? I read that there are symbols in the meta-language for Universal Quantification (/\) and for implication (==>). However, these symbols have their counterpart in the object-level language (∀ and -->). I understand that --> is an object-level function of type bool => bool => bool. However, how are ∀ and ∃ defined? Are they object-level Boolean functions? If so, they are not computable (considering infinite domains). I noticed that I am able to write Boolean functions in therms of ∀ and ∃, but they are not computable. So what are ∀ and ∃? Are they part of the object-level? If so, how are they defined?
Are Isabelle theorems just Boolean expressions? Then Booleans are part of the meta-language?
As far as I know, Isabelle is a strict programming language. How can I use infinite objects? Let's say, infinite lists. Is it possible in Isabelle/HOL?
Sorry if these questions are very basic. I do not seem to find a good tutorial on Isabelle's meta-theory. I would love if someone could recommend me a good tutorial on these topics.
Thank you very much.
You can define non-terminating (i.e. partial) functions in Isabelle (cf. Function package manual (section 8)). However, partial functions are more difficult to reason about, because whenever you want to use its definition equations (the psimps rules, which replace the simps rules of a normal function), you have to show that the function terminates on that particular input first.
In general, things like non-definedness and non-termination are always problematic in a logic – consider, for instance, the function ‘definition’ f x = f x + 1. If we were to take this as an equation on ℤ (integers), we could subtract f x from both sides and get 0 = 1. In Haskell, this problem is ‘solved’ by saying that this is not an equation on ℤ, but rather on ℤ ∪ {⊥} (the integers plus bottom) and the non-terminating function f evaluates to ⊥, and ‘⊥ + 1 = ⊥’, so everything works out fine.
However, if every single expression in your logic could potentially evaluate to ⊥ instead of a ‘proper‘ value, reasoning in this logic will become very tedious. This is why Isabelle/HOL chooses to restrict itself to total functions; things like partiality have to be emulated with things like undefined (which is an arbitrary value that you know nothing about) or option types.
I'm not an expert on Isabelle/Pure (the meta logic), but the most important symbols are definitely
⋀ (the universal meta quantifier)
⟹ (meta implication)
≡ (meta equality)
&&& (meta conjunction, defined in terms of ⟹)
Pure.term, Pure.prop, Pure.type, Pure.dummy_pattern, Pure.sort_constraint, which fulfil certain internal functions that I don't know much about.
You can find some information on this in the Isabelle/Isar Reference Manual in section 2.1, and probably more elsewhere in the manual.
Everything else (that includes ∀ and ∃, which indeed operate on boolean expressions) is defined in the object logic (HOL, usually). You can find the definitions, of rather the axiomatisations, in ~~/src/HOL/HOL.thy (where ~~ denotes the Isabelle root directory):
All_def: "All P ≡ (P = (λx. True))"
Ex_def: "Ex P ≡ ∀Q. (∀x. P x ⟶ Q) ⟶ Q"
Also note that many, if not most Isabelle functions are typically not computable. Isabelle is not a programming language, although it does have a code generator that allows exporting Isabelle functions as code to programming languages as long as you can give code equations for all the functions involved.
3)
Isabelle theorems are a complex datatype (cf. ~~/src/Pure/thm.ML) containing a lot of information, but the most important part, of course, is the proposition. A proposition is something from Isabelle/Pure, which in fact only has propositions and functions. (and itself and dummy, but you can ignore those).
Propositions are not booleans – in fact, there isn't even a way to state that a proposition does not hold in Isabelle/Pure.
HOL then defines (or rather axiomatises) booleans and also axiomatises a coercion from booleans to propositions: Trueprop :: bool ⇒ prop
Isabelle is not a programming language, and apart from that, totality does not mean you have to restrict yourself to finite structures. Even in a total programming language, you can have infinite lists. (cf. Idris's codata)
Isabelle is a theorem prover, and logically, infinite objects can be treated by axiomatising them and then reasoning about them using the axioms and rules that you have.
For instance, HOL assumes the existence of an infinite type and defines the natural numbers on that. That already gives you access to functions nat ⇒ 'a, which are essentially infinite lists.
You can also define infinite lists and other infinite data structures as codatatypes with the (co-)datatype package, which is based on bounded natural functors.
Let me add some points to two of your questions.
1) Why doesn't Isabelle/HOL admit functions that do not terminate? Many other languages such as Haskell do admit non-terminating functions.
In short: Isabelle/HOL does not require termination, but totality (i.e., there is a specific result for each input to the function) of functions. Totality does not mean that a function is actually terminating when transcribed to a (functional) programming language or even that it is computable at all.
Therefore, talking about termination is somewhat misleading, even though it is encouraged by the fact that Isabelle/HOL's function package uses the keyword termination for proving some property P about which I will have to say a little more below.
On the one hand the term "termination" might sound more intuitive to a wider audience. On the other hand, a more precise description of P would be well-foundedness of the function's call graph.
Don't get me wrong, termination is not really a bad name for the property P, it is even justified by the fact that many techniques that are implemented in the function package are very close to termination techniques from term rewriting or functional programming (like the size-change principle, dependency pairs, lexicographic orders, etc.).
I'm just saying that it can be misleading. The answer to why that is the case also touches on question 4 of the OP.
4) As far as I know Isabelle is a strict programming language. How can I use infinite objects? Let's say, infinite lists. Is it possible in Isabelle/HOL?
Isabelle/HOL is not a programming language and it specifically does not have any evaluation strategy (we could alternatively say: it has any evaluation strategy you like).
And here is why the word termination is misleading (drum roll): if there is no evaluation strategy and we have termination of a function f, people might expect f to terminate independent of the used strategy. But this is not the case. A termination proof of a function rather ensures that f is well-defined. Even if f is computable a proof of P merely ensures that there is an evaluation strategy for which f terminates.
(As an aside: what I call "strategy" here, is typically influenced by so called cong-rules (i.e., congruence rules) in Isabelle/HOL.)
As an example, it is trivial to prove that the function (see Section 10.1 Congruence rules and evaluation order in the documentation of the function package):
fun f' :: "nat ⇒ bool"
where
"f' n ⟷ f' (n - 1) ∨ n = 0"
terminates (in the sense defined by termination) after adding the cong-rule:
lemma [fundef_cong]:
"Q = Q' ⟹ (¬ Q' ⟹ P = P') ⟹ (P ∨ Q) = (P' ∨ Q')"
by auto
Which essentially states that logical-or should be "evaluated" from right to left. However, if you write the same function e.g. in OCaml it causes a stack overflow ...
EDIT: this answer is not really correct, check out Lars' comment below.
Unfortunately I don't have enough reputation to post this as a comment, so here is my go at an answer (please bear in mind I am no expert in Isabelle, but I also had similar questions once):
1) The idea is to prove statements about the defined functions. I am not sure how familiar you are with Computability Theory, but think about the Halting Problem and the fact most undeciability problems stem from it (such as Acceptance Problem). Imagine defining a function which you can't prove it terminates. How could you then still prove it returns the number 42 when given input "ABC" and it doesn't go in an infinite loop?
If instead you limit yourself to terminating functions, you can prove much more about them, essentially making a trade-off (or at least this is how I see it).
These ideas stem from Constructivism and Intuitionism and I recommend you check out Robert Harper's very interesting lecture series: https://www.youtube.com/watch?v=9SnefrwBIDc&list=PLGCr8P_YncjXRzdGq2SjKv5F2J8HUFeqN on Type Theory
You should check out especially the part about the absence of the Law of Excluded middle: http://youtu.be/3JHTb6b1to8?t=15m34s
2) See Manuel's answer.
3,4) Again see Manuel's answer keeping in mind Intuitionistic logic: "the fundamental entity is not the boolean, but rather the proof that something is true".
For me it took a long time to get adjusted to this way of thinking and I'm still not sure I understand it. I think the key though is to understand it is a more-or-less completely different way of thinking.

In pure functional languages, is data (strings, ints, floats.. ) also just functions?

I was thinking about pure Object Oriented Languages like Ruby, where everything, including numbers, int, floats, and strings are themselves objects. Is this the same thing with pure functional languages? For example, in Haskell, are Numbers and Strings also functions?
I know Haskell is based on lambda calculus which represents everything, including data and operations, as functions. It would seem logical to me that a "purely functional language" would model everything as a function, as well as keep with the definition that a function most always returns the same output with the same inputs and has no state.
It's okay to think about that theoretically, but...
Just like in Ruby not everything is an object (argument lists, for instance, are not objects), not everything in Haskell is a function.
For more reference, check out this neat post: http://conal.net/blog/posts/everything-is-a-function-in-haskell
#wrhall gives a good answer. However you are somewhat correct that in the pure lambda calculus it is consistent for everything to be a function, and the language is Turing-complete (capable of expressing any pure computation that Haskell, etc. is).
That gives you some very strange things, since the only thing you can do to anything is to apply it to something else. When do you ever get to observe something? You have some value f and want to know something about it, your only choice is to apply it some value x to get f x, which is another function and the only choice is to apply it to another value y, to get f x y and so on.
Often I interpret the pure lambda calculus as talking about transformations on things that are not functions, but only capable of expressing functions itself. That is, I can make a function (with a bit of Haskelly syntax sugar for recursion & let):
purePlus = \zero succ natCase ->
let plus = \m n -> natCase m n (\m' -> plus m' n)
in plus (succ (succ zero)) (succ (succ zero))
Here I have expressed the computation 2+2 without needing to know that there are such things as non-functions. I simply took what I needed as arguments to the function I was defining, and the values of those arguments could be church encodings or they could be "real" numbers (whatever that means) -- my definition does not care.
And you could think the same thing of Haskell. There is no particular reason to think that there are things which are not functions, nor is there a particular reason to think that everything is a function. But Haskell's type system at least prevents you from applying an argument to a number (anybody thinking about fromInteger right now needs to hold their tongue! :-). In the above interpretation, it is because numbers are not necessarily modeled as functions, so you can't necessarily apply arguments to them.
In case it isn't clear by now, this whole answer has been somewhat of a technical/philosophical digression, and the easy answer to your question is "no, not everything is a function in functional languages". Functions are the things you can apply arguments to, that's all.
The "pure" in "pure functional" refers to the "freedom from side effects" kind of purity. It has little relation to the meaning of "pure" being used when people talk about a "pure object-oriented language", which simply means that the language manipulates purely (only) in objects.
The reason is that pure-as-in-only is a reasonable distinction to use to classify object-oriented languages, because there are languages like Java and C++, which clearly have values that don't have all that much in common with objects, and there are also languages like Python and Ruby, for which it can be argued that every value is an object1
Whereas for functional languages, there are no practical languages which are "pure functional" in the sense that every value the language can manipulate is a function. It's certainly possible to program in such a language. The most basic versions of the lambda calculus don't have any notion of things that are not functions, but you can still do arbitrary computation with them by coming up with ways of representing the things you want to compute on as functions.2
But while the simplicity and minimalism of the lambda calculus tends to be great for proving things about programming, actually writing substantial programs in such a "raw" programming language is awkward. The function representation of basic things like numbers also tends to be very inefficient to implement on actual physical machines.
But there is a very important distinction between languages that encourage a functional style but allow untracked side effects anywhere, and ones that actually enforce that your functions are "pure" functions (similar to mathematical functions). Object-oriented programming is very strongly wed to the use of impure computations3, so there are no practical object-oriented programming languages that are pure in this sense.
So the "pure" in "pure functional language" means something very different from the "pure" in "pure object-oriented language".4 In each case the "pure vs not pure" distinction is one that is completely uninteresting applied to the other kind of language, so there's no very strong motive to standardise the use of the term.
1 There are corner cases to pick at in all "pure object-oriented" languages that I know of, but that's not really very interesting. It's clear that the object metaphor goes much further in languages in which 1 is an instance of some class, and that class can be sub-classed, than it does in languages in which 1 is something else than an object.
2 All computation is about representation anyway. Computers don't know anything about numbers or anything else. They just have bit-patterns that we use to represent numbers, and operations on bit-patterns that happen to correspond to operations on numbers (because we designed them so that they would).
3 This isn't fundamental either. You could design a "pure" object-oriented language that was pure in this sense. I tend to write most of my OO code to be pure anyway.
4 If this seems obtuse, you might reflect that the terms "functional", "object", and "language" have vastly different meanings in other contexts also.
A very different angle on this question: all sorts of data in Haskell can be represented as functions, using a technique called Church encodings. This is a form of inversion of control: instead of passing data to functions that consume it, you hide the data inside a set of closures, and to consume it you pass in callbacks describing what to do with this data.
Any program that uses lists, for example, can be translated into a program that uses functions instead of lists:
-- | A list corresponds to a function of this type:
type ChurchList a r = (a -> r -> r) --^ how to handle a cons cell
-> r --^ how to handle the empty list
-> r --^ result of processing the list
listToCPS :: [a] -> ChurchList a r
listToCPS xs = \f z -> foldr f z xs
That function is taking a concrete list as its starting point, but that's not necessary. You can build up ChurchList functions out of just pure functions:
-- | The empty 'ChurchList'.
nil :: ChurchList a r
nil = \f z -> z
-- | Add an element at the front of a 'ChurchList'.
cons :: a -> ChurchList a r -> ChurchList a r
cons x xs = \f z -> f z (xs f z)
foldChurchList :: (a -> r -> r) -> r -> ChurchList a r -> r
foldChurchList f z xs = xs f z
mapChurchList :: (a -> b) -> ChurchList a r -> ChurchList b r
mapChurchList f = foldChurchList step nil
where step x = cons (f x)
filterChurchList :: (a -> Bool) -> ChurchList a r -> ChurchList a r
filterChurchList pred = foldChurchList step nil
where step x xs = if pred x then cons x xs else xs
That last function uses Bool, but of course we can replace Bool with functions as well:
-- | A Bool can be represented as a function that chooses between two
-- given alternatives.
type ChurchBool r = r -> r -> r
true, false :: ChurchBool r
true a _ = a
false _ b = b
filterChurchList' :: (a -> ChurchBool r) -> ChurchList a r -> ChurchList a r
filterChurchList' pred = foldChurchList step nil
where step x xs = pred x (cons x xs) xs
This sort of transformation can be done for basically any type, so in theory, you could get rid of all "value" types in Haskell, and keep only the () type, the (->) and IO type constructors, return and >>= for IO, and a suitable set of IO primitives. This would obviously be hella impractical—and it would perform worse (try writing tailChurchList :: ChurchList a r -> ChurchList a r for a taste).
Is getChar :: IO Char a function or not? Haskell Report doesn't provide us with a definition. But it states that getChar is a function (see here). (Well, at least we can say that it is a function.)
So I think the answer is YES.
I don't think there can be correct definition of "function" except "everything is a function". (What is "correct definition"? Good question...) Consider the next example:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Applicative
f :: Applicative f => f Int
f = pure 1
g1 :: Maybe Int
g1 = f
g2 :: Int -> Int
g2 = f
Is f a function or datatype? It depends.

What are the most interesting equivalences arising from the Curry-Howard Isomorphism?

I came upon the Curry-Howard Isomorphism relatively late in my programming life, and perhaps this contributes to my being utterly fascinated by it. It implies that for every programming concept there exists a precise analogue in formal logic, and vice versa. Here's a "basic" list of such analogies, off the top of my head:
program/definition | proof
type/declaration | proposition
inhabited type | theorem/lemma
function | implication
function argument | hypothesis/antecedent
function result | conclusion/consequent
function application | modus ponens
recursion | induction
identity function | tautology
non-terminating function | absurdity/contradiction
tuple | conjunction (and)
disjoint union | disjunction (or) -- corrected by Antal S-Z
parametric polymorphism | universal quantification
So, to my question: what are some of the more interesting/obscure implications of this isomorphism? I'm no logician so I'm sure I've only scratched the surface with this list.
For example, here are some programming notions for which I'm unaware of pithy names in logic:
currying | "((a & b) => c) iff (a => (b => c))"
scope | "known theory + hypotheses"
And here are some logical concepts which I haven't quite pinned down in programming terms:
primitive type? | axiom
set of valid programs? | theory
Edit:
Here are some more equivalences collected from the responses:
function composition | syllogism -- from Apocalisp
continuation-passing | double negation -- from camccann
Since you explicitly asked for the most interesting and obscure ones:
You can extend C-H to many interesting logics and formulations of logics to obtain a really wide variety of correspondences. Here I've tried to focus on some of the more interesting ones rather than on the obscure, plus a couple of fundamental ones that haven't come up yet.
evaluation | proof normalisation/cut-elimination
variable | assumption
S K combinators | axiomatic formulation of logic
pattern matching | left-sequent rules
subtyping | implicit entailment (not reflected in expressions)
intersection types | implicit conjunction
union types | implicit disjunction
open code | temporal next
closed code | necessity
effects | possibility
reachable state | possible world
monadic metalanguage | lax logic
non-termination | truth in an unobservable possible world
distributed programs | modal logic S5/Hybrid logic
meta variables | modal assumptions
explicit substitutions | contextual modal necessity
pi-calculus | linear logic
EDIT: A reference I'd recommend to anyone interested in learning more about extensions of C-H:
"A Judgmental Reconstruction of Modal Logic" http://www.cs.cmu.edu/~fp/papers/mscs00.pdf - this is a great place to start because it starts from first principles and much of it is aimed to be accessible to non-logicians/language theorists. (I'm the second author though, so I'm biased.)
You're muddying things a little bit regarding nontermination. Falsity is represented by uninhabited types, which by definition can't be non-terminating because there's nothing of that type to evaluate in the first place.
Non-termination represents contradiction--an inconsistent logic. An inconsistent logic will of course allow you to prove anything, including falsity, however.
Ignoring inconsistencies, type systems typically correspond to an intuitionistic logic, and are by necessity constructivist, which means certain pieces of classical logic can't be expressed directly, if at all. On the other hand this is useful, because if a type is a valid constructive proof, then a term of that type is a means of constructing whatever you've proven the existence of.
A major feature of the constructivist flavor is that double negation is not equivalent to non-negation. In fact, negation is rarely a primitive in a type system, so instead we can represent it as implying falsehood, e.g., not P becomes P -> Falsity. Double negation would thus be a function with type (P -> Falsity) -> Falsity, which clearly is not equivalent to something of just type P.
However, there's an interesting twist on this! In a language with parametric polymorphism, type variables range over all possible types, including uninhabited ones, so a fully polymorphic type such as ∀a. a is, in some sense, almost-false. So what if we write double almost-negation by using polymorphism? We get a type that looks like this: ∀a. (P -> a) -> a. Is that equivalent to something of type P? Indeed it is, merely apply it to the identity function.
But what's the point? Why write a type like that? Does it mean anything in programming terms? Well, you can think of it as a function that already has something of type P somewhere, and needs you to give it a function that takes P as an argument, with the whole thing being polymorphic in the final result type. In a sense, it represents a suspended computation, waiting for the rest to be provided. In this sense, these suspended computations can be composed together, passed around, invoked, whatever. This should begin to sound familiar to fans of some languages, like Scheme or Ruby--because what it means is that double-negation corresponds to continuation-passing style, and in fact the type I gave above is exactly the continuation monad in Haskell.
Your chart is not quite right; in many cases you have confused types with terms.
function type implication
function proof of implication
function argument proof of hypothesis
function result proof of conclusion
function application RULE modus ponens
recursion n/a [1]
structural induction fold (foldr for lists)
mathematical induction fold for naturals (data N = Z | S N)
identity function proof of A -> A, for all A
non-terminating function n/a [2]
tuple normal proof of conjunction
sum disjunction
n/a [3] first-order universal quantification
parametric polymorphism second-order universal quantification
currying (A,B) -> C -||- A -> (B -> C), for all A,B,C
primitive type axiom
types of typeable terms theory
function composition syllogism
substitution cut rule
value normal proof
[1] The logic for a Turing-complete functional language is inconsistent. Recursion has no correspondence in consistent theories. In an inconsistent logic/unsound proof theory you could call it a rule which causes inconsistency/unsoundness.
[2] Again, this is a consequence of completeness. This would be a proof of an anti-theorem if the logic were consistent -- thus, it can't exist.
[3] Doesn't exist in functional languages, since they elide first-order logical features: all quantification and parametrization is done over formulae. If you had first-order features, there would be a kind other than *, * -> *, etc.; the kind of elements of the domain of discourse. For example, in Father(X,Y) :- Parent(X,Y), Male(X), X and Y range over the domain of discourse (call it Dom), and Male :: Dom -> *.
function composition | syllogism
I really like this question. I don't know a whole lot, but I do have a few things (assisted by the Wikipedia article, which has some neat tables and such itself):
I think that sum types/union types (e.g. data Either a b = Left a | Right b) are equivalent to inclusive disjunction. And, though I'm not very well acquainted with Curry-Howard, I think this demonstrates it. Consider the following function:
andImpliesOr :: (a,b) -> Either a b
andImpliesOr (a,_) = Left a
If I understand things correctly, the type says that (a ∧ b) → (a ★ b) and the definition says that this is true, where ★ is either inclusive or exclusive or, whichever Either represents. You have Either representing exclusive or, ⊕; however, (a ∧ b) ↛ (a ⊕ b). For instance, ⊤ ∧ ⊤ ≡ ⊤, but ⊤ ⊕ ⊥ ≡ ⊥, and ⊤ ↛ ⊥. In other words, if both a and b are true, then the hypothesis is true but the conclusion is false, and so this implication must be false. However, clearly, (a ∧ b) → (a ∨ b), since if both a and b are true, then at least one is true. Thus, if discriminated unions are some form of disjunction, they must be the inclusive variety. I think this holds as a proof, but feel more than free to disabuse me of this notion.
Similarly, your definitions for tautology and absurdity as the identity function and non-terminating functions, respectively, are a bit off. The true formula is represented by the unit type, which is the type which has only one element (data ⊤ = ⊤; often spelled () and/or Unit in functional programming languages). This makes sense: since that type is guaranteed to be inhabited, and since there's only one possible inhabitant, it must be true. The identity function just represents the particular tautology that a → a.
Your comment about non-terminating functions is, depending on what precisely you meant, more off. Curry-Howard functions on the type system, but non-termination is not encoded there. According to Wikipedia, dealing with non-termination is an issue, as adding it produces inconsistent logics (e.g., I can define wrong :: a -> b by wrong x = wrong x, and thus “prove” that a → b for any a and b). If this is what you meant by “absurdity”, then you're exactly correct. If instead you meant the false statement, then what you want instead is any uninhabited type, e.g. something defined by data ⊥—that is, a data type without any way to construct it. This ensures that it has no values at all, and so it must be uninhabited, which is equivalent to false. I think you could probably also use a -> b, since if we forbid non-terminating functions, then this is also uninhabited, but I'm not 100% sure.
Wikipedia says that axioms are encoded in two different ways, depending on how you interpret Curry-Howard: either in the combinators or in the variables. I think the combinator view means that the primitive functions we are given encode the things we can say by default (similar to the way that modus ponens is an axiom because function application is primitive). And I think that the variable view may actually mean the same thing—combinators, after all, are just global variables which are particular functions. As for primitive types: if I'm thinking about this correctly, then I think that primitive types are the entities—the primitive objects that we're trying to prove things about.
According to my logic and semantics class, the fact that (a ∧ b) → c ≡ a → (b → c) (and also that b → (a → c)) is called the exportation equivalence law, at least in natural deduction proofs. I didn't notice at the time that it was just currying—I wish I had, because that's cool!
While we now have a way to represent inclusive disjunction, we don't have a way to represent the exclusive variety. We should be able to use the definition of exclusive disjunction to represent it: a ⊕ b ≡ (a ∨ b) ∧ ¬(a ∧ b). I don't know how to write negation, but I do know that ¬p ≡ p → ⊥, and both implication and falsehood are easy. We should thus able to represent exclusive disjunction by:
data ⊥
data Xor a b = Xor (Either a b) ((a,b) -> ⊥)
This defines ⊥ to be the empty type with no values, which corresponds to falsity; Xor is then defined to contain both (and) Either an a or a b (or) and a function (implication) from (a,b) (and) to the bottom type (false). However, I have no idea what this means. (Edit 1: Now I do, see the next paragraph!) Since there are no values of type (a,b) -> ⊥ (are there?), I can't fathom what this would mean in a program. Does anyone know a better way to think about either this definition or another one? (Edit 1: Yes, camccann.)
Edit 1: Thanks to camccann's answer (more particularly, the comments he left on it to help me out), I think I see what's going on here. To construct a value of type Xor a b, you need to provide two things. First, a witness to the existence of an element of either a or b as the first argument; that is, a Left a or a Right b. And second, a proof that there are not elements of both types a and b—in other words, a proof that (a,b) is uninhabited—as the second argument. Since you'll only be able to write a function from (a,b) -> ⊥ if (a,b) is uninhabited, what does it mean for that to be the case? That would mean that some part of an object of type (a,b) could not be constructed; in other words, that at least one, and possibly both, of a and b are uninhabited as well! In this case, if we're thinking about pattern matching, you couldn't possibly pattern-match on such a tuple: supposing that b is uninhabited, what would we write that could match the second part of that tuple? Thus, we cannot pattern match against it, which may help you see why this makes it uninhabited. Now, the only way to have a total function which takes no arguments (as this one must, since (a,b) is uninhabited) is for the result to be of an uninhabited type too—if we're thinking about this from a pattern-matching perspective, this means that even though the function has no cases, there's no possible body it could have either, and so everything's OK.
A lot of this is me thinking aloud/proving (hopefully) things on the fly, but I hope it's useful. I really recommend the Wikipedia article; I haven't read through it in any sort of detail, but its tables are a really nice summary, and it's very thorough.
Here's a slightly obscure one that I'm surprised wasn't brought up earlier: "classical" functional reactive programming corresponds to temporal logic.
Of course, unless you're a philosopher, mathematician or obsessive functional programmer, this probably brings up several more questions.
So, first off: what is functional reactive programming? It's a declarative way to work with time-varying values. This is useful for writing things like user interfaces because inputs from the user are values that vary over time. "Classical" FRP has two basic data types: events and behaviors.
Events represent values which only exist at discrete times. Keystrokes are a great example: you can think of the inputs from the keyboard as a character at a given time. Each keypress is then just a pair with the character of the key and the time it was pressed.
Behaviors are values that exist constantly but can be changing continuously. The mouse position is a great example: it is just a behavior of x, y coordinates. After all, the mouse always has a position and, conceptually, this position changes continually as you move the mouse. After all, moving the mouse is a single protracted action, not a bunch of discrete steps.
And what is temporal logic? Appropriately enough, it's a set of logical rules for dealing with propositions quantified over time. Essentially, it extends normal first-order logic with two quantifiers: □ and ◇. The first means "always": read □φ as "φ always holds". The second is "eventually": ◇φ means that "φ will eventually hold". This is a particular kind of modal logic. The following two laws relate the quantifiers:
□φ ⇔ ¬◇¬φ
◇φ ⇔ ¬□¬φ
So □ and ◇ are dual to each other in the same way as ∀ and ∃.
These two quantifiers correspond to the two types in FRP. In particular, □ corresponds to behaviors and ◇ corresponds to events. If we think about how these types are inhabited, this should make sense: a behavior is inhabited at every possible time, while an event only happens once.
Related to the relationship between continuations and double negation, the type of call/cc is Peirce's law http://en.wikipedia.org/wiki/Call-with-current-continuation
C-H is usually stated as correspondence between intuitionistic logic and programs. However if we add the call-with-current-continuation (callCC) operator (whose type corresponds to Peirce's law), we get a correspondence between classical logic and programs with callCC.
2-continuation | Sheffer stoke
n-continuation language | Existential graph
Recursion | Mathematical Induction
One thing that is important, but have not yet being investigated is the relationship of 2-continuation (continuations that takes 2 parameters) and Sheffer stroke. In classic logic, Sheffer stroke can form a complete logic system by itself (plus some non-operator concepts). Which means the familiar and, or, not can be implemented using only the Sheffer stoke or nand.
This is an important fact of its programming type correspondence because it prompts that a single type combinator can be used to form all other types.
The type signature of a 2-continuation is (a,b) -> Void. By this implementation we can define 1-continuation (normal continuations) as (a,a) -> Void, product type as ((a,b)->Void,(a,b)->Void)->Void, sum type as ((a,a)->Void,(b,b)->Void)->Void. This gives us an impressive of its power of expressiveness.
If we dig further, we will find out that Piece's existential graph is equivalent to a language with the only data type is n-continuation, but I didn't see any existing languages is in this form. So inventing one could be interesting, I think.
While it's not a simple isomorphism, this discussion of constructive LEM is a very interesting result. In particular, in the conclusion section, Oleg Kiselyov discusses how the use of monads to get double-negation elimination in a constructive logic is analogous to distinguishing computationally decidable propositions (for which LEM is valid in a constructive setting) from all propositions. The notion that monads capture computational effects is an old one, but this instance of the Curry--Howard isomorphism helps put it in perspective and helps get at what double-negation really "means".
First-class continuations support allows you to express $P \lor \neg P$.
The trick is based on the fact that not calling the continuation and exiting with some expression is equivalent to calling the continuation with that same expression.
For more detailed view please see: http://www.cs.cmu.edu/~rwh/courses/logic/www-old/handouts/callcc.pdf

How do I implement graphs and graph algorithms in a functional programming language?

Basically, I know how to create graph data structures and use Dijkstra's algorithm in programming languages where side effects are allowed. Typically, graph algorithms use a structure to mark certain nodes as 'visited', but this has side effects, which I'm trying to avoid.
I can think of one way to implement this in a functional language, but it basically requires passing around large amounts of state to different functions, and I'm wondering if there is a more space-efficient solution.
You might check out how Martin Erwig's Haskell functional graph library does things. For instance, its shortest-path functions are all pure, and you can see the source code for how it's implemented.
Another option, like fmark mentioned, is to use an abstraction which allows you to implement pure functions in terms of state. He mentions the State monad (which is available in both lazy and strict varieties). Another option, if you're working in the GHC Haskell compiler/interpreter (or, I think, any Haskell implementation which supports rank-2 types), another option is the ST monad, which allows you to write pure functions which deal with mutable variables internally.
If you were using haskell, the only functional language with which I am familiar, I would recommend using the State monad. The State monad is an abstraction for a function that takes a state and returns an intermediate value and some new state value. This is considered idiomatic haskell for those situations where maintaining a large state is necessary.
It is a much nicer alternative to the naive "return state as a function result and pass it as a parameter" idiom that is emphasized in beginner functional programming tutorials. I imagine most functional programming languages have a similar construct.
I just keep the visited set as a set and pass it as a parameter. There are efficient log-time implementations of sets of any ordered type and extra-efficient sets of integers.
To represent a graph I use adjacency lists, or I'll use a finite map that maps each node to a list of its successors. It depends what I want to do.
Rather than Abelson and Sussman, I recommend Chris Okasaki's Purely Functional Data Structures. I've linked to Chris's dissertation, but if you have the money, he expanded it into an excellent book.
Just for grins, here's a slightly scary reverse postorder depth-first search done in continuation-passing style in Haskell. This is straight out of the Hoopl optimizer library:
postorder_dfs_from_except :: forall block e . (NonLocal block, LabelsPtr e)
=> LabelMap (block C C) -> e -> LabelSet -> [block C C]
postorder_dfs_from_except blocks b visited =
vchildren (get_children b) (\acc _visited -> acc) [] visited
where
vnode :: block C C -> ([block C C] -> LabelSet -> a)
-> ([block C C] -> LabelSet -> a)
vnode block cont acc visited =
if setMember id visited then
cont acc visited
else
let cont' acc visited = cont (block:acc) visited in
vchildren (get_children block) cont' acc (setInsert id visited)
where id = entryLabel block
vchildren bs cont acc visited = next bs acc visited
where next children acc visited =
case children of [] -> cont acc visited
(b:bs) -> vnode b (next bs) acc visited
get_children block = foldr add_id [] $ targetLabels bloc
add_id id rst = case lookupFact id blocks of
Just b -> b : rst
Nothing -> rst
Here is a Swift example. You might find this a bit more readable. The variables are actually descriptively named, unlike the super cryptic Haskell examples.
https://github.com/gistya/Functional-Swift-Graph
Most functional languages support inner functions. So you can just create your graph representation in the outermost layer and just reference it from the inner function.
This book covers it extensively http://www.amazon.com/gp/product/0262510871/ref=pd_lpo_k2_dp_sr_1?ie=UTF8&cloe_id=aa7c71b1-f0f7-4fca-8003-525e801b8d46&attrMsgId=LPWidget-A1&pf_rd_p=486539851&pf_rd_s=lpo-top-stripe-1&pf_rd_t=201&pf_rd_i=0262011530&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=114DJE8K5BG75B86E1QS
I would love to hear about some really clever technique, but I think there are two fundamental approaches:
Modify some global state object. i.e. side-effects
Pass the graph as an argument to your functions with the return value being the modified graph. I assume this is your approach of "passing around large amounts of state"
That is what's done in functional programming. If the compiler/interpreter is any good, it will help manage memory for you. In particular, you'll want to make sure that you use tail recursion, if you happen to recurse in any of your functions.

Functional Programming for Basic Algorithms

How good is 'pure' functional programming for basic routine implementations, e.g. list sorting, string matching etc.?
It's common to implement such basic functions within the base interpreter of any functional language, which means that they will be written in an imperative language (c/c++). Although there are many exceptions..
At least, I wish to ask: How difficult is it to emulate imperative style while coding in 'pure' functional language?
How good is 'pure' functional
programming for basic routine
implementations, e.g. list sorting,
string matching etc.?
Very. I'll do your problems in Haskell, and I'll be slightly verbose about it. My aim is not to convince you that the problem can be done in 5 characters (it probably can in J!), but rather to give you an idea of the constructs.
import Data.List -- for `sort`
stdlistsorter :: (Ord a) => [a] -> [a]
stdlistsorter list = sort list
Sorting a list using the sort function from Data.List
import Data.List -- for `delete`
selectionsort :: (Ord a) => [a] -> [a]
selectionsort [] = []
selectionsort list = minimum list : (selectionsort . delete (minimum list) $ list)
Selection sort implementation.
quicksort :: (Ord a) => [a] -> [a]
quicksort [] = []
quicksort (x:xs) =
let smallerSorted = quicksort [a | a <- xs, a <= x]
biggerSorted = quicksort [a | a <- xs, a > x]
in smallerSorted ++ [x] ++ biggerSorted
Quick sort implementation.
import Data.List -- for `isInfixOf`
stdstringmatch :: (Eq a) => [a] -> [a] -> Bool
stdstringmatch list1 list2 = list1 `isInfixOf` list2
String matching using isInfixOf function from Data.list
It's common to implement such basic
functions within the base interpreter
of any functional language, which
means that they will be written in an
imperative language (c/c++). Although
there are many exceptions..
Depends. Some functions are more naturally expressed imperatively. However, I hope I have convinced you that some algorithms are also expressed naturally in a functional way.
At least, I wish to ask: How difficult
is it to emulate imperative style
while coding in 'pure' functional
language?
It depends on how hard you find Monads in Haskell. Personally, I find it quite difficult to grasp.
1) Good by what standard? What properties do you desire?
List sorting? Easy. Let's do Quicksort in Haskell:
sort [] = []
sort (x:xs) = sort (filter (< x) xs) ++ [x] ++ sort (filter (>= x) xs)
This code has the advantage of being extremely easy to understand. If the list is empty, it's sorted. Otherwise, call the first element x, find elements less than x and sort them, find elements greater than x and sort those. Then concatenate the sorted lists with x in the middle. Try making that look comprehensible in C++.
Of course, Mergesort is much faster for sorting linked lists, but the code is also 6 times longer.
2) It's extremely easy to implement imperative style while staying purely functional. The essence of imperative style is sequencing of actions. Actions are sequenced in a pure setting by using monads. The essence of monads is the binding function:
(Monad m) => (>>=) :: m a -> (a -> m b) -> m b
This function exists in C++, and it's called ;.
A sequence of actions in Haskell, for example, is written thusly:
putStrLn "What's your name?" >>=
const (getLine >>= \name -> putStrLn ("Hello, " ++ name))
Some syntax sugar is available to make this look more imperative (but note that this is the exact same code):
do {
putStrLn "What's your name?";
name <- getLine;
putStrLn ("Hello, " ++ name);
}
Nearly all functional programming languages have some construct to allow for imperative coding (like do in Haskell). There are many problem domains that can't be solved with "pure" functional programming. One of those is network protocols, for example where you need a series of commands in the right order. And such things don't lend themselves well to pure functional programming.
I have to agree with Lothar, though, that list sorting and string matching are not really examples you need to solve imperatively. There are well-known algorithms for such things and they can be implemented efficiently in functional languages already.
I think that 'algorithms' (e.g. method bodies and basic data structures) are where functional programming is best. Assuming nothing completely IO/state-dependent, functional programming excels are authoring algorithms and data structures, often resulting in shorter/simpler/cleaner code than you'd get with an imperative solution. (Don't emulate imperative style, FP style is better for most of these kinds of tasks.)
You want imperative stuff sometimes to deal with IO or low-level performance, and you want OOP for partitioning the high-level design and architecture of a large program, but "in the small" where you write most of your code, FP is a win.
See also
How does functional programming affect the structure of your code?
It works pretty well the other way round emulating functional with imperative style.
Remember that the internal of an interpreter or VM ware so close to metal and performance critical that you should even consider going to assember level and count the the clock cycles for each instruction (like Smalltalk Dophin is just doing it and the results are impressive).
CPU's are imperative.
But there is no problem to do all the basic algorithm implementation - the one you mention are NOT low level - they are basics.
I don't know about list sorting, but you'd be hard pressed to bootstrapp a language without some kind of string matching in the compiler or runtime. So you need that routine to create the language. As there isn't a great deal of point writing the same code twice, when you create the library for matching strings within the language, you call the code written earlier. The degree to which this happens in successive releases will depend on how self hosting the language is, but unless that's a strong design goal there won't be any reason to change it.

Resources