Why is `join` also known as `mu`? - functional-programming

I noticed in the Idris documentation that join is also known as flatten and mu.
Idris> :doc join
Prelude.Monad.join : Monad m => m (m a) -> m a
Also called flatten or mu
The function is Total
IIRC, mu (or μ) is used as a binder for recursive data types. I've not seen mu in this context before. What's the background on that?

The very short answer: category theory.
The medium length answer:
If you look at the formal definition of a Monad on Wikipedia (sorry, not copying it over since there is no LaTeX support on SO), you'll see "mu" and "eta" used as the two "natural transformations", where mu is the one from T2 to T (which makes sense if you look at the type signature, which goes from (m (m a) to m a).
The in-depth answer: Monads Made Difficult (see "Natural Transformations")

Related

Decidable equality in Agda with less than n^2 cases?

I've got a datatype for the AST of a programming langauge that I'd like to reason about, but there are about 10 different constructors for the AST.
data Term : Set where
UnitTerm : Term
VarTerm : Var -> Term
...
SeqTerm : Term -> Term -> Term
I'm trying to write a function that has decidable equality for syntax trees of this language. In theory this is straightforward: there's nothing too complicated, it's just simple data being stored in the AST.
The problem is that writing such a function seems to require about 100 cases: for each constructor, there's 10 cases.
eqDecide : (x : Term) -> (y : Term) -> Dec (x ≡ y)
eqDecide UnitTerm UnitTerm = yes refl
eqDecide UnitTerm (VarTerm x) = Generic.no (λ ())
...
eqDecide UnitTerm (SeqTerm t1 t2) = Generic.no (λ ())
EqDecide (VarTerm x) UnitTerm = Generic.no (λ ())
...
The problem is, there are a bunch of redundant cases. After the first pattern match where the constructors match, ideally I could match with underscore, since no possible other constructor could unify, but it doesn't appear that I can do that.
I've tried and failed to use this library to derive the equality: I'm running into problems with strict positivity, as well as getting some general errors that I'm are pretty hard to debug. The Agda Prelude also has some facility for this, but looks pretty immature, and is lacking some things that I need from the standard library.
How do people do decidable equality in practice? Do they suck it up and just write all 100 cases, or is there a trick I'm missing? Is this just a place where the newness of the language is showing through?
If you want to avoid using reflection and still prove decidable equality in a linear number of cases, you could try the following approach:
Define a function embed : Term → Nat (or to some other type for which decidable equality is easier to prove such as labelled trees).
Prove that your function is indeed injective.
Make use of the fact that your function is injective together with decidable equality on the result type to conclude decidable equality on Terms (see for example via-injection in the module Relation.Nullary.Decidable in the standard library).

Convergence and vectors theories

Is there a convergence theory in Isabelle/HOL? I need to define ∥x(t)∥ ⟶ 0 as t ⟶ ∞.
Also, I'm looking for vectors theory, I found a matrix theory but I couldn't find the vectors one, Is there exist such theory in Isabelle/HOL?
Cheers.
Convergence etc. are expressed with filters in Isabelle. (See the corresponding documentation)
In your case, that would be something like
filterlim (λt. norm (x t)) (nhds 0) at_top
or, using the tendsto abbreviation,
((λt. norm (x t)) ⤏ 0) at_top
where ⤏ is the Isabelle symbol \<longlongrightarrow>, which can be input using the abbreviation --->.
As a side note, I am wondering why you are writing it that way in the first place, seeing as it is equivalent to
filterlim x (nhds 0) at_top
or, with the tendsto syntax:
(x ⤏ 0) at_top
Reasoning with these filters can be tricky at first, but it has the advantage of providing a unified framework for limits and other topological concepts, and once you get the hang of it, it is very elegant.
As for vectors, just import ~~/src/HOL/Analysis/Analysis. That should have everything you need. Ideally, build the HOL-Analysis session image by starting Isabelle/jEdit with isabelle jedit -l HOL-Analysis. Then you won't have to process all of Isabelle's analysis library every time you start the system.
I assume that by ‘vectors’ you mean concrete finite-dimensional real vector spaces like ℝn. This is provided by ~~/src/HOL/Analysis/Finite_Cartesian_Product.thy, which is part of HOL-Analysis. This provides the vec type, which takes two parameters: the component type (probably real in your case) and the index type, which specifies the dimension of the vector space.
There is also a pre-defined type n for every positive integer n, so that you can write e.g. (real, 3) vec for the vector space ℝ³. There is also type syntax so that you can write 'a ^ 'n for ('a, 'n) vec.

Isabelle/HOL foundations

I have seen a lot of documentation about Isabelle's syntax and proof strategies. However, little have I found about its foundations. I have a few questions that I would be very grateful if someone could take the time to answer:
Why doesn't Isabelle/HOL admit functions that do not terminate? Many other languages such as Haskell do admit non-terminating functions.
What symbols are part of Isabelle's meta-language? I read that there are symbols in the meta-language for Universal Quantification (/\) and for implication (==>). However, these symbols have their counterpart in the object-level language (∀ and -->). I understand that --> is an object-level function of type bool => bool => bool. However, how are ∀ and ∃ defined? Are they object-level Boolean functions? If so, they are not computable (considering infinite domains). I noticed that I am able to write Boolean functions in therms of ∀ and ∃, but they are not computable. So what are ∀ and ∃? Are they part of the object-level? If so, how are they defined?
Are Isabelle theorems just Boolean expressions? Then Booleans are part of the meta-language?
As far as I know, Isabelle is a strict programming language. How can I use infinite objects? Let's say, infinite lists. Is it possible in Isabelle/HOL?
Sorry if these questions are very basic. I do not seem to find a good tutorial on Isabelle's meta-theory. I would love if someone could recommend me a good tutorial on these topics.
Thank you very much.
You can define non-terminating (i.e. partial) functions in Isabelle (cf. Function package manual (section 8)). However, partial functions are more difficult to reason about, because whenever you want to use its definition equations (the psimps rules, which replace the simps rules of a normal function), you have to show that the function terminates on that particular input first.
In general, things like non-definedness and non-termination are always problematic in a logic – consider, for instance, the function ‘definition’ f x = f x + 1. If we were to take this as an equation on ℤ (integers), we could subtract f x from both sides and get 0 = 1. In Haskell, this problem is ‘solved’ by saying that this is not an equation on ℤ, but rather on ℤ ∪ {⊥} (the integers plus bottom) and the non-terminating function f evaluates to ⊥, and ‘⊥ + 1 = ⊥’, so everything works out fine.
However, if every single expression in your logic could potentially evaluate to ⊥ instead of a ‘proper‘ value, reasoning in this logic will become very tedious. This is why Isabelle/HOL chooses to restrict itself to total functions; things like partiality have to be emulated with things like undefined (which is an arbitrary value that you know nothing about) or option types.
I'm not an expert on Isabelle/Pure (the meta logic), but the most important symbols are definitely
⋀ (the universal meta quantifier)
⟹ (meta implication)
≡ (meta equality)
&&& (meta conjunction, defined in terms of ⟹)
Pure.term, Pure.prop, Pure.type, Pure.dummy_pattern, Pure.sort_constraint, which fulfil certain internal functions that I don't know much about.
You can find some information on this in the Isabelle/Isar Reference Manual in section 2.1, and probably more elsewhere in the manual.
Everything else (that includes ∀ and ∃, which indeed operate on boolean expressions) is defined in the object logic (HOL, usually). You can find the definitions, of rather the axiomatisations, in ~~/src/HOL/HOL.thy (where ~~ denotes the Isabelle root directory):
All_def: "All P ≡ (P = (λx. True))"
Ex_def: "Ex P ≡ ∀Q. (∀x. P x ⟶ Q) ⟶ Q"
Also note that many, if not most Isabelle functions are typically not computable. Isabelle is not a programming language, although it does have a code generator that allows exporting Isabelle functions as code to programming languages as long as you can give code equations for all the functions involved.
3)
Isabelle theorems are a complex datatype (cf. ~~/src/Pure/thm.ML) containing a lot of information, but the most important part, of course, is the proposition. A proposition is something from Isabelle/Pure, which in fact only has propositions and functions. (and itself and dummy, but you can ignore those).
Propositions are not booleans – in fact, there isn't even a way to state that a proposition does not hold in Isabelle/Pure.
HOL then defines (or rather axiomatises) booleans and also axiomatises a coercion from booleans to propositions: Trueprop :: bool ⇒ prop
Isabelle is not a programming language, and apart from that, totality does not mean you have to restrict yourself to finite structures. Even in a total programming language, you can have infinite lists. (cf. Idris's codata)
Isabelle is a theorem prover, and logically, infinite objects can be treated by axiomatising them and then reasoning about them using the axioms and rules that you have.
For instance, HOL assumes the existence of an infinite type and defines the natural numbers on that. That already gives you access to functions nat ⇒ 'a, which are essentially infinite lists.
You can also define infinite lists and other infinite data structures as codatatypes with the (co-)datatype package, which is based on bounded natural functors.
Let me add some points to two of your questions.
1) Why doesn't Isabelle/HOL admit functions that do not terminate? Many other languages such as Haskell do admit non-terminating functions.
In short: Isabelle/HOL does not require termination, but totality (i.e., there is a specific result for each input to the function) of functions. Totality does not mean that a function is actually terminating when transcribed to a (functional) programming language or even that it is computable at all.
Therefore, talking about termination is somewhat misleading, even though it is encouraged by the fact that Isabelle/HOL's function package uses the keyword termination for proving some property P about which I will have to say a little more below.
On the one hand the term "termination" might sound more intuitive to a wider audience. On the other hand, a more precise description of P would be well-foundedness of the function's call graph.
Don't get me wrong, termination is not really a bad name for the property P, it is even justified by the fact that many techniques that are implemented in the function package are very close to termination techniques from term rewriting or functional programming (like the size-change principle, dependency pairs, lexicographic orders, etc.).
I'm just saying that it can be misleading. The answer to why that is the case also touches on question 4 of the OP.
4) As far as I know Isabelle is a strict programming language. How can I use infinite objects? Let's say, infinite lists. Is it possible in Isabelle/HOL?
Isabelle/HOL is not a programming language and it specifically does not have any evaluation strategy (we could alternatively say: it has any evaluation strategy you like).
And here is why the word termination is misleading (drum roll): if there is no evaluation strategy and we have termination of a function f, people might expect f to terminate independent of the used strategy. But this is not the case. A termination proof of a function rather ensures that f is well-defined. Even if f is computable a proof of P merely ensures that there is an evaluation strategy for which f terminates.
(As an aside: what I call "strategy" here, is typically influenced by so called cong-rules (i.e., congruence rules) in Isabelle/HOL.)
As an example, it is trivial to prove that the function (see Section 10.1 Congruence rules and evaluation order in the documentation of the function package):
fun f' :: "nat ⇒ bool"
where
"f' n ⟷ f' (n - 1) ∨ n = 0"
terminates (in the sense defined by termination) after adding the cong-rule:
lemma [fundef_cong]:
"Q = Q' ⟹ (¬ Q' ⟹ P = P') ⟹ (P ∨ Q) = (P' ∨ Q')"
by auto
Which essentially states that logical-or should be "evaluated" from right to left. However, if you write the same function e.g. in OCaml it causes a stack overflow ...
EDIT: this answer is not really correct, check out Lars' comment below.
Unfortunately I don't have enough reputation to post this as a comment, so here is my go at an answer (please bear in mind I am no expert in Isabelle, but I also had similar questions once):
1) The idea is to prove statements about the defined functions. I am not sure how familiar you are with Computability Theory, but think about the Halting Problem and the fact most undeciability problems stem from it (such as Acceptance Problem). Imagine defining a function which you can't prove it terminates. How could you then still prove it returns the number 42 when given input "ABC" and it doesn't go in an infinite loop?
If instead you limit yourself to terminating functions, you can prove much more about them, essentially making a trade-off (or at least this is how I see it).
These ideas stem from Constructivism and Intuitionism and I recommend you check out Robert Harper's very interesting lecture series: https://www.youtube.com/watch?v=9SnefrwBIDc&list=PLGCr8P_YncjXRzdGq2SjKv5F2J8HUFeqN on Type Theory
You should check out especially the part about the absence of the Law of Excluded middle: http://youtu.be/3JHTb6b1to8?t=15m34s
2) See Manuel's answer.
3,4) Again see Manuel's answer keeping in mind Intuitionistic logic: "the fundamental entity is not the boolean, but rather the proof that something is true".
For me it took a long time to get adjusted to this way of thinking and I'm still not sure I understand it. I think the key though is to understand it is a more-or-less completely different way of thinking.

Idris vectors vs linked lists

Does Idris do any kind of optimization under the hood of vectors? Because from the looks of it, an Idris vector is just a linked list with known size (known at compile time). In fact, in general it seems like you could express the following equivalence (I'm guessing a bit at the syntax):
Vector : Nat -> Type -> Type
Vector n t = (l: List t ** length l = n)
So while this is nice in the sense of preventing range errors, the real advantage of vectors (in the traditional usage of the term) is in terms of performance; in particular, O(1) random access. It seems that the idris vector would not support this (how would you write the indexing function to have this performance?).
Assuming that there's no wizardry going on under the hood (as happens with Nat) to reconfigure Vectors, is there a random-access data type in Idris?
How would be/is such a thing defined in an algebraic type system? Certainly it seems like it would be impossible to define it inductively.
Is it possible, within a type system like that of Idris, to create a data type which supports O(1) random access, and is aware of its length such that all access is provably valid? (Haskell has array-style Vectors, but their concrete implementation is opaque to the average user, including me)
It doesn't do anything to optimise Vector lookups (at the time of writing this answer, at least).
This isn't because of any difficulty in doing it, really, but more because I would rather have some kind of general framework for writing this kind of optimisation than hard coding lots of them. Admittedly, we already have hard coded optimisations for Nat, but I still would prefer not to add loads more in an ad-hoc fashion.
Depending on what you actually want it for, it might be that the experimental uniqueness type system will help, in that you could have a low level mutable thing under the hood and still have safe and efficient access and update in a pure style in the high level language. We'll see...
Edwin has the definitive answers on what Idris currently does. However, if you are looking for something which might be natural to optimize into constant-time lookup in some cases, the following might be a step in the right direction.
For compile-time fixed-size vectors (i.e., not under a lambda, not parameterized by length at top-level), the following structure gives you vectors and lookup functions that, for any fixed concrete length, can be compile-time normalized to functions that should be somewhat straightforwardly optimizable into constant-time functions. (Sorry, code is in Coq; I don't have a working version of Idris at the moment, and don't know it well. I'm happy to replace this with Idris code if someone suggests the right syntax, e.g., in a comment.)
Fixpoint vector (n : nat) (A : Type) :=
match n return Type with
| 0 => unit
| S n' => (A * vector n' A)%type
end.
Definition nil {A} : vector 0 A := tt.
Definition cons {n} {A : Prop} (x : A) (xs : vector n A) : vector (S n) A
:= (x, xs).
Fixpoint get {n} {A : Prop} (m : nat) (default : A) (v : vector n A) {struct n} : A
:= match n as n return vector n A -> A with
| 0 => fun _ => default
| S n' => match m with
| 0 => fun v => fst v
| S m' => fun v => #get n' A m' default (snd v)
end
end v.
The idea is that, for any fixed n, the normal form of get is non-recursive, so the compiler could, hypothetically, compile it into a function whose runtime is independent of what n happens to be.

What is so special about Monads?

A monad is a mathematical structure which is heavily used in (pure) functional programming, basically Haskell. However, there are many other mathematical structures available, like for example applicative functors, strong monads, or monoids. Some have more specific, some are more generic. Yet, monads are much more popular. Why is that?
One explanation I came up with, is that they are a sweet spot between genericity and specificity. This means monads capture enough assumptions about the data to apply the algorithms we typically use and the data we usually have fulfills the monadic laws.
Another explanation could be that Haskell provides syntax for monads (do-notation), but not for other structures, which means Haskell programmers (and thus functional programming researchers) are intuitively drawn towards monads, where a more generic or specific (efficient) function would work as well.
I suspect that the disproportionately large attention given to this one particular type class (Monad) over the many others is mainly a historical fluke. People often associate IO with Monad, although the two are independently useful ideas (as are list reversal and bananas). Because IO is magical (having an implementation but no denotation) and Monad is often associated with IO, it's easy to fall into magical thinking about Monad.
(Aside: it's questionable whether IO even is a monad. Do the monad laws hold? What do the laws even mean for IO, i.e., what does equality mean? Note the problematic association with the state monad.)
If a type m :: * -> * has a Monad instance, you get Turing-complete composition of functions with type a -> m b. This is a fantastically useful property. You get the ability to abstract various Turing-complete control flows away from specific meanings. It's a minimal composition pattern that supports abstracting any control flow for working with types that support it.
Compare this to Applicative, for instance. There, you get only composition patterns with computational power equivalent to a push-down automaton. Of course, it's true that more types support composition with more limited power. And it's true that when you limit the power available, you can do additional optimizations. These two reasons are why the Applicative class exists and is useful. But things that can be instances of Monad usually are, so that users of the type can perform the most general operations possible with the type.
Edit:
By popular demand, here are some functions using the Monad class:
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM c x y = c >>= \z -> if z then x else y
whileM :: Monad m => (a -> m Bool) -> (a -> m a) -> a -> m a
whileM p step x = ifM (p x) (step x >>= whileM p step) (return x)
(*&&) :: Monad m => m Bool -> m Bool -> m Bool
x *&& y = ifM x y (return False)
(*||) :: Monad m => m Bool -> m Bool -> m Bool
x *|| y = ifM x (return True) y
notM :: Monad m => m Bool -> m Bool
notM x = x >>= return . not
Combining those with do syntax (or the raw >>= operator) gives you name binding, indefinite looping, and complete boolean logic. That's a well-known set of primitives sufficient to give Turing completeness. Note how all the functions have been lifted to work on monadic values, rather than simple values. All monadic effects are bound only when necessary - only the effects from the chosen branch of ifM are bound into its final value. Both *&& and *|| ignore their second argument when possible. And so on..
Now, those type signatures may not involve functions for every monadic operand, but that's just a cognitive simplification. There would be no semantic difference, ignoring bottoms, if all the non-function arguments and results were changed to () -> m a. It's just friendlier to users to optimize that cognitive overhead out.
Now, let's look at what happens to those functions with the Applicative interface.
ifA :: Applicative f => f Bool -> f a -> f a -> f a
ifA c x y = (\c' x' y' -> if c' then x' else y') <$> c <*> x <*> y
Well, uh. It got the same type signature. But there's a really big problem here already. The effects of both x and y are bound into the composed structure, regardless of which one's value is selected.
whileA :: Applicative f => (a -> f Bool) -> (a -> f a) -> a -> f a
whileA p step x = ifA (p x) (whileA p step <$> step x) (pure x)
Well, ok, that seems like it'd be ok, except for the fact that it's an infinite loop because ifA will always execute both branches... Except it's not even that close. pure x has the type f a. whileA p step <$> step x has the type f (f a). This isn't even an infinite loop. It's a compile error. Let's try again..
whileA :: Applicative f => (a -> f Bool) -> (a -> f a) -> a -> f a
whileA p step x = ifA (p x) (whileA p step <*> step x) (pure x)
Well shoot. Don't even get that far. whileA p step has the type a -> f a. If you try to use it as the first argument to <*>, it grabs the Applicative instance for the top type constructor, which is (->), not f. Yeah, this isn't gonna work either.
In fact, the only function from my Monad examples that would work with the Applicative interface is notM. That particular function works just fine with only a Functor interface, in fact. The rest? They fail.
Of course it's to be expected that you can write code using the Monad interface that you can't with the Applicative interface. It is strictly more powerful, after all. But what's interesting is what you lose. You lose the ability to compose functions that change what effects they have based on their input. That is, you lose the ability to write certain control-flow patterns that compose functions with types a -> f b.
Turing-complete composition is exactly what makes the Monad interface interesting. If it didn't allow Turing-complete composition, it would be impossible for you, the programmer, to compose together IO actions in any particular control flow that wasn't nicely prepackaged for you. It was the fact that you can use the Monad primitives to express any control flow that made the IO type a feasible way to manage the IO problem in Haskell.
Many more types than just IO have semantically valid Monad interfaces. And it happens that Haskell has the language facilities to abstract over the entire interface. Due to those factors, Monad is a valuable class to provide instances for, when possible. Doing so gets you access to all the existing abstract functionality provided for working with monadic types, regardless of what the concrete type is.
So if Haskell programmers seem to always care about Monad instances for a type, it's because it's the most generically-useful instance that can be provided.
First, I think that it is not quite true that monads are much more popular than anything else; both Functor and Monoid have many instances that are not monads. But they are both very specific; Functor provides mapping, Monoid concatenation. Applicative is the one class that I can think of that is probably underused given its considerable power, due largely to its being a relatively recent addition to the language.
But yes, monads are extremely popular. Part of that is the do notation; a lot of Monoids provide Monad instances that merely append values to a running accumulator (essentially an implicit writer). The blaze-html library is a good example. The reason, I think, is the power of the type signature (>>=) :: Monad m => m a -> (a -> m b) -> m b. While fmap and mappend are useful, what they can do is fairly narrowly constrained. bind, however, can express a wide variety of things. It is, of course, canonized in the IO monad, perhaps the best pure functional approach to IO before streams and FRP (and still useful beside them for simple tasks and defining components). But it also provides implicit state (Reader/Writer/ST), which can avoid some very tedious variable passing. The various state monads, especially, are important because they provide a guarantee that state is single threaded, allowing mutable structures in pure (non-IO) code before fusion. But bind has some more exotic uses, such as flattening nested data structures (the List and Set monads), both of which are quite useful in their place (and I usually see them used desugared, calling liftM or (>>=) explicitly, so it is not a matter of do notation). So while Functor and Monoid (and the somewhat rarer Foldable, Alternative, Traversable, and others) provide a standardized interface to a fairly straightforward function, Monad's bind is considerably more flexibility.
In short, I think that all your reasons have some role; the popularity of monads is due to a combination of historical accident (do notation and the late definition of Applicative) and their combination of power and generality (relative to functors, monoids, and the like) and understandability (relative to arrows).
Well, first let me explain what the role of monads is: Monads are very powerful, but in a certain sense: You can pretty much express anything using a monad. Haskell as a language doesn't have things like action loops, exceptions, mutation, goto, etc. Monads can be expressed within the language (so they are not special) and make all of these reachable.
There is a positive and a negative side to this: It's positive that you can express all those control structures you know from imperative programming and a whole bunch of them you don't. I have just recently developed a monad that lets you reenter a computation somewhere in the middle with a slightly changed context. That way you can run a computation, and if it fails, you just try again with slightly adjusted values. Furthermore monadic actions are first class, and that's how you build things like loops or exception handling. While while is primitive in C in Haskell it's actually just a regular function.
The negative side is that monads give you pretty much no guarantees whatsoever. They are so powerful that you are allowed to do whatever you want, to put it simply. In other words just like you know from imperative languages it can be hard to reason about code by just looking at it.
The more general abstractions are more general in the sense that they allow some concepts to be expressed which you can't express as monads. But that's only part of the story. Even for monads you can use a style known as applicative style, in which you use the applicative interface to compose your program from small isolated parts. The benefit of this is that you can reason about code by just looking at it and you can develop components without having to pay attention to the rest of your system.
What is so special about monads?
The monadic interface's main claim to fame in Haskell is its role in the replacement of the original and unwieldy dialogue-based I/O mechanism.
As for their status in a formal investigative context...it is merely an iteration of a seemingly-cyclic endeavour which is now (2021 Oct) approximately one half-century old:
During the 1960s, several researchers began work on proving things about programs. Efforts were
made to prove that:
A program was correct.
Two programs with different code computed the same answers when given the
same inputs.
One program was faster than another.
A given program would always terminate.
While these are abstract goals, they are all, really, the same as the practical goal of "getting the
program debugged".
Several difficult problems emerged from this work. One was the problem of specification: before
one can prove that a program is correct, one must specify the meaning of "correct", formally and
unambiguously. Formal systems for specifying the meaning of a program were developed, and they
looked suspiciously like programming languages.
The Anatomy of Programming Languages, Alice E. Fischer and Frances S. Grodzinsky.
(emphasis by me.)
...back when "programming languages" - apart from an intrepid few - were most definitely imperative.
Anyone for elevating this mystery to the rank of Millenium problem? Solving it would definitely advance the science of computing and the engineering of software, one way or the other...
Monads are special because of do notation, which lets you write imperative programs in a functional language. Monad is the abstraction that allows you to splice together imperative programs from smaller, reusable components (which are themselves imperative programs). Monad transformers are special because they represent enhancing an imperative language with new features.

Resources