Difference between Definition and Let in Coq - functional-programming

What is the difference between a Defintion and 'Let' in Coq? Why do some definitions require proofs?
For eg. This is a piece of code from g1.v in Group theory.
Definition exp : Z -> U -> U.
Proof.
intros n a.
elim n; clear n.
exact e.
intro n.
elim n; clear n.
exact a.
intros n valrec.
exact (star a valrec).
intro n; elim n; clear n.
exact (inv a).
intros n valrec.
exact (star (inv a) valrec).
Defined.
What is the aim of this proof?

I think what you're asking isn't really related to the difference between the Definition and Let commands in Coq. Instead, you seem to be wondering about why some definitions in Coq contain proof scripts.
One interesting feature of Coq is that the language that one uses for writing proofs and programs is actually the same. This language is known as Gallina, which is the programming language people work with when using Coq. When you write something like fun x => x + 5, that is a program in Gallina.
When doing proofs, however, people usually use another language, called Ltac. This is the language that appears in your exp example. This could lead you to believe that proofs in Coq are represented in a different language, but this is not true: what Ltac scripts do is to actually build proof terms in Gallina. You can see that by using the Print command, e.g.
Print exp.
The reason for having a separate language for writing proofs, even if proofs and programs are written in the same language, is that Gallina is a bit hard to use directly when writing proofs. Try using the Print command directly over a complicated theorem to see how hard that can be.
Now, even though Ltac is mostly meant for writing proofs, nothing forbids you from using it to write normal programs, since the end product is the same: a Gallina term. Usually, people prefer to use Gallina when writing programs because it is easier to read. However, people might resort to Ltac for writing programs when doing it directly in Gallina would be too cumbersome. I personally would prefer to use Gallina directly for writing functions such as exp in your example, although that's arguably a matter of taste.

Related

How do you lookup the definition or implementation of Coq proof tactics?

I am looking at this:
Theorem eq_add_1 : forall n m,
n + m == 1 -> n == 1 /\ m == 0 \/ n == 0 /\ m == 1.
Proof.
intros n m. rewrite one_succ. intro H.
assert (H1 : exists p, n + m == S p) by now exists 0.
apply eq_add_succ in H1. destruct H1 as [[n' H1] | [m' H1]].
left. rewrite H1 in H; rewrite add_succ_l in H; apply succ_inj in H.
apply eq_add_0 in H. destruct H as [H2 H3]; rewrite H2 in H1; now split.
right. rewrite H1 in H; rewrite add_succ_r in H; apply succ_inj in H.
apply eq_add_0 in H. destruct H as [H2 H3]; rewrite H3 in H1; now split.
Qed.
How do I find what the thing like intros or destruct mean exactly, like looking up an implementation (or if not possible, a definition)? What is the way to do this efficiently?
The answer differs for primitive and user-defined tactics. However, the proof script you show uses almost no user-defined tactics, except now, which is a notation for the easy tactic.
If you're not sure if a tactic is primitive, try both; checking the manual might be the simplest first step.
For user-defined tactics.
For tactics defined as Ltac foo args := body. you can use Print Ltac foo (e.g. Print Ltac easy.). AFAIK, that does not work for tactics defined by Tactic Notation. In both cases, I prefer to look at the sources (which I find via grep).
For primitive tactics
There is the Coq reference manual (https://coq.inria.fr/distrib/current/refman/coq-tacindex.html), which does not have complete specification but is usually the closest approximation. It’s not very accessible, so you should first refer to one of the many Coq tutorials or introductions, like Software Foundations.
There is the actual Coq implementation, but that’s not very helpful unless you’re a Coq implementer.
As Blaisorblade mentioned, it can be difficult to understand exactly what tactics are doing, and it is easier to look at the reference manual to find out how to use them. However, at a conceptual level, tactics are not that complicated. Via the Curry-Howard correspondence, Coq proofs are represented using the same functional language you use to write regular programs. Tactics like intros or destruct are just metaprograms that write programs in this language.
For instance, suppose that you need to prove A -> B. This means that you need to write a program with a function type A -> B. When you write intros H., Coq builds a partial proof fun (H : A) => ?, where the ? denotes a hole that should have type B. Similarly, destruct adds a match expression to your proof to do case analysis, and asks you to produce proofs for each match branch. As you add more tactics, you keep filling in these holes until you have a complete proof.
The language of Coq is quite minimal, so there is only so much that tactics can do to build a proof: function application and abstraction, constructors, pattern matching, etc. In principle, it would be possible to have only a handful of tactics, one for each syntactic construct in Coq, and this would allow you to build any proof! The reason we don't do this is that using these core constructs directly is too low level, and tactics use automated proof search, unification and other features to simplify the process of writing a proof.

Is there a prover just for propositional logic

I tried to implement LTL logic syntactically using the axiomatization command, with the purpose of automatically finding proofs for theorems (motivation of proving program properties).
However the automatic provers such as (cvc4, z3, e, etc) all use quantifiers of some sort. For example using FOL one could prove F(p)-->G(p) which is obviously false.
My question is if there exists a prover, just like the ones mentioned, but that is made for propositional logic, i.e. only has access to MP and the propositional logic axioms.
I am rather new to isabelle so there might be an easier way of doing this im not seeing.
EDIT: I am looking for a hilbert style deduction prover and not a SAT as this would defeat the problem of implementing it axiomatically
I think the sat method only uses propositional logic.
However, I would recommend not to use axiomatizations and just define the syntax of LTL using datatypes and the semantics using functions. Maybe you can reuse the formalization from https://www.isa-afp.org/entries/LTL.html
Without axiomatizations you are then free to use any method.
What you want is a SAT solver, such as minisat.
However the automatic provers such as (cvc4, z3, e, etc) all use quantifiers of some sort. For example using FOL one could prove F(p)-->G(p) which is obviously false.
This is not correct. Any first-order theorem prover, like iProver, E, Vampire, will not prove forall X. f(X) => g(x).

Isabelle/HOL foundations

I have seen a lot of documentation about Isabelle's syntax and proof strategies. However, little have I found about its foundations. I have a few questions that I would be very grateful if someone could take the time to answer:
Why doesn't Isabelle/HOL admit functions that do not terminate? Many other languages such as Haskell do admit non-terminating functions.
What symbols are part of Isabelle's meta-language? I read that there are symbols in the meta-language for Universal Quantification (/\) and for implication (==>). However, these symbols have their counterpart in the object-level language (∀ and -->). I understand that --> is an object-level function of type bool => bool => bool. However, how are ∀ and ∃ defined? Are they object-level Boolean functions? If so, they are not computable (considering infinite domains). I noticed that I am able to write Boolean functions in therms of ∀ and ∃, but they are not computable. So what are ∀ and ∃? Are they part of the object-level? If so, how are they defined?
Are Isabelle theorems just Boolean expressions? Then Booleans are part of the meta-language?
As far as I know, Isabelle is a strict programming language. How can I use infinite objects? Let's say, infinite lists. Is it possible in Isabelle/HOL?
Sorry if these questions are very basic. I do not seem to find a good tutorial on Isabelle's meta-theory. I would love if someone could recommend me a good tutorial on these topics.
Thank you very much.
You can define non-terminating (i.e. partial) functions in Isabelle (cf. Function package manual (section 8)). However, partial functions are more difficult to reason about, because whenever you want to use its definition equations (the psimps rules, which replace the simps rules of a normal function), you have to show that the function terminates on that particular input first.
In general, things like non-definedness and non-termination are always problematic in a logic – consider, for instance, the function ‘definition’ f x = f x + 1. If we were to take this as an equation on ℤ (integers), we could subtract f x from both sides and get 0 = 1. In Haskell, this problem is ‘solved’ by saying that this is not an equation on ℤ, but rather on ℤ ∪ {⊥} (the integers plus bottom) and the non-terminating function f evaluates to ⊥, and ‘⊥ + 1 = ⊥’, so everything works out fine.
However, if every single expression in your logic could potentially evaluate to ⊥ instead of a ‘proper‘ value, reasoning in this logic will become very tedious. This is why Isabelle/HOL chooses to restrict itself to total functions; things like partiality have to be emulated with things like undefined (which is an arbitrary value that you know nothing about) or option types.
I'm not an expert on Isabelle/Pure (the meta logic), but the most important symbols are definitely
⋀ (the universal meta quantifier)
⟹ (meta implication)
≡ (meta equality)
&&& (meta conjunction, defined in terms of ⟹)
Pure.term, Pure.prop, Pure.type, Pure.dummy_pattern, Pure.sort_constraint, which fulfil certain internal functions that I don't know much about.
You can find some information on this in the Isabelle/Isar Reference Manual in section 2.1, and probably more elsewhere in the manual.
Everything else (that includes ∀ and ∃, which indeed operate on boolean expressions) is defined in the object logic (HOL, usually). You can find the definitions, of rather the axiomatisations, in ~~/src/HOL/HOL.thy (where ~~ denotes the Isabelle root directory):
All_def: "All P ≡ (P = (λx. True))"
Ex_def: "Ex P ≡ ∀Q. (∀x. P x ⟶ Q) ⟶ Q"
Also note that many, if not most Isabelle functions are typically not computable. Isabelle is not a programming language, although it does have a code generator that allows exporting Isabelle functions as code to programming languages as long as you can give code equations for all the functions involved.
3)
Isabelle theorems are a complex datatype (cf. ~~/src/Pure/thm.ML) containing a lot of information, but the most important part, of course, is the proposition. A proposition is something from Isabelle/Pure, which in fact only has propositions and functions. (and itself and dummy, but you can ignore those).
Propositions are not booleans – in fact, there isn't even a way to state that a proposition does not hold in Isabelle/Pure.
HOL then defines (or rather axiomatises) booleans and also axiomatises a coercion from booleans to propositions: Trueprop :: bool ⇒ prop
Isabelle is not a programming language, and apart from that, totality does not mean you have to restrict yourself to finite structures. Even in a total programming language, you can have infinite lists. (cf. Idris's codata)
Isabelle is a theorem prover, and logically, infinite objects can be treated by axiomatising them and then reasoning about them using the axioms and rules that you have.
For instance, HOL assumes the existence of an infinite type and defines the natural numbers on that. That already gives you access to functions nat ⇒ 'a, which are essentially infinite lists.
You can also define infinite lists and other infinite data structures as codatatypes with the (co-)datatype package, which is based on bounded natural functors.
Let me add some points to two of your questions.
1) Why doesn't Isabelle/HOL admit functions that do not terminate? Many other languages such as Haskell do admit non-terminating functions.
In short: Isabelle/HOL does not require termination, but totality (i.e., there is a specific result for each input to the function) of functions. Totality does not mean that a function is actually terminating when transcribed to a (functional) programming language or even that it is computable at all.
Therefore, talking about termination is somewhat misleading, even though it is encouraged by the fact that Isabelle/HOL's function package uses the keyword termination for proving some property P about which I will have to say a little more below.
On the one hand the term "termination" might sound more intuitive to a wider audience. On the other hand, a more precise description of P would be well-foundedness of the function's call graph.
Don't get me wrong, termination is not really a bad name for the property P, it is even justified by the fact that many techniques that are implemented in the function package are very close to termination techniques from term rewriting or functional programming (like the size-change principle, dependency pairs, lexicographic orders, etc.).
I'm just saying that it can be misleading. The answer to why that is the case also touches on question 4 of the OP.
4) As far as I know Isabelle is a strict programming language. How can I use infinite objects? Let's say, infinite lists. Is it possible in Isabelle/HOL?
Isabelle/HOL is not a programming language and it specifically does not have any evaluation strategy (we could alternatively say: it has any evaluation strategy you like).
And here is why the word termination is misleading (drum roll): if there is no evaluation strategy and we have termination of a function f, people might expect f to terminate independent of the used strategy. But this is not the case. A termination proof of a function rather ensures that f is well-defined. Even if f is computable a proof of P merely ensures that there is an evaluation strategy for which f terminates.
(As an aside: what I call "strategy" here, is typically influenced by so called cong-rules (i.e., congruence rules) in Isabelle/HOL.)
As an example, it is trivial to prove that the function (see Section 10.1 Congruence rules and evaluation order in the documentation of the function package):
fun f' :: "nat ⇒ bool"
where
"f' n ⟷ f' (n - 1) ∨ n = 0"
terminates (in the sense defined by termination) after adding the cong-rule:
lemma [fundef_cong]:
"Q = Q' ⟹ (¬ Q' ⟹ P = P') ⟹ (P ∨ Q) = (P' ∨ Q')"
by auto
Which essentially states that logical-or should be "evaluated" from right to left. However, if you write the same function e.g. in OCaml it causes a stack overflow ...
EDIT: this answer is not really correct, check out Lars' comment below.
Unfortunately I don't have enough reputation to post this as a comment, so here is my go at an answer (please bear in mind I am no expert in Isabelle, but I also had similar questions once):
1) The idea is to prove statements about the defined functions. I am not sure how familiar you are with Computability Theory, but think about the Halting Problem and the fact most undeciability problems stem from it (such as Acceptance Problem). Imagine defining a function which you can't prove it terminates. How could you then still prove it returns the number 42 when given input "ABC" and it doesn't go in an infinite loop?
If instead you limit yourself to terminating functions, you can prove much more about them, essentially making a trade-off (or at least this is how I see it).
These ideas stem from Constructivism and Intuitionism and I recommend you check out Robert Harper's very interesting lecture series: https://www.youtube.com/watch?v=9SnefrwBIDc&list=PLGCr8P_YncjXRzdGq2SjKv5F2J8HUFeqN on Type Theory
You should check out especially the part about the absence of the Law of Excluded middle: http://youtu.be/3JHTb6b1to8?t=15m34s
2) See Manuel's answer.
3,4) Again see Manuel's answer keeping in mind Intuitionistic logic: "the fundamental entity is not the boolean, but rather the proof that something is true".
For me it took a long time to get adjusted to this way of thinking and I'm still not sure I understand it. I think the key though is to understand it is a more-or-less completely different way of thinking.

Define natural numbers in functional languages like Ada subtypes

In Ada to define natural numbers you can write this:
subtype Natural is Integer range 0 .. Integer'Last;
This is type-safe and it is checked at compile-time. It is simple (one-line of code) and efficient (it does not use recursion to define natural numbers as many functional languages do). Is there any functional language that can provide similar functionality?
Thanks
This is type-safe and it is checked at compile-time.
As you already pointed out in the comments to your question, it is not checked at compile time. Neither is equivalent functionality in Modula-2 or any other production-ready, general-purpose programming language.
The ability to check constraints like this at compile time is something that requires dependent types, refinement types or similar constructs. You can find those kinds of features in theorem provers like Coq or Agda or in experimental/academic languages like ATS or Liquid Haskell.
Now of those languages I mentioned Coq and Agda define their Nat types recursively, so that's not what you want, and ATS is an imperative language. So that leaves Liquid Haskell (plus languages that I didn't mention, of course). Liquid Haskell is Haskell with extra type annotations, so it's definitely a functional language. In Liquid Haskell you can define a MyNat (a type named Nat is already defined in the standard library) type like this:
{-# type MyNat = {n:Integer | n >= 0} #-}
And then use it like this:
{-# fac :: MyNat -> MyNat #-}
fac :: Integer -> Integer
fac 0 = 1
fac n = n * fac (n-1)
If you then try to call fac with a negative number as the argument, you'll get a compilation error. You will also get a compilation error if you call it with user input as the argument unless you specifically check that the input was non-negative. You would also get a compilation error if you removed the fac 0 = 1 line because now n on the next line could be 0, making n-1 negative when you call fac (n-1), so the compiler would reject that.
It should be noted that even with state-of-the-art type inference techniques non-trivial programs in languages like this will end up having quite complicated type signatures and you'll spend a lot of time and effort chasing type errors through an increasingly complex jungle of type signatures having only incomprehensible type errors to guide you. So there's a price for the safety that features like these offer you. It should also be pointed out that, in a Turing complete language, you will occasionally have to write runtime checks for cases that you know can't happen as the compiler can't prove everything even when you think it should.
Typed Racket, a typed dialect of Racket, has a rich set of numeric subtypes and it knows about a fair number of closure properties (eg, the sum of two nonnegative numbers is nonnegative, the sum of two exact integers is an exact integer, etc). Here's a simple example:
#lang typed/racket
(: f : (Nonnegative-Integer Nonnegative-Integer -> Positive-Integer))
(define (f x y)
(+ x y 1))
Type checking is done statically, but of course the typechecker is not able to prove every true fact about numeric subtypes. For example, the following function in fact only returns values of type Nonnegative-Integer, but the type rules for subtraction only allow TR to conclude the result type of Integer.
> (lambda: ([x : Nonnegative-Integer] [y : Nonnegative-Integer])
(- x (- x y)))
- : (Nonnegative-Integer Nonnegative-Integer -> Integer)
#<procedure>
The Typed Racket approach to numbers is described in Typing the Numeric Tower by St-Amour et al (appeared at PADL 2012). There's usually a link to the paper here, but the link seems to be broken at the moment. Google has a cached rendering of the PDF as HTML, if you search for the title.

What is so special about Monads?

A monad is a mathematical structure which is heavily used in (pure) functional programming, basically Haskell. However, there are many other mathematical structures available, like for example applicative functors, strong monads, or monoids. Some have more specific, some are more generic. Yet, monads are much more popular. Why is that?
One explanation I came up with, is that they are a sweet spot between genericity and specificity. This means monads capture enough assumptions about the data to apply the algorithms we typically use and the data we usually have fulfills the monadic laws.
Another explanation could be that Haskell provides syntax for monads (do-notation), but not for other structures, which means Haskell programmers (and thus functional programming researchers) are intuitively drawn towards monads, where a more generic or specific (efficient) function would work as well.
I suspect that the disproportionately large attention given to this one particular type class (Monad) over the many others is mainly a historical fluke. People often associate IO with Monad, although the two are independently useful ideas (as are list reversal and bananas). Because IO is magical (having an implementation but no denotation) and Monad is often associated with IO, it's easy to fall into magical thinking about Monad.
(Aside: it's questionable whether IO even is a monad. Do the monad laws hold? What do the laws even mean for IO, i.e., what does equality mean? Note the problematic association with the state monad.)
If a type m :: * -> * has a Monad instance, you get Turing-complete composition of functions with type a -> m b. This is a fantastically useful property. You get the ability to abstract various Turing-complete control flows away from specific meanings. It's a minimal composition pattern that supports abstracting any control flow for working with types that support it.
Compare this to Applicative, for instance. There, you get only composition patterns with computational power equivalent to a push-down automaton. Of course, it's true that more types support composition with more limited power. And it's true that when you limit the power available, you can do additional optimizations. These two reasons are why the Applicative class exists and is useful. But things that can be instances of Monad usually are, so that users of the type can perform the most general operations possible with the type.
Edit:
By popular demand, here are some functions using the Monad class:
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM c x y = c >>= \z -> if z then x else y
whileM :: Monad m => (a -> m Bool) -> (a -> m a) -> a -> m a
whileM p step x = ifM (p x) (step x >>= whileM p step) (return x)
(*&&) :: Monad m => m Bool -> m Bool -> m Bool
x *&& y = ifM x y (return False)
(*||) :: Monad m => m Bool -> m Bool -> m Bool
x *|| y = ifM x (return True) y
notM :: Monad m => m Bool -> m Bool
notM x = x >>= return . not
Combining those with do syntax (or the raw >>= operator) gives you name binding, indefinite looping, and complete boolean logic. That's a well-known set of primitives sufficient to give Turing completeness. Note how all the functions have been lifted to work on monadic values, rather than simple values. All monadic effects are bound only when necessary - only the effects from the chosen branch of ifM are bound into its final value. Both *&& and *|| ignore their second argument when possible. And so on..
Now, those type signatures may not involve functions for every monadic operand, but that's just a cognitive simplification. There would be no semantic difference, ignoring bottoms, if all the non-function arguments and results were changed to () -> m a. It's just friendlier to users to optimize that cognitive overhead out.
Now, let's look at what happens to those functions with the Applicative interface.
ifA :: Applicative f => f Bool -> f a -> f a -> f a
ifA c x y = (\c' x' y' -> if c' then x' else y') <$> c <*> x <*> y
Well, uh. It got the same type signature. But there's a really big problem here already. The effects of both x and y are bound into the composed structure, regardless of which one's value is selected.
whileA :: Applicative f => (a -> f Bool) -> (a -> f a) -> a -> f a
whileA p step x = ifA (p x) (whileA p step <$> step x) (pure x)
Well, ok, that seems like it'd be ok, except for the fact that it's an infinite loop because ifA will always execute both branches... Except it's not even that close. pure x has the type f a. whileA p step <$> step x has the type f (f a). This isn't even an infinite loop. It's a compile error. Let's try again..
whileA :: Applicative f => (a -> f Bool) -> (a -> f a) -> a -> f a
whileA p step x = ifA (p x) (whileA p step <*> step x) (pure x)
Well shoot. Don't even get that far. whileA p step has the type a -> f a. If you try to use it as the first argument to <*>, it grabs the Applicative instance for the top type constructor, which is (->), not f. Yeah, this isn't gonna work either.
In fact, the only function from my Monad examples that would work with the Applicative interface is notM. That particular function works just fine with only a Functor interface, in fact. The rest? They fail.
Of course it's to be expected that you can write code using the Monad interface that you can't with the Applicative interface. It is strictly more powerful, after all. But what's interesting is what you lose. You lose the ability to compose functions that change what effects they have based on their input. That is, you lose the ability to write certain control-flow patterns that compose functions with types a -> f b.
Turing-complete composition is exactly what makes the Monad interface interesting. If it didn't allow Turing-complete composition, it would be impossible for you, the programmer, to compose together IO actions in any particular control flow that wasn't nicely prepackaged for you. It was the fact that you can use the Monad primitives to express any control flow that made the IO type a feasible way to manage the IO problem in Haskell.
Many more types than just IO have semantically valid Monad interfaces. And it happens that Haskell has the language facilities to abstract over the entire interface. Due to those factors, Monad is a valuable class to provide instances for, when possible. Doing so gets you access to all the existing abstract functionality provided for working with monadic types, regardless of what the concrete type is.
So if Haskell programmers seem to always care about Monad instances for a type, it's because it's the most generically-useful instance that can be provided.
First, I think that it is not quite true that monads are much more popular than anything else; both Functor and Monoid have many instances that are not monads. But they are both very specific; Functor provides mapping, Monoid concatenation. Applicative is the one class that I can think of that is probably underused given its considerable power, due largely to its being a relatively recent addition to the language.
But yes, monads are extremely popular. Part of that is the do notation; a lot of Monoids provide Monad instances that merely append values to a running accumulator (essentially an implicit writer). The blaze-html library is a good example. The reason, I think, is the power of the type signature (>>=) :: Monad m => m a -> (a -> m b) -> m b. While fmap and mappend are useful, what they can do is fairly narrowly constrained. bind, however, can express a wide variety of things. It is, of course, canonized in the IO monad, perhaps the best pure functional approach to IO before streams and FRP (and still useful beside them for simple tasks and defining components). But it also provides implicit state (Reader/Writer/ST), which can avoid some very tedious variable passing. The various state monads, especially, are important because they provide a guarantee that state is single threaded, allowing mutable structures in pure (non-IO) code before fusion. But bind has some more exotic uses, such as flattening nested data structures (the List and Set monads), both of which are quite useful in their place (and I usually see them used desugared, calling liftM or (>>=) explicitly, so it is not a matter of do notation). So while Functor and Monoid (and the somewhat rarer Foldable, Alternative, Traversable, and others) provide a standardized interface to a fairly straightforward function, Monad's bind is considerably more flexibility.
In short, I think that all your reasons have some role; the popularity of monads is due to a combination of historical accident (do notation and the late definition of Applicative) and their combination of power and generality (relative to functors, monoids, and the like) and understandability (relative to arrows).
Well, first let me explain what the role of monads is: Monads are very powerful, but in a certain sense: You can pretty much express anything using a monad. Haskell as a language doesn't have things like action loops, exceptions, mutation, goto, etc. Monads can be expressed within the language (so they are not special) and make all of these reachable.
There is a positive and a negative side to this: It's positive that you can express all those control structures you know from imperative programming and a whole bunch of them you don't. I have just recently developed a monad that lets you reenter a computation somewhere in the middle with a slightly changed context. That way you can run a computation, and if it fails, you just try again with slightly adjusted values. Furthermore monadic actions are first class, and that's how you build things like loops or exception handling. While while is primitive in C in Haskell it's actually just a regular function.
The negative side is that monads give you pretty much no guarantees whatsoever. They are so powerful that you are allowed to do whatever you want, to put it simply. In other words just like you know from imperative languages it can be hard to reason about code by just looking at it.
The more general abstractions are more general in the sense that they allow some concepts to be expressed which you can't express as monads. But that's only part of the story. Even for monads you can use a style known as applicative style, in which you use the applicative interface to compose your program from small isolated parts. The benefit of this is that you can reason about code by just looking at it and you can develop components without having to pay attention to the rest of your system.
What is so special about monads?
The monadic interface's main claim to fame in Haskell is its role in the replacement of the original and unwieldy dialogue-based I/O mechanism.
As for their status in a formal investigative context...it is merely an iteration of a seemingly-cyclic endeavour which is now (2021 Oct) approximately one half-century old:
During the 1960s, several researchers began work on proving things about programs. Efforts were
made to prove that:
A program was correct.
Two programs with different code computed the same answers when given the
same inputs.
One program was faster than another.
A given program would always terminate.
While these are abstract goals, they are all, really, the same as the practical goal of "getting the
program debugged".
Several difficult problems emerged from this work. One was the problem of specification: before
one can prove that a program is correct, one must specify the meaning of "correct", formally and
unambiguously. Formal systems for specifying the meaning of a program were developed, and they
looked suspiciously like programming languages.
The Anatomy of Programming Languages, Alice E. Fischer and Frances S. Grodzinsky.
(emphasis by me.)
...back when "programming languages" - apart from an intrepid few - were most definitely imperative.
Anyone for elevating this mystery to the rank of Millenium problem? Solving it would definitely advance the science of computing and the engineering of software, one way or the other...
Monads are special because of do notation, which lets you write imperative programs in a functional language. Monad is the abstraction that allows you to splice together imperative programs from smaller, reusable components (which are themselves imperative programs). Monad transformers are special because they represent enhancing an imperative language with new features.

Resources