Correct way of writing recursive functions in CLP(R) with Prolog - recursion

I am very confused in how CLP works in Prolog. Not only do I find it hard to see the benefits (I do see it in specific cases but find it hard to generalise those) but more importantly, I can hardly make up how to correctly write a recursive predicate. Which of the following would be the correct form in a CLP(R) way?
factorial(0, 1).
factorial(N, F):- {
N > 0,
PrevN = N - 1,
factorial(PrevN, NewF),
F = N * NewF}.
or
factorial(0, 1).
factorial(N, F):- {
N > 0,
PrevN = N - 1,
F = N * NewF},
factorial(PrevN, NewF).
In other words, I am not sure when I should write code outside the constraints. To me, the first case would seem more logical, because PrevN and NewF belong to the constraints. But if that's true, I am curious to see in which cases it is useful to use predicates outside the constraints in a recursive function.

There are several overlapping questions and issues in your post, probably too many to coherently address to your complete satisfaction in a single post.
Therefore, I would like to state a few general principles first, and then—based on that—make a few specific comments about the code you posted.
First, I would like to address what I think is most important in your case:
LP ⊆ CLP
This means simply that CLP can be regarded as a superset of logic programming (LP). Whether it is to be considered a proper superset or if, in fact, it makes even more sense to regard them as denoting the same concept is somewhat debatable. In my personal view, logic programming without constraints is much harder to understand and much less usable than with constraints. Given that also even the very first Prolog systems had a constraint like dif/2 and also that essential built-in predicates like (=)/2 perfectly fit the notion of "constraint", the boundaries, if they exist at all, seem at least somewhat artificial to me, suggesting that:
LP ≈ CLP
Be that as it may, the key concept when working with CLP (of any kind) is that the constraints are available as predicates, and used in Prolog programs like all other predicates.
Therefore, whether you have the goal factorial(N, F) or { N > 0 } is, at least in principle, the same concept: Both mean that something holds.
Note the syntax: The CLP(ℛ) constraints have the form { C }, which is {}(C) in prefix notation.
Note that the goal factorial(N, F) is not a CLP(ℛ) constraint! Neither is the following:
?- { factorial(N, F) }.
ERROR: Unhandled exception: type_error({factorial(_3958,_3960)},...)
Thus, { factorial(N, F) } is not a CLP(ℛ) constraint either!
Your first example therefore cannot work for this reason alone already. (In addition, you have a syntax error in the clause head: factorial (, so it also does not compile at all.)
When you learn working with a constraint solver, check out the predicates it provides. For example, CLP(ℛ) provides {}/1 and a few other predicates, and has a dedicated syntax for stating relations that hold about floating point numbers (in this case).
Other constraint solver provide their own predicates for describing the entities of their respective domains. For example, CLP(FD) provides (#=)/2 and a few other predicates to reason about integers. dif/2 lets you reason about any Prolog term. And so on.
From the programmer's perspective, this is exactly the same as using any other predicate of your Prolog system, whether it is built-in or stems from a library. In principle, it's all the same:
A goal like list_length(Ls, L) can be read as: "The length of the list Ls is L."
A goal like { X = A + B } can be read as: The number X is equal to the sum of A and B. For example, if you are using CLP(Q), it is clear that we are talking about rational numbers in this case.
In your second example, the body of the clause is a conjunction of the form (A, B), where A is a CLP(ℛ) constraint, and B is a goal of the form factorial(PrevN, NewF).
The point is: The CLP(ℛ) constraint is also a goal! Check it out:
?- write_canonical({a,b,c}).
{','(a,','(b,c))}
true.
So, you are simply using {}/1 from library(clpr), which is one of the predicates it exports.
You are right that PrevN and NewF belong to the constraints. However, factorial(PrevN, NewF) is not part of the mini-language that CLP(ℛ) implements for reasoning over floating point numbers. Therefore, you cannot pull this goal into the CLP(ℛ)-specific part.
From a programmer's perspective, a major attraction of CLP is that it blends in completely seamlessly into "normal" logic programming, to the point that it can in fact hardly be distinguished at all from it: The constraints are simply predicates, and written down like all other goals.
Whether you label a library predicate a "constraint" or not hardly makes any difference: All predicates can be regarded as constraints, since they can only constrain answers, never relax them.
Note that both examples you post are recursive! That's perfectly OK. In fact, recursive predicates will likely be the majority of situations in which you use constraints in the future.
However, for the concrete case of factorial, your Prolog system's CLP(FD) constraints are likely a better fit, since they are completely dedicated to reasoning about integers.

Related

Moral of the story from SICP Ex. 1.20?

In this exercise we are asked to trace Euclid's algorithm using first normal order then applicative order evaluation.
(define (gcd a b)
(if (= b 0)
a
(gcd b (remainder a b))))
I've done the manual trace, and checked it with the several solutions available on the internet. I'm curious here to consolidate the moral of the exercise.
In gcd above, note that b is re-used three times in the function body, plus this function is recursive. This being what gives rise to 18 calls to remainder for normal order, in contrast to only 4 for applicative order.
So, it seems that when a function uses an argument more than once in its body, (and perhaps recursively! as here), then not evaluating it when the function is called (i.e. applicative order), will lead to redundant recomputation of the same thing.
Note that the question is at pains to point out that the special form if does not change its behaviour when normal order is used; that is, if will always run first; if this didn't happen, there could be no termination in this example.
I'm curious regarding delayed evaluation we are seeing here.
As a plus, it might allow us to handle infinite things, like streams.
As a minus, if we have a function like here, it can cause great inefficiency.
To fix the latter it seems like there are two conceptual options. One, wrap it in some data structure that caches its result to avoid recomputation. Two, selectively force the argument to realise when you know it will otherwise cause repeated recomputation.
The thing is, neither of those options seem very nice, because both represent additional "levers" the programmer must know how to use and choose when to use.
Is all of this dealt with more thoroughly later in the book?
Is there any straightforward consolidation of these points which would be worth making clear at this point (without perhaps going into all the detail that is to come).
The short answer is yes, this is covered extensively later in the book.
It is covered in most detail in Chapter 4 where we implement eval and apply, and so get the opportunity to modify their behaviour. For example Exercise 4.31 suggests the following syntax:
(define (f a (b lazy) c (d lazy-memo))
As you can see this identifies three different behaviours.
Parameters a and c have applicative order (they are evaluated before they are passed to the f).
b is normal or lazy, it is evaluated everytime it is used (but only if it is used).
d lazy but it's value it memoized so it is evaluated at most once.
Although this syntax is possible it isn't used. I think the philosopy is that the the language has an expected behaviour (applicative order) and that normal order is only used by default when necessary (e.g., the consequent and alternative of an if statement, and in creating streams). When it is necssary (or desirable) to have a parameter with normal evaluation then we can use delay and force, and if necessary memo-proc (e.g. Exercise 3.77).
So at this point the authors are introducing the ideas of normal and applicative order so that we have some familiarity with them by the time we get into the nitty gritty later on.
In a sense this a recurring theme, applicative order is probably more intuitive, but sometimes we need normal order. Recursive functions are simpler to write, but sometimes we need the performance of iterative functions. Programs where we can use the substition model are easier to reason about, but sometimes we need environmental model because we need mutable state.

"Efficient" least- and greatest fixpoint computations?

I am trying to compute two finite sets of some enumerable type (let's say char) using a least- and greatest- fixpoint computation, respectively. I want my definitions to be extractable to SML, and to be "semi-efficient" when executed. What are my options?
From exploring the HOL library and playing around with code generation, I have the following observations:
I could use the complete_lattice.lfp and complete_lattice.gfp constants with a pair of additional monotone functions to compute my sets, which in fact I currently am doing. Code generation does work with these constants, but the code produced is horribly inefficient, and if I understand the generated SML code correctly is performing an exhaustive search over every possible set in the powerset of characters. Any use, no matter how simple, of these two constants at type char therefore causes a divergence when executed.
I could try to make use of the iterative fixpoint described by the Kleene fixpoint theorem in directed complete partial orders. From exploring, there's a ccpo_class.fixp constant in the theory Complete_Partial_Order, but the underlying iterates constant that this is defined in terms of has no associated code equations, and so code cannot be extracted.
Are there any existing fixpoint combinators hiding somewhere, suitable for use with finite sets, that produce semi-efficient code with code generation that I have missed?
None of the general fixpoint combinators in Isabelle's standard library is meant to used directly for code extraction because their construction is too general to be usable in practice. (There is another one in the theory ~~/src/HOL/Library/Bourbaki_Witt_Fixpoint.) But the theory ~~/src/HOL/Library/While_Combinator connects the lfp and gfp fixpoints to the iterative implementation you are looking for, see theorems lfp_while_lattice and gfp_while_lattice. These characterisations have the precondition that the function is monotone, so they cannot be used as code equations directly. So you have two options:
Use the while combinator instead of lfp/gfp in your code equations and/or definitions.
Tell the code preprocessor to use lfp_while_lattice as a [code_unfold] equation. This works if you also add all the rules that the preprocessor needs to prove the assumptions of these equations for the instances at which it should apply. Hence, I recommend that you also add as [code_unfold] the monotonicity statement of your function and the theorem to prove the finiteness of char set, i.e., finite_class.finite.

How to quantitatively measure how simplified a mathematical expression is

I am looking for a simple method to assign a number to a mathematical expression, say between 0 and 1, that conveys how simplified that expression is (being 1 as fully simplified). For example:
eval('x+1') should return 1.
eval('1+x+1+x+x-5') should returns some value less than 1, because it is far from being simple (i.e., it can be further simplified).
The parameter of eval() could be either a string or an abstract syntax tree (AST).
A simple idea that occurred to me was to count the number of operators (?)
EDIT: Let simplified be equivalent to how close a system is to the solution of a problem. E.g., given an algebra problem (i.e. limit, derivative, integral, etc), it should assign a number to tell how close it is to the solution.
The closest metaphor I can come up with it how a maths professor would look at an incomplete problem and mentally assess it in order to tell how close the student is to the solution. Like in a math exam, were the student didn't finished a problem worth 20 points, but the professor assigns 8 out of 20. Why would he come up with 8/20, and can we program such thing?
I'm going to break a stack-overflow rule and post this as an answer instead of a comment, because not only I'm pretty sure the answer is you can't (at least, not the way you imagine), but also because I believe it can be educational up to a certain degree.
Let's assume that a criteria of simplicity can be established (akin to a normal form). It seems to me that you are very close to trying to solve an analogous to entscheidungsproblem or the halting problem. I doubt that in a complex rule system required for typical algebra, you can find a method that gives a correct and definitive answer to the number of steps of a series of term reductions (ipso facto an arbitrary-length computation) without actually performing it. Such answer would imply knowing in advance if such computation could terminate, and so contradict the fact that automatic theorem proving is, for any sufficiently powerful logic capable of representing arithmetic, an undecidable problem.
In the given example, the teacher is actually either performing that computation mentally (going step by step, applying his own sequence of rules), or gives an estimation based on his experience. But, there's no generic algorithm that guarantees his sequence of steps are the simplest possible, nor that his resulting expression is the simplest one (except for trivial expressions), and hence any quantification of "distance" to a solution is meaningless.
Wouldn't all this be true, your problem would be simple: you know the number of steps, you know how many steps you've taken so far, you divide the latter by the former ;-)
Now, returning to the criteria of simplicity, I also advice you to take a look on Hilbert's 24th problem, that specifically looked for a "Criteria of simplicity, or proof of the greatest simplicity of certain proofs.", and the slightly related proof compression. If you are philosophically inclined to further understand these subjects, I would suggest reading the classic Gödel, Escher, Bach.
Further notes: To understand why, consider a well-known mathematical artefact called the Mandelbrot fractal set. Each pixel color is calculated by determining if the solution to the equation z(n+1) = z(n)^2 + c for any specific c is bounded, that is, "a complex number c is part of the Mandelbrot set if, when starting with z(0) = 0 and applying the iteration repeatedly, the absolute value of z(n) remains bounded however large n gets." Despite the equation being extremely simple (you know, square a number and sum a constant), there's absolutely no way to know if it will remain bounded or not without actually performing an infinite number of iterations or until a cycle is found (disregarding complex heuristics). In this sense, every fractal out there is a rough approximation that typically usages an escape time algorithm as an heuristic to provide an educated guess whether the solution will be bounded or not.

A Functional-Imperative Hybrid

Pure functional programming languages do not allow mutable data, but some computations are more naturally/intuitively expressed in an imperative way -- or an imperative version of an algorithm may be more efficient. I am aware that most functional languages are not pure, and let you assign/reassign variables and do imperative things but generally discourage it.
My question is, why not allow local state to be manipulated in local variables, but require that functions can only access their own locals and global constants (or just constants defined in an outer scope)? That way, all functions maintain referential transparency (they always give the same return value given the same arguments), but within a function, a computation can be expressed in imperative terms (like, say, a while loop).
IO and such could still be accomplished in the normal functional ways - through monads or passing around a "world" or "universe" token.
My question is, why not allow local state to be manipulated in local variables, but require that functions can only access their own locals and global constants (or just constants defined in an outer scope)?
Good question. I think the answer is that mutable locals are of limited practical value but mutable heap-allocated data structures (primarily arrays) are enormously valuable and form the backbone of many important collections including efficient stacks, queues, sets and dictionaries. So restricting mutation to locals only would not give an otherwise purely functional language any of the important benefits of mutation.
On a related note, communicating sequential processes exchanging purely functional data structures offer many of the benefits of both worlds because the sequential processes can use mutation internally, e.g. mutable message queues are ~10x faster than any purely functional queues. For example, this is idiomatic in F# where the code in a MailboxProcessor uses mutable data structures but the messages communicated between them are immutable.
Sorting is a good case study in this context. Sedgewick's quicksort in C is short and simple and hundreds of times faster than the fastest purely functional sort in any language. The reason is that quicksort mutates the array in-place. Mutable locals would not help. Same story for most graph algorithms.
The short answer is: there are systems to allow what you want. For example, you can do it using the ST monad in Haskell (as referenced in the comments).
The ST monad approach is from Haskell's Control.Monad.ST. Code written in the ST monad can use references (STRef) where convenient. The nice part is that you can even use the results of the ST monad in pure code, as it is essentially self-contained (this is basically what you were wanting in the question).
The proof of this self-contained property is done through the type-system. The ST monad carries a state-thread parameter, usually denoted with a type-variable s. When you have such a computation you'll have monadic result, with a type like:
foo :: ST s Int
To actually turn this into a pure result, you have to use
runST :: (forall s . ST s a) -> a
You can read this type like: give me a computation where the s type parameter doesn't matter, and I can give you back the result of the computation, without the ST baggage. This basically keeps the mutable ST variables from escaping, as they would carry the s with them, which would be caught by the type system.
This can be used to good effect on pure structures that are implemented with underlying mutable structures (like the vector package). One can cast off the immutability for a limited time to do something that mutates the underlying array in place. For example, one could combine the immutable Vector with an impure algorithms package to keep the most of the performance characteristics of the in place sorting algorithms and still get purity.
In this case it would look something like:
pureSort :: Ord a => Vector a -> Vector a
pureSort vector = runST $ do
mutableVector <- thaw vector
sort mutableVector
freeze mutableVector
The thaw and freeze functions are linear-time copying, but this won't disrupt the overall O(n lg n) running time. You can even use unsafeFreeze to avoid another linear traversal, as the mutable vector isn't used again.
"Pure functional programming languages do not allow mutable data" ... actually it does, you just simply have to recognize where it lies hidden and see it for what it is.
Mutability is where two things have the same name and mutually exclusive times of existence so that they may be treated as "the same thing at different times". But as every Zen philosopher knows, there is no such thing as "same thing at different times". Everything ceases to exist in an instant and is inherited by its successor in possibly changed form, in a (possibly) uncountably-infinite succession of instants.
In the lambda calculus, mutability thus takes the form illustrated by the following example: (λx (λx f(x)) (x+1)) (x+1), which may also be rendered as "let x = x + 1 in let x = x + 1 in f(x)" or just "x = x + 1, x = x + 1, f(x)" in a more C-like notation.
In other words, "name clash" of the "lambda calculus" is actually "update" of imperative programming, in disguise. They are one and the same - in the eyes of the Zen (who is always right).
So, let's refer to each instant and state of the variable as the Zen Scope of an object. One ordinary scope with a mutable object equals many Zen Scopes with constant, unmutable objects that either get initialized if they are the first, or inherit from their predecessor if they are not.
When people say "mutability" they're misidentifying and confusing the issue. Mutability (as we've just seen here) is a complete red herring. What they actually mean (even unbeknonwst to themselves) is infinite mutability; i.e. the kind which occurs in cyclic control flow structures. In other words, what they're actually referring to - as being specifically "imperative" and not "functional" - is not mutability at all, but cyclic control flow structures along with the infinite nesting of Zen Scopes that this entails.
The key feature that lies absent in the lambda calculus is, thus, seen not as something that may be remedied by the inclusion of an overwrought and overthought "solution" like monads (though that doesn't exclude the possibility of it getting the job done) but as infinitary terms.
A control flow structure is the wrapping of an unwrapped (possibility infinite) decision tree structure. Branches may re-converge. In the corresponding unwrapped structure, they appear as replicated, but separate, branches or subtrees. Goto's are direct links to subtrees. A goto or branch that back-branches to an earlier part of a control flow structure (the very genesis of the "cycling" of a cyclic control flow structure) is a link to an identically-shaped copy of the entire structure being linked to. Corresponding to each structure is its Universally Unrolled decision tree.
More precisely, we may think of a control-flow structure as a statement that precedes an actual expression that conditions the value of that expression. The archetypical case in point is Landin's original case, itself (in his 1960's paper, where he tried to lambda-ize imperative languages): let x = 1 in f(x). The "x = 1" part is the statement, the "f(x)" is the value being conditioned by the statement. In C-like form, we could write this as x = 1, f(x).
More generally, corresponding to each statement S and expression Q is an expression S[Q] which represents the result Q after S is applied. Thus, (x = 1)[f(x)] is just λx f(x) (x + 1). The S wraps around the Q. If S contains cyclic control flow structures, the wrapping will be infinitary.
When Landin tried to work out this strategy, he hit a hard wall when he got to the while loop and went "Oops. Never mind." and fell back into what become an overwrought and overthought solution, while this simple (and in retrospect, obvious) answer eluded his notice.
A while loop "while (x < n) x = x + 1;" - which has the "infinite mutability" mentioned above, may itself be treated as an infinitary wrapper, "if (x < n) { x = x + 1; if (x < 1) { x = x + 1; if (x < 1) { x = x + 1; ... } } }". So, when it wraps around an expression Q, the result is (in C-like notation) "x < n? (x = x + 1, x < n? (x = x + 1, x < n? (x = x + 1, ...): Q): Q): Q", which may be directly rendered in lambda form as "x < n? (λx x < n (λx x < n? (λx·...) (x + 1): Q) (x + 1): Q) (x + 1): Q". This shows directly the connection between cyclicity and infinitariness.
This is an infinitary expression that, despite being infinite, has only a finite number of distinct subexpressions. Just as we can think of there being a Universally Unrolled form to this expression - which is similar to what's shown above (an infinite decision tree) - we can also think of there being a Maximally Rolled form, which could be obtained by labelling each of the distinct subexpressions and referring to the labels, instead. The key subexpressions would then be:
A: x < n? goto B: Q
B: x = x + 1, goto A
The subexpression labels, here, are "A:" and "B:", while the references to the subexpressions so labelled as "goto A" and "goto B", respectively. So, by magic, the very essence of Imperativitity emerges directly out of the infinitary lambda calculus, without any need to posit it separately or anew.
This way of viewing things applies even down to the level of binary files. Every interpretation of every byte (whether it be a part of an opcode of an instruction that starts 0, 1, 2 or more bytes back, or as part of a data structure) can be treated as being there in tandem, so that the binary file is a rolling up of a much larger universally unrolled structure whose physical byte code representation overlaps extensively with itself.
Thus, emerges the imperative programming language paradigm automatically out of the pure lambda calculus, itself, when the calculus is extended to include infinitary terms. The control flow structure is directly embodied in the very structure of the infinitary expression, itself; and thus requires no additional hacks (like Landin's or later descendants, like monads) - as it's already there.
This synthesis of the imperative and functional paradigms arose in the late 1980's via the USENET, but has not (yet) been published. Part of it was already implicit in the treatment (dating from around the same time) given to languages, like Prolog-II, and the much earlier treatment of cyclic recursive structures by infinitary expressions by Irene Guessarian LNCS 99 "Algebraic Semantics".
Now, earlier I said that the magma-based formulation might get you to the same place, or to an approximation thereof. I believe there is a kind of universal representation theorem of some sort, which asserts that the infinitary based formulation provides a purely syntactic representation, and that the semantics that arise from the monad-based representation factors through this as "monad-based semantics" = "infinitary lambda calculus" + "semantics of infinitary languages".
Likewise, we may think of the "Q" expressions above as being continuations; so there may also be a universal representation theorem for continuation semantics, which similarly rolls this formulation back into the infinitary lambda calculus.
At this point, I've said nothing about non-rational infinitary terms (i.e. infinitary terms which possess an infinite number of distinct subterms and no finite Minimal Rolling) - particularly in relation to interprocedural control flow semantics. Rational terms suffice to account for loops and branches, and so provide a platform for intraprocedural control flow semantics; but not as much so for the call-return semantics that are the essential core element of interprocedural control flow semantics, if you consider subprograms to be directly represented as embellished, glorified macros.
There may be something similar to the Chomsky hierarchy for infinitary term languages; so that type 3 corresponds to rational terms, type 2 to "algebraic terms" (those that can be rolled up into a finite set of "goto" references and "macro" definitions), and type 0 for "transcendental terms". That is, for me, an unresolved loose end, as well.

Algebraic structure and programming

May anyone give me an example how we can improve our code reusability using algebraic structures like groups, monoids and rings? (or how can i make use of these kind of structures in programming, knowing at least that i didn't learn all that theory in highschool for nothing).
I heard this is possible but i can't figure out a way applying them in programming and genereally applying hardcore mathematics in programming.
It is not really the mathematical stuff that helps as is the mathematical thinking. Abstraction is the key in programming. Transforming real live concepts into numbers and relations is what we do every day. Algebra is the mother of all, algebra is the set of rules that defines correctness, it is the highest level of abstraction, so, understanding algebra means you can think more clear, more faster, more efficient. Commencing from Sets theory to Category Theory, Domain Theory etc everything comes from practical challenges, abstraction and generalization requirements.
In common practice you will not need to actually know these, although if you are thinking of developing stuff like AI Agents, programming languages, fundamental concepts and tools then they are a must.
In functional programming, esp. Haskell, it's common to structure programs that transform states as monads. Doing so means you can reuse generic algorithms on monads in very different programs.
The C++ standard template library features the concept of a monoid. The idea is again that generic algorithms may require an operation to satisfies the axioms of monoids for their correctness.
E.g., if we can prove the type T we're operating on (numbers, string, whatever) is closed under the operation, we know we won't have to check for certain errors; we always get a valid T back. If we can prove that the operation is associative (x * (y * z) = (x * y) * z), then we can reuse the fork-join architecture; a simple but way of parallel programming implemented in various libraries.
Computer science seems to get a lot of milage out of category theory these days. You get monads, monoids, functors -- an entire bestiary of mathematical entities that are being used to improve code reusability, harnessing the abstraction of abstract mathematics.
Lists are free monoids with one generator, binary trees are groups. You have either the finite or infinite variant.
Starting points:
http://en.wikipedia.org/wiki/Algebraic_data_type
http://en.wikipedia.org/wiki/Initial_algebra
http://en.wikipedia.org/wiki/F-algebra
You may want to learn category theory, and the way category theory approaches algebraic structures: it is exactly the way functional programming languages approach data structures, at least shapewise.
Example: the type Tree A is
Tree A = () | Tree A | Tree A * Tree A
which reads as the existence of a isomorphism (*) (I set G = Tree A)
1 + G + G x G -> G
which is the same as a group structure
phi : 1 + G + G x G -> G
() € 1 -> e
x € G -> x^(-1)
(x, y) € G x G -> x * y
Indeed, binary trees can represent expressions, and they form an algebraic structure. An element of G reads as either the identity, an inverse of an element or the product of two elements. A binary tree is either a leaf, a single tree or a pair of trees. Note the similarity in shape.
(*) as well as a universal property, but they are two of them (finite trees or infinite lazy trees) so I won't dwelve into details.
As I had no idea this stuff existed in the computer science world, please disregard this answer ;)
I don't think the two fields (no pun intended) have any overlap. Rings/fields/groups deal with mathematical objects. Consider a part of the definition of a field:
For every a in F, there exists an element −a in F, such that a + (−a) = 0. Similarly, for any a in F other than 0, there exists an element a^−1 in F, such that a · a^−1 = 1. (The elements a + (−b) and a · b^−1 are also denoted a − b and a/b, respectively.) In other words, subtraction and division operations exist.
What the heck does this mean in terms of programming? I surely can't have an additive inverse of a list object in Python (well, I could just destroy the object, but that is like the multiplicative inverse. I guess you could get somewhere trying to define a Python-ring, but it just won't work out in the end). Don't even think about dividing lists...
As for code readability, I have absolutely no idea how this can even be applied, so this application is irrelevant.
This is my interpretation, but being a mathematics major probably makes me blind to other terminology from different fields (you know which one I'm talking about).
Monoids are ubiquitous in programming. In some programming languages, eg. Haskell, we can make monoids explicit http://blog.sigfpe.com/2009/01/haskell-monoids-and-their-uses.html

Resources