Related
I am trying to self-learn SML. Though I can write some SML code, I have not attained the ah ha.
In:
val x = 5;
how does binding a value of 5 to the name x differ from assigning a value of 5 to the memory location/variable x in imperative programming?
How does the above expression elucidate "a style that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data"?
What do I have to throw away about imperative programming to catch on FP quickly?
Please be gentle with me.
How does binding a value of 5 to the name x differ from assigning a value of 5 to the memory location/variable x in imperative programming?
The key difference between variables in functional programming and variables in imperative programming is that variables in functional programming cannot be modified.
Then why are they called “variables”?
Variables in functional programming languages don't vary in the sense that you can modify their value. However, they do vary in the mathematical sense in that they represent some unknown constant value. For example, consider the following quadratic expression:
This is a quadratic expression in one variable. The variable x varies in the sense that you can select any value for x. However, once you select a certain value for x you cannot change that value.
Now, if you have a quadractic equation then the choice of x is no longer arbitrary. For example, consider the following quadratic equation:
The only choices for x are those which satisfy the equation (i.e. x = -2 or x = -1.5).
A mathematical function on the other hand is a relation between two sets, called the domain (input set) and the codomain (output set). For example, consider the following function:
This function relates every x belonging to the set of real numbers to its corresponding 2x^2 + 7x + 6, also belonging to the set of real numbers. Again, we are free to choose any value for x. However, once we choose some value for x we are not allowed to modify it.
The point is that they are called variables because they vary in the mathematical sense that they can assume different values. Hence, they vary with instance. However, they do not vary with time.
This immutability of variables is important in functional programming because it makes your program referentially transparent (i.e. invoking a function can be thought of as simply replacing the function call with the function body).
If variables were allowed to vary with time then you wouldn't be able to simply substitute the function call with the function body because the substituted variable could change over time. Hence, your substituted function body could become invalid over time.
How does the above expression elucidate "a style that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data"?
The expression val x = 5; could be thought of as a function application (fn x => y) 5 where y is the rest of the program. Functional programming is all about functions, and immutability in the sense that variables only vary with instance and not with time.
In particular, mutability is considered bad because it is implicit. Anything that is implicit could potentially cause unforeseeable errors in your program. Hence, the Zen of Python explicitly states that:
Explicit is better than implicit.
In fact, mutable state is the primary cause of software complexity. Hence, functional programming tries to eschew mutability as much as possible. However, that doesn't mean that we only use immutable structures. We can use mutable structures. We just need to be explicit about it.
What do I have to throw away about imperative programming to catch on FP quickly?
Nothing. Functional programming and imperative programming are two ways of thinking about computation. Computation is a very abstract idea. What exactly is computation? How do we know which problems can be computed? These were some of the questions that mathematicians struggled with in the early nineteen hundreds.
To think about computation we need to have a formal model of computation (i.e. a formal system that captures the notion of computation). There are various models of computation. However, the most famous are the lambda calculus (which gave rise to functional programming) and turing machines (which gave rise to imperative programming).
Both the lambda calculus and the turing machine are equivalent in power. They both capture the notion of computing and every problem that can be computed can be expressed as either a lambda calculus expression or an equivalent turing machine. The only difference is the way in which you express your problem.
You don't have to throw away anything that you learned about imperative programming to understand functional programming. Neither one is better than the other. They are both just different ways of expressing the same problem. However, you do need to start thinking like a functional programmer to write good functional programs.
I am trying to model linear feedback shift registers in haskell. These can be modeled by polynomials over finite fields, so I am using numeric-prelude to get type classes that resemble mathematical algebraic structures more closely than those in the normal prelude.
I am by no means an expert in abstract algebra, so I have become a little confused about the IntegralDomain type class. The problem is that my book on abstract algebra (A book of abstract algebra by Charles C. Pinter) and the type classes seem to be conflicting with each other.
According to the book, the ring of polynomials over an integral domain, is itself an integral domain. Also, a ring of polynomials over a field is only an integral domain, but with the special(the fact that it is special is mentioned) property that the division algorithm holds.
That is, if F[x] is a polynomial over a field, then for a in F[x] and b!=0 in F[x], there exists q,r in F[x] such that b*q+r=a, and the degree of r is less than that of b.
The fact that this property is special to polynomials over a field, to me implies that it does not hold over any integral domain.
On the other hand, according to the type classes of numeric prelude, a polynomial over a field (that is zeroTestable) is also an IntegraldDomain. But according to the documentation, there are several laws of integralDomains, one of the being:
(a `div` b) * b + (a `mod` b) === a
http://hackage.haskell.org/packages/archive/numeric-prelude/0.4.0.1/doc/html/Algebra-IntegralDomain.html#t:C
This to me looks like the division algorithm, but then the division algorithm is true in any integral domain, including a polynomial over an integral domain contradiction my book. It is also worth noting that a polynomail over an integral domain does not have an instance for IntegralDomain in numeric-prelude(not that I can see at least, the fact that every type class is simply called C, make the documentation a little hard to read). So maybe the IntegralDomain in numeric prelude is an integral domain with the extra property that the division algorithm holds?
So is the IntegralDomain in numeric-prelude really an integral domain?
rubber duck debugging post script: While writing this question I got an idea for part of a possible explanation. Is it the requirement that "the degree of r is less than that of b." which makes the whole difference? That requirement is not in the numeric-prelude IntegralDomain. Then again, some of the other laws might imply this fact...
According to the book, a polynomial over an integral domain, is itself an integral domain.
That's not correctly phrased. The ring of polynomials over an integral domain is again an integral domain.
The ring of polynomials in one indeterminate over a field is even a principal ideal domain, as witnessed by the division algorithm, since every polynomial of degree 0 is a unit then.
In a general integral domain R, you have nonzero non-units, and if a is one, then you cannot write
X = q*a + r
with the degree of r smaller than that of a (which is 0).
Is it the requirement that "the degree of r is less than that of b." which makes the whole difference?
Precisely. That requirement guarantees that the division algorithm terminates. In a general integral domain, you can have a "canonical" choice of remainders modulo any fixed ring element, but the canonical remainder need not be "smaller" in any meaningful way, so an attempt to use the division algorithm need not terminate.
Then again, some of the other laws might imply this fact
None of the laws in Algebra.IntegralDomain imply that.
The law
(a+k*b) `mod` b === a `mod` b
is, I believe, hard to implement for a completely general integral domain, which could somewhat restrict the actual instances, but for something like Z[X] or R[X,Y] which are not PIDs, an instance is possible.
I'm struggling with whether or not this is decidable:
A = {x is an element of the set of Natural Numbers | for every y greater than x, 2y is the sum of two primes}
I'm inclined to think that this is decidable given the fact that when fed into a Turing Machine, it will never reach an accept state and loop for infinity unless it rejects. However, I also do know that for a language to be decidable, there must only exist an algorithm to decide it; we don't necessarily have to know how it's done. With this, part of me think that it is decidable? Does anyone know how to prove either?
This language is decidable, though the proof is a bit evil.
For starters, let's think about the properties of this language. Clearly, if n is a natural number contained in the language, then every number greater than n is also in the language. Thus there are three possible forms this language can take:
This language contains all natural numbers, or
This language contains no natural numbers, or
This language contains all natural numbers greater than some natural number n.
Languages (1) and (2) are, respectively, {0, 1}* and the empty language, both of which are decidable (so there are TMs that always halt that accept those languages). Every language of form (3) is also decidable, because for any n we can easily write a TM with n hardcoded into it that simply checks whether the input is at least n. Consequently, no matter which case is true (either 1, 2, or 3), there exists some TM that always halts whose language is the language you've provided, so your language is decidable.
But that said, this proof is nonconstructive. We can show that the language has to be decidable, but we can't actually find the TM that always halts that accepts it! In fact, no one knows which TM it is, because Goldbach's Conjecture (whether every even number greater than two is the sum of two primes) is an open problem in mathematics.
Hope this helps!
Pure functional programming languages do not allow mutable data, but some computations are more naturally/intuitively expressed in an imperative way -- or an imperative version of an algorithm may be more efficient. I am aware that most functional languages are not pure, and let you assign/reassign variables and do imperative things but generally discourage it.
My question is, why not allow local state to be manipulated in local variables, but require that functions can only access their own locals and global constants (or just constants defined in an outer scope)? That way, all functions maintain referential transparency (they always give the same return value given the same arguments), but within a function, a computation can be expressed in imperative terms (like, say, a while loop).
IO and such could still be accomplished in the normal functional ways - through monads or passing around a "world" or "universe" token.
My question is, why not allow local state to be manipulated in local variables, but require that functions can only access their own locals and global constants (or just constants defined in an outer scope)?
Good question. I think the answer is that mutable locals are of limited practical value but mutable heap-allocated data structures (primarily arrays) are enormously valuable and form the backbone of many important collections including efficient stacks, queues, sets and dictionaries. So restricting mutation to locals only would not give an otherwise purely functional language any of the important benefits of mutation.
On a related note, communicating sequential processes exchanging purely functional data structures offer many of the benefits of both worlds because the sequential processes can use mutation internally, e.g. mutable message queues are ~10x faster than any purely functional queues. For example, this is idiomatic in F# where the code in a MailboxProcessor uses mutable data structures but the messages communicated between them are immutable.
Sorting is a good case study in this context. Sedgewick's quicksort in C is short and simple and hundreds of times faster than the fastest purely functional sort in any language. The reason is that quicksort mutates the array in-place. Mutable locals would not help. Same story for most graph algorithms.
The short answer is: there are systems to allow what you want. For example, you can do it using the ST monad in Haskell (as referenced in the comments).
The ST monad approach is from Haskell's Control.Monad.ST. Code written in the ST monad can use references (STRef) where convenient. The nice part is that you can even use the results of the ST monad in pure code, as it is essentially self-contained (this is basically what you were wanting in the question).
The proof of this self-contained property is done through the type-system. The ST monad carries a state-thread parameter, usually denoted with a type-variable s. When you have such a computation you'll have monadic result, with a type like:
foo :: ST s Int
To actually turn this into a pure result, you have to use
runST :: (forall s . ST s a) -> a
You can read this type like: give me a computation where the s type parameter doesn't matter, and I can give you back the result of the computation, without the ST baggage. This basically keeps the mutable ST variables from escaping, as they would carry the s with them, which would be caught by the type system.
This can be used to good effect on pure structures that are implemented with underlying mutable structures (like the vector package). One can cast off the immutability for a limited time to do something that mutates the underlying array in place. For example, one could combine the immutable Vector with an impure algorithms package to keep the most of the performance characteristics of the in place sorting algorithms and still get purity.
In this case it would look something like:
pureSort :: Ord a => Vector a -> Vector a
pureSort vector = runST $ do
mutableVector <- thaw vector
sort mutableVector
freeze mutableVector
The thaw and freeze functions are linear-time copying, but this won't disrupt the overall O(n lg n) running time. You can even use unsafeFreeze to avoid another linear traversal, as the mutable vector isn't used again.
"Pure functional programming languages do not allow mutable data" ... actually it does, you just simply have to recognize where it lies hidden and see it for what it is.
Mutability is where two things have the same name and mutually exclusive times of existence so that they may be treated as "the same thing at different times". But as every Zen philosopher knows, there is no such thing as "same thing at different times". Everything ceases to exist in an instant and is inherited by its successor in possibly changed form, in a (possibly) uncountably-infinite succession of instants.
In the lambda calculus, mutability thus takes the form illustrated by the following example: (λx (λx f(x)) (x+1)) (x+1), which may also be rendered as "let x = x + 1 in let x = x + 1 in f(x)" or just "x = x + 1, x = x + 1, f(x)" in a more C-like notation.
In other words, "name clash" of the "lambda calculus" is actually "update" of imperative programming, in disguise. They are one and the same - in the eyes of the Zen (who is always right).
So, let's refer to each instant and state of the variable as the Zen Scope of an object. One ordinary scope with a mutable object equals many Zen Scopes with constant, unmutable objects that either get initialized if they are the first, or inherit from their predecessor if they are not.
When people say "mutability" they're misidentifying and confusing the issue. Mutability (as we've just seen here) is a complete red herring. What they actually mean (even unbeknonwst to themselves) is infinite mutability; i.e. the kind which occurs in cyclic control flow structures. In other words, what they're actually referring to - as being specifically "imperative" and not "functional" - is not mutability at all, but cyclic control flow structures along with the infinite nesting of Zen Scopes that this entails.
The key feature that lies absent in the lambda calculus is, thus, seen not as something that may be remedied by the inclusion of an overwrought and overthought "solution" like monads (though that doesn't exclude the possibility of it getting the job done) but as infinitary terms.
A control flow structure is the wrapping of an unwrapped (possibility infinite) decision tree structure. Branches may re-converge. In the corresponding unwrapped structure, they appear as replicated, but separate, branches or subtrees. Goto's are direct links to subtrees. A goto or branch that back-branches to an earlier part of a control flow structure (the very genesis of the "cycling" of a cyclic control flow structure) is a link to an identically-shaped copy of the entire structure being linked to. Corresponding to each structure is its Universally Unrolled decision tree.
More precisely, we may think of a control-flow structure as a statement that precedes an actual expression that conditions the value of that expression. The archetypical case in point is Landin's original case, itself (in his 1960's paper, where he tried to lambda-ize imperative languages): let x = 1 in f(x). The "x = 1" part is the statement, the "f(x)" is the value being conditioned by the statement. In C-like form, we could write this as x = 1, f(x).
More generally, corresponding to each statement S and expression Q is an expression S[Q] which represents the result Q after S is applied. Thus, (x = 1)[f(x)] is just λx f(x) (x + 1). The S wraps around the Q. If S contains cyclic control flow structures, the wrapping will be infinitary.
When Landin tried to work out this strategy, he hit a hard wall when he got to the while loop and went "Oops. Never mind." and fell back into what become an overwrought and overthought solution, while this simple (and in retrospect, obvious) answer eluded his notice.
A while loop "while (x < n) x = x + 1;" - which has the "infinite mutability" mentioned above, may itself be treated as an infinitary wrapper, "if (x < n) { x = x + 1; if (x < 1) { x = x + 1; if (x < 1) { x = x + 1; ... } } }". So, when it wraps around an expression Q, the result is (in C-like notation) "x < n? (x = x + 1, x < n? (x = x + 1, x < n? (x = x + 1, ...): Q): Q): Q", which may be directly rendered in lambda form as "x < n? (λx x < n (λx x < n? (λx·...) (x + 1): Q) (x + 1): Q) (x + 1): Q". This shows directly the connection between cyclicity and infinitariness.
This is an infinitary expression that, despite being infinite, has only a finite number of distinct subexpressions. Just as we can think of there being a Universally Unrolled form to this expression - which is similar to what's shown above (an infinite decision tree) - we can also think of there being a Maximally Rolled form, which could be obtained by labelling each of the distinct subexpressions and referring to the labels, instead. The key subexpressions would then be:
A: x < n? goto B: Q
B: x = x + 1, goto A
The subexpression labels, here, are "A:" and "B:", while the references to the subexpressions so labelled as "goto A" and "goto B", respectively. So, by magic, the very essence of Imperativitity emerges directly out of the infinitary lambda calculus, without any need to posit it separately or anew.
This way of viewing things applies even down to the level of binary files. Every interpretation of every byte (whether it be a part of an opcode of an instruction that starts 0, 1, 2 or more bytes back, or as part of a data structure) can be treated as being there in tandem, so that the binary file is a rolling up of a much larger universally unrolled structure whose physical byte code representation overlaps extensively with itself.
Thus, emerges the imperative programming language paradigm automatically out of the pure lambda calculus, itself, when the calculus is extended to include infinitary terms. The control flow structure is directly embodied in the very structure of the infinitary expression, itself; and thus requires no additional hacks (like Landin's or later descendants, like monads) - as it's already there.
This synthesis of the imperative and functional paradigms arose in the late 1980's via the USENET, but has not (yet) been published. Part of it was already implicit in the treatment (dating from around the same time) given to languages, like Prolog-II, and the much earlier treatment of cyclic recursive structures by infinitary expressions by Irene Guessarian LNCS 99 "Algebraic Semantics".
Now, earlier I said that the magma-based formulation might get you to the same place, or to an approximation thereof. I believe there is a kind of universal representation theorem of some sort, which asserts that the infinitary based formulation provides a purely syntactic representation, and that the semantics that arise from the monad-based representation factors through this as "monad-based semantics" = "infinitary lambda calculus" + "semantics of infinitary languages".
Likewise, we may think of the "Q" expressions above as being continuations; so there may also be a universal representation theorem for continuation semantics, which similarly rolls this formulation back into the infinitary lambda calculus.
At this point, I've said nothing about non-rational infinitary terms (i.e. infinitary terms which possess an infinite number of distinct subterms and no finite Minimal Rolling) - particularly in relation to interprocedural control flow semantics. Rational terms suffice to account for loops and branches, and so provide a platform for intraprocedural control flow semantics; but not as much so for the call-return semantics that are the essential core element of interprocedural control flow semantics, if you consider subprograms to be directly represented as embellished, glorified macros.
There may be something similar to the Chomsky hierarchy for infinitary term languages; so that type 3 corresponds to rational terms, type 2 to "algebraic terms" (those that can be rolled up into a finite set of "goto" references and "macro" definitions), and type 0 for "transcendental terms". That is, for me, an unresolved loose end, as well.
May anyone give me an example how we can improve our code reusability using algebraic structures like groups, monoids and rings? (or how can i make use of these kind of structures in programming, knowing at least that i didn't learn all that theory in highschool for nothing).
I heard this is possible but i can't figure out a way applying them in programming and genereally applying hardcore mathematics in programming.
It is not really the mathematical stuff that helps as is the mathematical thinking. Abstraction is the key in programming. Transforming real live concepts into numbers and relations is what we do every day. Algebra is the mother of all, algebra is the set of rules that defines correctness, it is the highest level of abstraction, so, understanding algebra means you can think more clear, more faster, more efficient. Commencing from Sets theory to Category Theory, Domain Theory etc everything comes from practical challenges, abstraction and generalization requirements.
In common practice you will not need to actually know these, although if you are thinking of developing stuff like AI Agents, programming languages, fundamental concepts and tools then they are a must.
In functional programming, esp. Haskell, it's common to structure programs that transform states as monads. Doing so means you can reuse generic algorithms on monads in very different programs.
The C++ standard template library features the concept of a monoid. The idea is again that generic algorithms may require an operation to satisfies the axioms of monoids for their correctness.
E.g., if we can prove the type T we're operating on (numbers, string, whatever) is closed under the operation, we know we won't have to check for certain errors; we always get a valid T back. If we can prove that the operation is associative (x * (y * z) = (x * y) * z), then we can reuse the fork-join architecture; a simple but way of parallel programming implemented in various libraries.
Computer science seems to get a lot of milage out of category theory these days. You get monads, monoids, functors -- an entire bestiary of mathematical entities that are being used to improve code reusability, harnessing the abstraction of abstract mathematics.
Lists are free monoids with one generator, binary trees are groups. You have either the finite or infinite variant.
Starting points:
http://en.wikipedia.org/wiki/Algebraic_data_type
http://en.wikipedia.org/wiki/Initial_algebra
http://en.wikipedia.org/wiki/F-algebra
You may want to learn category theory, and the way category theory approaches algebraic structures: it is exactly the way functional programming languages approach data structures, at least shapewise.
Example: the type Tree A is
Tree A = () | Tree A | Tree A * Tree A
which reads as the existence of a isomorphism (*) (I set G = Tree A)
1 + G + G x G -> G
which is the same as a group structure
phi : 1 + G + G x G -> G
() € 1 -> e
x € G -> x^(-1)
(x, y) € G x G -> x * y
Indeed, binary trees can represent expressions, and they form an algebraic structure. An element of G reads as either the identity, an inverse of an element or the product of two elements. A binary tree is either a leaf, a single tree or a pair of trees. Note the similarity in shape.
(*) as well as a universal property, but they are two of them (finite trees or infinite lazy trees) so I won't dwelve into details.
As I had no idea this stuff existed in the computer science world, please disregard this answer ;)
I don't think the two fields (no pun intended) have any overlap. Rings/fields/groups deal with mathematical objects. Consider a part of the definition of a field:
For every a in F, there exists an element −a in F, such that a + (−a) = 0. Similarly, for any a in F other than 0, there exists an element a^−1 in F, such that a · a^−1 = 1. (The elements a + (−b) and a · b^−1 are also denoted a − b and a/b, respectively.) In other words, subtraction and division operations exist.
What the heck does this mean in terms of programming? I surely can't have an additive inverse of a list object in Python (well, I could just destroy the object, but that is like the multiplicative inverse. I guess you could get somewhere trying to define a Python-ring, but it just won't work out in the end). Don't even think about dividing lists...
As for code readability, I have absolutely no idea how this can even be applied, so this application is irrelevant.
This is my interpretation, but being a mathematics major probably makes me blind to other terminology from different fields (you know which one I'm talking about).
Monoids are ubiquitous in programming. In some programming languages, eg. Haskell, we can make monoids explicit http://blog.sigfpe.com/2009/01/haskell-monoids-and-their-uses.html