Why Erlang functions are not memoized by default? - functional-programming

One of the properties of functional languages is that functions do not have side effects, thus the same input should always yield the same output. It seems that those languages could easily and heavily benefit from memoization.
But, at least regarding Erlang, there is no default memoization of the functions' calls. Is there any particular reason that Erlang (and other functional languages as far as I know) do memoize by default (or with a simple trigger), or at least have explicit, good support of memoization?
Is there something inherently wrong with memoization?
One reason I can imagine is that with memoization your memory footprint can grow quickly. Right, but Erlang already runs on VM and manages memory, so I guess it could tame caches and prevent them from growing quite easily.
Related:
Memoization in Erlang
How does one use cached data in a functional language such as Erlang?
Simple example for Erlang memoization
Edit:
What is an idempotent operation?
Too big too fail :)

There are a number of mistaken assumptions in this question.
One of the properties of functional languages is that functions do not have side effects
Incorrect, only "purely functional languages" have such a constraint. Erlang is not purely functional.
It permits arbitrary side effects in functions.
Is there any particular reason that Erlang (and other functional languages as far as I know) do memoize by default (or with a simple trigger), or at least have explicit, good support of memoization?
No languages (at least non-toy ones) implement by-default memoization of all function calls. Why? Massive space leaks would ensue.

Related

Are there any modern languages not supporting recursion?

There is a lot of talk about loop-less (functional) programming languages, but I haven't heard of any modern recursive-less programming languages. I know COBOL doesn't (or didn't) support recursion and I imagine it's the same for FORTRAN (or at the very least they will have a weird behaviour), but are there any modern languages?
There are several reasons I am wondering about this...
From what I understand there is no recursive algorithm for which no iterative implementation exists that is at least as performant (as you can simply "simulate" recursion iteratively, when managing the stack within the loop). So there is a) no performance gain and b) no capability gain from using recursion.
Recursion can have worse performance than iteration for things like naiive, recursive implementations of fibonacci numbers. People like to point out that recursion can be just as fast when compilers are optimized and it comes to tail recursion, but from my understanding tail recursion algos always are trivial to convert to loops. So there is c) potential performance loss from recursion.
Academics seem to have a thing for recursion, but then again most of them can't/don't really write any productive code. C++ (or object orientation in general) is the living proof how academic ideas can make their way into the business world/companies, while beeing mostly impractical. I wonder if recursion is only another example of this. (I know this is an opinionated topic, but the majority of accomplished software engineers I heard talk agree on this at least to some extent).
I have written recursive algorithms in productive systems before, but I can't think of a time, when I really benefited from using recursion. Recursive algorithms d) may or may not provide code readability advantages in some cases. I am not sure about this...
Most (all?) modern operating systems rely on specified call stack conventions. So if you want to call any external library from within your program, your programming language/compiler has to comply with the system's call stack convention. This makes it an easy step to adopt recursion for programming languages/compilers. However if you would create a recursion-less language you wouldn't need a call stack at all, which would have interesting effects in terms of memory management...
So I am not hating on recursion... I am just wondering if the recursion-less language experiment has been done in the recent past and what the results have been, as it seems like a practical experiment to me?

2 questions at the end of a functional programming course

Here seems to be the two biggest things I can take from the How to Design Programs (simplified Racket) course I just finished, straight from the lecture notes of the course:
1) Tail call optimization, and the lack thereof in non-functional languages:
Sadly, most other languages do not support TAIL CALL
OPTIMIZATION. Put another way, they do build up a stack
even for tail calls.
Tail call optimization was invented in the mid 70s, long
after the main elements of most languages were developed.
Because they do not have tail call optimization, these
languages provide a fixed set of LOOPING CONSTRUCTS that
make it possible to traverse arbitrary sized data.
a) What are the equivalents to this type of optimization in procedural languages that don't feature it?
b) Do using those equivalents mean we avoid building up a stack in similar situations in languages that don't have it?
2) Mutation and multicore processors
This mechanism is fundamental in almost any other language you
program in. We have delayed introducing it until now for
several reasons:
despite being fundamental, it is surprisingly complex
overuse of it leads to programs that are not amenable
to parallelization (running on multiple processors).
Since multi-core computers are now common, the ability
to use mutation only when needed is becoming more and
more important
overuse of mutation can also make it difficult to
understand programs, and difficult to test them well
But mutable variables are important, and learning this mechanism
will give you more preparation to work with Java, Python and many
other languages. Even in such languages, you want to use a style
called "mostly functional programming".
I learned some Java, Python and C++ before taking this course, so came to take mutation for granted. Now that has been all thrown in the air by the above statement. My questions are:
a) where could I find more detailed information regarding what is suggested in the 2nd bullet, and what to do about it, and
b) what kind of patterns would emerge from a "mostly functional programming" style, as opposed to a more careless style I probably would have had had I continued on with those other languages instead of taking this course?
As Leppie points out, looping constructs manage to recover the space savings of proper tail calling, for the particular kinds of loops that they support. The only problem with looping constructs is that the ones you have are never enough, unless you just hurl the ball into the user's court and force them to model the stack explicitly.
To take an example, suppose you're traversing a binary tree using a loop. It works... but you need to explicitly keep track of the "ones to come back to." A recursive traversal in a tail-calling language allows you to have your cake and eat it too, by not wasting space when not required, and not forcing you to keep track of the stack yourself.
Your question on parallelism and concurrency is much more wide-open, and the best pointers are probably to areas of research, rather than existing solutions. I think that most would agree that there's a crisis going on in the computing world; how do we adapt our mutation-heavy programming skills to the new multi-core world?
Simply switching to a functional paradigm isn't a silver bullet here, either; we still don't know how to write high-level code and generate blazing fast non-mutating run-concurrently code. Lots of folks are working on this, though!
To expand on the "mutability makes parallelism hard" concept, when you have multiple cores going, you have to use synchronisation if you want to modify something from one core and have it be seen consistently by all the other cores.
Getting synchronisation right is hard. If you over-synchronise, you have deadlocks, slow (serial rather than parallel) performance, etc. If you under-synchronise, you have partially-observed changes (where another core sees only a portion of the changes you made from a different core), leaving your objects observed in an invalid "halfway changed" state.
It is for that reason that many functional programming languages encourage a message-queue concept instead of a shared state concept. In that case, the only shared state is the message queue, and managing synchronisation in a message queue is a solved problem.
a) What are the equivalents to this type of optimization in procedural languages that don't feature it? b) Do using those equivalents mean we avoid building up a stack in similar situations in languages that don't have it?
Well, the significance of a tail call is that it can evaluate another function without adding to the call stack, so anything that builds up the stack can't really be called an equivalent.
A tail call behaves essentially like a jump to the new code, using the language trappings of a function call and all the appropriate detail management. So in languages without this optimization, you'd use a jump within a single function. Loops, conditional blocks, or even arbitrary goto statements if nothing else works.
a) where could I find more detailed information regarding what is suggested in the 2nd bullet, and what to do about it
The second bullet sounds like an oversimplification. There are many ways to make parallelization more difficult than it needs to be, and overuse of mutation is just one.
However, note that parallelization (splitting a task into pieces that can be done simultaneously) is not entirely the same thing as concurrency (having multiple tasks executed simultaneously that may interact), though there's certainly overlap. Avoiding mutation is incredibly helpful in writing concurrent programs, since immutable data avoids a lot of race conditions and resource contention that would otherwise be possible.
b) what kind of patterns would emerge from a "mostly functional programming" style, as opposed to a more careless style I probably would have had had I continued on with those other languages instead of taking this course?
Have you looked at Haskell or Clojure? Both are heavily inclined to a very functional style emphasizing controlled mutation. Haskell is more rigorous about it but has a lot of tools for working with limited forms of mutability, while Clojure is a bit more informal and might be more familiar to you since it's another Lisp dialect.

Do purely functional languages really guarantee immutability?

In a purely functional language, couldn't one still define an "assignment" operator, say, "<-", such that the command, say, "i <- 3", instead of directly assigning the immutable variable i, would create a copy of the entire current call stack, except replacing i with 3 in the new call stack, and executing the new call stack from that point onward? Given that no data actually changed, wouldn't that still be considered "purely functional" by definition? Of course the compiler would simply make the optimization to simply assign 3 to i, in which case what's the difference between imperative and purely functional?
Purely functional languages, such as Haskell, have ways of modelling imperative languages, and they are not shy about admitting it either. :)
See http://www.haskell.org/tutorial/io.html, in particular 7.5:
So, in the end, has Haskell simply
re-invented the imperative wheel?
In some sense, yes. The I/O monad
constitutes a small imperative
sub-language inside Haskell, and thus
the I/O component of a program may
appear similar to ordinary imperative
code. But there is one important
difference: There is no special
semantics that the user needs to deal
with. In particular, equational
reasoning in Haskell is not
compromised. The imperative feel of
the monadic code in a program does not
detract from the functional aspect of
Haskell. An experienced functional
programmer should be able to minimize
the imperative component of the
program, only using the I/O monad for
a minimal amount of top-level
sequencing. The monad cleanly
separates the functional and
imperative program components. In
contrast, imperative languages with
functional subsets do not generally
have any well-defined barrier between
the purely functional and imperative
worlds.
So the value of functional languages is not that they make state mutation impossible, but that they provide a way to allow you to keep the purely functional parts of your program separate from the state-mutating parts.
Of course, you can ignore this and write your entire program in the imperative style, but then you won't be taking advantage of the facilities of the language, so why use it?
Update
Your idea is not as flawed as you assume. Firstly, if someone familiar only with imperative languages wanted to loop through a range of integers, they might wonder how this could be achieved without a way to increment a counter.
But of course instead you just write a function that acts as the body of the loop, and then make it call itself. Each invocation of the function corresponds to an "iteration step". And in the scope of each invocation the parameter has a different value, acting like an incrementing variable. Finally, the runtime can note that the recursive call appears at the end of the invocation, and so it can reuse the top of the function-call stack instead of growing it (tail call). Even this simple pattern has almost all of the flavour of your idea - including the compiler/runtime quietly stepping in and actually making mutation occur (overwriting the top of the stack). Not only is it logically equivalent to a loop with a mutating counter, but in fact it makes the CPU and memory do the same thing physically.
You mention a GetStack that would return the current stack as a data structure. That would indeed be a violation of functional purity, given that it would necessarily return something different each time it was called (with no arguments). But how about a function CallWithStack, to which you pass a function of your own, and it calls back to your function and passes it the current stack as a parameter? That would be perfectly okay. CallCC works a bit like that.
Haskell doesn't readily give you ways to introspect or "execute" call stacks, so I wouldn't worry too much about that particular bizarre scheme. However in general it is true that one can subvert the type system using unsafe "functions" such as unsafePerformIO :: IO a -> a. The idea is to make it difficult, not impossible, to violate purity.
Indeed, in many situations, such as when making Haskell bindings for a C library, these mechanisms are quite necessary... by using them you are removing the burden of proof of purity from the compiler and taking it upon yourself.
There is a proposal to actually guarantee safety by outlawing such subversions of the type system; I'm not too familiar with it, but you can read about it here.
Immutability is a property of the language, not of the implementation.
An operation a <- expr that copies data is still an imperative operation, if values that refer to the location a appear to have changed from the programmers point of view.
Likewise, a purely functional language implementation may overwrite and reuse variables to its heart's content, as long as each modification is invisible to the programmer. For example, the map function can in principle overwrite a list instead of creating a new, whenever the language implementation can deduce that the old list won't be needed anywhere.

Is Erlang really a functional language? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I hear all the time that Erlang is a functional language, yet it is easy to call databases or non side-effect free code from a function, and commands are easily ordered by using "," commas between them just like Ruby or another language, so where is the "functional" part of Erlang?
The central idea is that each process is a functional program over an input stream of messages. The result from the functional program is an output stream of messages to others. From this perspective, Erlang is a rather clean functional language; there are no destructive updates to data structures (like setcar in Lisp and most Schemes).
With few exceptions, all built-in functions such as operations on ETS tables also follow this model: apart from efficiency issues, those BIFs could actually have been implemented with pure Erlang processes and message passing.
So yes, the Erlang language is functional, but a collection of interacting Erlang processes is a different thing. Each process is an ongoing computation, and as such it has a current state, which can change in relation to the other processes. Even a database is just another process in this respect.
In my mind, this is one of the most important things about Erlang: outside the process, there could be a storm raging, but inside, things are calm, letting you focus on what that process should do - and only that.
There's a meme that functional languages must have immutable values and be side-effect free, but I say that any language with first-class functions is a functional programming language.
It is useful and powerful to have strong controls over value mutability and side effects, but I believe these are peripheral to functional programming. Nice to have, but not essential. Even in languages that do have these properties, there is always a way to escape the purity of the paradigm.1
There is a grayscale continuum between truly pure FP languages that you can't actually use for anything practical and languages that are really quite impure but still have some of the FP nature to them:
Book FP: Introductory books on FP languages frequently show only a subset of the language, with all examples done within the language's REPL, so that you never get to see the purely functional paradigm get broken. A good example of this is Scheme as presented in The Little Schemer.
You can come away from reading such a book with the false impression that FP languages can't actually do anything useful.
Haskell: The creators of Haskell went to uncommon lengths to wall off impurity via the famous I/O monad. Everything on one side of the wall is purely functional, so that the compiler can reason confidently about the code.
But the important thing is, despite the existence of this wall, you have to use the I/O monad to get anything useful done in Haskell.2 In that sense, Haskell isn't as "pure" as some would like you to believe. The I/O monad allows you to build any sort of "impure" software you like in Haskell: database clients, highly stateful GUIs, etc.
Erlang: Has immutable values and first-class functions, but lacks a strong wall between the core language and the impure bits.
Erlang ships with Mnesia, a disk-backed in-memory DBMS, which is about as impure as software gets. It's scarcely different in practice from a global variable store. Erlang also has great support for communicating with external programs via ports, sockets, etc.
Erlang doesn't impose any kind of purity policy on you, the programmer, at the language level. It just gives you the tools and lets you decide how to use them.
OCaml and F#: These closely-related multiparadigm languages include both purely functional elements as well as imperative and object-oriented characteristics.
The imperative programming bits allow you to do things like write a traditional for loop with a mutable counter variable, whereas a pure FP program would probably try to recurse over a list instead to accomplish the same result.
The OO parts are pretty much useless without the mutable keyword, which turns a value definition into a variable, so that the compiler won't complain if you change the variable's value. Mutable variables in OCaml and F# have some limitations, but you can escape even those limitations with the ref keyword.
If you're using F# with .NET, you're going to be mutating values all the time, because most of .NET is mutable, in one way or another. Any .NET object with a settable property is mutable, for example, and all the GUI stuff inherently has side-effects. The only immutable part of .NET that immediately comes to mind is System.String.
Despite all this, no one would argue that OCaml and F# are not functional programming languages.
JavaScript, R, Lua, Perl...: There are many languages even less pure than OCaml which can still be considered functional in some sense. Such languages have first-class functions, but values are mutable by default.
Foototes:
Any truly pure FP language is a toy language or someone's research project.
That is, unless your idea of "useful" is to keep everything in the ghci REPL. You can use Haskell like a glorified calculator, if you like, so it's pure FP.
Yes, it's a functional language. It's not a pure functional language like Haskell, but then again, neither is LISP (and nobody really argues that LISP isn't functional).
The message-passing/process handling of Erlang is an implementation of the Actor model. You could argue that Erlang is an Actor language, with a functional language used for the individual Actors.
The functional part is that you tend to pass around functions. Most langauges can be used both as a functional language, and as an imperative language, even C (it's quite possible to make a program consisting of only function pointers and constants).
I guess the distinguishing factor is usually the lack of mutable variables in functional languages.

Do functional languages cope well with complexity?

I am curious how functional languages compare (in general) to more "traditional" languages such as C# and Java for large programs. Does program flow become difficult to follow more quickly than if a non-functional language is used? Are there other issues or things to consider when writing a large software project using a functional language?
Thanks!
Functional programming aims to reduce the complexity of large systems, by isolating each operation from others. When you program without side-effects, you know that you can look at each function individually - yes, understanding that one function may well involve understanding other functions too, but at least you know it won't interfere with some other piece of system state elsewhere.
Of course this is assuming completely pure functional programming - which certainly isn't always the case. You can use more traditional languages in a functional way too, avoiding side-effects where possible. But the principle is an important one: avoiding side-effects leads to more maintainable, understandable and testable code.
Does program flow become difficult to follow more quickly than if a >non-functional language is used?
"Program flow" is probably the wrong concept to analyze a large functional program. Control flow can become baroque because there are higher-order functions, but these are generally easy to understand because there is rarely any shared mutable state to worry about, so you can just think about arguments and results. Certainly my experience is that I find it much easier to follow an aggressively functional program than an aggressively object-oriented program where parts of the implementation are smeared out over many classes. And I find it easier to follow a program written with higher-order functions than with dynamic dispatch. I also observe that my students, who are more representative of programmers as a whole, have difficulties with both inheritance and dynamic dispatch. They do not have comparable difficulties with higher-order functions.
Are there other issues or things to consider when writing a large
software project using a functional language?
The critical thing is a good module system. Here is some commentary.
The most powerful module system I know of the unit system of PLT Scheme designed by Matthew Flatt and Matthias Felleisen. This very powerful system unfortunately lacks static types, which I find a great aid to programming.
The next most powerful system is the Standard ML module system. Unfortunately Standard ML, while very expressive, also permits a great many questionable constructs, so it is easy for an amateur to make a real mess. Also, many programmers find it difficult to use Standard ML modules effectively.
The Objective Caml module system is very similar, but there are some differences which tend to mitigate the worst excesses of Standard ML. The languages are actually very similar, but the styles and idioms of Objective Caml make it significantly less likely that beginners will write insane programs.
The least powerful/expressive module system for a functional langauge is the Haskell module system. This system has a grave defect that there are no explicit interfaces, so most of the cognitive benefit of having modules is lost. Another sad outcome is that while the Haskell module system gives users a hierarchical name space, use of this name space (import qualified, in case you're an insider) is often deprecated, and many Haskell programmers write code as if everything were in one big, flat namespace. This practice amounts to abandoning another of the big benefits of modules.
If I had to write a big system in a functional language and had to be sure that other people understood it, I'd probably pick Standard ML, and I'd establish very stringent programming conventions for use of the module system. (E.g., explicit signatures everywhere, opague ascription with :>, and no use of open anywhere, ever.) For me the simplicity of the Standard ML core language (as compared with OCaml) and the more functional nature of the Standard ML Basis Library (as compared with OCaml) are more valuable than the superior aspects of the OCaml module system.
I've worked on just one really big Haskell program, and while I found (and continue to find) working in Haskell very enjoyable, I really missed not having explicit signatures.
Do functional languages cope well with complexity?
Some do. I've found ML modules and module types (both the Standard ML and Objective Caml) flavors invaluable tools for managing complexity, understanding complexity, and placing unbreachable firewalls between different parts of large programs. I have had less good experiences with Haskell
Final note: these aren't really new issues. Decomposing systems into modules with separate interfaces checked by the compiler has been an issue in Ada, C, C++, CLU, Modula-3, and I'm sure many other languages. The main benefit of a system like Standard ML or Caml is the that you get explicit signatures and modular type checking (something that the C++ community is currently struggling with around templates and concepts). I suspect that these issues are timeless and are going to be important for any large system, no matter the language of implementation.
I'd say the opposite. It is easier to reason about programs written in functional languages due to the lack of side-effects.
Usually it is not a matter of "functional" vs "procedural"; it is rather a matter of lazy evaluation.
Lazy evaluation is when you can handle values without actually computing them yet; rather, the value is attached to an expression which should yield the value if it is needed. The main example of a language with lazy evaluation is Haskell. Lazy evaluation allows the definition and processing of conceptually infinite data structures, so this is quite cool, but it also makes it somewhat more difficult for a human programmer to closely follow, in his mind, the sequence of things which will really happen on his computer.
For mostly historical reasons, most languages with lazy evaluation are "functional". I mean that these language have good syntaxic support for constructions which are typically functional.
Without lazy evaluation, functional and procedural languages allow the expression of the same algorithms, with the same complexity and similar "readability". Functional languages tend to value "pure functions", i.e. functions which have no side-effect. Order of evaluation for pure function is irrelevant: in that sense, pure functions help the programmer in knowing what happens by simply flagging parts for which knowing what happens in what order is not important. But that is an indirect benefit and pure functions also appear in procedural languages.
From what I can say, here are the key advantages of functional languages to cope with complexity :
Functional programming hates side-effects.
You can really black-box the different layers
and you won't be afraid of parallel processing
(actor model like in Erlang is really easier to use
than locks and threads).
Culturally, functional programmer
are used to design a DSL to express
and solve a problem. Identifying the fundamental
primitives of a problem is a radically
different approach than rushing to the brand
new trendy framework.
Historically, this field has been led by very smart people :
garbage collection, object oriented, metaprogramming...
All those concepts were first implemented on functional platform.
There is plenty of literature.
But the downside of those languages is that they lack support and experience in the industry. Having portability, performance and interoperability may be a real challenge where on other platform like Java, all of this seems obvious. That said, a language based on the JVM like Scala could be a really nice fit to benefit from both sides.
Does program flow become difficult to
follow more quickly than if a
non-functional language is used?
This may be the case, in that functional style encourages the programmer to prefer thinking in terms of abstract, logical transformations, mapping inputs to outputs. Thinking in terms of "program flow" presumes a sequential, stateful mode of operation--and while a functional program may have sequential state "under the hood", it usually isn't structured around that.
The difference in perspective can be easily seen by comparing imperative vs. functional approaches to "process a collection of data". The former tends to use structured iteration, like a for or while loop, telling the program "do this sequence of tasks, then move to the next one and repeat, until done". The latter tends to use abstracted recursion, like a fold or map function, telling the program "here's a function to combine/transform elements--now use it". It isn't necessary to follow the recursive program flow through a function like map; because it's a stateless abstraction, it's sufficient to think in terms of what it means, not what it's doing.
It's perhaps somewhat telling that the functional approach has been slowly creeping into non-functional languages--consider foreach loops, Python's list comprehensions...

Resources