I come from an imperative background, and have recently begun to delve into functional programming. I'm confused by one concept of pure functions. From what I understand, a pure function is a function that always evaluates to the same result given the same input, and also is a function that does not have side effects.
My confusion stems from using functions inside functions. If I have two functions (pseudocode):
function foo(x) { return x+1; }
function bar(x) { return foo(x); }
bar relies on an outside function, foo, to compute its result. Does this render bar impure? If so, how can anyone write a program with only pure functions? Does one have to pass a set of utility functions as a parameter (such as count() or map() etc)? I feel like I'm drastically misunderstanding an important aspect of functional programming.
Thanks!
Not really. The thing that makes a function impure are side effects. If a function leaks information out to other places than the caller or retrieves information from other places than the caller.
As an example of side effects you have read from keyboard or file and print or store to file. Another one might be storing state inside the function. eg. a counter. that counts how many times you have called it. It doesn'tĀ matter if it's local or global so mutating any variables are side effects. Any of this would make a function impure.
Usually you can never make your programs 100% functional. You need small dirty portians that uses input somehow (or else you are calculating exactly the same every time you run it) and some way to display the result (or else why run it in the first place.. unless you use your priogram to head your apartment in case you have a side effect and it's again impure)
Related
Considering the common definition of pure functions:
the function return values are identical for identical arguments (no variation with local static variables, non-local variables, mutable reference arguments or input streams)
the function application has no side effects (no mutation of local static variables, non-local variables, mutable reference arguments or input/output streams).
What are the implications of using (only) impure functions that still have 2 but do not have 1, in the sense that they can read the current value of (some) immutable state, but can not modify (any) state ? Is such a pattern have a name and is useful or is it an anti pattern ?
An example can be using a global way to get a read only immutable version of a state, or passing as an argument a function that return the current immutable value of a state.
(Rationale - I have been trying to structure my (C#) code in a more functional way, using pure functions where possible (as static members of static classes).
It quickly became obvious how complex and tedious it is to pass state values to these pure functions even when they only need to read the value. I need to know the relevant state value at the point of calling and that often means passing it around through parts of the code which have no need to know it otherwise.
However if for example I initialize such static classes with an internal member function that can return the current immutable value of the state, other members can use it instead of having that value passed to them. And This pattern has greatly simplified my code where I used it. And it feels like I still get most of the benefits of isolating state changes etc)
An potentially impure operation without side effects fits the description of a Query in Command-Query Separation (CQS) - a decade-old object-oriented design principle.
CQS explicitly distinguishes between operations with side effects (Commands) and operations that return data (Queries). According to that principle, a Query must not have a side effect.
On the other hand, CQS says nothing about determinism, so a Query is allowed to be non-deterministic.
In C#, a fundamental example of a Query is DateTime.Now. This is essentially a method that takes no arguments - which is equivalent with unit as an input argument. Thus, we can think of DateTime.Now as a Query from unit to DateTime: () -> DateTime.
DateTime.Now is (in)famously non-deterministic, so it's clearly not a pure function. It is, however, a Query.
All pure functions are Queries, but not all Queries are pure functions.
CQS is a nice design principle, but it's not Functional Programming (FP). It's a move in the right direction, but you should attempt to have as few non-deterministic Queries as possible.
People often tend to focus on avoiding side-effects when learning FP, but it's just as important to avoid non-determinism.
I was learning functional programming and came across the term referential transparency.
After some research on that, I found out that RT is useful
When we want to make our code easier to reason about and read since our function is predictable AND
When our function is predictable, it will be of great help to the JIT compiler allowing it to replace the function with its return value(Does it replace the function with its value as long as the function is hot?).
Are both the above statements true?
Referential Transparency means that a function with certain parameters will always return the same result as long as the input parameters are the same, in other words it does not have side effects.
Of course one of the benefits of these is that the code is easier to reason about because the same execution will return the same values so you can change the call to the function by the result it returns.
I suppose this feature is used by many compilers to speed up the execution by making the change, but this depends on the language and the compiler used to translate to byte code, but It little has to do with Functional programming per se.
OK, let me try to rephrase my question.
Actually I wanted to know how is Function Application implemented in FP.
Is it done like a function call in imperative languages, where a stack frame is added for each call and removed on each return.
Or, is it like in inline functions, where the function call statement is replaced by the function definition.
Also, in terms of implementation of a function application, what is the significance of the statement functions in FP are mappings between domains and corresponding ranges. It is obviously not possible to maintain a mapping for each domain-range entry pair, so what exactly does the statement imply...
This question is broad enough that I can't answer it completely, since I don't know every single functional programming language. But I can tell you how it's done in one language, F#. You asked whether function application is done like a function call in imperative languages (another stack frame added for each call) or whether it's done as inline functions... and in F# the answer is both. The F# compiler is allowed to choose whether to create a stack-frame-using function call, or whether to inline the function at the call site; generally the choice is made based on the size of the compiled function. If the function compiles down to fewer than N bytes of compiled code (I can't tell you the exact number, but knowing the exact number doesn't actually matter) then the compiler will usually inline that function call; if it takes more than N bytes then the function call will use a stack frame. (Except in the case of tail-recursive calls, which are compiled to the equivalent of a goto and don't use a stack frame).
P.S. You can force the compiler's hand by using the inline keyword, which forces that function to be inlined at the call site every time. Most F# programmers don't recommend doing that on a regular basis, because the compiler is smart enough that it's usually not a good idea to override its decisions. (Also, the inline keyword means that the types of the function's parameters must be resolvable at compile time, so there are some functions for which that changes the semantics, but that's a little off-topic for the question you asked so I won't go into it. Except to say that in F#, statically-resolved type parameters or SRTPs are a very complicated subject, and you can do some very advanced things with them if you understand them.)
In a Functional Programming book the author mentions the following are the side effects.
Modifying a variable
Modifying a data structure in place
Setting a field on an object
Throwing an exception or halting with an error
Printing to the console or reading user input
Reading from or writing to a file
Drawing on the screen
I am just wondering how it is possible to write pure functional program without reading or writing to a file if they are side effects. If yes what would be the common approach in the functional world to achieve this ?
Thanks,
Mohamed
Properly answering this question requires likely an entire book (not too long). The point here is that functional programming is meant to separate logic description / representation from its actual runtime interpretation. Your functional code just represents (doesn't run) the effects of your program as values, giving you back some kind of abstract syntax tree that describes your computation. A different part of your code (usually called the interpreter) will take those values and lazily run the actual effects. That part is not functional.
How is it possible to write a pure functional program that is useful in any way? It is not possible. A pure functional program only heats up the CPU. It needs an impure part (the interpreter) that actually writes to disk or to the network. There are several important advantages in doing it that way. The pure functional part is easy to test (testing pure functions is easy), and the referentially transparent nature of pure functions makes easy to reason about your code locally, making the development process as a whole less buggy and more productive. It also offers elegant ways to deal with traditionally obfuscated defensive code.
So what is the common approach in the functional world to achieve side effects? As said, representing them using values, and then writing the code that interprets those values. A really good explanation of the whole process can be found in these blog post series.
For the sake of brevity, let me (over)simplify and make the long story short:
To deal with "side effects" in purely functional programming, you (programmers) write pure functions from the input to the output, and the system causes the side effects by applying those pure functions to the "real world".
For example, to read an integer x and write x+1, you (roughly speaking) write a function f(x) = x+1, and the system applies it to the real input and outputs its return value.
For another example, instead of raising an exception as a side effect, your pure function returns a special value representing the exception. Various "monads" such as IO in Haskell generalize these ideas, that is, represent side effects by pure functions (actual implementations are more complicated, of course).
Is it unwise to return a var bound using let?
(let [pipeline (Channels/pipeline)]
(.addLast pipeline "codec" (HttpClientCodec.))
;; several more lines like this
pipeline)
Is the binding here just about the lexical scope (as opposed to def) and not unsafe to pass around?
Update
In writing this question I realised the above was ugly. And if something is ugly in Clojure you are probably doing it wrong.
I think this is probably the more idiomatic way of handling the above (which makes the question moot, btw, but still handy knowledge).
(doto (Channels/pipeline)
(.addLast "codec" (HttpClientCodec.)))
let is purely lexically scoped and doesn't create a var. The locals created by let (or loop) behave exactly like function arguments. So yeah, it's safe to use as many let/loop-defined locals as you like, close over them, etc. Returning a local from the function simply returns its value, not the internal representation (which is actually on the stack, unless closed over). let/loop bindings are therefore also reentrancy/thread-safe.
By the way, for your specific code example with lots of java calls, you may want to consider using doto instead or additionally. http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/doto