clojure functions, let & return values - functional-programming

Is it unwise to return a var bound using let?
(let [pipeline (Channels/pipeline)]
(.addLast pipeline "codec" (HttpClientCodec.))
;; several more lines like this
pipeline)
Is the binding here just about the lexical scope (as opposed to def) and not unsafe to pass around?
Update
In writing this question I realised the above was ugly. And if something is ugly in Clojure you are probably doing it wrong.
I think this is probably the more idiomatic way of handling the above (which makes the question moot, btw, but still handy knowledge).
(doto (Channels/pipeline)
(.addLast "codec" (HttpClientCodec.)))

let is purely lexically scoped and doesn't create a var. The locals created by let (or loop) behave exactly like function arguments. So yeah, it's safe to use as many let/loop-defined locals as you like, close over them, etc. Returning a local from the function simply returns its value, not the internal representation (which is actually on the stack, unless closed over). let/loop bindings are therefore also reentrancy/thread-safe.
By the way, for your specific code example with lots of java calls, you may want to consider using doto instead or additionally. http://clojure.github.com/clojure/clojure.core-api.html#clojure.core/doto

Related

Julia functions: making mutable types immutable

Coming from Wolfram Mathematica, I like the idea that whenever I pass a variable to a function I am effectively creating a copy of that variable. On the other hand, I am learning that in Julia there are the notions of mutable and immutable types, with the former passed by reference and the latter passed by value. Can somebody explain me the advantage of such a distinction? why arrays are passed by reference? Naively I see this as a bad aspect, since it creates side effects and ruins the possibility to write purely functional code. Where I am wrong in my reasoning? is there a way to make immutable an array, such that when it is passed to a function it is effectively passed by value?
here an example of code
#x is an in INT and so is immutable: it is passed by value
x = 10
function change_value(x)
x = 17
end
change_value(x)
println(x)
#arrays are mutable: they are passed by reference
arr = [1, 2, 3]
function change_array!(A)
A[1] = 20
end
change_array!(arr)
println(arr)
which indeed modifies the array arr
There is a fair bit to respond to here.
First, Julia does not pass-by-reference or pass-by-value. Rather it employs a paradigm known as pass-by-sharing. Quoting the docs:
Function arguments themselves act as new variable bindings (new
locations that can refer to values), but the values they refer to are
identical to the passed values.
Second, you appear to be asking why Julia does not copy arrays when passing them into functions. This is a simple one to answer: Performance. Julia is a performance oriented language. Making a copy every time you pass an array into a function is bad for performance. Every copy operation takes time.
This has some interesting side-effects. For example, you'll notice that a lot of the mature Julia packages (as well as the Base code) consists of many short functions. This code structure is a direct consequence of near-zero overhead to function calls. Languages like Mathematica and MatLab on the other hand tend towards long functions. I have no desire to start a flame war here, so I'll merely state that personally I prefer the Julia style of many short functions.
Third, you are wondering about the potential negative implications of pass-by-sharing. In theory you are correct that this can result in problems when users are unsure whether a function will modify its inputs. There were long discussions about this in the early days of the language, and based on your question, you appear to have worked out that the convention is that functions that modify their arguments have a trailing ! in the function name. Interestingly, this standard is not compulsory so yes, it is in theory possible to end up with a wild-west type scenario where users live in a constant state of uncertainty. In practice this has never been a problem (to my knowledge). The convention of using ! is enforced in Base Julia, and in fact I have never encountered a package that does not adhere to this convention. In summary, yes, it is possible to run into issues when pass-by-sharing, but in practice it has never been a problem, and the performance benefits far outweigh the cost.
Fourth (and finally), you ask whether there is a way to make an array immutable. First things first, I would strongly recommend against hacks to attempt to make native arrays immutable. For example, you could attempt to disable the setindex! function for arrays... but please don't do this. It will break so many things.
As was mentioned in the comments on the question, you could use StaticArrays. However, as Simeon notes in the comments on this answer, there are performance penalties for using static arrays for really big datasets. More than 100 elements and you can run into compilation issues. The main benefit of static arrays really is the optimizations that can be implemented for smaller static arrays.
Another package-based options suggested by phipsgabler in the comments below is FunctionalCollections. This appears to do what you want, although it looks to be only sporadically maintained. Of course, that isn't always a bad thing.
A simpler approach is just to copy arrays in your own code whenever you want to implement pass-by-value. For example:
f!(copy(x))
Just be sure you understand the difference between copy and deepcopy, and when you may need to use the latter. If you're only working with arrays of numbers, you'll never need the latter, and in fact using it will probably drastically slow down your code.
If you wanted to do a bit of work then you could also build your own array type in the spirit of static arrays, but without all the bells and whistles that static arrays entails. For example:
struct MyImmutableArray{T,N}
x::Array{T,N}
end
Base.getindex(y::MyImmutableArray, inds...) = getindex(y.x, inds...)
and similarly you could add any other functions you wanted to this type, while excluding functions like setindex!.

What is the mechanism behind Function Application in Functional Programming

OK, let me try to rephrase my question.
Actually I wanted to know how is Function Application implemented in FP.
Is it done like a function call in imperative languages, where a stack frame is added for each call and removed on each return.
Or, is it like in inline functions, where the function call statement is replaced by the function definition.
Also, in terms of implementation of a function application, what is the significance of the statement functions in FP are mappings between domains and corresponding ranges. It is obviously not possible to maintain a mapping for each domain-range entry pair, so what exactly does the statement imply...
This question is broad enough that I can't answer it completely, since I don't know every single functional programming language. But I can tell you how it's done in one language, F#. You asked whether function application is done like a function call in imperative languages (another stack frame added for each call) or whether it's done as inline functions... and in F# the answer is both. The F# compiler is allowed to choose whether to create a stack-frame-using function call, or whether to inline the function at the call site; generally the choice is made based on the size of the compiled function. If the function compiles down to fewer than N bytes of compiled code (I can't tell you the exact number, but knowing the exact number doesn't actually matter) then the compiler will usually inline that function call; if it takes more than N bytes then the function call will use a stack frame. (Except in the case of tail-recursive calls, which are compiled to the equivalent of a goto and don't use a stack frame).
P.S. You can force the compiler's hand by using the inline keyword, which forces that function to be inlined at the call site every time. Most F# programmers don't recommend doing that on a regular basis, because the compiler is smart enough that it's usually not a good idea to override its decisions. (Also, the inline keyword means that the types of the function's parameters must be resolvable at compile time, so there are some functions for which that changes the semantics, but that's a little off-topic for the question you asked so I won't go into it. Except to say that in F#, statically-resolved type parameters or SRTPs are a very complicated subject, and you can do some very advanced things with them if you understand them.)

Is the "define" primitive of Scheme an imperative languages feature? Why or why not?

(define hypot
(lambda (a b)
(sqrt (+ (* a a) (* b b)))))
This is a Scheme programming language.
"define" creates a variable and global binding
lambda creates a procedure
I would like to know if "define" would be considered as an imperative language feature! As long as I know imperative feature is static scoping. I think it is an imperative feature since "define" create a global binding and static scoped looks at global binding for any variable definition where as in dynamic it looks at the current most active binding.
Please help me find the correct answer!! And I would like to know why or why not?
In a Scheme program (define var expr) statement is both a declaration and an initialization. Declarations introduce a new name into the scope. Declarations and initialization are present in both imperative and declarative languages.
However if the same variable is defined twice, then define behave as an assignment - which belongs to the imperative paradigm.
You've put your finger on a subtle and contentious issue. There have long been two informal camps on how define should work, which I would label (very imperfectly, and very controversially!) as the static vs. dynamic camps.
The static camp sees define as a non-side-effecting top-level declaration—it's a syntax that simply defines a name in a top-level scope, just like let is a syntax that defines a name in a local scope. A bit more precisely, this camp tends to see the top-level environment as equivalent to a big letrec with all the defines as the bindings, and all "loose" top-level expressions as the body. This is, incidentally, similar to the way that simple compilers work—read the whole program from one or more files, figure out all of the top-level bindings and generate code with knowledge of the whole program's source text.
The dynamic camp, on the other hand, tends to conceive of the top-level environment as a mutable data structure to which bindings can be added at runtime, and define is then an operation that modifies the top-level environment. This is, incidentally, similar to how simple interactive interpreters work—read definitions interactively from input, one at a time, and incorporate them into the environment as the user provides them.
To give one example, the SLIB library is one that I recall has been criticized for being much too firmly in the "dynamic" camp. If you read Section 1.1 on "features", you see this right from the beginning:
SLIB maintains a list of features supported by a Scheme session. The set of features provided by a session may change during that session.
The documentation for the require form that you use in SLIB to "load" modules continues with this:
Procedure: require feature
If (provided? feature) is true, then require just returns.
Otherwise, if feature is found in the catalog, then the corresponding files will be loaded and (provided? feature) will henceforth return #t. That feature is thereafter provided.
Otherwise (feature not found in the catalog), an error is signaled.
If you read this carefully, you will be struck that it's framing the whole thing as modules being "loaded" at runtime—and not as compile-time linking, which is foreign to the design.
So a "session" is a set of bindings whose keys—not just their values—changes during the runtime of the program. Programs are able to mutate the session with provide and require. They are able to directly observe the mutation with provided?. And it is implied that they can indirectly observe the set of identifiers bound in top-level environment change as a result of require—a call to require causes procedure invocations that would result in a runtime error before its invocation to no longer be so afterwards.
So we can't help but conclude that going by the philosophy of the people who designed this library, define is imperative. But not every Scheme user or implementer shares this philosophy.
First off Scheme is lexically scoped. Define usually is not limited to top level bindings like it is in Racket. It can create bindings within other procedure bodies.
In some implementations define can manipulate state but only for top level definitions. Otherwise it acts like let and binds a variable to the local scope. To actually take advantage of the top-level rebinding programatically is difficult.
So define doesn't introduce an imperative style into scheme code. Compare define to set! and its relatives, which by modify the variable in whatever environment it is bound, thereby allowing imperative style in scheme code.

Recursion of internal functions in Erlang

Playing with Erlang, I've got a process-looping function like:
process_loop(...A long list of parameters here...) ->
receive
...Message processing logic involving the function parameters...
end,
process_loop(...Same long list of parameters...)
end.
It looks quite ugly, so I tried a refactoring like that:
process_loop(...A long list of parameters...) ->
Loop = fun() ->
receive
...Message processing logic...
end,
Loop()
end,
Loop()
end.
But it turned out to be incorrect, as Loop variable is unbound inside the Loop function. So, I've arranged a workaround:
process_loop(...A long list of parameters...) ->
Loop = fun(Next) ->
receive
...Message processing logic...
end,
Next(Next)
end,
Loop(Loop)
end.
I have two questions:
Is there a way to achieve the idea of snippet #2, but without such "Next(Next)" workarounds?
Do snippets #1 and #3 differ significantly in terms of performance, or they're equivalent?
No. Unfortunately anonymous function are just that. Anonymous, unless you give them a name.
Snippet #3 is a little bit more expensive. Given that you do pattern matching on messages in the body, I wouldn't worry about it. Optimise for readability in this case. The difference is a very small constant factor.
You might use tuples/records as named parameters instead of passing lots of parameters. You can just reuse the single parameter that the function is going to take.
I guess (but I' not sure) that this syntax isn't supported by proper tail-recursion. If you refactor to use a single parameter I think that you will be again on the right track.
The more conventional way of avoiding repeating the list of parameters in snippet #1 is to put all or most of them in a record that holds the loop state. Then you only have one or a few variables to pass around in the loop. That's easier to read and harder to screw up than playing around with recursive funs.
I must say that in all cases where I do this type of recursion I don't think I have ever come across the case where exactly the same set of variables is passed around in the recursion. Usually variables will change reflecting state change in the process loop. It cannot be otherwise as you have to handle state explicitly. I usually group related parameters into records which cuts down the number of arguments and adds clarity.
You can of course use your solution and have some parameters implicit in the fun and some explicit in the recursive calls but I don't think this would improve clarity.
The same answer applies to "normal" recursion where you are stepping over data structures.

More explanation on Lexical Binding in Closures?

There are many SO posts related to this, but I am asking this again with a different purpose
I am trying to understand why closures are important and useful. One of things that I've read in other SO posts related to this is that when you pass a variable to closure, the closure starts remembering this value from then onwards. Is this the entire Technical aspect of it or there is more to what happens there.
What I wonder then is what would happen when the variable used inside the closure gets modified from outside. Should they be constants only?
In the language Clojure, I can do the following: But since there are value is immutable, this issue does not arise. What about other languages and what is the proper technical definition of a closure?
(defn make-greeter [greeting-prefix]
(fn [username] (str greeting-prefix ", " username)))
((make-greeter "Hello") "World")
This is not the sort of answer that appears to get up-votes around here, but I would heartily urge you to discover the answer to your question by reading Shriram Krishnamurthi's (free!) (online!) textbook, Programming Languages: Application and Interpretation.
I will paraphrase the book very, very briefly, by summarizing the development of the teeny tiny interpreters that it leads you through:
an arithmetic expression language (AE)
an arithmetic expression language with named expressions (WAE);
implementing this involves developing a substitution function that can
replace names with values
a language that adds first-order functions (F1WAE): using a function involves substituting
values for each of the parameter names.
The same language, without substitution: it turns out that "environments" allow you to avoid the overhead of pre-emptive substitution.
a language that eliminates the separation between functions and expressions by allowing
functions to be defined at arbitrary locations (FWAE)
This is the key point: you implement this, and then you discover that with substitution it works fine, but with environments it's broken. In particular, in order to fix it up, you must be sure to associate with an evaluated function definition the environment that was in place when it was evaluated. This pair (fundef + environment-of-definition) is what's called a "closure".
Whew!
Okay, what happens when we add mutable bindings to the picture? If you try this yourself, you'll see that the natural implementation replaces an environment that associates names with values with an environment that associates names with bindings. This is orthogonal to the notion of closures; since closures capture environments, and since environments now map names to bindings, you get the behavior you describe, whereby mutation of a variable captured in an environment is visible and persistent.
Again, I would very much urge you to take a look at PLAI.
A closure is really a data structure used by the compiler to make sure that a function will always have access to the data that it needs to opperate. here is an example of a function that recordes when it was defined.
(defn outer []
(let [foo (get-time-of-day)]
(defn inner []
#(str "then:" foo " now:" (get-time-of-day)))))
(def then-and-now (outer))
(then-and-now) ==> "then:1:02:03 now:2:30:01"
....
(then-and-now) ==> "then:1:02:03 now:2:31:02"
when this function is defined a class is created and a small structure (a closure) is allocated on the heap that stores the value of foo. the class has a pointer to that (or it contains it im not sure). if you run this again then a second closure would be allocated to hold that other foo. When we say "this function closes over foo" we mean to say that it has a reference to a stricture/class/whatever that stores the state of foo at the time it was compiled. The reason you need to close over something is because the function that contains it is going away before the data will be used. In this case outer (which contains the value of foo) is going to end and be gone long before foo is used so nobody will be around to modify foo. of course foo could pas a ref to somebody who could then modify it.
A lexical closure is one in which the enclosed variables (e.g. greeting-prefix in your example) are enclosed by reference. The closure created does not simply get the value of greeting-prefix at the time it is created, but gets a reference. If greeting-prefix is modified after the closure is created, then its new value will be used by the closure every time it is called.
In pure functional languages this isn't much of a distinction, because values are never changed. So it doesn't matter if the value of greeting-prefix is copied into the closure: there's no possible difference in behaviour that could arise from referring to the original versus its copy.
In "imperative-languages-with-closures", such as C# and Java (via anonymous classes), some decision has to be made about whether the enclosed variable is enclosed by value or by reference. In Java this decision is pre-empted by only allowing final variables to be enclosed, effectively mimicking a functional language as far as that variable is concerned. In C# I believe it is a different matter.
Enclosing by value simplifies the implementation: the variable to be enclosed will often exist on the stack and hence will be destroyed when the function constructing the closure returns -- that means it can't be enclosed by reference. If you need enclosure by reference, a workaround is to identify such variables and keep them in an object allocated each time that function is called. This object is then kept as part of the closure's environment and must remain live as long as all closures using it are live. (I do not know if any compiled languages directly use this technique.)
For more descriptions see for example:
Common Lisp HyperSpec, 3.1.4 Closures and Lexical Binding
and
Common Lisp the Language, 2nd Edition, Chapter 3., Scope and Extent
You can think of a closure as an "environment", in which names are bound to values. Those names are entirely private to the closure, which is why we say that it "closes over" its environment. So your question isn't meaningful, in that the "outside" cannot affect the closed-over environment. Yes, a closure can refer to a name in a global environment (in other words, if it uses a name that is not bound in its private, closed-over environment), but that's a different story.
If you like, you can think of an environment as a dictionary, or hash table. A closure gets its own little dictionary where names are looked up.
You might enjoy reading On lambdas, capture, and mutability, which describes how this works in C# and F#, for comparison.
Have a look at this blog post: ADTs in Clojure. It shows a nice application of closures to the problem of locking up data so that it is accessible exclusively through a particular interface (rendering the data type opaque).
The main idea behind this type of locking is more simply illustrated with the counter example, which huaiyuan posted in Common Lisp while I was composing this answer. Actually, the Clojure version is interesting in that it shows that the issue of a closed-over variable changing its value does arise in Clojure if the variable happens to hold an instance of one of the reference types.
(defn create-counter []
(let [counter (atom 0)
inc-counter! #(swap! counter inc)
get-counter (fn [] #counter)]
[inc-counter! get-counter]))
As for the original make-greeter example, you could rewrite it thus (note the deref/#):
(defn make-greeter [greeting-prefix]
(fn [username] (str #greeting-prefix ", " username)))
Then you can use it to render personalised greetings from the different operators of various sections of a website. :-)
((make-greeter "Hello from Gizmos Dept") "John")
((make-greeter "Hello from Gadgets Dept") "Jack").
You can think of a closure as an
"environment", in which names are
bound to values. Those names are
entirely private to the closure, which
is why we say that it "closes over"
its environment. So your question
isn't meaningful, in that the
"outside" cannot affect the
closed-over environment. Yes, a
closure can refer to a name in a
global environment (in other words, if
it uses a name that is not bound in
its private, closed-over environment),
but that's a different story.
I suppose that the question was if things like these are possible in languages which allow mutation of local variables:
CL-USER> (let ((x (list 1 2 3)))
(prog1
(let ((y x))
(lambda () y))
(rplaca x 2)))
#<COMPILED-LEXICAL-CLOSURE #x9FEC77E>
CL-USER> (funcall *)
(2 2 3)
And -- since they are obviously possible -- I think the question is legitimate.

Resources