What does "immutable variable" mean in functional programming? - functional-programming

I'm new to functional programming, and I don't understand the concept of immutability; e.g. an immutable variable.
For example, in Standard ML (SML):
val a = 3
val a = a + 1
The second line does not "change" the value of variable a; however, afterwards, a equals 4. Can someone please explain this?
Also, what is the benefit of "no mutation" (immutability)?

When we say a variable is immutable, we mean that its own value cannot be changed. What you're showing there with
val a = 3
val a = a+1
is: the new value of a is simply "shadowing" the old value of a. a is just a name that is bound to 3, and in the second line, it's bound to 4. The old value of a still exists, it's just inaccessible.
This can be seen more apparently if you're using some sort of data structures. There's no mutator methods like you see in many other languages. For example, if you have a list val L = [1,2,3], then there's no way to change the first value in L. You would have to shadow L altogether, and create a new list to shadow the old one.
So, every time you bind a new value declaration it creates a new environment with all the current name/value bindings. None of these bindings can be changed, they're simply shadowed.

Related

Elixir allow assign to variable two times [duplicate]

In Dave Thomas's book Programming Elixir he states "Elixir enforces immutable data" and goes on to say:
In Elixir, once a variable references a list such as [1,2,3], you know it will always reference those same values (until you rebind the variable).
This sounds like "it won't ever change unless you change it" so I'm confused as to what the difference between mutability and rebinding is. An example highlighting the differences would be really helpful.
Don't think of "variables" in Elixir as variables in imperative languages, "spaces for values". Rather look at them as "labels for values".
Maybe you would better understand it when you look at how variables ("labels") work in Erlang. Whenever you bind a "label" to a value, it remains bound to it forever (scope rules apply here of course).
In Erlang you cannot write this:
v = 1, % value "1" is now "labelled" "v"
% wherever you write "1", you can write "v" and vice versa
% the "label" and its value are interchangeable
v = v+1, % you can not change the label (rebind it)
v = v*10, % you can not change the label (rebind it)
instead you must write this:
v1 = 1, % value "1" is now labelled "v1"
v2 = v1+1, % value "2" is now labelled "v2"
v3 = v2*10, % value "20" is now labelled "v3"
As you can see this is very inconvenient, mainly for code refactoring. If you want to insert a new line after the first line, you would have to renumber all the v* or write something like "v1a = ..."
So in Elixir you can rebind variables (change the meaning of the "label"), mainly for your convenience:
v = 1 # value "1" is now labelled "v"
v = v+1 # label "v" is changed: now "2" is labelled "v"
v = v*10 # value "20" is now labelled "v"
Summary: In imperative languages, variables are like named suitcases: you have a suitcase named "v". At first you put sandwich in it. Than you put an apple in it (the sandwich is lost and perhaps eaten by the garbage collector). In Erlang and Elixir, the variable is not a place to put something in. It's just a name/label for a value. In Elixir you can change a meaning of the label. In Erlang you cannot. That's the reason why it doesn't make sense to "allocate memory for a variable" in either Erlang or Elixir, because variables do not occupy space. Values do. Now perhaps you see the difference clearly.
If you want to dig deeper:
1) Look at how "unbound" and "bound" variables work in Prolog. This is the source of this maybe slightly strange Erlang concept of "variables which do not vary".
2) Note that "=" in Erlang really is not an assignment operator, it's just a match operator! When matching an unbound variable with a value, you bind the variable to that value. Matching a bound variable is just like matching a value it's bound to. So this will yield a match error:
v = 1,
v = 2, % in fact this is matching: 1 = 2
3) It's not the case in Elixir. So in Elixir there must be a special syntax to force matching:
v = 1
v = 2 # rebinding variable to 2
^v = 3 # matching: 2 = 3 -> error
Immutability means that data structures don't change. For example the function HashSet.new returns an empty set and as long as you hold on to the reference to that set it will never become non-empty. What you can do in Elixir though is to throw away a variable reference to something and rebind it to a new reference. For example:
s = HashSet.new
s = HashSet.put(s, :element)
s # => #HashSet<[:element]>
What cannot happen is the value under that reference changing without you explicitly rebinding it:
s = HashSet.new
ImpossibleModule.impossible_function(s)
s # => #HashSet<[:element]> will never be returned, instead you always get #HashSet<[]>
Contrast this with Ruby, where you can do something like the following:
s = Set.new
s.add(:element)
s # => #<Set: {:element}>
Erlang and obviously Elixir that is built on top of it, embraces immutability.
They simply don’t allow values in a certain memory location to change. Never Until the variable gets garbage collected or is out of scope.
Variables aren't the immutable thing. The data they point to is the immutable thing. That's why changing a variable is referred to as rebinding.
You're point it at something else, not changing the thing it points to.
x = 1 followed by x = 2 doesn't change the data stored in computer memory where the 1 was to a 2. It puts a 2 in a new place and points x at it.
x is only accessible by one process at a time so this has no impact on concurrency and concurrency is the main place to even care if something is immutable anyway.
Rebinding doesn’t change the state of an object at all, the value is still in the same memory location, but it’s label (variable) now points to another memory location, so immutability is preserved. Rebinding is not available in Erlang, but while it is in Elixir this is not braking any constraint imposed by the Erlang VM, thanks to its implementation.
The reasons behind this choice are well explained by Josè Valim in this gist .
Let's say you had a list
l = [1, 2, 3]
and you had another process that was taking lists and then performing "stuff" against them repeatedly and changing them during this process would be bad. You might send that list like
send(worker, {:dostuff, l})
Now, your next bit of code might want to update l with more values for further work that's unrelated to what that other process is doing.
l = l ++ [4, 5, 6]
Oh no, now that first process is going to have undefined behavior because you changed the list right? Wrong.
That original list remains unchanged. What you really did was make a new list based on the old one and rebind l to that new list.
The separate process never has access to l. The data l originally pointed at is unchanged and the other process (presumably, unless it ignored it) has its own separate reference to that original list.
What matters is you can't share data across processes and then change it while another process is looking at it. In a language like Java where you have some mutable types (all primitive types plus references themselves) it would be possible to share a structure/object that contained say an int and change that int from one thread while another was reading it.
In fact, it's possible to change a large integer type in java partially while it's read by another thread. Or at least, it used to be, not sure if they clamped that aspect of things down with the 64 bit transition. Anyway, point is, you can pull the rug out from under other processes/threads by changing data in a place that both are looking at simultaneously.
That's not possible in Erlang and by extension Elixir. That's what immutability means here.
To be a bit more specific, in Erlang (the original language for the VM Elixir runs on) everything was single-assignment immutable variables and Elixir is hiding a pattern Erlang programmers developed to work around this.
In Erlang, if a=3 then that was what a was going to be its value for the duration of that variable's existence until it dropped out of scope and was garbage collected.
This was useful at times (nothing changes after assignment or pattern match so it is easy to reason about what a function is doing) but also a bit cumbersome if you were doing multiple things to a variable or collection over the course executing a function.
Code would often look like this:
A=input,
A1=do_something(A),
A2=do_something_else(A1),
A3=more_of_the_same(A2)
This was a bit clunky and made refactoring more difficult than it needed to be. Elixir is doing this behind the scenes, but hiding it from the programmer via macros and code transforms performed by the compiler.
Great discussion here
immutability-in-elixir
The variables really are immutable in sense, every new rebinding (assignment) is only visible to access that come after that. All previous access, still refer to old value(s) at the time of their call.
foo = 1
call_1 = fn -> IO.puts(foo) end
foo = 2
call_2 = fn -> IO.puts(foo) end
foo = 3
foo = foo + 1
call_3 = fn -> IO.puts(foo) end
call_1.() #prints 1
call_2.() #prints 2
call_3.() #prints 4
To make it a very simple
variables in elixir are not like container where you keep adding and removing or modifying items from the container.
Instead they are like Labels attached to a container, when you reassign a variable is as simple a you pick a label from one container and place it on a new container with expected data in it.

bulk adding to a map, in F#

I've a simple type:
type Token =
{
Symbol: string
Address: string
Decimals: int
}
and a memory cache (they're in a db):
let mutable private tokenCache : Map<string, Token> = Map.empty
part of the Tokens module.
Sometimes I get a few new entries to add, in the form of a Token array, and I want to update the cache.
It happens very rarely (less than once per million reads).
When I update the database with the new batch, I want to update the cache map as well and I just wrote this:
tokenCache <- tokens |> Seq.fold (fun m i -> m.Add(i.Symbol, i)) tokenCache
Since this is happening rarely, I don't really care about the performance so this question is out of curiosity:
When I do this, the map will be recreated once per entry in the tokens array: 10 new tokens, 10 map re-creation. I assumed this was the most 'F#' way to deal with this. It got me thinking: wouldn't converting the map to a list of KVP, getting the output of distinct and re-creating a map be more efficient? or is there another method I haven't thought about?
This is not an answer to the question as stated, but a clarification to something you asked in the comments.
This premise that you have expressed is incorrect:
the map will be recreated once per entry in the tokens array
The map doesn't actually get completely recreated for every insertion. But at the same time, another hypothesis that you have expressed in the comments is also incorrect:
so the immutability is from the language's perspective, the compiler doesn't recreate the object behind the scenes?
Immutability is real. But the map also doesn't get recreated every time. Sometimes it does, but not every time.
I'm not going to describe exactly how Map works, because that's too involved. Instead, I'll illustrate the principle on a list.
F# lists are "singly linked lists", which means each list consists of two things: (1) first element (called "head") and (2) a reference (pointer) to the rest of elements (called "tail"). The crucial thing to note here is that the "rest of elements" part is also itself a list.
So if you declare a list like this:
let x = [1; 2; 3]
It would be represented in memory something like this:
x -> 1 -> 2 -> 3 -> []
The name x is a reference to the first element, and then each element has a reference to the next one, and the last one - to empty list. So far so good.
Now let's see what happens if you add a new element to this list:
let y = 42 :: x
Now the list y will be represented like this:
y -> 42 -> 1 -> 2 -> 3 -> []
But this picture is missing half the picture. If we look at the memory in a wider scope than just y, we'll see this:
x -> 1 -> 2 -> 3 -> []
^
|
/
y -> 42
So you see that the y list consists of two things (as all lists do): first element 42 and a reference to the rest of the elements 1->2->3. But the "rest of the elements" bit is not exclusive to y, it has its own name x.
And so it is that you have two lists x and y, 3 and 4 elements respectively, but together they occupy just 4 cells of memory, not 7.
And another thing to note is that when I created the y list, I did not have to recreate the whole list from scratch, I did not have to copy 1, 2, and 3 from x to y. Those cells stayed right where they are, and y only got a reference to them.
And a third thing to note is that this means that prepending an element to a list is an O(1) operation. No copying of the list involved.
And a fourth (and hopefully final) thing to note is that this approach is only possible because of immutability. It is only because I know that the x list will never change that I can take a reference to it. If it was subject to change, I would be forced to copy it just in case.
This sort of arrangement, where each iteration of a data structure is built "on top of" the previous one is called "persistent data structure" (well, to be more precise, it's one kind of a persistent data structure).
The way it works is very easy to see for linked lists, but it also works for more involved data structures, including maps (which are represented as trees).

Parameters of function in Julia

Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified?  This requires, if we want to use it anyway, to go through a very artificial process, by representing these data in the form of a ridiculous single element table.
Ada, which had the same kind of limitation, abandoned it in its 2012 redesign to the great satisfaction of its users. A small keyword (like out in Ada) could very well indicate that the possibility of keeping the modifications of a parameter at the output is required.
From my experience in Julia it is useful to understand the difference between a value and a binding.
Values
Each value in Julia has a concrete type and location in memory. Value can be mutable or immutable. In particular when you define your own composite type you can decide if objects of this type should be mutable (mutable struct) or immutable (struct).
Of course Julia has in-built types and some of them are mutable (e.g. arrays) and other are immutable (e.g. numbers, strings). Of course there are design trade-offs between them. From my perspective two major benefits of immutable values are:
if a compiler works with immutable values it can perform many optimizations to speed up code;
a user is can be sure that passing an immutable to a function will not change it and such encapsulation can simplify code analysis.
However, in particular, if you want to wrap an immutable value in a mutable wrapper a standard way to do it is to use Ref like this:
julia> x = Ref(1)
Base.RefValue{Int64}(1)
julia> x[]
1
julia> x[] = 10
10
julia> x
Base.RefValue{Int64}(10)
julia> x[]
10
You can pass such values to a function and modify them inside. Of course Ref introduces a different type so method implementation has to be a bit different.
Variables
A variable is a name bound to a value. In general, except for some special cases like:
rebinding a variable from module A in module B;
redefining some constants, e.g. trying to reassign a function name with a non-function value;
rebinding a variable that has a specified type of allowed values with a value that cannot be converted to this type;
you can rebind a variable to point to any value you wish. Rebinding is performed most of the time using = or some special constructs (like in for, let or catch statements).
Now - getting to the point - function is passed a value not a binding. You can modify a binding of a function parameter (in other words: you can rebind a value that a parameter is pointing to), but this parameter is a fresh variable whose scope lies inside a function.
If, for instance, we wanted a call like:
x = 10
f(x)
change a binding of variable x it is impossible because f does not even know of existence of x. It only gets passed its value. In particular - as I have noted above - adding such a functionality would break the rule that module A cannot rebind variables form module B, as f might be defined in a module different than where x is defined.
What to do
Actually it is easy enough to work without this feature from my experience:
What I typically do is simply return a value from a function that I assign to a variable. In Julia it is very easy because of tuple unpacking syntax like e.g. x,y,z = f(x,y,z), where f can be defined e.g. as f(x,y,z) = 2x,3y,4z;
You can use macros which get expanded before code execution and thus can have an effect modifying a binding of a variable, e.g. macro plusone(x) return esc(:($x = $x+1)) end and now writing y=100; #plusone(y) will change the binding of y;
Finally you can use Ref as discussed above (or any other mutable wrapper - as you have noted in your question).
"Does anyone know the reasons why Julia chose a design of functions where the parameters given as inputs cannot be modified?" asked by Schemer
Your question is wrong because you assume the wrong things.
Parameters are variables
When you pass things to a function, often those things are values and not variables.
for example:
function double(x::Int64)
2 * x
end
Now what happens when you call it using
double(4)
What is the point of the function modifying it's parameter x , it's pointless. Furthermore the function has no idea how it is called.
Furthermore, Julia is built for speed.
A function that modifies its parameter will be hard to optimise because it causes side effects. A side effect is when a procedure/function changes objects/things outside of it's scope.
If a function does not modifies a variable that is part of its calling parameter then you can be safe knowing.
the variable will not have its value changed
the result of the function can be optimised to a constant
not calling the function will not break the program's behaviour
Those above three factors are what makes FUNCTIONAL language fast and NON FUNCTIONAL language slow.
Furthermore when you move into Parallel programming or Multi Threaded programming, you absolutely DO NOT WANT a variable having it's value changed without you (The programmer) knowing about it.
"How would you implement with your proposed macro, the function F(x) which returns a boolean value and modifies c by c:= c + 1. F can be used in the following piece of Ada code : c:= 0; While F(c) Loop ... End Loop;" asked by Schemer
I would write
function F(x)
boolean_result = perform_some_logic()
return (boolean_result,x+1)
end
flag = true
c = 0
(flag,c) = F(c)
while flag
do_stuff()
(flag,c) = F(c)
end
"Unfortunately no, because, and I should have said that, c has to take again the value 0 when F return the value False (c increases as long the Loop lives and return to 0 when it dies). " said Schemer
Then I would write
function F(x)
boolean_result = perform_some_logic()
if boolean_result == true
return (true,x+1)
else
return (false,0)
end
end
flag = true
c = 0
(flag,c) = F(c)
while flag
do_stuff()
(flag,c) = F(c)
end

Creating copies in Julia with = operator

When I create some array A and assign it to B
A = [1:10]
B = A
I can modify A and the change reflects in B
A[1] = 42
# B[1] is now 42
But if I do that with scalar variables, the change doesn't propagate:
a = 1
b = a
a = 2
# b remains being 1
I can even mix the things up and transform the vector to a scalar, and the change doesn't propagate:
A = [1:10]
B = A
A = 0
# B remains being 1,2,...,10
What exactly does the = operator does? When I want to copy variables and modify the old ones preserving the integrity of the new variables, when should I use b = copy(a) over just b=a?
The confusion stems from this: assignment and mutation are not the same thing.
Assignment. Assignment looks like x = ... – what's left of the = is an identifier, i.e. a variable name. Assignment changes which object the variable x refers to (this is called a variable binding). It does not mutate any objects at all.
Mutation. There are two typical ways to mutate something in Julia:
x.f = ... – what's left of the = is a field access expression;
x[i] = ... – what's left of the = is an indexing expression. Currently, field mutation is fundamental – that syntax can only mean that you are mutating a structure by changing its field. This may change. Array mutation syntax is not fundamental – x[i] = y means setindex!(x, y, i) and you can either add methods to setindex! or locally change which generic function setindex!. Actual array assignment is a builtin – a function implemented in C (and for which we know how to generate corresponding LLVM code).
Mutation changes the values of objects; it doesn't change any variable bindings. After doing either of the above, the variable x still refers to the same object it did before; that object may have different contents, however. In particular, if that object is accessible from some other scope – say the function that called one doing the mutation – then the changed value will be visible there. But no bindings have changed – all bindings in all scopes still refer to the same objects.
You'll note that in this explanation I never once talked about mutability or immutability. That's because it has nothing to do with any of this – mutable and immutable objects have exactly the same semantics when it comes to assignment, argument passing, etc. The only difference is that if you try to do x.f = ... when x is immutable, you will get an error.
This behavior is similar to Java. A and B are variables that can hold either a "plain" data type, such as an integer, float etc, or a references (aka pointers) to a more complex data structure. In contrast to Java, Julia handles many non-abstract types as "plain" data.
You can test with isbits(A) whether your variable A holds a bit value, or contains a reference to another data object. In the first case B=A will copy every bit from A to a new memory allocation for B, otherwise, only the reference to the object will be copied.
Also play around with pointer_from_objref(A).

In OCaml, how do I re-assign a global variable inside a function

My program has the following global variable:
let a = (0.0,0.0);;
And the following, where eval e1 returns a string_of_float and somefunc e2 returns a tuple.
let rec output_expr = function
Binop(e1, op, e2) ->
let onDist = float_of_string(eval e1) and onDir = somefunc e2 in
let newA = onDir in (
fprintf oc "\n\t%s" ("blah");
fprintf oc "\n\t%s" ("blah");
fprintf oc "\n\t%s" ("blah");
let a = newA
)
Now, the code above gives me the following error:
Error: This expression has type bool
but an expression was expected of type unit
Command exited with code 2.
I want let a = newA to change the value of the global variable a. How can I do that?
To do it you need to make the value a reference,
let a = ref (0.0, 0.0)
then later that state can change by,
a := (1.0, 2.0);
In a functional world you would not want to have this global state. Sometimes it is very helpful, but in this particular case that is doubtful. You should pass the value a into your function and return a new value (a') that can be used subsequently; note that the value never changes, but new values take the place and are used in further computation.
In your particular case, I think you need to ask yourself why a function named output_expr modifies some global state, or returns anything but unit. But maybe this is a toy example for our consumption, so I will leave it at that.
You cannot assign to a variable (local or global is the same) in OCaml. There's simply no syntax in the language for it. In other words, variables in OCaml are what other languages call "constants" -- they get a value once in initialization, and that's it.
However, you can use a mutable data structure, which offers ways to modify its contents. Data structures are reference types, you can hold a reference to the data structure in a variable, and modify the contents, without needing to assign to the variable.
nlucaroni mentioned such a data structure, ref, which is a simple mutable cell holding a value of the desired type. There are other mutable data structures, like arrays, strings, and any record with mutable fields. Each has its own way of modifying the contents.
However, mutable state can mostly be avoided in functional programming, and if you are relying on mutable state, it may be an indication that you are not doing it the functional way.
In OCaml, values are immutable. You can't change the content of a value and should reorganize your code so that you don't need to.
Here your function output_expr should return the newA and this value should be used instead of a after that.
Actually you can have mutable variables using references but you should only use them if you know what you do and think they are better suited for a particular use case, never because you don't understand immutability.

Resources