Extract nth element of a tuple - functional-programming

For a list, you can do pattern matching and iterate until the nth element, but for a tuple, how would you grab the nth element?

TL;DR; Stop trying to access directly the n-th element of a t-uple and use a record or an array as they allow random access.
You can grab the n-th element by unpacking the t-uple with value deconstruction, either by a let construct, a match construct or a function definition:
let ivuple = (5, 2, 1, 1)
let squared_sum_let =
let (a,b,c,d) = ivuple in
a*a + b*b + c*c + d*d
let squared_sum_match =
match ivuple with (a,b,c,d) -> a*a + b*b + c*c + d*d
let squared_sum_fun (a,b,c,d) =
a*a + b*b + c*c + d*d
The match-construct has here no virtue over the let-construct, it is just included for the sake of completeness.
Do not use t-uples, Don¹
There are only a few cases where using t-uples to represent a type is the right thing to do. Most of the times, we pick a t-uple because we are too lazy to define a type and we should interpret the problem of accessing the n-th field of a t-uple or iterating over the fields of a t-uple as a serious signal that it is time to switch to a proper type.
There are two natural replacements to t-uples: records and arrays.
When to use records
We can see a record as a t-uple whose entries are labelled; as such, they are definitely the most natural replacement to t-uples if we want to access them directly.
type ivuple = {
a: int;
b: int;
c: int;
d: int;
}
We then access directly the field a of a value x of type ivuple by writing x.a. Note that records are easily copied with modifications, as in let y = { x with d = 0 }. There is no natural way to iterate over the fields of a record, mostly because a record do not need to be homogeneous.
When to use arrays
A large² homogeneous collection of values is adequately represented by an array, which allows direct access, iterating and folding. A possible inconvenience is that the size of an array is not part of its type, but for arrays of fixed size, this is easily circumvented by introducing a private type — or even an abstract type. I described an example of this technique in my answer to the question “OCaml compiler check for vector lengths”.
Note on float boxing
When using floats in t-uples, in records containing only floats and in arrays, these are unboxed. We should therefore not notice any performance modification when changing from one type to the other in our numeric computations.
¹ See the TeXbook.
² Large starts near 4.

Since the length of OCaml tuples is part of the type and hence known (and fixed) at compile time, you get the n-th item by straightforward pattern matching on the tuple. For the same reason, the problem of extracting the n-th element of an "arbitrary-length tuple" cannot occur in practice - such a "tuple" cannot be expressed in OCaml's type system.
You might still not want to write out a pattern every time you need to project a tuple, and nothing prevents you from generating the functions get_1_1...get_i_j... that extract the i-th element from a j-tuple for any possible combination of i and j occuring in your code, e.g.
let get_1_1 (a) = a
let get_1_2 (a,_) = a
let get_2_2 (_,a) = a
let get_1_3 (a,_,_) = a
let get_2_3 (_,a,_) = a
...
Not necessarily pretty, but possible.
Note: Previously I had claimed that OCaml tuples can have at most length 255 and you can simply generate all possible tuple projections once and for all. As #Virgile pointed out in the comments, this is incorrect - tuples can be huge. This means that it is impractical to generate all possible tuple projection functions upfront, hence the restriction "occurring in your code" above.

It's not possible to write such a function in full generality in OCaml. One way to see this is to think about what type the function would have. There are two problems. First, each size of tuple is a different type. So you can't write a function that accesses elements of tuples of different sizes. The second problem is that different elements of a tuple can have different types. Lists don't have either of these problems, which is why you can have List.nth.
If you're willing to work with a fixed size tuple whose elements are all the same type, you can write a function as shown by #user2361830.
Update
If you really have collections of values of the same type that you want to access by index, you should probably be using an array.

here is a function wich return you the string of the ocaml function you need to do that ;) very helpful I use it frequently.
let tup len n =
if n>=0 && n<len then
let rec rep str nn = match nn<1 with
|true ->""
|_->str ^ (rep str (nn-1))in
let txt1 ="let t"^(string_of_int len)^"_"^(string_of_int n)^" tup = match tup with |" ^ (rep "_," n) ^ "a" and
txt2 =","^(rep "_," (len-n-2)) and
txt3 ="->a" in
if n = len-1 then
print_string (txt1^txt3)
else
print_string (txt1^txt2^"_"^txt3)
else raise (Failure "Error") ;;
For example:
tup 8 6;;
return:
let t8_6 tup = match tup with |_,_,_,_,_,_,a,_->a
and of course:
val t8_6 : 'a * 'b * 'c * 'd * 'e * 'f * 'g * 'h -> 'g = <fun>

Related

Can a julia struct be defined with persistent requirements on field dimensions?

If I define a new struct as
mutable struct myStruct
data::AbstractMatrix
labels::Vector{String}
end
and I want to throw an error if the length of labels is not equal to the number of columns of data, I know that I can write a constructor that enforces this condition like
myStruct(data, labels) = length(labels) != size(data)[2] ? error("Labels incorrect length") : new(data,labels)
However, once the struct is initialized, the labels field can be set to the incorrect length:
m = myStruct(randn(2,2), ["a", "b"])
m.labels = ["a"]
Is there a way to throw an error if the labels field is ever set to length not equal to the number of columns in data?
You could use StaticArrays.jl to fix the matrix and vector's sizes to begin with:
using StaticArrays
mutable struct MatVec{R, C, RC, VT, MT}
data::MMatrix{R, C, MT, RC} # RC should be R*C
labels::MVector{C, VT}
end
but there's the downside of having to compile for every concrete type with a unique permutation of type parameters R,C,MT,VT. StaticArrays also does not scale as well as normal Arrays.
If you don't restrict dimensions in the type parameters (with all those downsides) and want to throw an error at runtime, you got good and bad news.
The good news is you can control whatever mutation happens to your type. m.labels = v would call the method setproperty!(object::myStruct, name::Symbol, v), which you can define with all the safeguards you like.
The bad news is that you can't control mutation to the fields' types. push!(m.labels, 1) mutates in the push!(a::Vector{T}, item) method. The myStruct instance itself doesn't actually change; it still points to the same Vector. If you can't guarantee that you won't do something like x = m.labels; push!(x, "whoops") , then you really do need runtime checks, like iscorrect(m::myStruct) = length(m.labels) == size(m.data)[2]
A good option is to not access the fields of your struct directly. Instead, do it using a function. Eg:
mutable struct MyStruct
data::AbstractMatrix
labels::Vector{String}
end
function modify_labels(s::MyStruct, new_labels::Vector{String})
# do all checks and modifications
end
You should check chapter 8 from "Hands-On Design Patterns and Best Practices with Julia: Proven solutions to common problems in software design for Julia 1.x"

What is an alternative to Kotlin's `reduce` operation in Rust?

I encountered this competitive programming problem:
nums is a vector of integers (length n)
ops is a vector of strings containing + and - (length n-1)
It can be solved with the reduce operation in Kotlin like this:
val op_iter = ops.iterator();
nums.reduce {a, b ->
when (op_iter.next()) {
"+" -> a+b
"-" -> a-b
else -> throw Exception()
}
}
reduce is described as:
Accumulates value starting with the first element and applying operation from left to right to current accumulator value and each element.
It looks like Rust vectors do not have a reduce method. How would you achieve this task?
Edited: since Rust version 1.51.0, this function is called reduce
Be aware of similar function which is called fold. The difference is that reduce will produce None if iterator is empty while fold accepts accumulator and will produce accumulator's value if iterator is empty.
Outdated answer is left to capture the history of this function debating how to name it:
There is no reduce in Rust 1.48. In many cases you can simulate it with fold but be aware that the semantics of the these functions are different. If the iterator is empty, fold will return the initial value whereas reduce returns None. If you want to perform multiplication operation on all elements, for example, getting result 1 for empty set is not too logical.
Rust does have a fold_first function which is equivalent to Kotlin's reduce, but it is not stable yet. The main discussion is about naming it. It is a safe bet to use it if you are ok with nightly Rust because there is little chance the function will be removed. In the worst case, the name will be changed. If you need stable Rust, then use fold if you are Ok with an illogical result for empty sets. If not, then you'll have to implement it, or find a crate such as reduce.
Kotlin's reduce takes the first item of the iterator for the starting point while Rust's fold and try_fold let you specify a custom starting point.
Here is an equivalent of the Kotlin code:
let mut it = nums.iter().cloned();
let start = it.next().unwrap();
it.zip(ops.iter()).try_fold(start, |a, (b, op)| match op {
'+' => Ok(a + b),
'-' => Ok(a - b),
_ => Err(()),
})
Playground
Or since we're starting from a vector, which can be indexed:
nums[1..]
.iter()
.zip(ops.iter())
.try_fold(nums[0], |a, (b, op)| match op {
'+' => Ok(a + b),
'-' => Ok(a - b),
_ => Err(()),
});
Playground

Ocaml - global vs local variable

I wanted to create a global variable called result that uses 5 string concatenations to create a string containing 9 times the string start, separated by commas.
I have two pieces of code, only the second one declares a global variable.
For some reason it's not registering easily in my brain... Is it just that i used a let in so result in the first piece of code is a local variable? Is there a more detailed explanation for this?
let start = "ab";;
let result = start ^ "," in
let result = result ^ result in
let result = result ^ result in
let result = result ^ result in
let result = result ^ start in
result;;
- : string = "ab,ab,ab,ab,ab,ab,ab,ab,ab"
let result =
let result = start ^ "," in
let result = result ^ result in
let result = result ^ result in
let result = result ^ result in
let result = result ^ start in
result;;
val result : string = "ab,ab,ab,ab,ab,ab,ab,ab,ab"
Let me to be a little bit boring person. There are no local and global variables in OCaml. This concept came from languages with different scoping rules. Also, the word "variable" itself should be taken with care. Its meaning was perverted by C-like languages. The original, mathematical, meaning of this word corresponds to a name of some mathematical object, that is used inside a formula, that represent a range of such values. In C-like languages, a variable is confused with the memory cell, that can change in time. So, to avoid the confusion let's use a more accurate terminology. Let's use word name instead of variable. Since, variables... sorry names are not memory cells, there is nothing to create. When you're using one of the let syntaxes, you're actually creating a binding, i.e., an association between a name and a value. The let <name> = <expr-1> in <expr-2> binds a value of the in the scope of the <expr-2> expression. The let <name> = <expr-1> in <expr-2> is by itself is also an expression, so, for example <expr-2> can also contain let ... in ... constructs inside, e.g.,
let a = 1 in
let b = a + 1 in
let c = b + 1 in
a + b + c
I especially, indented the code in non-idiomatic way to highlight the syntactic structure of the expression. OCaml also allows to use a name, that is already bound in the scope. The new binding will hide the existing one (that is not allowed in C, for example), e.g.,
let a = a + 1 in
let a = a + 1 in
let a = a + 1 in
a + a + a
Finally, the top-level (aka module level) let-binding (called definition in OCaml parlance), has the syntax: let <name> = <expr>, note that there is no in here. The definition binds the <name> to a result of the evaluation of <expr> in the lexical scope that extends form the point of definition to the end of the enclosing module. When you're implementing a module, you must use let <name> = <expr> to bind your code to names (you may omit name by using _). It is a little bit different from the interactive toplevel (interactive ocaml program), that actually accepts an expression, and evaluates it. For example,
let result = start ^ "," in
let result = result ^ result in
let result = result ^ result in
let result = result ^ result in
let result = result ^ start in
result
Is not a valid OCaml program (something that can be put into an ml file and compiled). Because it is an expression, not a module definition.
Is it just that i used a let in so result in the first piece of code is a local variable?
Pretty much. The syntax to define a global variable is let variable = expression without an in. The syntax to define a local variable is let variable = expression in expression which will define variable local to the expression after the in.
When you have let ... in, you're declaring a local variable. When you have just let by itself (at the top level of a module), you're declaring a global name of the module. (That is, a name that can be exported from the module.)
Your first example consists entirely of let ... in. So there is no top-level name declared.
Your second example has one let by itself, followed by several occurrences of let ... in. So it declares a top-level name result.

How to implement a dictionary as a function in OCaml?

I am learning Jason Hickey's Introduction to Objective Caml.
Here is an exercise I don't have any clue
First of all, what does it mean to implement a dictionary as a function? How can I image that?
Do we need any array or something like that? Apparently, we can't have array in this exercise, because array hasn't been introduced yet in Chapter 3. But How do I do it without some storage?
So I don't know how to do it, I wish some hints and guides.
I think the point of this exercise is to get you to use closures. For example, consider the following pair of OCaml functions in a file fun-dict.ml:
let empty (_ : string) : int = 0
let add d k v = fun k' -> if k = k' then v else d k'
Then at the OCaml prompt you can do:
# #use "fun-dict.ml";;
val empty : string -> int =
val add : ('a -> 'b) -> 'a -> 'b -> 'a -> 'b =
# let d = add empty "foo" 10;;
val d : string -> int =
# d "bar";; (* Since our dictionary is a function we simply call with a
string to look up a value *)
- : int = 0 (* We never added "bar" so we get 0 *)
# d "foo";;
- : int = 10 (* We added "foo" -> 10 *)
In this example the dictionary is a function on a string key to an int value. The empty function is a dictionary that maps all keys to 0. The add function creates a closure which takes one argument, a key. Remember that our definition of a dictionary here is function from key to values so this closure is a dictionary. It checks to see if k' (the closure parameter) is = k where k is the key just added. If it is it returns the new value, otherwise it calls the old dictionary.
You effectively have a list of closures which are chained not by cons cells by by closing over the next dictionary(function) in the chain).
Extra exercise, how would you remove a key from this dictionary?
Edit: What is a closure?
A closure is a function which references variables (names) from the scope it was created in. So what does that mean?
Consider our add function. It returns a function
fun k' -> if k = k' then v else d k
If you only look at that function there are three names that aren't defined, d, k, and v. To figure out what they are we have to look in the enclosing scope, i.e. the scope of add. Where we find
let add d k v = ...
So even after add has returned a new function that function still references the arguments to add. So a closure is a function which must be closed over by some outer scope in order to be meaningful.
In OCaml you can use an actual function to represent a dictionary. Non-FP languages usually don't support functions as first-class objects, so if you're used to them you might have trouble thinking that way at first.
A dictionary is a map, which is a function. Imagine you have a function d that takes a string and gives back a number. It gives back different numbers for different strings but always the same number for the same string. This is a dictionary. The string is the thing you're looking up, and the number you get back is the associated entry in the dictionary.
You don't need an array (or a list). Your add function can construct a function that does what's necessary without any (explicit) data structure. Note that the add function takes a dictionary (a function) and returns a dictionary (a new function).
To get started thinking about higher-order functions, here's an example. The function bump takes a function (f: int -> int) and an int (k: int). It returns a new function that returns a value that's k bigger than what f returns for the same input.
let bump f k = fun n -> k + f n
(The point is that bump, like add, takes a function and some data and returns a new function based on these values.)
I thought it might be worth to add that functions in OCaml are not just pieces of code (unlike in C, C++, Java etc.). In those non-functional languages, functions don't have any state associated with them, it would be kind of rediculous to talk about such a thing. But this is not the case with functions in functional languages, you should start to think of them as a kind of objects; a weird kind of objects, yes.
So how can we "make" these objects? Let's take Jeffrey's example:
let bump f k =
fun n ->
k + f n
Now what does bump actually do? It might help you to think of bump as a constructor that you may already be familiar with. What does it construct? it constructs a function object (very losely speaking here). So what state does that resulting object has? it has two instance variables (sort of) which are f and k. These two instance variables are bound to the resulting function-object when you invoke bump f k. You can see that the returned function-object:
fun n ->
k + f n
Utilizes these instance variables f and k in it's body. Once this function-object is returned, you can only invoke it, there's no other way for you to access f or k (so this is encapsulation).
It's very uncommon to use the term function-object, they are called just functions, but you have to keep in mind that they can "enclose" state as well. These function-objects (also called closures) are not far separated from the "real" objects in object-oriented programming languages, a very interesting discussion can be found here.
I'm also struggling with this problem. Here's my solution and it works for the cases listed in the textbook...
An empty dictionary simply returns 0:
let empty (k:string) = 0
Find calls the dictionary's function on the key. This function is trivial:
let find (d: string -> int) k = d k
Add extends the function of the dictionary to have another conditional branching. We return a new dictionary that takes a key k' and matches it against k (the key we need to add). If it matches, we return v (the corresponding value). If it doesn't match we return the old (smaller) dictionary:
let add (d: string -> int) k v =
fun k' ->
if k' = k then
v
else
d k'
You could alter add to have a remove function. Also, I added a condition to make sure we don't remove a non-exisiting key. This is just for practice. This implementation of a dictionary is bad anyways:
let remove (d: string -> int) k =
if find d k = 0 then
d
else
fun k' ->
if k' = k then
0
else
d k'
I'm not good with the terminology as I'm still learning functional programming. So, feel free to correct me.

Implementing quicksort in a functional language

I need to implement quicksort in SML for a homework assignment, and I'm lost. I was previously unfamiliar with how quicksort was implemented, so I read up on that, but every implementation I read about was in an imperative one. These don't look too difficult, but I had no idea how to implement quicksort functionally.
Wikipedia happens to have quicksort code in Standard ML (which is the language required for my assignment), but I don't understand how it's working.
Wikipedia code:
val filt = List.filter
fun quicksort << xs = let
fun qs [] = []
| qs [x] = [x]
| qs (p::xs) = let
val lessThanP = (fn x => << (x, p))
in
qs (filt lessThanP xs) # p :: (qs (filt (not o lessThanP) xs))
end
in
qs xs
end
In particular, I don't understand this line: qs (filt lessThanP xs) # p :: (qs (filt (not o lessThanP) xs)). filt will return a list of everything in xs less than p*, which is concatenated with p, which is cons-ed onto everything >= p.*
*assuming the << (x, p) function returns true when x < p. Of course it doesn't have to be that.
Actually, typing this out is helping me understand what's going on a bit. Anyways, I'm trying to compare that SML function to wiki's quicksort pseudocode, which follows.
function quicksort(array, 'left', 'right')
// If the list has 2 or more items
if 'left' < 'right'
// See "Choice of pivot" section below for possible choices
choose any 'pivotIndex' such that 'left' ≤ 'pivotIndex' ≤ 'right'
// Get lists of bigger and smaller items and final position of pivot
'pivotNewIndex' := partition(array, 'left', 'right', 'pivotIndex')
// Recursively sort elements smaller than the pivot
quicksort(array, 'left', 'pivotNewIndex' - 1)
// Recursively sort elements at least as big as the pivot
quicksort(array, 'pivotNewIndex' + 1, 'right')
Where partition is defined as
// left is the index of the leftmost element of the array
// right is the index of the rightmost element of the array (inclusive)
// number of elements in subarray = right-left+1
function partition(array, 'left', 'right', 'pivotIndex')
'pivotValue' := array['pivotIndex']
swap array['pivotIndex'] and array['right'] // Move pivot to end
'storeIndex' := 'left'
for 'i' from 'left' to 'right' - 1 // left ≤ i < right
if array['i'] < 'pivotValue'
swap array['i'] and array['storeIndex']
'storeIndex' := 'storeIndex' + 1
swap array['storeIndex'] and array['right'] // Move pivot to its final place
return 'storeIndex'
So, where exactly is the partitioning happening? Or am I thinking about SMLs quicksort wrongly?
So, where exactly is the partitioning happening? Or am I thinking about SMLs quicksort wrongly?
A purely functional implementation of quicksort works by structural recursion on the input list (IMO, this is worth mentioning). Moreover, as you see, the two calls to "filt" allow you to partition the input list into two sublists (say A and B), which then can be treated individually. What is important here is that:
all elements of A are less than or equal to the pivot element ("p" in the code)
all elements of B are greater than the pivot element
An imperative implementation works in-place, by swapping elements in the same array. In the pseudocode you've provided, the post-invariant of the "partition" function is that you have two subarrays, one starting at 'left' of input array (and ending at 'pivotIndex'), and another starting right after 'pivotIndex' and ending at 'right'. What is important here is that the two subarrays can be seen as representations of the sublists A and B.
I think that by now, you have the idea where the partitioning step is happening (or conversely, how the imperative and functional are related).
You said this:
filt will return a list of everything in xs less than p*, which is concatenated with p, which is cons-ed onto everything >= p.*
That's not quite accurate. filt will return a list of everything in xs less than p, but that new list isn't immediately concatenated with p. The new list is in fact passed to qs (recursively), and whatever qs returns is concatenated with p.
In the pseudocode version, the partitioning happens in-place in the array variable. That's why you see swap in the partition loop. Doing the partitioning in-place is much better for performance than making a copy.

Resources