Compare two lists in Ocaml - functional-programming

Seems simple enough, but I'm very new to this language and having some trouble. Given two lists, what would be the best way to write a function to determine which list is "larger."
For example: [1:2:3] and [1:3:2] would return [1:3:2]
These lists don't have to be the same length: [1:2] and [1:2:3] would return [1:2:3]
Thanks.

The predefined function max will do this for you:
# max [1;2;3] [1;3;2];;
- : int list = [1; 3; 2]
# max [1;2] [1;2;3];;
- : int list = [1; 2; 3]
Of course, it depends on what you mean by "larger". The built-in comparisons of OCaml use lexicographic order. If you wanted to use some other order, you would actually have to write your own function.
Or maybe you want to write your own function from scratch just for the practice. In that case, a good way to go with lists in OCaml is to use recursion. Try out some patterns of recursion and (if you still need help), update your question to show what you've tried.

Related

Prime factorization in a functional language

I'm writing in OCaml, trying to give a (somewhat) efficient implementation of prime factorization. I figure the best representation of a number 2 or more is in a list of exponents. For simplicity with consing I'll do it in decreasing order of primes. So 2 would be [1] and 3 would be [1;0] and 4 would be [2], and 5 [1;0;0].
I was thinking of using the sieve idea to take a number n and look for all possible divisors between 2 and sqrt(n). Then divide by any divisor and recurse. However, every implementation that I can think of seems to involve repeatedly searching over a list and that seems just unnecessarily inefficient. The outline of my solution is best stated in this code
let rec pf n =
if (n=2) then ([1], 0)
else let sq = int_of_float ( (float_of_int n) ** 0.5 ) in
let primes = getPrimes sq in
match earliestDiv n primes with
| None -> n::(zero_list (n-1))
| Some (x, i) -> let subproblem pf (n/x) in
increment subproblem i
The helper functions here would be:
getPrimes which takes an int and returns a list of all prime numbers less-than-or-equal to it.
earliestDiv which takes an int n and list of ints lst, returns an int*int option corresponding to the earliest number in lst which divides n. That will be the first coordinate of the tuple; the second coordinate will return the index of this prime x in the list of primes.
increment will take an int list and index, and increase by 1 the number located at the index.
All of these helper functions keep making lists, and passing through lists, and so on. And in fact, I often feel like I'm doing this in functional programming. I often have the sense that I'm unnecessarily iterating over lists whereas in imperative languages I would be writing code that is more efficient. Perhaps it's just in my head, and when writing in imperative languages I less often notice how many resources are going into some of the list operations I use. But if I'm missing some important technique that could prevent repeatedly scanning lists, I'd be curious to hear it.
The question: Is it necessary to repeatedly make and iterate over lists in order to write this function?
If you end up indexing a list, or filling a list with default elements up to a fixed size, lists are most probably the wrong data structure. For prime factorisation, you probably want an implementation of a sparse array. Maps would be a better (if not optimal) implementation than fixed-size lists.

Define a new method with only a few changes

I want to write a version that accepts a supplementary argument. The difference with the initial version only resides in a few lines of codes, potentially within loops. A typical example is to user a vector of weight w.
One solution is to completely rewrite a new function
function f(Vector::a)
...
for x in a
...
s += x[i]
...
end
...
end
function f(a::Vector, w::Vector)
...
for x in a
...
s += x[i] * w[i]
...
end
...
end
This solution duplicates code and therefore makes the program harder to maintain.
I could split ... into different helper functions, which are called by both functions, but the resulting code would be hard to follow
Another solution is to write only one function and use a ? : structure for each line that should be changed
function f(a, w::Union(Nothing, Vector) = nothing)
....
for x in a
...
s += (w == nothing)? x[i] : x[i] * w[i]
...
end
....
end
This code requires to check a condition at every step in a loop, which does not sound efficient, compared to the first version.
I'm sure there is a better solution, maybe using macros. What would be a good way to deal with this?
There are lots of ways to do this sort of thing, ranging from optional arguments to custom types to metaprogramming with #eval'ed code generation (this would splice in the changes for each new method as you loop over a list of possibilities).
I think in this case I'd use a combination of the approaches suggested by #ColinTBowers and #GnimucKey.
It's fairly simple to define a custom array type that is all ones:
immutable Ones{N} <: AbstractArray{Int,N}
dims::NTuple{N, Int}
end
Base.size(O::Ones) = O.dims
Base.getindex(O::Ones, I::Int...) = (checkbounds(O, I...); 1)
I've chosen to use an Int as the element type since it tends to promote well. Now all you need is to be a bit more flexible in your argument list and you're good to go:
function f(a::Vector, w::AbstractVector=Ones(size(a))
…
This should have a lower overhead than either of the other proposed solutions; getindex should inline nicely as a bounds check and the number 1, there's no type instability, and you don't need to rewrite your algorithm. If you're sure that all your accesses are in-bounds, you could even remove the bounds checking as an additional optimization. Or on a recent 0.4, you could define and use Base.unsafe_getindex(O::Ones, I::Int...) = 1 (that won't quite work on 0.3 since it's not guaranteed to be defined for all AbstractArrays).
In this case, using Optional Arguments may play the trick.
Just make the w argument default to ones().
I've come up against this problem a few times. If you want to avoid the conditional if statement inside the loop, one possibility is to use multiple dispatch over some dummy types. For example:
abstract MyFuncTypes
type FuncWithNoWeight <: MyFuncTypes; end
evaluate(x::Vector, i::Int, ::FuncWithNoWeight) = x[i]
type FuncWithWeight{T} <: MyFuncTypes
w::Vector{T}
end
evaluate(x::Vector, i::Int, wT::FuncWithWeight) = x[i] * wT.w[i]
function f(a, w::MyFuncTypes=FuncWithNoWeight())
....
for x in a
...
s += evaluate(x, i, w)
...
end
....
end
I extend the evaluate method over FuncWithNoWeight and FuncWithWeight in order to get the appropriate behaviour. I also nest these types within an abstract type MyFuncTypes, which is the second input to f (with default value of FuncWithNoWeight). From here, multiple dispatch and Julia's type system takes care of the rest.
One neat thing about this approach is that if you decide later on you want to add a third type of behaviour inside the loop (not necessarily even weighting, pretty much any type of transformation will be possible), it is as simple as defining a new type, nesting it under MyFuncTypes, and extending the evaluate method to the new type.
UPDATE: As Matt B. has pointed out, the first version of my answer accidentally introduced type instability into the function with my solution. As a general rule I typically find that if Matt posts something it is worth paying close attention (hint, hint, check out his answer). I'm still learning a lot about Julia (and am answering questions on StackOverflow to facilitate that learning). I've updated my answer to remove the type instability pointed out by Matt.

Rust - Vector with size defined at runtime

How do you create an array, in rust, whose size is defined at run time?
Basically, how do you convert in rust the following code:
void f(int n){ return std::vector<int>(n); }
?
This is not possible in rust:
let n = 15;
let board: [int, ..n];
Note: I saw that it was impossible to do this in a simple manner, here, but I refuse to accept that such a simple thing is impossible :p
Thanks a lot!
Never-mind, I found it the way:
let n = 15; // number of items
let val = 17; // value to replicate
let v = std::vec::from_elem(val, n);
The proper way in modern Rust is vec![value; size].
Values are cloned, which is quite a relief compared to other languages that casually hand back a vector of references to the same object. E.g. vec![vec![]; 2] creates a vector where both elements are independent vectors, 3 vectors in total. Python's [[]] * 2 creates a vector of length 2 where both elements are (references to) the same nested vector.

New to OCaml: How would I go about implementing Gaussian Elimination?

I'm new to OCaml, and I'd like to implement Gaussian Elimination as an exercise. I can easily do it with a stateful algorithm, meaning keep a matrix in memory and recursively operating on it by passing around a reference to it.
This statefulness, however, smacks of imperative programming. I know there are capabilities in OCaml to do this, but I'd like to ask if there is some clever functional way I haven't thought of first.
OCaml arrays are mutable, and it's hard to avoid treating them just like arrays in an imperative language.
Haskell has immutable arrays, but from my (limited) experience with Haskell, you end up switching to monadic, mutable arrays in most cases. Immutable arrays are probably amazing for certain specific purposes. I've always imagined you could write a beautiful implementation of dynamic programming in Haskell, where the dependencies among array entries are defined entirely by the expressions in them. The key is that you really only need to specify the contents of each array entry one time. I don't think Gaussian elimination follows this pattern, and so it seems it might not be a good fit for immutable arrays. It would be interesting to see how it works out, however.
You can use a Map to emulate a matrix. The key would be a pair of integers referencing the row and column. You'll want to use your own get x y function to ensure x < n and y < n though, instead of accessing the Map directly. (edit) You can use the compare function in Pervasives directly.
module OrderedPairs = struct
type t = int * int
let compare = Pervasives.compare
end
module Pairs = Map.Make (OrderedPairs)
let get_ n set x y =
assert( x < n && y < n );
Pairs.find (x,y) set
let set_ n set x y v =
assert( x < n && y < n );
Pairs.add (x,y) set v
Actually, having a general set of functions (get x y and set x y at a minimum), without specifying the implementation, would be an even better option. The functions then can be passed to the function, or be implemented in a module through a functor (a better solution, but having a set of functions just doing what you need would be a first step since you're new to OCaml). In this way you can use a Map, Array, Hashtbl, or a set of functions to access a file on the hard-drive to implement the matrix if you wanted. This is the really important aspect of functional programming; that you trust the interface over exploiting the side-effects, and not worry about the underlying implementation --since it's presumed to be pure.
The answers so far are using/emulating mutable data-types, but what does a functional approach look like?
To see, let's decompose the problem into some functional components:
Gaussian elimination involves a sequence of row operations, so it is useful first to define a function taking 2 rows and scaling factors, and returning the resultant row operation result.
The row operations we want should eliminate a variable (column) from a particular row, so lets define a function which takes a pair of rows and a column index and uses the previously defined row operation to return the modified row with that column entry zero.
Then we define two functions, one to convert a matrix into triangular form, and another to back-substitute a triangular matrix to the diagonal form (using the previously defined functions) by eliminating each column in turn. We could iterate or recurse over the columns, and the matrix could be defined as a list, vector or array of lists, vectors or arrays. The input is not changed, but a modified matrix is returned, so we can finally do:
let out_matrix = to_diagonal (to_triangular in_matrix);
What makes it functional is not whether the data-types (array or list) are mutable, but how they they are used. This approach may not be particularly 'clever' or be the most efficient way to do Gaussian eliminations in OCaml, but using pure functions lets you express the algorithm cleanly.

Confused over behavior of List.mapi in F#

I am building some equations in F#, and when working on my polynomial class I found some odd behavior using List.mapi
Basically, each polynomial has an array, so 3*x^2 + 5*x + 6 would be [|6, 5, 3|] in the array, so, when adding polynomials, if one array is longer than the other, then I just need to append the extra elements to the result, and that is where I ran into a problem.
Later I want to generalize it to not always use a float, but that will be after I get more working.
So, the problem is that I expected List.mapi to return a List not individual elements, but, in order to put the lists together I had to put [] around my use of mapi, and I am curious why that is the case.
This is more complicated than I expected, I thought I should be able to just tell it to make a new List starting at a certain index, but I can't find any function for that.
type Polynomial() =
let mutable coefficients:float [] = Array.empty
member self.Coefficients with get() = coefficients
static member (+) (v1:Polynomial, v2:Polynomial) =
let ret = List.map2(fun c p -> c + p) (List.ofArray v1.Coefficients) (List.ofArray v2.Coefficients)
let a = List.mapi(fun i x -> x)
match v1.Coefficients.Length - v2.Coefficients.Length with
| x when x < 0 ->
ret :: [((List.ofArray v1.Coefficients) |> a)]
| x when x > 0 ->
ret :: [((List.ofArray v2.Coefficients) |> a)]
| _ -> [ret]
I think that a straightforward implementation using lists and recursion would be simpler in this case. An alternative implementation of the Polynomial class might look roughly like this:
// The type is immutable and takes initial list as constructor argument
type Polynomial(coeffs:float list) =
// Local recursive function implementing the addition using lists
let rec add l1 l2 =
match l1, l2 with
| x::xs, y::ys -> (x+y) :: (add xs ys)
| rest, [] | [], rest -> rest
member self.Coefficients = coeffs
static member (+) (v1:Polynomial, v2:Polynomial) =
// Add lists using local function
let newList = add v1.Coefficients v2.Coefficients
// Wrap result into new polynomial
Polynomial(newList)
It is worth noting that you don't really need mutable field in the class, since the + operator creates and returns a new instance of the type, so the type is fully immutable (as you'd usually want in F#).
The nice thing in the add function is that after processing all elements that are available in both lists, you can simply return the tail of the non-empty list as the rest.
If you wanted to implement the same functionality using arrays, then it may be better to use a simple for loop (since arrays are, in principle, imperative, the usual imperative patterns are usually the best option for dealing with them). However, I don't think there is any particular reason for preferring arrays (maybe performance, but that would have to be evaluated later during the development).
As Pavel points out, :: operator appends a single element to the front of a list (see the add function above, which demonstrates that). You could write what you wanted using # which concatenates lists, or using Array.concat (which concatenates a sequence of arrays).
An implementation using higher-order functions and arrays is also possible - the best version I can come up with would look like this:
let add (a1:_[]) (a2:_[]) =
// Add parts where both arrays have elements
let l = min a1.Length a2.Length
let both = Array.map2 (+) a1.[0 .. l-1] a2.[0 .. l-1]
// Take the rest of the longer array
let rest =
if a1.Length > a2.Length
then a1.[l .. a1.Length - 1]
else a2.[l .. a2.Length - 1]
// Concatenate them
Array.concat [ both; rest ]
add [| 6; 5; 3 |] [| 7 |]
This uses slices (e.g. a.[0 .. l]) which give you a part of an array - you can use these to take the parts where both arrays have elements and the remaining part of the longer array.
I think you're misunderstanding what operator :: does. It's not used to concatenate two lists. It's used to prepend a single element to the list. Consequently, it's type is:
'a -> 'a list -> 'a list
In your case, you're giving ret as a first argument, and ret is itself a float list. Consequently, it expects the second argument to be of type float list list - hence why you need to add an extra [] to the second argument to make it to compile - and that will also be the result type of your operator +, which is probably not what you want.
You can use List.concat to concatenate two (or more) lists, but that is inefficient. In your example, I don't see the point of using lists at all - all this converting back & forth is going to be costly. For arrays, you can use Array.append, which is better.
By the way, it's not clear what is the purpose of mapi in your code at all. It's exactly the same as map, except for the index argument, but you're not using it, and your mapping is the identity function, so it's effectively a no-op. What's it about?

Resources