Beginner to SML / NJ. How to find largest Value in list - functional-programming

We want to find the largest value in a given nonempty list of integers. Then we have to compare elements in the list. Since data
values are given as a sequence, we can do comparisons from the
beginning or from the end of the list. Define in both ways. a)
comparison from the beginning b) comparison from the end (How can we
do this when data values are in a list?) No auxiliary functions.
I've been playing around a lot with recursive functions, but can't seem to figure out how to compare two values in the list.
fun listCompare [] = 0
| listCompare [x] = x
| listCompare (x::xs) = listCompare(xs)
This will break the list down to the last element, but how do I start comparing and composing the list back up?

You could compare the first two elements of a given list and keep the larger element in the list and drop the other. Once the list has only one element, then you have the maximum. In functional pseudocode for a) it looks roughly like so:
lmax [] = error "empty list"
lmax [x] = x
lmax (x::y::xs) =
if x > y then lmax (x::xs)
else lmax (y::xs)
For b) you could reverse the list first.

This is what the foldl (or foldr) function in the SML list library is for :
foldl : ((`a * `b) -> `b) -> `b -> `a list -> `b
You can simply add an anonymous function to compare the current element against the accumulator :
fun lMax l =
foldl (fn (x,y) => if x > y then x else y) (nth l 0) l
The nth function simply takes the int list : l and an int : 0 to return the first element in the list. As lists in SML are written recursively as : h :: t, retrieving the first element is an O(1) operation, and using the foldl function greatly increases the elegance of code. The whole point of having a functional language is to define abstractions to pass around anonymous functions as higher-order functions and re-use the abstract type definitions with concrete functions.

Related

Is it possible to create a "generic" function in Standard ML?

I would like to create a function remove_duplicates that takes a list of any type (e.g. can be an int list or a bool list or a int list list or a whatever list) and returns the same list without duplicates, is this possible in Standard ML?
Is a function that takes a list of any type and returns the list without duplicates possible in Standard ML?
No.
To determine if one element is a duplicate of another, their values must be comparable. "Any type", or 'a in Standard ML, is not comparable for equality. So while you cannot have a val nub : 'a list -> 'a list that removes duplicates, here are four alternative options:
What #qouify suggests, the built-in equality type ''a, so anything you can use = on:
val nub : ''a list -> ''a list
What #kopecs suggests, a function that takes an equality operator as parameter:
val nub : ('a * 'a -> bool) -> 'a list -> 'a list
Which is a generalisation of 1., since here, nub op= : ''a list -> ''a list. This solution is kind of neat since it lets you remove not only duplicates, but also redundant representatives of arbitrary equivalence classes, e.g. nub (fn (x, y) => (x mod 3) = (y mod 3)) will only preserve integers that are distinct modulo 3. But its complexity is O(n²). (-_- )ノ⌒┻━┻
Because it is O(n²), nub is considered harmful.
As the article also suggests, the alternative is to use ordering rather than equality to reduce the complexity to O(n log n). While in Haskell this means only changing the type class constraint:
nub :: Eq a => [a] -> [a]
nubOrd :: Ord a => [a] -> [a]
and adjusting the algorithm, it gets a little more complicated to express this constraint in SML. While we do have ''a to represent Eq a => a (that we can use = on our input), we don't have a similar special syntax support for elements that can be compared as less/equal/greater, and we also don't have type classes. We do have the following built-in order type:
datatype order = LESS | EQUAL | GREATER
so if you like kopecs' solution, a variation with a better running time is:
val nubOrd : ('a * 'a -> order) -> 'a list -> 'a list
since it can use something like a mathematical set of previously seen elements, implemented using some kind of balanced search tree; n inserts each of complexity O(log n) takes a total of O(n log n) steps.
One of SML's winner features is its composable module system. Instead of using parametric polymorphism and feeding the function nubOrd with an order comparison function, you can create a module that takes another module as a parameter (a functor).
First, let's define a signature for modules that represent ordering of types:
signature ORD =
sig
type t
val compare : t * t -> order
end
(Notice that there isn't a ' in front of t.)
This means that anyone could make a struct ... end : ORD by specifying a t and a corresponding compare function for ts. Many built-in types have pre-defined compare functions: int has Int.compare and real has Real.compare.
Then, define a tree-based set data structure; I've used a binary search tree, and I've skipped most functions but the ones strictly necessary to perform this feat. Ideally you might extend the interface and use a better tree type, such as a self-balancing tree. (Unfortunately, since you've tagged this Q&A both as SML/NJ and Moscow ML, I wasn't sure which module to use, since they extend the standard library in different ways when it comes to balanced trees.)
functor TreeSet (X : ORD) =
struct
type t = X.t
datatype 'a tree = Leaf | Branch of 'a tree * 'a * 'a tree
val empty = Leaf
fun member (x, Leaf) = false
| member (x, Branch (left, y, right)) =
case X.compare (x, y) of
EQUAL => true
| LESS => member (x, left)
| GREATER => member (x, right)
fun insert (x, Leaf) = Branch (Leaf, x, Leaf)
| insert (x, Branch (left, y, right)) =
case X.compare (x, y) of
EQUAL => Branch (left, y, right)
| LESS => Branch (insert (x, left), y, right)
| GREATER => Branch (left, y, insert (x, right))
end
Lastly, the ListUtils functor contains the nubOrd utility function. The functor takes a structure X : ORD just like the TreeSet functor does. It creates an XSet structure by specialising the TreeSet functor using the same ordering module. It then uses this XSet to efficiently keep a record of the elements it has seen before.
functor ListUtils (X : ORD) =
struct
structure XSet = TreeSet(X)
fun nubOrd (xs : X.t list) =
let
val init = ([], XSet.empty)
fun go (x, (ys, seen)) =
if XSet.member (x, seen)
then (ys, seen)
else (x::ys, XSet.insert (x, seen))
in rev (#1 (foldl go init xs))
end
end
Using this functor to remove duplicates in an int list:
structure IntListUtils = ListUtils(struct
type t = int
val compare = Int.compare
end)
val example = IntListUtils.nubOrd [1,1,2,1,3,1,2,1,3,3,2,1,4,3,2,1,5,4,3,2,1]
(* [1, 2, 3, 4, 5] *)
The purpose of all that mess is a nubOrd without a direct extra function parameter.
Unfortunately, in order for this to extend to int list list, you need to create the compare function for that type, since unlike Int.compare, there isn't a generic one available in the standard library either. (This is where Haskell is a lot more ergonomic.)
So you might go and write a generic, lexicographical list compare function: If you know how to compare two elements of type 'a, you know how to compare two lists of those, no matter what the element type is:
fun listCompare _ ([], []) = EQUAL (* empty lists are equal *)
| listCompare _ ([], ys) = LESS (* empty is always smaller than non-empty *)
| listCompare _ (xs, []) = GREATER (* empty is always smaller than non-empty *)
| listCompare compare (x::xs, y::ys) =
case compare (x, y) of
EQUAL => listCompare compare (xs, ys)
| LESS => LESS
| GREATER => GREATER
And now,
structure IntListListUtils = ListUtils(struct
type t = int list
val compare = listCompare Int.compare
end)
val example2 = IntListListUtils.nubOrd [[1,2,3],[1,2,3,2],[1,2,3]]
(* [[1,2,3],[1,2,3,2]] *)
So even though [1,2,3] and [1,2,3,2] contain duplicates, they are not EQUAL when you compare them. But the third element is EQUAL to the first one, and so it gets removed as a duplicate.
Some last observations:
You may consider that even though each compare is only run O(log n) times, a single compare for some complex data structure, such as a (whatever * int) list list may still be expensive. So another improvement you can make here is to cache the result of every compare output, which is actually what Haskell's nubOrdOn operator does. ┳━┳ ヽ(ಠل͜ಠ)ノ
The functor approach is used extensively in Jane Street's OCaml Base library. The quick solution was to pass around an 'a * 'a -> order function around every single time you nub something. One moral, though, is that while the module system does add verbosity, if you provide enough of this machinery in a standard library, it will become quite convenient.
If you think the improvement from O(n²) to O(n log n) is not enough, consider Fritz Henglein's Generic top-down discrimination for sorting and partitioning in linear time (2012) and Edward Kmett's Haskell discrimination package's nub for a O(n) nub.
Yes. This is possible in SML through use of parametric polymorphism. You want a function of most general type 'a list -> 'a list where 'a is a type variable (i.e., variable that ranges over types) that would be read as alpha.
For some more concrete examples of how you might apply this (the explicit type variable after fun is optional):
fun 'a id (x : 'a) : 'a = x
Here we have the identity function with type 'a -> 'a.
We can declare similar functions with some degree of specialisation of the types, for instance
fun map _ [] = []
| map f (x::xs) = f x :: map f xs
Where map has most general type ('a -> 'b) -> 'a list -> 'b list, i.e, takes two curried arguments, one with some function type and another with some list type (agrees with function's domain) and returns a new list with type given by the codomain of the function.
For your specific problem you'll probably also want to take an equality function in order to determine what is a "duplicate" or you'll probably restrict yourself to "equality types" (types that can be compared with op=, represented by type variables with two leading apostrophes, e.g., ''a).
Yes sml provides polymorphism to do such things. In many cases you actually don't care for the type of the item in your lists (or other structures). For instance this function checks (already present in the List structure) for the existence of an item in a list:
fun exists _ [] = false
| exists x (y :: l) = x = y orelse exists x l
Such function works for any type of list as long as the equal operator is defined for this type (such type is called an equality type). You can do the same for remove_duplicates. In order to work with list of items of non equality types you will have to give remove_duplicates an additional function that checks if two items are equal.

SMLNJ powerset function

I am trying to print the size of a list created from below power set function
fun add x ys = x :: ys;
fun powerset ([]) = [[]]
| powerset (x::xr) = powerset xr # map (add x) (powerset xr) ;
val it = [[],[3],[2],[2,3],[1],[1,3],[1,2],[1,2,3]] : int list list;
I have the list size function
fun size xs = (foldr op+ 0 o map (fn x => 1)) xs;
I couldnt able to merge these two functions and get the result like
I need something like this:
[(0,[]),(1,[3]),(1,[2]),(2,[2,3]),(1,[1]),(2,[1,3]),(2,[1,2]),(3,[1,2,3])]
Could anyone please help me with this?
You can get the length of a list using the built-in List.length.
You seem to forget to mention that you have the constraint that you can only use higher-order functions. (I am guessing you have this constraint because others these days are asking how to write powerset functions with this constraint, and using foldr to count, like you do, seems a little constructed.)
Your example indicates that you are trying to count each list in a list of lists, and not just the length of one list. For that you'd want to map the counting function across your list of lists. But that'd just give you a list of lengths, and your desired output seems to be a list of tuples containing both the length and the actual list.
Here are some hints:
You might as well use foldl rather than foldr since addition is associative.
You don't need to first map (fn x => 1) - this adds an unnecessary iteration of the list. You're probably doing this because folding seems complicated and you only just managed to write foldr op+ 0. This is symptomatic of not having understood the first argument of fold.
Try, instead of op+, to write the fold expression using an anonymous function:
fun size L = foldl (fn (x, acc) => ...) 0 L
Compare this to op+ which, if written like an anonymous function, would look like:
fn (x, y) => x + y
Folding with op+ carries some very implicit uses of the + operator: You want to discard one operand (since not its value but its presence counts) and use the other one as an accumulating variable (which is better understood by calling it acc rather than y).
If you're unsure what I mean about accumulating variable, consider this recursive version of size:
fun size L =
let fun sizeHelper ([], acc) = acc
| sizeHelper (x::xs, acc) = sizeHelper (xs, 1+acc)
in sizeHelper (L, 0) end
Its helper function has an extra argument for carrying a result through recursive calls. This makes the function tail-recursive, and folding is one generalisation of this technique; the second argument to fold's helper function (given as an argument) is the accumulating variable. (The first argument to fold's helper function is a single argument rather than a list, unlike the explicitly recursive version of size above.)
Given your size function (aka List.length), you're only a third of the way, since
size [[],[3],[2],[2,3],[1],[1,3],[1,2],[1,2,3]]
gives you 8 and not [(0,[]),(1,[3]),(1,[2]),(2,[2,3]),...)]
So you need to write another function that (a) applies size to each element, which would give you [0,1,1,2,...], and (b) somehow combine that with the input list [[],[3],[2],[2,3],...]. You could do that either in two steps using zip/map, or in one step using only foldr.
Try and write a foldr expression that does nothing to an input list L:
foldr (fn (x, acc) => ...) [] L
(Like with op+, doing op:: instead of writing an anonymous function would be cheating.)
Then think of each x as a list.

Sublists of N length function in Erlang style

I've been learning Erlang and tried completing some practise functions. I struggled making one function in particular and think it might be due to me not thinking "Erlang" enough.
The function in question takes a list and a sublist size then produces a list of tuples containing the number of elements before the a sublist, the sublist itself and the number of elements after the sublist. For example
sublists(1,[a,b,c])=:=[{0,[a],2}, {1,[b],1}, {2,[c],0}].
sublists(2,[a,b,c])=:=[{0,[a,b],1}, {1,[b,c],0}].
My working solution was
sublists(SubListSize, [H | T]) ->
Length = length(1, T),
sublists(SubListSize, Length, Length-SubListSize, [H|T], []).
sublists(_, _, -1, _, Acc) -> lists:reverse(Acc);
sublists(SubSize, Length, Count, [H|T], Acc) ->
Sub = {Length-SubSize-Count, grab(SubSize, [H|T],[]),Count},
sublists(SubSize, Length, Count-1, T, [Sub|Acc]).
length(N, []) -> N;
length(N, [_|T]) -> length(N+1, T).
grab(0, _, Acc) -> lists:reverse(Acc);
grab(N, [H|T], Acc) -> grab(N-1, T, [H|Acc]).
but it doesn't feel right and I wondered if there was a better way?
There was an extension that asked for the sublists function to be re-implemented using a list comprehension. My failed attempt was
sublist_lc(SubSize, L) ->
Length = length(0, L),
Indexed = lists:zip(L, lists:seq(0, Length-1)),
[{I, X, Length-1-SubSize} || {X,I} <- Indexed, I =< Length-SubSize].
As I understand it, list comprehensions can't see ahead so I was unable to use my grab function from earlier. This again makes me thing there must be a better way of solving this problem.
I show a few approaches below. All protect against the case where the requested sublist length is greater than the list length. All use functions from the standard lists module.
The first one uses lists:split/2 to capture each sublist and the length of the remaining tail list, and uses a counter C to keep track of how many elements precede the sublist. The length of the remaining tail list, named Rest, gives the number of elements that follow each sublist.
sublists(N,L) when N =< length(L) ->
sublists(N,L,[],0).
sublists(N,L,Acc,C) when N == length(L) ->
lists:reverse([{C,L,0}|Acc]);
sublists(N,[_|T]=L,Acc,C) ->
{SL,Rest} = lists:split(N,L),
sublists(N,T,[{C,SL,length(Rest)}|Acc],C+1).
The next one uses two lists of counters, one indicating how many elements precede the sublist and the other indicating how many follow it. The first is easily calculated by simply counting from 0 to the length of the input list minus the length of each sublist, and the second list of counters is just the reverse of the first. These counter lists are also used to control recursion; we stop when each contains only a single element, indicating we've reached the final sublist and can end the recursion. This approach uses the lists:sublist/2 call to obtain all but the final sublist.
sublists(N,L) when N =< length(L) ->
Up = lists:seq(0,length(L)-N),
Down = lists:reverse(Up),
sublists(N,L,[],{Up,Down}).
sublists(_,L,Acc,{[U],[D]}) ->
lists:reverse([{U,L,D}|Acc]);
sublists(N,[_|T]=L,Acc,{[U|UT],[D|DT]}) ->
sublists(N,T,[{U,lists:sublist(L,N),D}|Acc],{UT,DT}).
And finally, here's a solution based on a list comprehension. It's similar to the previous solution in that it uses two lists of counters to control iteration. It also makes use of lists:nthtail/2 and lists:sublist/2 to obtain each sublist, which admittedly isn't very efficient; no doubt it can be improved.
sublists(N,L) when N =< length(L) ->
Up = lists:seq(0,length(L)-N),
Down = lists:reverse(Up),
[{U,lists:sublist(lists:nthtail(U,L),N),D} || {U,D} <- lists:zip(Up,Down)].
Oh, and a word of caution: your code implements a function named length/2, which is somewhat confusing because it has the same name as the standard length/1 function. I recommend avoiding naming your functions the same as such commonly-used standard functions.

Simple functions for SML/NJ

I was required to write a set of functions for problems in class. I think the way I wrote them was a bit more complicated than they needed to be. I had to implement all the functions myself, without using and pre-defined ones. I'd like to know if there are any quick any easy "one line" versions of these answers?
Sets can be represented as lists. The members of a set may appear in any order on the list, but there shouldn't be more than one
occurrence of an element on the list.
(a) Define dif(A, B) to
compute the set difference of A and B, A-B.
(b) Define cartesian(A,
B) to compute the Cartesian product of set A and set B, { (a, b) |
a∈A, b∈B }.
(c) Consider the mathematical-induction proof of the
following: If a set A has n elements, then the powerset of A has 2n
elements. Following the proof, define powerset(A) to compute the
powerset of set A, { B | B ⊆ A }.
(d) Define a function which, given
a set A and a natural number k, returns the set of all the subsets of
A of size k.
(* Takes in an element and a list and compares to see if element is in list*)
fun helperMem(x,[]) = false
| helperMem(x,n::y) =
if x=n then true
else helperMem(x,y);
(* Takes in two lists and gives back a single list containing unique elements of each*)
fun helperUnion([],y) = y
| helperUnion(a::x,y) =
if helperMem(a,y) then helperUnion(x,y)
else a::helperUnion(x,y);
(* Takes in an element and a list. Attaches new element to list or list of lists*)
fun helperAttach(a,[]) = []
helperAttach(a,b::y) = helperUnion([a],b)::helperAttach(a,y);
(* Problem 1-a *)
fun myDifference([],y) = []
| myDifference(a::x,y) =
if helper(a,y) then myDifference(x,y)
else a::myDifference(x,y);
(* Problem 1-b *)
fun myCartesian(xs, ys) =
let fun first(x,[]) = []
| first(x, y::ys) = (x,y)::first(x,ys)
fun second([], ys) = []
| second(x::xs, ys) = first(x, ys) # second(xs,ys)
in second(xs,ys)
end;
(* Problem 1-c *)
fun power([]) = [[]]
| power(a::y) = union(power(y),insert(a,power(y)));
I never got to problem 1-d, as these took me a while to get. Any suggestions on cutting these shorter? There was another problem that I didn't get, but I'd like to know how to solve it for future tests.
(staircase problem) You want to go up a staircase of n (>0) steps. At one time, you can go by one step, two steps, or three steps. But,
for example, if there is one step left to go, you can go only by one
step, not by two or three steps. How many different ways are there to
go up the staircase? Solve this problem with sml. (a) Solve it
recursively. (b) Solve it iteratively.
Any help on how to solve this?
Your set functions seem nice. I would not change anything principal about them except perhaps their formatting and naming:
fun member (x, []) = false
| member (x, y::ys) = x = y orelse member (x, ys)
fun dif ([], B) = []
| dif (a::A, B) = if member (a, B) then dif (A, B) else a::dif(A, B)
fun union ([], B) = B
| union (a::A, B) = if member (a, B) then union (A, B) else a::union(A, B)
(* Your cartesian looks nice as it is. Here is how you could do it using map: *)
local val concat = List.concat
val map = List.map
in fun cartesian (A, B) = concat (map (fn a => map (fn b => (a,b)) B) A) end
Your power is also very neat. If you call your function insert, it deserves a comment about inserting something into many lists. Perhaps insertEach or similar is a better name.
On your last task, since this is a counting problem, you don't need to generate the actual combinations of steps (e.g. as lists of steps), only count them. Using the recursive approach, try and write the base cases down as they are in the problem description.
I.e., make a function steps : int -> int where the number of ways to take 0, 1 and 2 steps are pre-calculated, but for n steps, n > 2, you know that there is a set of combinations of steps that begin with either 1, 2 or 3 steps plus the number combinations of taking n-1, n-2 and n-3 steps respectively.
Using the iterative approach, start from the bottom and use parameterised counting variables. (Sorry for the vague hint here.)

New to SML / NJ. Making a custom insert function

Define a function that, given a list L, an object x, and a positive
integer k, returns a copy of L with x inserted at the k-th position.
For example, if L is [a1, a2, a3] and k=2, then [a1, x, a2, a3] is
returned. If the length of L is less than k, insert at the end. For
this kind of problems, you are supposed not to use, for example, the
length function. Think about how the function computes the length. No
'if-then-else' or any auxiliary function.
I've figured out how to make a function to find the length of a list
fun mylength ([]) = 0
| mylength (x::xs) = 1+ mylength(xs)
But, as the questions states, I can't use this as an auxiliary function in the insert function. Also, i'm lost as to how to go about the insert function? Any help or guidance would be appreciated!
Here's how to do this. Each recursive call you pass to the function tail of the list and (k - 1) - position of the new element in the tail of the list. When the list is empty, you construct a single-element list (which was given to you); when k is 0, you append your element to what's left from the list. On the way back, you append all heads of the list that you unwrapped before.
fun kinsert [] x k = [x]
| kinsert ls x 0 = x::ls
| kinsert (l::ls) x k = l::(kinsert ls x (k - 1))
I used a 0-indexed list; if you want 1-indexed, just replace 0 with 1.
As you can see, it's almost the same as your mylength function. The difference is that there are two base cases for recursion and your operation on the way back is not +, but ::.
Edit
You can call it like this
kinsert [1,2,3,4,5,6] 10 3;
It has 3 arguments; unlike your length function, it does not wrap arguments in a tuple.
Here's how I'd approach it. The following assumes that the list item starts from zero.
fun mylength (lst,obj,pos) =
case (lst,obj,pos) of
([],ob,po)=>[ob]
| (xs::ys,ob,0) => ob::lst
| (xs::ys,ob,po) => xs::mylength(ys,obj,pos-1)

Resources