Standard ML style: is variable shadowing good style? - functional-programming

in standard ML, is variable shadowing in general, and in particular, when pattern matching acceptable? For this toy example:
case xs of
[] => 0
| x::xs => x + sum xs
Is the following better style?
fun sum(xs) =
case xs of
[] => 0
| x::xs' => x + sum xs'
Without shadowing, one has to pick different names, which clutters the code, especially when nested patterns, let function bindings, and other language constructs are being used.
Thank you!

is variable shadowing good style?
No.
But also xs and xs' aren't good either: They have the same type so it is very easy to accidentally use one instead of the other. In your case this probably leads to infinite recursion and that gets detected soon enough. But in other cases it might lead to more subtle bugs. This advice is not particular to functional programming.
Edit: For totality I'm including molbdnilo's suggestion of y::ys:
fun sum xs =
case xs of
  [] => 0
| y::ys => y + sum ys
An alternative is to only pattern match and bind the values you actually need. In your sum example you don't actually need the full input for anything other than splitting apart. So you might write it like
fun sum [] = 0
| sum (x::xs) = x + sum xs
Or with even more implicit pattern matching matching:
val sum = foldl op+ 0
Another example, Exercism's Bob exercise, can be solved by first sanitizing the input and then classifying it:
datatype diction = Yelling | Asking | YellingAsking | Whatever
fun strip message = ...
fun classify message = ...
fun answer diction = ...
val response = answer o classify o strip
Here the message of the function strip will contain unstripped whitespace and the message of the classify function won't. So instead of having multiple messages, one having whitespace and the other one not, you put them in separate scopes of functions that do separate things.

Related

F# Discriminated union with optional recurcive component

I'm working with F# and I struggle building my business model.
Let's say I have a list of float and two types of transformations I can apply on it.
By example:
type Transformation =
| SMA of period:Period
| P2MA of period:Period
Then I have defined a function let compute Transformation list to compute any kind of transformation over a list.
With the above Transformation type, I can create a SMA(3) or a P2SMA(5) by example.
I would like to be able to nest the transformations in a way I can write SMA(3, P2SMA(5, SMA(10))) by example. But also I would like to still be able to write SMA(2) only.
I tried using options, but I think writing SMA(3, None) or SMA(3, Some(P2SMA(5))) is too verbose.
Is there any way to do that? Maybe my approach is wrong, as I'm new in F#, I may tackle the problem by the wrong way?
Thanks a lot for the help
Try my answer here.
It's not possible to overload discriminated union cases in exactly the way you want. But if you'll accept a very slightly different syntax, you could do this instead:
type Period = int
type SmaTransform =
| Sma of Period
| Sma' of Period * Transform
and P2smaTransform =
| P2sma of Period
| P2sma' of Period * Transform
and Transform =
| OfSma of SmaTransform
| OfP2Sma of P2smaTransform
let SMA(period) =
Sma(period) |> OfSma
let SMA'(period, transform) =
Sma'(period, transform) |> OfSma
let P2SMA(period) =
P2sma(period) |> OfP2Sma
let P2SMA'(period, transform) =
P2sma'(period, transform) |> OfP2Sma
let transforms =
[|
SMA(3)
P2SMA(5)
SMA'(3, P2SMA'(5, SMA(10)))
|]
for transform in transforms do
printfn "%A" transform
The only difference from your desired syntax is the apostrophe that denotes a nested transform.
I tried using options, but I think writing SMA(3, None) or SMA(3, Some(P2SMA(5))) is too verbose.
You can use a static member with an optional argument:
type Transformation =
| [<EditorBrowsable(EditorBrowsableState.Never)>]
SMAInternal of period:int * inner: Transformation option
...
static member SMA(period:int, ?t:Transformation) =
SMAInternal(period, t)
You can then write: Transformation.SMA(3) or Transformation.SMA(3, Transformation.P2SMA(5)). This takes more characters but has fewer constructs. You may or may not regard it as more concise.
I'm new in F#, I may tackle the problem by the wrong way?
If you are going to be defining hundreds of these things in a code file then using the above approach and shortening the name Transformation may be a good idea. Otherwise just use the Somes and Nones. Verbosity is a negligible consideration and if you start to worry about it, horrible things start happening.
I don't understand quite what you're trying to model, but here is something hopefully similar.
My Transformations are either multiple a number or add to it
type Trans =
| Mult of period: int
| Add of period: int
and I can now write an interpret function, that given a number and a transformation, I can interpret it
let interpret x trans =
match trans with
| Mult p -> p * x
| Add p -> p + x
so we can now do simple
let x = interpret 1 (Mult 2)
but you want to chain transformations?
so lets allow that..
let interprets xs x =
List.fold (fun state trans ->
interpret state trans) x xs
and we can go...
let foo = [ Mult 3; Add 2 ]
let bar = interprets foo 1
OK, so IF you really want to deal with these compositions of lists of transformations uniformly, which may be nice (a bit like function composition).
Then i would be tempted to go (note I'm trying to follow your coding style)
(there's quite a lot to take in here, so maybe stick with the above approach until you're happy you understand F# a bit better).
type Trans =
| Mult of period: int
| Add of period: int
| Compose of chain: List<Trans>
let rec interpret x trans =
let interprets xs x =
List.fold (fun state trans ->
interpret state trans) x xs
match trans with
| Mult p -> p * x
| Add p -> p + x
| Compose ps ->
interprets ps x
let two = interpret 1 (Mult 2)
let three = interpret 1 (Compose [ Mult 2; Add 1 ])
Now I think you have a data model that "works", and is pretty simple.
I wouldnt then try to change the data model to make your code convenient, I'd create utility functions to do that (smart constructors) to do that.
e.g.
let multThen x trans = Compose [ Mult x; trans ]
let addThen x trans = Compose [ Add x; trans ]
The advice though is to make the data model model the data in the simplest manner, and then use functions to make your code elegant, and map in and out of that model, often the two things look quite different.
Caveat: I havent tested some of this code

How to recursively make a list of lists in SML/NJ

I'm brand new to SML/NJ and I'm trying to make a recursive function
that makes a listOfLists. Ex: listOf([1,2,3,4]) will output
[[1],[2],[3],[4]]. I've found a recursive merge in SML/NJ, and I'm
trying to use it as kind've an outline:
- fun merge(xs,nil) = xs
= | merge(nil,ys) = ys
= | merge(x::xs, y::ys) =
= if (x < y) then x::merge(xs, y::ys) else y::merge(x::xs,ys);
- fun listOf(xs) = xs
= | listOf(x::xs) = [x]::listOf(xs);
I'm trying to use pattern match and I'm a little confused on it. I'm
pretty sure x is the head and then xs is the tail, but I could be
wrong. So what I'm trying to do is use the head of the list, make it a
list, and then add it to the rest of the list. But when trying to do
this function, I get the error:
stdIn:15.19-15.34 Error: operator and operand don't agree [circularity]
operator domain: 'Z list * 'Z list list
operand: 'Z list * 'Z list
in expression:
(x :: nil) :: listOf xs
This error is foreign to me because I don't have really any experience
with sml/nj. How can I fix my listOf function?
You are fairly close. The problem is that in pattern-matching, a pattern like xs (just a variable) can match anything. The fact that you end it with s doesn't mean that the pattern can only match a tail of a list. Using s in that way is just a programmer convention in SML.
Thus, in your definition:
fun listOf(xs) = xs
| listOf(x::xs) = [x]::listOf(xs);
The first line tells SML to return all values unchanged, which is clearly not your intent. SML detects that this is inconsistent with the second line where you are trying to change a value after all.
You need to change that first line so that it doesn't match everything. Looking at that merge function as a template, you need something which matches a basis case. The natural basis case is nil (which can also be written as []). Note the role that nil plays in the definition of merge. If you use nil instead of xs for the pattern in the first line of your function definition, your second line does exactly what you want and the function will work as intended:
fun listOf(nil) = nil
| listOf(x::xs) = [x]::listOf(xs);

What is a good way to define a finite multiplication table in Isar?

Suppose I have a binary operator f :: "sT => sT => sT". I want to define f so that it implements a 4x4 multiplication table for the Klein four group, shown here on the Wiki:
http://en.wikipedia.org/wiki/Klein_four-group
Here, all I'm attempting to do is create a table with 16 entries. First, I define four constants like this:
consts
k_1::sT
k_a::sT
k_b::sT
k_ab::sT
Then I define my function to implement the 16 entries in the table:
k_1 * k_1 = k_1
k_1 * k_a = k_a
...
k_ab * k_ab = k_1
I don't know how to do any normal-like programming in Isar, and I've seen on the Isabelle user's list where it was said that (certain) programming-like constructs have been intentionally de-emphasized in the language.
The other day, I was trying to create a simple, contrived function, and after finding the use of if, then, else in a source file, I couldn't find a reference to those commands in isar-ref.pdf.
In looking at the tutorials, I see definition for defining functions in a straightforward way, and other than that, I only see information on recursive and inductive functions, which require datatype, and my situation is more simple than that.
If left to my own devices, I guess I would try and define a datatype for those 4 constants shown above, and then create some conversion functions so that I end up with a binary operator f :: sT => sT => sT. I messed around a little with trying to use fun, but it wasn't turning out to be a simple deal.
I had done a little experimenting with fun and inductive
UPDATE: I add some material here in response to the comment telling me that Programming and Proving is where I'll find the answers. It seems I might be going astray of the ideal Stackoverflow format.
I had done some basic experimenting, mainly with fun, but also with inductive. I gave up on inductive fairly fast. Here's the type of error I got from simple examples:
consts
k1::sT
inductive k4gI :: "sT => sT => sT" where
"k4gI k1 k1 = k1"
--"OUTPUT ERROR:"
--{*Proofs for inductive predicate(s) "k4gI"
Ill-formed introduction rule ""
((k4gI k1 k1) = k1)
Conclusion of introduction rule must be an inductive predicate
*}
My multiplication table isn't inductive, so I didn't see that inductive was what I should spend my time chasing.
"Pattern matching" seems a key idea here, so I experimented with fun. Here's some really messed up code trying to use fun with only a standard function type:
consts
k1::sT
fun k4gF :: "sT => sT => sT" where
"k4gF k1 k1 = k1"
--"OUTPUT ERROR:"
--"Malformed definition:
Non-constructor pattern not allowed in sequential mode.
((k4gF k1 k1) = k1)"
I got that kind of error, and I had read things like this in Programming and Proving:
"Recursive functions are defined with fun by pattern matching over datatype constructors.
That all gives a novice the impression that fun requires datatype. As far its big brother function, I don't know about that.
It seems here, all I need is a recursive function with 16 base cases, and that would define my multiplication table.
Is function the answer?
In editing this question, I remembered function from the past, and here's function at work:
consts
k1::sT
function k4gF :: "sT => sT => sT" where
"k4gF k1 k1 = k1"
try
The output of try is telling me it can be proved (Update: I think it's actually telling me that only 1 of the proof steps can be prove.):
Trying "solve_direct", "quickcheck", "try0", "sledgehammer", and "nitpick"...
Timestamp: 00:47:27.
solve_direct: (((k1, k1) = (k1, k1)) ⟹ (k1 = k1)) can be solved directly with
HOL.arg_cong: ((?x = ?y) ⟹ ((?f ?x) = (?f ?y))) [name "HOL.arg_cong", kind "lemma"]
HOL.refl: (?t = ?t) [name "HOL.refl"]
MFZ.HOL⇣'eq⇣'is⇣'reflexive: (?r = ?r) [name "MFZ.HOL⇣'eq⇣'is⇣'reflexive", kind "theorem"]
MFZ.HOL_eq_is_reflexive: (?r = ?r) [name "MFZ.HOL_eq_is_reflexive", kind "lemma"]
Product_Type.Pair_inject:
(⟦((?a, ?b) = (?a', ?b')); (⟦(?a = ?a'); (?b = ?b')⟧ ⟹ ?R)⟧ ⟹ ?R)
[name "Product_Type.Pair_inject", kind "lemma"]
I don't know what that means. I only know about function because of trying to prove an inconsistency. I only know it doesn't complain as much. If using function like this is how I define my multiplication table, then I'm happy.
Still, being an argumentative type, I didn't learn about function in a tutorial. I learned about it several months ago in a reference manual, and I still don't know much about how to use it.
I have a function which I prove with auto, but the function is probably no good, fortunately. That adds to the function's mystery. There's information on function in Defining Recursive Functions in Isabelle/HOL, and it compares fun and function.
However, I haven't seen one example of fun or function that doesn't use a recursive datatype, such as nat or 'a list. Maybe I didn't look hard enough.
Sorry for being verbose and this not ending up as a direct question, but there's no tutorial with Isabelle that takes a person directly from A to B.
Below, I don't adhere to an "only answer the question" format, but I am responding to my own question, and so everything I say will be of interest to the original poster.
(2nd update begin)
This should be my last update. To be content with "unsophisticated methods", it helps to be able to make comparisons to see the "low tech" way can be the best way.
I finally quit trying to make my main type work with the new type, and I just made me a Klein four-group out of a datatype like this, where the proof of associativity is at the end:
datatype AT4k = e4kt | a4kt | b4kt | c4kt
fun AOP4k :: "AT4k => AT4k => AT4k" where
"AOP4k e4kt y = y"
| "AOP4k x e4kt = x"
| "AOP4k a4kt a4kt = e4kt"
| "AOP4k a4kt b4kt = c4kt"
| "AOP4k a4kt c4kt = b4kt"
| "AOP4k b4kt a4kt = c4kt"
| "AOP4k b4kt b4kt = e4kt"
| "AOP4k b4kt c4kt = a4kt"
| "AOP4k c4kt a4kt = b4kt"
| "AOP4k c4kt b4kt = a4kt"
| "AOP4k c4kt c4kt = e4kt"
notation
AOP4k ("AOP4k") and
AOP4k (infixl "*" 70)
theorem k4o_assoc2:
"(x * y) * z = x * (y * z)"
by(smt AOP4k.simps(1) AOP4k.simps(10) AOP4k.simps(11) AOP4k.simps(12)
AOP4k.simps(13) AOP4k.simps(2) AOP4k.simps(3) AOP4k.simps(4) AOP4k.simps(5)
AOP4k.simps(6) AOP4k.simps(7) AOP4k.simps(8) AOP4k.simps(9) AT4k.exhaust)
The consequence is that I am now content with my if-then-else multiplication function. Why? Because the if-then-else function is very conducive to simp magic. This pattern matching doesn't work any magic in and of itself, not to mention that I would still have to work out the coercive subtyping part of it.
Here's the if-then-else function for the 4x4 multiplication table:
definition AO4k :: "sT => sT => sT" where
"AO4k x y =
(if x = e4k then y else
(if y = e4k then x else
(if x = y then e4k else
(if x = a4k y = c4k then b4k else
(if x = b4k y = c4k then a4k else
(if x = c4k y = a4k then b4k else
(if x = c4k y = b4k then a4k else
c4k)))))))"
Because of the one nested if-then-else statement, when I run auto, it produces 64 goals. I made 16 simp rules, one for every value in the multiplication table, so when I run auto, with all the other simp rules, the auto proof takes about 90ms.
Low tech is the way to go sometimes; it's a RISC vs. CISC thing, somewhat.
A small thing like a multiplication table can be important for testing things, but it can't be useful if it's gonna slow my THY down because it's in some big loop that takes forever to finish.
(2nd update end)
(Update begin)
(UPDATE: My question above falls under the category "How do I do basic programming in Isabelle, like with other programming languages?" Here, I go beyond the specific question some, but I try to keep my comments about the challenge to a beginner who is trying to learn Isabelle when the docs are at the intermediate level, at least, in my opinion they are.
Specific to my question, though, is that I have need for a case statement, which is a very basic feature of many, many programming languages.
In looking for a case statement today, I thought I hit gold after doing one more search in the docs, this time in Isabelle - A Proof Assistant for
Higher-Order Logic.
On page 5 it documents a case statement, but on page 18, it clarifies that it's only good for datatype, and I seem to confirm that with an error like this:
definition k4oC :: "kT => kT => kT" (infixl "**" 70) where
"k4oC x y = (case x y of k1 k1 => k1)"
--{*Error in case expression:
Not a datatype constructor: "i130429a.k1"
In clause
((k1 k1) ⇒ k1)*}
This is an example that a person, whether expert or beginner, has a need for a tutorial to run through the basic programming features of Isabelle.
If you say, "There are tutorials that do that." I say, "No, there aren't, not in my opinion".
The tutorials emphasize the important, sophisticated features of Isabelle that separate it from the crowd.
That's commentary, but it's commentary meant to tie into the question, "How do I learn Isabelle?", and which my original question above is related to.
The way you learn Isabelle without being a PhD graduate student at Cambridge, TUM, or NICTA, is you struggle for 6 to 12 months or more. If during that time you don't abandon, you can be at a level that will allow you to appreciate the intermediate level instruction available. Experience may vary.
For me, the 3 books that will take me to the next level of proving, weaning me off of auto and metis, when I find time to go through them, are
Isabelle - A Proof Assistant for Higher-Order Logic
Programming and Proving in Isabelle/HOL
Isabelle/Isar --- a versatile environment for human-readable formal proof documents
If someone says, "You've abused the Stackoverflow answer format by engaging in long-winded commentary and opinion."
I say, "Well, I asked for a good way to do some basic programming in Isabelle, where I was hoping for something more sophisticated than a big if-then-else statement. No one provided anything close to what I asked for. In fact, I am who provided a pattern matching function, and what I needed to do it is not even close to being documented. Pattern matching is a simple concept, but not necessarily in Isabelle, due to the proof requirements for recursive functions. (If there's a simple way to do it to replace my if-then-else function below, or even a case statement way, I'd sure like to know.)
Having said that, I am inclined to take some liberties, and there are, at this time, only 36 views for this page anyway, of which probably, at least 10 come from my browser.
Isabelle/HOL is a powerful language. I'm not complaining. It only sounds like it.)
(Update end)
It can count for a lot just to know that something is true or false, in this case being told that function can work with non-inductive types. However, how I end up using function below is not a result of anything I've seen in any one Isabelle document, and I had need for this former SO question on coercive subtyping:
What is an Isabelle/HOL subtype? What Isar commands produce subtypes?
I end up with two ways that I completed a 2x2 part of my multiplication table. I link here to the theory: as ASCII friendly A_i130429a.thy, jEdit friendly i130429a.thy, the PDF, and folder.
The two ways are:
The clumsy but fast and simp friendly if-then-else way. The definition takes 0ms, and the proof takes 155ms.
The pattern matching way using function. Here I could think aloud in public for a long time about this way of doing things, but I won't. I know I'll use what I've learned here, but it's definitely not an elegant solution for a simple multiplication table function, and it's far from obvious that a person would have to do all that to create a basic function that uses pattern matching. Of course, maybe I don't have to do all that. The definition takes 391ms, and the proof takes 317ms.
As to having to resort to using if-then-else, either Isabelle/HOL is not feature rich when it comes to basic programming statements, or these basic statements aren't documented. The if-then-else statement is not even in the Isar Reference Manual index. I think, "If it's not documented, maybe there's a nice, undocumented case statement like Haskell has". Still, I'd take Isabelle over Haskell any day.
Below, I explain the different sections of A_i130429a.thy. It's sort of trivial, but not completely, since I haven't seen an example to teach me how to do that.
I start with a type and four constants, which remain undefined.
typedecl kT
consts
k1::kT
ka::kT
kb::kT
kab::kT
Of note is that the constants remain undefined. That I'm leaving a lot of things undefined is part of why I have problems finding good examples in docs and sources to use as templates for myself.
I do a test to try and intelligently use function on a non-inductive datatype, but it doesn't work. With my if-then-else function, after I figure out I'm not restricting my function domain, I then see that the problem with this function was also with the domain. The function k4f0 is wanting x to be k1 or ka for every x, which obviously is not true.
function k4f0 :: "kT => kT" where
"k4f0 k1 = k1"
| "k4f0 ka = ka"
apply(auto)
apply(atomize_elim)
--"goal (1 subgoal):
1. (!! (x::sT). ((x = k1) | (x = ka)))"
I give up and define me an ugly function with if-then-else.
definition k4o :: "kT => kT => kT" (infixl "**" 70) where
"k4o x y =
(if x = k1 & y = k1 then k1 else
(if x = k1 & y = ka then ka else
(if x = ka & y = k1 then ka else
(if x = ka & y = ka then k1 else (k1)
))))"
declare k4o_def [simp add]
The hard part becomes trying to prove associativity of my function k4o. But that's only because I'm not restricting the domain. I put in an implication into the statement, and the auto magic kicks in, the fastforce magic is there also, and faster, so I use it.
abbreviation k4g :: "kT set" where
"k4g == {k1, ka}"
theorem
"(x \<in> k4g & y \<in> k4g & z \<in> k4g) --> (x ** y) ** z = x ** (y ** z)"
by(fastforce)(*155ms*)
The magic makes me happy, and I'm then motivated to try and get it done with function and pattern matching. Because of the recent SO answer on coercive subtyping, linked to above, I figure out how to fix the domain with typedef. I don't thinks it's the perfect solution, but I definitely learned something.
typedef kTD = "{x::kT. x = k1 | x = ka}"
by(auto)
declare [[coercion_enabled]]
declare [[coercion Abs_kTD]]
function k4f :: "kTD => kTD => kT" (infixl "***" 70) where
"k4f k1 k1 = k1"
| "k4f k1 ka = ka"
| "k4f ka k1 = ka"
| "k4f ka ka = k1"
by((auto),(*391ms*)
(atomize_elim),
(metis (lifting, full_types) Abs_kTD_cases mem_Collect_eq),
(metis (lifting, full_types) Rep_kTD_cases Rep_kTD_inverse mem_Collect_eq),
(metis (lifting, full_types) Rep_kTD_cases Rep_kTD_inverse mem_Collect_eq),
(metis (lifting, full_types) Rep_kTD_cases Rep_kTD_inverse mem_Collect_eq),
(metis (lifting, full_types) Rep_kTD_cases Rep_kTD_inverse mem_Collect_eq))
termination
by(metis "termination" wf_measure)
theorem
"(x *** y) *** z = x *** (y *** z)"
by(smt
Abs_kTD_cases
k4f.simps(1)
k4f.simps(2)
k4f.simps(3)
k4f.simps(4)
mem_Collect_eq)(*317ms*)
A more or less convenient syntax for defining a "finite" function is the function update syntax: For a function f, f(x := y) represents the function %z. if z = x then y else f z. If you want to update more than one value, separate them with commas: f(x1 := y1, x2 := y2).
So, for example function which is addition for 0, 1 and undefined else could be written as:
undefined (0 := undefined(0 := 0, 1 := 1),
1 := undefined(0 := 1, 1 := 2))
Another possibility to define a finite function is to generate it from a list of pairs; for example with map_of. With f xs y z = the (map_of xs (y,z)), then the above function could be written as
f [((0,0),0), ((0,1),1), ((1,0),1), ((1,1),1)]
(Actually, it is not quite the same function, as it might behave differently outside the defined Domain).

Keeping a counter at each recursive call in OCaml

I am trying to write a function that returns the index of the passed value v in a given list x; -1 if not found. My attempt at the solution:
let rec index (x, v) =
let i = 0 in
match x with
[] -> -1
| (curr::rest) -> if(curr == v) then
i
else
succ i; (* i++ *)
index(rest, v)
;;
This is obviously wrong to me (it will return -1 every time) because it redefines i at each pass. I have some obscure ways of doing it with separate functions in my head, none which I can write down at the moment. I know this is a common pattern in all programming, so my question is, what's the best way to do this in OCaml?
Mutation is not a common way to solve problems in OCaml. For this task, you should use recursion and accumulate results by changing the index i on certain conditions:
let index(x, v) =
let rec loop x i =
match x with
| [] -> -1
| h::t when h = v -> i
| _::t -> loop t (i+1)
in loop x 0
Another thing is that using -1 as an exceptional case is not a good idea. You may forget this assumption somewhere and treat it as other indices. In OCaml, it's better to treat this exception using option type so the compiler forces you to take care of None every time:
let index(x, v) =
let rec loop x i =
match x with
| [] -> None
| h::t when h = v -> Some i
| _::t -> loop t (i+1)
in loop x 0
This is pretty clearly a homework problem, so I'll just make two comments.
First, values like i are immutable in OCaml. Their values don't change. So succ i doesn't do what your comment says. It doesn't change the value of i. It just returns a value that's one bigger than i. It's equivalent to i + 1, not to i++.
Second the essence of recursion is to imagine how you would solve the problem if you already had a function that solves the problem! The only trick is that you're only allowed to pass this other function a smaller version of the problem. In your case, a smaller version of the problem is one where the list is shorter.
You can't mutate variables in OCaml (well, there is a way but you really shouldn't for simple things like this)
A basic trick you can do is create a helper function that receives extra arguments corresponding to the variables you want to "mutate". Note how I added an extra parameter for the i and also "mutate" the current list head in a similar way.
let rec index_helper (x, vs, i) =
match vs with
[] -> -1
| (curr::rest) ->
if(curr == x) then
i
else
index_helper (x, rest, i+1)
;;
let index (x, vs) = index_helper (x, vs, 0) ;;
This kind of tail-recursive transformation is a way to translate loops to functional programming but to be honest it is kind of low level (you have full power but the manual recursion looks like programming with gotos...).
For some particular patterns what you can instead try to do is take advantage of reusable higher order functions, such as map or folds.

Could someone explain these Haskell functions to me?

I've dabbled with Haskell in the past, and recently got back into it seriously, and I'm reading real world haskell. Some of the examples they've shone, I've yet to understand. Such at this one:
myLength [] = 0
myLength (x:xs) = 1 + myLength (xs)
I don't see how this works, what is 1 really being added too? How is the recursion returning something that can be added to? I don't get it.
And here we have this one:
splitLines [] = []
splitLines cs =
let (pre, suf) = break isLineTerminator cs
in pre : case suf of
('\r':'\n':rest) -> splitLines rest
('\r':rest) -> splitLines rest
('\n':rest) -> splitLines rest
_ -> []
isLineTerminator c = c == '\r' || c == '\n'
How does this work, what is pre really being attached too? I don't see how the the result of the case expression is something that pre can be concatenated to. Maybe I just need someone to explain the evaluation of these functions in details. I must be missing something very vital.
Thanks in advance!
EDIT: I know, it was a copy-paste fail. Sorry.
EDIT 2: It seems my confusion was with what these functions were actually /returning/ I have it all worked out now. Thanks for the answers guys, it finally clicked! I appreciate it!
As for the first one, it's a very basic way of recursion. However, it seems to be missing a part:
myLength [] = 0
It works by scaling off one element at the time from the list and adding one to the result. To visualise, consider the call
myLength [1,2,3]
which will evaluate to:
1 + myLength [2,3]
1 + 1 + myLength [3]
1 + 1 + 1 + myLength []
1 + 1 + 1 + 0
which is 3.
As for the second one, well, you have already split the string at the next line break into two parts: pre and suf. Now, suf will start with either a \n, or a \r, or a \r\n. We want to remove these. So we use pattern matching. See how the rest variable is essentially the suf variable minus the initial line break character(s).
So we have pre, which is the first line, and rest, which is the rest of the text. So in order to continue splitting rest into lines we call splitLines on it recursively and concatenate the result to pre.
To visualize, say you have the string "foo\nbar\r\nbaz".
So, when calling, the result will be:
[ pre => foo, suf => \nbar\r\nbaz, rest => bar\r\n\baz ]
foo : splitLines bar\r\nbaz
then splitLines is called again, and the result is expanded into:
[ pre => bar, suf => \r\nbaz, rest = baz ]
foo : bar : splitLines baz
then once again:
[ pre => baz, suf => [], rest = [] ]
foo : bar : baz
which is the final result.
I think the definition of myLength misses the case where the list is empty:
myLength [] = 0
myLength (x:xs) = 1 + myLength (xs)
With this definition, the myLength of an empty list is 0. The (x:xs) patten unpacks a list into the first item, a, and a list with the rest of the items, xs. If the list has one item, then xs is an empty list, so the result is 1 + 0. And so on.
Recursion is easiest to understand when you look at the base case first, and then see how each level of recursion builds on the result. (The base case is the case where the function does not call itself. If a recursive function doesn't have a base case, the output will be infinite.)
In the second example, the base case (the last case in the case-statment) is also an empty list. So pre will always be appended to a list, which will yield a new, longer, list.
Re: myLength (x:xs) = 1 + myLength (xs) -- this is "half" of the definition of myLength, it says, by pattern match, that if the argument has a head and a tail, then the result is one more than the recursive tail call on the tail -- there needs to be another half to say that the result is 0 when the argument cannot match x:xs, i.e., when the argument is an empty list.
In the second case, the possibility of different patterns matching is just made a bit more epxlicit via case.
BTW, laziness is not a key issue here -- ML, with eager evaluation but pattern matching much like Haskell, would work very similarly. Looks like pattern matching is what you really need to brush up about.
First of all the first example should be like this (edit: it looks like you fixed it now):
myLength [] = 0
myLength (x:xs) = 1 + myLength (xs)
It works like this: say I give it a list with three items, it returns one plus the length of the tail (which is one plus the length of the tail (which is one plus the length of the tail, (which is [] at this point), which is 1), which is w), which is 3 (the final answer). Maybe nested parenthesis will help you understand it. ;-)
It is instructive to look at what the type signatures of the functions would be. They could be:
myLength :: [a] -> Int
In myLength, 1 is being added to the result of the recursive call to myLength, which is an Int, which in turn results in an Int.
splitLines :: [Char] -> [[Char]]
In splitLines, pre (a [Char]) is being prepended to the result of the case statement, which, from looking at the cases, is either the result of a recursive call to splitLines, which is [[Char]]; or an empty list. In both cases, prepending pre (a [Char]) will result in a [[Char]] in turn.

Resources