How do you prove termination of a recursive list length? - recursion

Suppose we have a list:
List = nil | Cons(car cdr:List).
Note that I am talking about modifiable lists!
And a trivial recursive length function:
recursive Length(List l) = match l with
| nil => 0
| Cons(car cdr) => 1 + Length cdr
end.
Naturally, it terminates only when the list is non-circular:
inductive NonCircular(List l) = {
empty: NonCircular(nil) |
\forall head, tail: NonCircular(tail) => NonCircular (Cons(head tail))
}
Note that this predicate, being implemented as a recursive function, also does not terminate on a circular list.
Usually I see proofs of list traversal termination that use list length as a bounded decreasing factor. They suppose that Length is non-negative. But, as I see it, this fact (Length l >= 0) follows from the termination of Length on the first place.
How do you prove, that the Length terminates and is non-negative on NonCircular (or an equivalent, better defined predicate) lists?
Am I missing an important concept here?

Unless the length function has cycle detection there is no guarantee it will halt!
For a singly linked list one uses the Tortoise and hare algorithm to determine the length where there is a chance there might be circles in the cdr.
It's just two cursors, the tortoise starts at first element and the hare starts at the second. Tortoise moves one pointer at a time while the hare moves two (if it can). The hare will eventually either be the same as the tortoise, which indicates a cycle, or it will terminate knowing the length is 2*steps or 2*steps+1.
Compared to finding cycles in a tree this is very cheap and performs just as well on terminating lists as a function that does not have cycle detection.

The definition of List that you have on top doesn't seem to permit circular lists. Each call to the "constructor" Cons will create a new pointer, and you aren't allowed to modify the pointer later to create the circularity.
You need a more sophisticated definition of List if you want to handle circularity. You probably need to define a Cell containing data value and an address, and a Node which contains a Cell and an address pointing to the previous node, and then you'll need to define the dereferencing operator to go back from addresses to Cells. You can also try to define non-circular on this object.
My gut feeling is that you will also need to define an injective function from the "simple" list definition you have above to the sophisticated one that I've outlined and then finally you'll be able to prove your result.
One other thing, the definition of NonCircular doesn't need to terminate. It isn't a program, it is a proof. If it holds, then you can examine the proof to see why it holds and use this in other proofs.
Edit: Thanks to Necto for pointing out I'm wrong.

Related

Why after pressing semicolon program is back in deep recursion?

I'm trying to understand the semicolon functionality.
I have this code:
del(X,[X|Rest],Rest).
del(X,[Y|Tail],[Y|Rest]) :-
del(X,Tail,Rest).
permutation([],[]).
permutation(L,[X|P]) :- del(X,L,L1), permutation(L1,P).
It's the simple predicate to show all permutations of given list.
I used the built-in graphical debugger in SWI-Prolog because I wanted to understand how it works and I understand for the first case which returns the list given in argument. Here is the diagram which I made for better understanding.
But I don't get it for the another solution. When I press the semicolon it doesn't start in the place where it ended instead it's starting with some deep recursion where L=[] (like in step 9). I don't get it, didn't the recursion end earlier? It had to go out of the recursions to return the answer and after semicolon it's again deep in recursion.
Could someone clarify that to me? Thanks in advance.
One analogy that I find useful in demystifying Prolog is that Backtracking is like Nested Loops, and when the innermost loop's variables' values are all found, the looping is suspended, the vars' values are reported, and then the looping is resumed.
As an example, let's write down simple generate-and-test program to find all pairs of natural numbers above 0 that sum up to a prime number. Let's assume is_prime/1 is already given to us.
We write this in Prolog as
above(0, N), between(1, N, M), Sum is M+N, is_prime(Sum).
We write this in an imperative pseudocode as
for N from 1 step 1:
for M from 1 step 1 until N:
Sum := M+N
if is_prime(Sum):
report_to_user_and_ask(Sum)
Now when report_to_user_and_ask is called, it prints Sum out and asks the user whether to abort or to continue. The loops are not exited, on the contrary, they are just suspended. Thus all the loop variables values that got us this far -- and there may be more tests up the loops chain that sometimes succeed and sometimes fail -- are preserved, i.e. the computation state is preserved, and the computation is ready to be resumed from that point, if the user presses ;.
I first saw this in Peter Norvig's AI book's implementation of Prolog in Common Lisp. He used mapping (Common Lisp's mapcan which is concatMap in Haskell or flatMap in many other languages) as a looping construct though, and it took me years to see that nested loops is what it is really all about.
Goals conjunction is expressed as the nesting of the loops; goals disjunction is expressed as the alternatives to loop through.
Further twist is that the nested loops' structure isn't fixed from the outset. It is fluid, the nested loops of a given loop can be created depending on the current state of that loop, i.e. depending on the current alternative being explored there; the loops are written as we go. In (most of the) languages where such dynamic creation of nested loops is impossible, it can be encoded with nested recursion / function invocation / inside the loops. (Here's one example, with some pseudocode.)
If we keep all such loops (created for each of the alternatives) in memory even after they are finished with, what we get is the AND-OR tree (mentioned in the other answer) thus being created while the search space is being explored and the solutions are found.
(non-coincidentally this fluidity is also the essence of "monad"; nondeterminism is modeled by the list monad; and the essential operation of the list monad is the flatMap operation which we saw above. With fluid structure of loops it is "Monad"; with fixed structure it is "Applicative Functor"; simple loops with no structure (no nesting at all): simply "Functor" (the concepts used in Haskell and the like). Also helps to demystify those.)
So, the proper slogan could be Backtracking is like Nested Loops, either fixed, known from the outset, or dynamically-created as we go. It's a bit longer though. :)
Here's also a Prolog example, which "as if creates the code to be run first (N nested loops for a given value of N), and then runs it." (There's even a whole dedicated tag for it on SO, too, it turns out, recursive-backtracking.)
And here's one in Scheme ("creates nested loops with the solution being accessible in the innermost loop's body"), and a C++ example ("create n nested loops at run-time, in effect enumerating the binary encoding of 2n, and print the sums out from the innermost loop").
There is a big difference between recursion in functional/imperative programming languages and Prolog (and it really became clear to me only in the last 2 weeks or so):
In functional/imperative programming, you recurse down a call chain, then come back up, unwinding the stack, then output the result. It's over.
In Prolog, you recurse down an AND-OR tree (really, alternating AND and OR nodes), selecting a predicate to call on an OR node (the "choicepoint"), from left to right, and calling every predicate in turn on an AND node, also from left to right. An acceptable tree has exactly one predicate returning TRUE under each OR node, and all predicates returning TRUE under each AND node. Once an acceptable tree has been constructed, by the very search procedure, we are (i.e. the "search cursor" is) on a rightmost bottommost node .
Success in constructing an acceptable tree also means a solution to the query entered at the Prolog Toplevel (the REPL) has been found: The variable values are output, but the tree is kept (unless there are no choicepoints).
And this is also important: all variables are global in the sense that if a variable X as been passed all the way down the call chain from predicate to predicate to the rightmost bottommost node, then constrained at the last possible moment by unifying it with 2 for example, X = 2, then the Prolog Toplevel is aware of that without further ado: nothing needs to be passed up the call chain.
If you now press ;, search doesn't restart at the top of the tree, but at the bottom, i.e. at the current cursor position: the nearest parent OR node is asked for more solutions. This may result in much search until a new acceptable tree has been constructed, we are at a new rightmost bottommost node. The new variable values are output and you may again enter ;.
This process cycles until no acceptable tree can be constructed any longer, upon which false is output.
Note that having this AND-OR as an inspectable and modifiable data structure at runtime allows some magical tricks to be deployed.
There is bound to be a lot of power in debugging tools which record this tree to help the user who gets the dreaded sphynxian false from a Prolog program that is supposed to work. There are now Time Traveling Debuggers for functional and imperative languages, after all...

In-order traversal in BST-Ocaml

I'm working with a polymorphic binary search tree with the standard following type definition:
type tree =
Empty
| Node of int * tree * tree (*value, left sub tree, right sub tree*);;
I want to do an in order traversal of this tree and add the values to a list, let's say. I tried this:
let rec in_order tree =
match tree with
Empty -> []
| Node(v,l,r) -> let empty = [] in in_order r#empty;
v::empty;
in_order l#empty
;;
But it keeps returning an empty list every time. I don't see why it is doing that.
When you're working with recursion you need to always reason as follows:
How do I solve the easiest version of the problem?
Supposing I have a solution to an easier problem, how can I modify it to solve a harder problem?
You've done the first part correctly, but the second part is a mess.
Part of the problem is that you've not implemented the thing you said you want to implement. You said you want to do a traversal and add the values to a list. OK, so then the method should take a list somewhere -- the list you are adding to. But it doesn't. So let's suppose it does take such a parameter and see if that helps. Such a list is traditionally called an accumulator for reasons which will become obvious.
As always, get the signature right first:
let rec in_order tree accumulator =
OK, what's the easy solution? If the tree is empty then adding the tree contents to the accumulator is simply the identity:
match tree with
| Empty -> accumulator
Now, what's the recursive case? We suppose that we have a solution to some smaller problems. For instance, we have a solution to the problem of "add everything on one side to the accumulator with the value":
| Node (value, left, right) ->
let acc_with_right = in_order right accumulator in
let acc_with_value = value :: acc_with_right in
OK, we now have the accumulator with all the elements from one side added. We can then use that to add to it all the elements from the other side:
in_order left acc_with_value
And now we can make the whole thing implement the function you tried to write in the first place:
let in_order tree =
let rec aux tree accumulator =
match tree with
| Empty -> accumulator
| Node (value, left, right) ->
let acc_with_right = aux right accumulator in
let acc_with_value = value :: acc_with_right in
aux left acc_with_value in
aux tree []
And we're done.
Does that all make sense? You have to (1) actually implement the exact thing you say you're going to implement, (2) solve the base case, and (3) assume you can solve smaller problems and combine them into solutions to larger problems. That's the pattern you use for all recursive problem solving.
I think your problem boils down to this. The # operator returns a new list that is the concatenation of two other lists. It doesn't modify the other lists. In fact, nothing ever modifies a list in OCaml. Lists are immutable.
So, this expression:
r # empty
Has no effect on the value named empty. It will remain an empty list. In fact, the value empty can never be changed either. Variables in OCaml are also immutable.
You need to imagine constructing and returning your value without modifying lists or variables.
When you figure it out, it won't involve the ; operator. What this operator does is to evaluate two expressions (to the left and right), then return the value of the expression at the right. It doesn't combine values, it performs an action and discards its result. As such, it's not useful when working with lists. (It is used for imperative constructs, like printing values.)
If you thought about using # where you're now using ;, you'd be a lot closer to a solution.

Erlang Recursive end loop

I just started learning Erlang and since I found out there is no for loop I tried recreating one with recursion:
display(Rooms, In) ->
Room = array:get(In, Rooms)
io:format("~w", [Room]),
if
In < 59 -> display(Rooms, In + 1);
true -> true
end.
With this code i need to display the content (false or true) of each array in Rooms till the number 59 is reached. However this creates a weird code which displays all of Rooms contents about 60 times (?). When I drop the if statement and only put in the recursive code it is working except for a exception error: Bad Argument.
So basically my question is how do I put a proper end to my "for loop".
Thanks in advance!
Hmm, this code is rewritten and not pasted. It is missing colon after Room = array:get(In, Rooms). The Bad argument error is probably this:
exception error: bad argument
in function array:get/2 (array.erl, line 633)
in call from your_module_name:display/2
This means, that you called array:get/2 with bad arguments: either Rooms is not an array or you used index out of range. The second one is more likely the cause. You are checking if:
In < 59
and then calling display again, so it will get to 58, evaluate to true and call:
display(Rooms, 59)
which is too much.
There is also couple of other things:
In io:format/2 it is usually better to use ~p instead of ~w. It does exactly the same, but with pretty printing, so it is easier to read.
In Erlang if is unnatural, because it evaluates guards and one of them has to match or you get error... It is just really weird.
case is much more readable:
case In < 59 of
false -> do_something();
true -> ok
end
In case you usually write something, that always matches:
case Something of
{One, Two} -> do_stuff(One, Two);
[Head, RestOfList] -> do_other_stuff(Head, RestOfList);
_ -> none_of_the_previous_matched()
end
The underscore is really useful in pattern matching.
In functional languages you should never worry about details like indexes! Array module has map function, which takes function and array as arguments and calls the given function on each array element.
So you can write your code this way:
display(Rooms) ->
DisplayRoom = fun(Index, Room) -> io:format("~p ~p~n", [Index, Room]) end,
array:map(DisplayRoom, Rooms).
This isn't perfect though, because apart from calling the io:format/2 and displaying the contents, it will also construct new array. io:format returns atom ok after completion, so you will get array of 58 ok atoms. There is also array:foldl/3, which doesn't have that problem.
If you don't have to have random access, it would be best to simply use lists.
Rooms = lists:duplicate(58, false),
DisplayRoom = fun(Room) -> io:format("~p~n", [Room]) end,
lists:foreach(DisplayRoom, Rooms)
If you are not comfortable with higher order functions. Lists allow you to easily write recursive algorithms with function clauses:
display([]) -> % always start with base case, where you don't need recursion
ok; % you have to return something
display([Room | RestRooms]) -> % pattern match on list splitting it to first element and tail
io:format("~p~n", [Room]), % do something with first element
display(RestRooms). % recursive call on rest (RestRooms is quite funny name :D)
To summarize - don't write forloops in Erlang :)
This is a general misunderstanding of recursive loop definitions. What you are trying to check for is called the "base condition" or "base case". This is easiest to deal with by matching:
display(0, _) ->
ok;
display(In, Rooms) ->
Room = array:get(In, Rooms)
io:format("~w~n", [Room]),
display(In - 1, Rooms).
This is, however, rather unidiomatic. Instead of using a hand-made recursive function, something like a fold or map is more common.
Going a step beyond that, though, most folks would probably have chosen to represent the rooms as a set or list, and iterated over it using list operations. When hand-written the "base case" would be an empty list instead of a 0:
display([]) ->
ok;
display([Room | Rooms]) ->
io:format("~w~n", [Room]),
display(Rooms).
Which would have been avoided in favor, once again, of a list operation like foreach:
display(Rooms) ->
lists:foreach(fun(Room) -> io:format("~w~n", [Room]) end, Rooms).
Some folks really dislike reading lambdas in-line this way. (In this case I find it readable, but the larger they get the more likely the are to become genuinely distracting.) An alternative representation of the exact same function:
display(Rooms) ->
Display = fun(Room) -> io:format("~w~n", [Room]) end,
lists:foreach(Display, Rooms).
Which might itself be passed up in favor of using a list comprehension as a shorthand for iteration:
_ = [io:format("~w~n", [Room]) | Room <- Rooms].
When only trying to get a side effect, though, I really think that lists:foreach/2 is the best choice for semantic reasons.
I think part of the difficulty you are experiencing is that you have chosen to use a rather unusual structure as your base data for your first Erlang program that does anything (arrays are not used very often, and are not very idiomatic in functional languages). Try working with lists a bit first -- its not scary -- and some of the idioms and other code examples and general discussions about list processing and functional programming will make more sense.
Wait! There's more...
I didn't deal with the case where you have an irregular room layout. The assumption was always that everything was laid out in a nice even grid -- which is never the case when you get into the really interesting stuff (either because the map is irregular or because the topology is interesting).
The main difference here is that instead of simply carrying a list of [Room] where each Room value is a single value representing the Room's state, you would wrap the state value of the room in a tuple which also contained some extra data about that state such as its location or coordinates, name, etc. (You know, "metadata" -- which is such an overloaded, buzz-laden term today that I hate saying it.)
Let's say we need to maintain coordinates in a three-dimensional space in which the rooms reside, and that each room has a list of occupants. In the case of the array we would have divided the array by the dimensions of the layout. A 10*10*10 space would have an array index from 0 to 999, and each location would be found by an operation similar to
locate({X, Y, Z}) -> (1 * X) + (10 * Y) + (100 * Z).
and the value of each Room would be [Occupant1, occupant2, ...].
It would be a real annoyance to define such an array and then mark arbitrarily large regions of it as "unusable" to give the impression of irregular layout, and then work around that trying to simulate a 3D universe.
Instead we could use a list (or something like a list) to represent the set of rooms, but the Room value would now be a tuple: Room = {{X, Y, Z}, [Occupants]}. You may have an additional element (or ten!), like the "name" of the room or some other status information or whatever, but the coordinates are the most certain real identity you're likely to get. To get the room status you would do the same as before, but mark what element you are looking at:
display(Rooms) ->
Display =
fun({ID, Occupants}) ->
io:format("ID ~p: Occupants ~p~n", [ID, Occupants])
end,
lists:foreach(Display, Rooms).
To do anything more interesting than printing sequentially, you could replace the internals of Display with a function that uses the coordinates to plot the room on a chart, check for empty or full lists of Occupants (use pattern matching, don't do it procedurally!), or whatever else you might dream up.

Explanation of lists:fold function

I learning more and more about Erlang language and have recently faced some problem. I read about foldl(Fun, Acc0, List) -> Acc1 function. I used learnyousomeerlang.com tutorial and there was an example (example is about Reverse Polish Notation Calculator in Erlang):
%function that deletes all whitspaces and also execute
rpn(L) when is_list(L) ->
[Res] = lists:foldl(fun rpn/2, [], string:tokens(L," ")),
Res.
%function that converts string to integer or floating poitn value
read(N) ->
case string:to_float(N) of
%returning {error, no_float} where there is no float avaiable
{error,no_float} -> list_to_integer(N);
{F,_} -> F
end.
%rpn managing all actions
rpn("+",[N1,N2|S]) -> [N2+N1|S];
rpn("-", [N1,N2|S]) -> [N2-N1|S];
rpn("*", [N1,N2|S]) -> [N2*N1|S];
rpn("/", [N1,N2|S]) -> [N2/N1|S];
rpn("^", [N1,N2|S]) -> [math:pow(N2,N1)|S];
rpn("ln", [N|S]) -> [math:log(N)|S];
rpn("log10", [N|S]) -> [math:log10(N)|S];
rpn(X, Stack) -> [read(X) | Stack].
As far as I understand lists:foldl executes rpn/2 on every element on list. But this is as far as I can understand this function. I read the documentation but it does not help me a lot. Can someone explain me how lists:foldl works?
Let's say we want to add a list of numbers together:
1 + 2 + 3 + 4.
This is a pretty normal way to write it. But I wrote "add a list of numbers together", not "write numbers with pluses between them". There is something fundamentally different between the way I expressed the operation in prose and the mathematical notation I used. We do this because we know it is an equivalent notation for addition (because it is commutative), and in our heads it reduces immediately to:
3 + 7.
and then
10.
So what's the big deal? The problem is that we have no way of understanding the idea of summation from this example. What if instead I had written "Start with 0, then take one element from the list at a time and add it to the starting value as a running sum"? This is actually what summation is about, and it's not arbitrarily deciding which two things to add first until the equation is reduced.
sum(List) -> sum(List, 0).
sum([], A) -> A;
sum([H|T], A) -> sum(T, H + A).
If you're with me so far, then you're ready to understand folds.
There is a problem with the function above; it is too specific. It braids three ideas together without specifying any independently:
iteration
accumulation
addition
It is easy to miss the difference between iteration and accumulation because most of the time we never give this a second thought. Most languages accidentally encourage us to miss the difference, actually, by having the same storage location change its value each iteration of a similar function.
It is easy to miss the independence of addition merely because of the way it is written in this example because "+" looks like an "operation", not a function.
What if I had said "Start with 1, then take one element from the list at a time and multiply it by the running value"? We would still be doing the list processing in exactly the same way, but with two examples to compare it is pretty clear that multiplication and addition are the only difference between the two:
prod(List) -> prod(List, 1).
prod([], A) -> A;
prod([H|T], A) -> prod(T, H * A).
This is exactly the same flow of execution but for the inner operation and the starting value of the accumulator.
So let's make the addition and multiplication bits into functions, so we can pull that part of the pattern out:
add(A, B) -> A + B.
mult(A, B) -> A * B.
How could we write the list operation on its own? We need to pass a function in -- addition or multiplication -- and have it operate over the values. Also, we have to pay attention to the identity of the type and operation of things we are operating on or else we will screw up the magic that is value aggregation. "add(0, X)" always returns X, so this idea (0 + Foo) is the addition identity operation. In multiplication the identity operation is to multiply by 1. So we must start our accumulator at 0 for addition and 1 for multiplication (and for building lists an empty list, and so on). So we can't write the function with an accumulator value built-in, because it will only be correct for some type+operation pairs.
So this means to write a fold we need to have a list argument, a function to do things argument, and an accumulator argument, like so:
fold([], _, Accumulator) ->
Accumulator;
fold([H|T], Operation, Accumulator) ->
fold(T, Operation, Operation(H, Accumulator)).
With this definition we can now write sum/1 using this more general pattern:
fsum(List) -> fold(List, fun add/2, 0).
And prod/1 also:
fprod(List) -> fold(List, fun prod/2, 1).
And they are functionally identical to the one we wrote above, but the notation is more clear and we don't have to write a bunch of recursive details that tangle the idea of iteration with the idea of accumulation with the idea of some specific operation like multiplication or addition.
In the case of the RPN calculator the idea of aggregate list operations is combined with the concept of selective dispatch (picking an operation to perform based on what symbol is encountered/matched). The RPN example is relatively simple and small (you can fit all the code in your head at once, it's just a few lines), but until you get used to functional paradigms the process it manifests can make your head hurt. In functional programming a tiny amount of code can create an arbitrarily complex process of unpredictable (or even evolving!) behavior, based just on list operations and selective dispatch; this is very different from the conditional checks, input validation and procedural checking techniques used in other paradigms more common today. Analyzing such behavior is greatly assisted by single assignment and recursive notation, because each iteration is a conceptually independent slice of time which can be contemplated in isolation of the rest of the system. I'm talking a little ahead of the basic question, but this is a core idea you may wish to contemplate as you consider why we like to use operations like folds and recursive notations instead of procedural, multiple-assignment loops.
I hope this helped more than confused.
First, you have to remember haw works rpn. If you want to execute the following operation: 2 * (3 + 5), you will feed the function with the input: "3 5 + 2 *". This was useful at a time where you had 25 step to enter a program :o)
the first function called simply split this character list into element:
1> string:tokens("3 5 + 2 *"," ").
["3","5","+","2","*"]
2>
then it processes the lists:foldl/3. for each element of this list, rpn/2 is called with the head of the input list and the current accumulator, and return a new accumulator. lets go step by step:
Step head accumulator matched rpn/2 return value
1 "3" [] rpn(X, Stack) -> [read(X) | Stack]. [3]
2 "5" [3] rpn(X, Stack) -> [read(X) | Stack]. [5,3]
3 "+" [5,3] rpn("+", [N1,N2|S]) -> [N2+N1|S]; [8]
4 "2" [8] rpn(X, Stack) -> [read(X) | Stack]. [2,8]
5 "*" [2,8] rpn("*",[N1,N2|S]) -> [N2*N1|S]; [16]
At the end, lists:foldl/3 returns [16] which matches to [R], and though rpn/1 returns R = 16

Choosing unique items from a list, using recursion

As follow up to yesterday's question Erlang: choosing unique items from a list, using recursion
In Erlang, say I wanted choose all unique items from a given list, e.g.
List = [foo, bar, buzz, foo].
and I had used your code examples resulting in
NewList = [bar, buzz].
How would I further manipulate NewList in Erlang?
For example, say I not only wanted to choose all unique items from List, but also count the total number of characters of all resulting items from NewList?
In functional programming we have patterns that occur so frequently they deserve their own names and support functions. Two of the most widely used ones are map and fold (sometimes reduce). These two form basic building blocks for list manipulation, often obviating the need to write dedicated recursive functions.
Map
The map function iterates over a list in order, generating a new list where each element is the result of applying a function to the corresponding element in the original list. Here's how a typical map might be implemented:
map(Fun, [H|T]) -> % recursive case
[Fun(H)|map(Fun, T)];
map(_Fun, []) -> % base case
[].
This is a perfect introductory example to recursive functions; roughly speaking, the function clauses are either recursive cases (result in a call to iself with a smaller problem instance) or base cases (no recursive calls made).
So how do you use map? Notice that the first argument, Fun, is supposed to be a function. In Erlang, it's possible to declare anonymous functions (sometimes called lambdas) inline. For example, to square each number in a list, generating a list of squares:
map(fun(X) -> X*X end, [1,2,3]). % => [1,4,9]
This is an example of Higher-order programming.
Note that map is part of the Erlang standard library as lists:map/2.
Fold
Whereas map creates a 1:1 element mapping between one list and another, the purpose of fold is to apply some function to each element of a list while accumulating a single result, such as a sum. The right fold (it helps to think of it as "going to the right") might look like so:
foldr(Fun, Acc, [H|T]) -> % recursive case
foldr(Fun, Fun(H, Acc), T);
foldr(_Fun, Acc, []) -> % base case
Acc.
Using this function, we can sum the elements of a list:
foldr(fun(X, Sum) -> Sum + X, 0, [1,2,3,4,5]). %% => 15
Note that foldr and foldl are both part of the Erlang standard library, in the lists module.
While it may not be immediately obvious, a very large class of common list-manipulation problems can be solved using map and fold alone.
Thinking recursively
Writing recursive algorithms might seem daunting at first, but as you get used to it, it turns out to be quite natural. When encountering a problem, you should identify two things:
How can I decompose the problem into smaller instances? In order for recursion to be useful, the recursive call must take a smaller problem as its argument, or the function will never terminate.
What's the base case, i.e. the termination criterion?
As for 1), consider the problem of counting the elements of a list. How could this possibly be decomposed into smaller subproblems? Well, think of it this way: Given a non-empty list whose first element (head) is X and whose remainder (tail) is Y, its length is 1 + the length of Y. Since Y is smaller than the list [X|Y], we've successfully reduced the problem.
Continuing the list example, when do we stop? Well, eventually, the tail will be empty. We fall back to the base case, which is the definition that the length of the empty list is zero. You'll find that writing function clauses for the various cases is very much like writing definitions for a dictionary:
%% Definition:
%% The length of a list whose head is H and whose tail is T is
%% 1 + the length of T.
length([H|T]) ->
1 + length(T);
%% Definition: The length of the empty list ([]) is zero.
length([]) ->
0.
You could use a fold to recurse over the resulting list. For simplicity I turned your atoms into strings (you could do this with list_to_atom/1):
1> NewList = ["bar", "buzz"].
["bar","buzz"]
2> L = lists:foldl(fun (W, Acc) -> [{W, length(W)}|Acc] end, [], NewList).
[{"buzz",4},{"bar",3}]
This returns a proplist you can access like so:
3> proplists:get_value("buzz", L).
4
If you want to build the recursion yourself for didactic purposes instead of using lists:
count_char_in_list([], Count) ->
Count;
count_char_in_list([Head | Tail], Count) ->
count_char_in_list(Tail, Count + length(Head)). % a string is just a list of numbers
And then:
1> test:count_char_in_list(["bar", "buzz"], 0).
7

Resources