If we have two lists l1 and l2 and we want to concatenate them we can use # or append which is in O(n1) where n1 is the length of l1. Or we can use rev_append which is according to the doc:
equivalent to List.rev l1 # l2, but rev_append is tail-recursive and more efficient.
So is rev_append more efficient than # or is it more efficient than List.rev + #? And is it better to use it instead of # and append when we don't care about the order?
OCaml lists are immutable. The second list doesn't need to be changed, but the first list has to be copied so the copy can point to the second list. Hence you're going to have to traverse the first list somehow. Nothing you can do will change the big-O time complexity of the append.
Since you can only add new elements at the beginning of a list, you need to traverse the first list in reverse order if you want the result to preserve the order of the first list.
The most obvious way to do this is to call recursively until you're at the end of the first list, then do the prefixing as you return from each recursive call. However this isn't tail-recursive. I.e., it will consume stack space proportional to the length of the first list. When the first list is long, you can run out of stack space (aka stack overflow).
This is the way that # works. It takes time and stack space proportional to the length of the first list.
Another idea is to give up on maintaining the order of the first list. If you prefix the first list in reverse order, you can can easily make the operation tail recursive. That's the purpose of List.rev_append. It takes constant stack space.
If you want to maintain the original list orders, but also use constant stack space you can reverse the first list (with List.rev), then use List.rev_append.
Plain List.rev_append is faster than # because it doesn't have to make internal function calls--it can just be a loop. It's also obviously faster than List.rev plus List.rev_append.
In summary if you don't care about the final order, then List.rev_append is faster than #, yes. Also it won't overflow the stack. It's not going to be a gigantic amount faster because the time complexity is basically the same.
Related
When accumulating a collection (just collection, not list) of values into a single value, there are two options.
reduce(). Which takes a List<T>, and a function (T, T) -> T, and applies that function iteratively until the whole list is reduced into a single value.
fold(). Which takes a List<T>, an initial value V, and a function (V, T) -> V, and applies that function iteratively until the whole list is folded into a single value.
I know that both of them have their own use cases. For eg, reduce() can be used to find maximum value in a list and fold() can be used to find sum of all values in a list.
But, in that example, instead of using fold(), you can add(0), and then reduce(). Another use case of fold is to join all elements into a string. But this can also be done without using fold, by map |> toString() followed by reduce().
Just out of curiosity, the question is, can every use case of fold() be avoided given functions map(), filter(), reduce() and add()? (also remove() if required.)
It's the other way around. reduce(L,f) = fold(first(L), rest(L), f), so there's no special need for reduce -- it's just a short form for a common fold pattern.
fold has lots of use cases of its own, though.
The example you gave for string concatenation is one of them -- you can fold items into a special string accumulator much more efficiently than you can build strings by incremental accumulation. (exactly how depends on the language, but it's true pretty much everywhere).
Applying a list of incremental changes to a target object is a pretty common pattern. Adding files to a folder, drawing shapes on a canvas, turning a list into a set, crossing off completed items in a to-do list, etc., are all examples of this pattern.
Also map(L,f) = fold(newMap(), L, (M,v) -> add(M,f(v)), so map is also just a common fold pattern. Similarly, filter(L,f) = fold(newList(), L, (L,v) -> f(v) ? add(L,v) : L).
polarbear([],H,[H]).
polarbear([H|T],Y,[H|Z]):- polarbear(T,Y,Z).
This is the prolog code. When entering ?-polarbear([1,2], 6, P). Get P =[1,2,6].
The thing is I just don't understand how it's working and I've been trying to work out how Prolog is doing what it's doing.
I have some experience with Prolog, but I don't understand this, so any guidance as to how it does what it does in order to help me understand Prolog would be greatly appreciated.
states that the first argument is a list with head H and tail T and the third argument is a list with head H and tail Z. So it forces (by using unification) the heads of the two lists to be the same. Recursively the two lists become identical except the fact that the third argument list has one more element in the end (element Y) and this is defined by the first clause. Note that second clause only works for lists with one or more elements. So as a base of the recursion when we examine the empty list then the third list due to first clause contains only one more element the element Y.
Suppose we have a list:
List = nil | Cons(car cdr:List).
Note that I am talking about modifiable lists!
And a trivial recursive length function:
recursive Length(List l) = match l with
| nil => 0
| Cons(car cdr) => 1 + Length cdr
end.
Naturally, it terminates only when the list is non-circular:
inductive NonCircular(List l) = {
empty: NonCircular(nil) |
\forall head, tail: NonCircular(tail) => NonCircular (Cons(head tail))
}
Note that this predicate, being implemented as a recursive function, also does not terminate on a circular list.
Usually I see proofs of list traversal termination that use list length as a bounded decreasing factor. They suppose that Length is non-negative. But, as I see it, this fact (Length l >= 0) follows from the termination of Length on the first place.
How do you prove, that the Length terminates and is non-negative on NonCircular (or an equivalent, better defined predicate) lists?
Am I missing an important concept here?
Unless the length function has cycle detection there is no guarantee it will halt!
For a singly linked list one uses the Tortoise and hare algorithm to determine the length where there is a chance there might be circles in the cdr.
It's just two cursors, the tortoise starts at first element and the hare starts at the second. Tortoise moves one pointer at a time while the hare moves two (if it can). The hare will eventually either be the same as the tortoise, which indicates a cycle, or it will terminate knowing the length is 2*steps or 2*steps+1.
Compared to finding cycles in a tree this is very cheap and performs just as well on terminating lists as a function that does not have cycle detection.
The definition of List that you have on top doesn't seem to permit circular lists. Each call to the "constructor" Cons will create a new pointer, and you aren't allowed to modify the pointer later to create the circularity.
You need a more sophisticated definition of List if you want to handle circularity. You probably need to define a Cell containing data value and an address, and a Node which contains a Cell and an address pointing to the previous node, and then you'll need to define the dereferencing operator to go back from addresses to Cells. You can also try to define non-circular on this object.
My gut feeling is that you will also need to define an injective function from the "simple" list definition you have above to the sophisticated one that I've outlined and then finally you'll be able to prove your result.
One other thing, the definition of NonCircular doesn't need to terminate. It isn't a program, it is a proof. If it holds, then you can examine the proof to see why it holds and use this in other proofs.
Edit: Thanks to Necto for pointing out I'm wrong.
As follow up to yesterday's question Erlang: choosing unique items from a list, using recursion
In Erlang, say I wanted choose all unique items from a given list, e.g.
List = [foo, bar, buzz, foo].
and I had used your code examples resulting in
NewList = [bar, buzz].
How would I further manipulate NewList in Erlang?
For example, say I not only wanted to choose all unique items from List, but also count the total number of characters of all resulting items from NewList?
In functional programming we have patterns that occur so frequently they deserve their own names and support functions. Two of the most widely used ones are map and fold (sometimes reduce). These two form basic building blocks for list manipulation, often obviating the need to write dedicated recursive functions.
Map
The map function iterates over a list in order, generating a new list where each element is the result of applying a function to the corresponding element in the original list. Here's how a typical map might be implemented:
map(Fun, [H|T]) -> % recursive case
[Fun(H)|map(Fun, T)];
map(_Fun, []) -> % base case
[].
This is a perfect introductory example to recursive functions; roughly speaking, the function clauses are either recursive cases (result in a call to iself with a smaller problem instance) or base cases (no recursive calls made).
So how do you use map? Notice that the first argument, Fun, is supposed to be a function. In Erlang, it's possible to declare anonymous functions (sometimes called lambdas) inline. For example, to square each number in a list, generating a list of squares:
map(fun(X) -> X*X end, [1,2,3]). % => [1,4,9]
This is an example of Higher-order programming.
Note that map is part of the Erlang standard library as lists:map/2.
Fold
Whereas map creates a 1:1 element mapping between one list and another, the purpose of fold is to apply some function to each element of a list while accumulating a single result, such as a sum. The right fold (it helps to think of it as "going to the right") might look like so:
foldr(Fun, Acc, [H|T]) -> % recursive case
foldr(Fun, Fun(H, Acc), T);
foldr(_Fun, Acc, []) -> % base case
Acc.
Using this function, we can sum the elements of a list:
foldr(fun(X, Sum) -> Sum + X, 0, [1,2,3,4,5]). %% => 15
Note that foldr and foldl are both part of the Erlang standard library, in the lists module.
While it may not be immediately obvious, a very large class of common list-manipulation problems can be solved using map and fold alone.
Thinking recursively
Writing recursive algorithms might seem daunting at first, but as you get used to it, it turns out to be quite natural. When encountering a problem, you should identify two things:
How can I decompose the problem into smaller instances? In order for recursion to be useful, the recursive call must take a smaller problem as its argument, or the function will never terminate.
What's the base case, i.e. the termination criterion?
As for 1), consider the problem of counting the elements of a list. How could this possibly be decomposed into smaller subproblems? Well, think of it this way: Given a non-empty list whose first element (head) is X and whose remainder (tail) is Y, its length is 1 + the length of Y. Since Y is smaller than the list [X|Y], we've successfully reduced the problem.
Continuing the list example, when do we stop? Well, eventually, the tail will be empty. We fall back to the base case, which is the definition that the length of the empty list is zero. You'll find that writing function clauses for the various cases is very much like writing definitions for a dictionary:
%% Definition:
%% The length of a list whose head is H and whose tail is T is
%% 1 + the length of T.
length([H|T]) ->
1 + length(T);
%% Definition: The length of the empty list ([]) is zero.
length([]) ->
0.
You could use a fold to recurse over the resulting list. For simplicity I turned your atoms into strings (you could do this with list_to_atom/1):
1> NewList = ["bar", "buzz"].
["bar","buzz"]
2> L = lists:foldl(fun (W, Acc) -> [{W, length(W)}|Acc] end, [], NewList).
[{"buzz",4},{"bar",3}]
This returns a proplist you can access like so:
3> proplists:get_value("buzz", L).
4
If you want to build the recursion yourself for didactic purposes instead of using lists:
count_char_in_list([], Count) ->
Count;
count_char_in_list([Head | Tail], Count) ->
count_char_in_list(Tail, Count + length(Head)). % a string is just a list of numbers
And then:
1> test:count_char_in_list(["bar", "buzz"], 0).
7
am making a function that will send me a list of all possible elemnts .. in each iteration its giving me the last answer .. but after the recursion am only getting the last answer back .. how can i make it give back every single answer ..
thank you
the problem is that am trying to find all possible distributions for a list into other lists .. the code
addIn(_,[],Result,Result).
addIn(C,[Element|Rest],[F|R],Result):-
member( Members , [F|R]),
sumlist( Members, Sum),
sumlist([Element],ElementLength),
Cap is Sum + ElementLength,
(Cap =< Ca,
append([Element], Members,New)....
by calling test .. am getting back all the list of possible answers .. now if i tried to do something that will fail like
bp(3,11,[8,2,4,6,1,8,4],Answer).
it will just enter a while loop .. more over if i changed the
bp(NB,C,OL,A):-
addIn(C,OL,[[],[],[]],A);
bp(NB,C,_,A).
to and instead of Or .. i get error :
ERROR: is/2: Arguments are not
sufficiently instantiated
appreciate the help ..
Thanks alot #hardmath
It sounds like you are trying to write your own version of findall/3, perhaps limited to a special case of an underlying goal. Doing it generally (constructing a list of all solutions to a given goal) in a user-defined Prolog predicate is not possible without resorting to side-effects with assert/retract.
However a number of useful special cases can be implemented without such "tricks". So it would be helpful to know what predicate defines your "all possible elements". [It may also be helpful to state which Prolog implementation you are using, if only so that responses may include links to documentation for that version.]
One important special case is where the "universe" of potential candidates already exists as a list. In that case we are really asking to find the sublist of "all possible elements" that satisfy a particular goal.
findSublist([ ],_,[ ]).
findSublist([H|T],Goal,[H|S]) :-
Goal(H),
!,
findSublist(T,Goal,S).
findSublist([_|T],Goal,S) :-
findSublist(T,Goal,S).
Many Prologs will allow you to pass the name of a predicate Goal around as an "atom", but if you have a specific goal in mind, you can leave out the middle argument and just hardcode your particular condition into the middle clause of a similar implementation.
Added in response to code posted:
I think I have a glimmer of what you are trying to do. It's hard to grasp because you are not going about it in the right way. Your predicate bp/4 has a single recursive clause, variously attempted using either AND or OR syntax to relate a call to addIn/4 to a call to bp/4 itself.
Apparently you expect wrapping bp/4 around addIn/4 in this way will somehow cause addIn/4 to accumulate or iterate over its solutions. It won't. It might help you to see this if we analyze what happens to the arguments of bp/4.
You are calling the formal arguments bp(NB,C,OL,A) with simple integers bound to NB and C, with a list of integers bound to OL, and with A as an unbound "output" Answer. Note that nothing is ever done with the value NB, as it is not passed to addIn/4 and is passed unchanged to the recursive call to bp/4.
Based on the variable names used by addIn/4 and supporting predicate insert/4, my guess is that NB was intended to mean "number of bins". For one thing you set NB = 3 in your test/0 clause, and later you "hardcode" three empty lists in the third argument in calling addIn/4. Whatever Answer you get from bp/4 comes from what addIn/4 is able to do with its first two arguments passed in, C and OL, from bp/4. As we noted, C is an integer and OL a list of integers (at least in the way test/0 calls bp/4).
So let's try to state just what addIn/4 is supposed to do with those arguments. Superficially addIn/4 seems to be structured for self-recursion in a sensible way. Its first clause is a simple termination condition that when the second argument becomes an empty list, unify the third and fourth arguments and that gives "answer" A to its caller.
The second clause for addIn/4 seems to coordinate with that approach. As written it takes the "head" Element off the list in the second argument and tries to find a "bin" in the third argument that Element can be inserted into while keeping the sum of that bin under the "cap" given by C. If everything goes well, eventually all the numbers from OL get assigned to a bin, all the bins have totals under the cap C, and the answer A gets passed back to the caller. The way addIn/4 is written leaves a lot of room for improvement just in basic clarity, but it may be doing what you need it to do.
Which brings us back to the question of how you should collect the answers produced by addIn/4. Perhaps you are happy to print them out one at a time. Perhaps you meant to collect all the solutions produced by addIn/4 into a single list. To finish up the exercise I'll need you to clarify what you really want to do with the Answers from addIn/4.
Let's say you want to print them all out and then stop, with a special case being to print nothing if the arguments being passed in don't allow a solution. Then you'd probably want something of this nature:
newtest :-
addIn(12,[7, 3, 5, 4, 6, 4, 5, 2], Answer),
format("Answer = ~w\n",[Answer]),
fail.
newtest.
This is a standard way of getting predicate addIn/4 to try all possible solutions, and then stop with the "fall-through" success of the second clause of newtest/0.
(Added) Suggestions about coding addIn/4:
It will make the code more readable and maintainable if the variable names are clear. I'd suggest using Cap instead of C as the first argument to addIn/4 and BinSum when you take the sum of items assigned to a "bin". Likewise Bin would be better where you used Members. In the third argument to addIn/4 (in the head of the second clause) you don't need an explicit list structure [F|R] since you never refer to either part F or R by itself. So there I'd use Bins.
Some of your predicate calls don't accomplish much that you cannot do more easily. For example, your second call to sumlist/2 involves a list with one item. Thus the sum is just the same as that item, i.e. ElementLength is the same as Element. Here you could just replace both calls to sumlist/2 with one such call:
sumlist([Element|Bin],BinSum)
and then do your test comparing BinSum with Cap. Similarly your call to append/3 just adjoins the single item Element to the front of the list (I'm calling) Bin, so you could just replace what you have called New with [Element|Bin].
You have used an extra pair of parentheses around the last four subgoals (in the second clause for addIn/4). Since AND is implied for all the subgoals of this clause, using the extra pair of parentheses is unnecessary.
The code for insert/4 isn't shown now, but it could be a source of some unintended "backtracking" in special cases. The better approach would be to have the first call (currently to member/2) be your only point of indeterminacy, i.e. when you choose one of the bins, do it by replacing it with a free variable that gets unified with [Element|Bin] at the next to last step.