How to create a Prolog predicate that removes 2nd to last element? - recursion

I need help creating a predicate that removes the 2nd to last element of a list and returns that list written in Prolog. So far I have
remove([],[]).
remove([X],[X]).
remove([X,Y],[Y]).
That is as far as I've gotten. I need to figure out a way to recursively go through the list until it is only two elements long and then reassemble the list to be returned. Help with explanation if you can.

Your definition so far is perfect! It is a little bit too specialized, so we will have to extend it. But your program is a solid foundation.
You "only" need to extend it.
remove([],[]).
remove([X],[X]).
remove([_,X],[X]).
remove([X,_,Y], [X,Y]).
remove([X,Y,_,Z], [X,Y,Z]).
remove([X,Y,Z,_,Z2], [X,Y,Z,Z2]).
...
OK, you see how to continue. Now, let us identify common cases:
...
remove([X,Y,_,Z], [X,Y,Z]).
% ^^^ ^^^
remove([X,Y,Z,_,Z2], [X,Y,Z,Z2]).
% ^^^^^ ^^^^^
...
So, we have a common list prefix. We could say:
Whenever we have a list and its removed list, we can conclude that by adding one element on both sides, we get a longer list of that kind.
remove([X|Xs], [X|Ys]) :-
remove(Xs,Ys).
Please note that the :- is really an arrow. It means: Provided what is true on the right-hand side, also what is found on the left-hand side will be true.
H-h-hold a minute! Is this really the case? How to test this? (If you test just for positive cases, you will always get a "yes".) We don't have the time to conjure up some test cases, do we? So let us let Prolog do the hard work for us! So, Prolog, fill in the blanks!
remove([],[]).
remove([X],[X]).
remove([_,X],[X]).
remove([X|Xs], [X|Ys]) :-
remove(Xs,Ys).
?- remove(Xs,Ys). % most general goal
Xs = [], Ys = []
; Xs = [A], Ys = [A]
; Xs = [_,A], Ys = [A]
; Xs = [A], Ys = [A] % redundant, but OK
; Xs = [A,B], Ys = [A,B], unexpected % WRONG
; Xs = [A,_,B], Ys = [A,B]
; Xs = [A,B], Ys = [A,B], unexpected % WRONG again!
; Xs = [A,B,C], Ys = [A,B,C], unexpected % WRONG
; Xs = [A,B,_,C], Ys = [A,B,C]
; ... .
It is tempting to reject everything and start again from scratch.
But in Prolog you can do better than that, so let's calm down to estimate the actual damage:
Some answers are incorrect. And some answers are correct.
It could be that our current definition is just a little bit too general.
To better understand the situation, I will look at the unexpected success remove([1,2],[1,2]) in detail. Who is the culprit for it?
Even the following program slice/fragment succeeds.
remove([],[]).
remove([X],[X]) :- false.
remove([_,X],[X]) :- false.
remove([X|Xs], [X|Ys]) :-
remove(Xs,Ys).
While this is a specialization of our program it reads: that remove/2 holds for all lists that are the same. That can't be true! To fix the problem we have to do something in the remaining visible part. And we have to specialize it. What is problematic here is that the recursive rule also holds for:
remove([1,2], [1,2]) :-
remove([2], [2]).
remove([2], [2]) :-
remove([], []).
That kind of conclusion must be avoided. We need to restrict the rule to those cases were the list has at least two further elements by adding another goal (=)/2.
remove([X|Xs], [Y|Ys]) :-
Xs = [_,_|_],
remove(Xs, Ys).
So what was our error? In the informal
Whenever we have a list and its removed list, ...
the term "removed list" was ambiguous. It could mean that we are referring here to the relation remove/2 (which is incorrect, because remove([],[]) holds, but still nothing is removed), or we are referring here to a list with an element removed. Such errors inevitably happen in programming since you want to keep your intuitions afresh by using a less formal language than Prolog itself.
For reference, here again (and for comparison with other definitions) is the final definition:
remove([],[]).
remove([X],[X]).
remove([_,X],[X]).
remove([X|Xs], [X|Ys]) :-
Xs = [_,_|_],
remove(Xs,Ys).
There are more efficient ways to do this, but this is the most straight-forward way.

I will try to provide another solution which is easier to construct if you only consider the meaning of "second last element", and describe each possible case explicitly:
rem_2nd_last([], []).
rem_2nd_last([First|Rest], R) :-
rem_2nd_last_2(Rest, First, R). % "Lag" the list once
rem_2nd_last_2([], First, [First]).
rem_2nd_last_2([Second|Rest], First, R) :-
rem_2nd_last_3(Rest, Second, First, R). % "Lag" the list twice
rem_2nd_last_3([], Last, _SecondLast, [Last]). % End of list: drop second last
rem_2nd_last_3([This|Rest], Prev, PrevPrev, [PrevPrev|R]) :-
rem_2nd_last_3(Rest, This, Prev, R). % Rest of list
The explanation is hiding in plain view in the definition of the three predicates.
"Lagging" is a way to reach back from the end of the list but keep the predicate always deterministic. You just grab one element and pass the rest of the list as the first argument of a helper predicate. One way, for example, to define last/2, is:
last([H|T], Last) :-
last_1(T, H, Last).
last_1([], Last, Last).
last_1([H|T], _, Last) :-
last_1(T, H, Last).

Related

Avoiding infinite recursion but still using unbound parameter passing only

I have the following working program: (It can be tested on this site: http://swish.swi-prolog.org, I've removed the direct link to a saved program, because I noticed that anybody can edit it.)
It searches for a path between two points in an undirected graph. The important part is that the result is returned in the scope of the "main" predicate. (In the Track variable)
edge(a, b).
edge(b, c).
edge(d, b).
edge(d, e).
edge(v, w).
connected(Y, X) :-
(
edge(X, Y);
edge(Y, X)
).
path(X, X, _, []) :-
connected(X, _).
path(X, Y, _, [X, Y]) :-
connected(Y, X).
path(X, Z, Visited, [X|Track]) :-
connected(X, Y),
not(member(X, Visited)),
path(Y, Z, [X|Visited], Track).
main(X, Y) :-
path(X, Y, [], Track),
print(Track),
!.
Results:
?- main(a, e).
[a, b, d, e]
true
?- main(c, c).
[]
true
?- main(b, w).
false
My questions:
The list of visited nodes is passed down to the predicates in 2 different ways. In the bound Visited variable and in the unbound Track variable. What are the names of these 2 different forms of parameter passing?
Normally I only wanted to use the unbound parameter passing (Track variable), to have the results in the scope of the main predicate. But I had to add the Visited variable too, because the member checking didn't work on the Track variable (I don't know why). Is it possible to make it work with only passing the Track in an unbound way? (without the Visited variable)
Many thanks!
The short answer: no, you cannot avoid the extra argument without making everything much messier. This is because this particular algorithm for finding a path needs to keep a state; basically, your extra argument is your state.
There might be other ways to keep a state, like using a global, mutable variable, or dynamically changing the Prolog data base, but both are more difficult to get right and will involve more code.
This extra argument is often called an accumulator, because it accumulates something as you go down the proof tree. The simplest example would be traversing a list:
foo([]).
foo([X|Xs]) :-
foo(Xs).
This is fine, unless you need to know what elements you have already seen before getting here:
bar(List) :-
bar_(List, []).
bar_([], _).
bar_([X|Xs], Acc) :-
/* Acc is a list of all elements so far */
bar_(Xs, [X|Acc]).
This is about the same as what you are doing in your code. And if you look at this in particular:
path(X, Z, Visited, /* here */[X|Track]) :-
connected(X, Y),
not(member(X, Visited)),
path(Y, Z, [X|Visited], /* and here */Track).
The last argument of path/4 has one element more at a depth of one less in the proof tree! And, of course, the third argument is one longer (it grows as you go down the proof tree).
For example, you can reverse a list by adding another argument to the silly bar predicate above:
list_reverse(L, R) :-
list_reverse_(L, [], R).
list_reverse_([], R, R).
list_reverse_([X|Xs], R0, R) :-
list_reverse_(Xs, [X|R0], R).
I am not aware of any special name for the last argument, the one that is free at the beginning and holds the solution at the end. In some cases it could be an output argument, because it is meant to capture the output, after transforming the input somehow. There are many cases where it is better to avoid thinking about arguments as strictly input or output arguments. For example, length/2:
?- length([a,b], N).
N = 2.
?- length(L, 3).
L = [_2092, _2098, _2104].
?- length(L, N).
L = [],
N = 0 ;
L = [_2122],
N = 1 ;
L = [_2122, _2128],
N = 2 . % and so on
Note: there are quite a few minor issues with your code that are not critical, and giving that much advice is not a good idea on Stackoverflow. If you want you could submit this as a question on Code Review.
Edit: you should definitely study this question.
I also provided a somewhat simpler solution here. Note the use of term_expansion/2 for making directed edges from undirected edges at compile time. More important: you don't need the main, just call the predicate you want from the top level. When you drop the cut, you will get all possible solutions when one or both of your From and To arguments are free variables.

In-order traversal in BST-Ocaml

I'm working with a polymorphic binary search tree with the standard following type definition:
type tree =
Empty
| Node of int * tree * tree (*value, left sub tree, right sub tree*);;
I want to do an in order traversal of this tree and add the values to a list, let's say. I tried this:
let rec in_order tree =
match tree with
Empty -> []
| Node(v,l,r) -> let empty = [] in in_order r#empty;
v::empty;
in_order l#empty
;;
But it keeps returning an empty list every time. I don't see why it is doing that.
When you're working with recursion you need to always reason as follows:
How do I solve the easiest version of the problem?
Supposing I have a solution to an easier problem, how can I modify it to solve a harder problem?
You've done the first part correctly, but the second part is a mess.
Part of the problem is that you've not implemented the thing you said you want to implement. You said you want to do a traversal and add the values to a list. OK, so then the method should take a list somewhere -- the list you are adding to. But it doesn't. So let's suppose it does take such a parameter and see if that helps. Such a list is traditionally called an accumulator for reasons which will become obvious.
As always, get the signature right first:
let rec in_order tree accumulator =
OK, what's the easy solution? If the tree is empty then adding the tree contents to the accumulator is simply the identity:
match tree with
| Empty -> accumulator
Now, what's the recursive case? We suppose that we have a solution to some smaller problems. For instance, we have a solution to the problem of "add everything on one side to the accumulator with the value":
| Node (value, left, right) ->
let acc_with_right = in_order right accumulator in
let acc_with_value = value :: acc_with_right in
OK, we now have the accumulator with all the elements from one side added. We can then use that to add to it all the elements from the other side:
in_order left acc_with_value
And now we can make the whole thing implement the function you tried to write in the first place:
let in_order tree =
let rec aux tree accumulator =
match tree with
| Empty -> accumulator
| Node (value, left, right) ->
let acc_with_right = aux right accumulator in
let acc_with_value = value :: acc_with_right in
aux left acc_with_value in
aux tree []
And we're done.
Does that all make sense? You have to (1) actually implement the exact thing you say you're going to implement, (2) solve the base case, and (3) assume you can solve smaller problems and combine them into solutions to larger problems. That's the pattern you use for all recursive problem solving.
I think your problem boils down to this. The # operator returns a new list that is the concatenation of two other lists. It doesn't modify the other lists. In fact, nothing ever modifies a list in OCaml. Lists are immutable.
So, this expression:
r # empty
Has no effect on the value named empty. It will remain an empty list. In fact, the value empty can never be changed either. Variables in OCaml are also immutable.
You need to imagine constructing and returning your value without modifying lists or variables.
When you figure it out, it won't involve the ; operator. What this operator does is to evaluate two expressions (to the left and right), then return the value of the expression at the right. It doesn't combine values, it performs an action and discards its result. As such, it's not useful when working with lists. (It is used for imperative constructs, like printing values.)
If you thought about using # where you're now using ;, you'd be a lot closer to a solution.

How to write a Prolog predicate to split a list into a list of paired elements?

This was a question on a sample exam I did.
Give the definition of a Prolog predicate split_into_pairs that takes as arguments a list and returns as a result a list which consists of paired elements. For example, split_into_pairs([1,2,3,4,5,6],X) would return as a result X=[[1,2],[3,4],[5,6]]. Similarly, split_into_pairs([a,2,3,4,a,a,a,a],X) would return as result X=[[a,2],[3,4],[a,a],[a,a]] while split_into_pairs([1,2,3],X) would return No.
It's not meant to be done using built-in predicates I believe, but it shouldn't need to be too complicated either as it was only worth 8/120 marks.
I'm not sure what it should do for a list of two elements, so I guess that would either be not specified so that it returns no, or split_into_pairs([A,B],[[A,B]]).
My main issue is how to do the recursive call properly, without having extra brackets, not ending up as something like X=[[A,B],[[C,D],[[E,F]]]]?.
My most recent attempts have been variations of the code below, but obviously this is incorrect.
split_into_pairs([A,B],[A,B])
split_into_pairs([A,B|T], X) :- split_into_pairs(T, XX), X is [A,B|XX]
This is a relatively straightforward recursion:
split_into_pairs([], []).
split_into_pairs([First, Second | Tail], [[First, Second] | Rest]) :-
split_into_pairs(Tail, Rest).
The first rule says that an empty list is already split into pairs; the second requires that the source list has at least two items, pairs them up, and inserts the result of pairing up the tail list behind them.
Here is a demo on ideone.
Your solution could be fixed as well by adding square brackets in the result, and moving the second part of the rule into the header, like this:
split_into_pairs([A,B],[[A,B]]).
split_into_pairs([A,B|T], [[A,B]|XX]) :- split_into_pairs(T, XX).
Note that this solution does not consider an empty list a list of pairs, so split_into_pairs([], X) would fail.
Your code is almost correct. It has obvious syntax issues, and several substantive issues:
split_into_pairs([A,B], [ [ A,B ] ] ):- !.
split_into_pairs([A,B|T], X) :- split_into_pairs(T, XX),
X = [ [ A,B ] | XX ] .
Now it is correct: = is used instead of is (which is normally used with arithmetic operations), both clauses are properly terminated by dots, and the first one has a cut added into it, to make the predicate deterministic, to produce only one result. The correct structure is produced by enclosing each pair of elements into a list of their own, with brackets.
This is inefficient though, because it describes a recursive process - it constructs the result on the way back from the base case.
The efficient definition works on the way forward from the starting case:
split_into_pairs([A,B],[[A,B]]):- !.
split_into_pairs([A,B|T], X) :- X = [[A,B]|XX], split_into_pairs(T, XX).
This is the essence of tail recursion modulo cons optimization technique, which turns recursive processes into iterative ones - such that are able to run in constant stack space. It is very similar to the tail-recursion with accumulator technique.
The cut had to be introduced because the two clauses are not mutually exclusive: a term unifying with [A,B] could also be unifiable with [A,B|T], in case T=[]. We can get rid of the cut by making the two clauses to be mutually-exclusive:
split_into_pairs([], [] ).
split_into_pairs([A,B|T], [[A,B]|XX]):- split_into_pairs(T, XX).

Miranda going through lists

is there an easy way to go through a list?
lets say i wanted to access the 5th data on the list not knowing it was a B
["A","A","A","A","B","A","A","A","A"]
is there a way i can do it without having to sort through the list?
I do not know Miranda that well, but I expect the functions skip and take are available.
you can address the 5th element by making a function out of skip and take. When skip and take are not available, it is easy to create them yourself.
skip: skips the y number of elements in a list, when y is greater than the number of items in the list, it will return an empty list
take: takes the first y number of elements in a list, when y is greater than the number of items in the list, the full list will be returned.
skip y [] = []
skip 0 xs = xs
skip y (x:xs) = skip xs (y-1)
take y [] = []
take 0 xs = []
take y (x:xs) = x : take (y-1) xs
elementAt x xs = take 1 (skip x xs)
Lists are inductive datatypes. This means that functions defined over lists - for instance, accessing the nth element - are defined by recursion. The data structure you are looking for appears to be an array, which allows constant time lookup. The easiest way to find the element at an index in a list is directly:
lookup :: Int -> [a] -> Maybe [a]
lookup n [] = Nothing
lookup 0 (x:xs) = Just x
lookup n (x:xs) = lookup (n - 1) xs
Another way to do this would be to use the ! operator. Let's say you have a program with defined data in the list, such as:
plist = [A,A,A,A,B,A,A,A,A]
then executing plist!4 will give you the 5th element of that list. (4 being the 5th unit if you include 0,1,2,3,4)
So plist!4 returns B.
Lists are not arrays.
You can only access elements beginning from first. Think of lists as streams (like a song playing in radio). Lists may be of infinite length (as radio never stops).
Most programmers uses "syntactic" sugar, which hides the nature of lists behind an easier syntax.
Miranda automatically loads a default library named stdenv.m, which you can study.
Now, let's think about your problem:
You want to ignore ("drop") all elements before the 5th and then get the first element from the rest of the ramaining list.
This is expressed in Miranda as:
nth :: num -> [*] -> *
nth n = hd . drop (n-1)
This is a function with explicit type declaration to see, that function works with every list (elements are of wildcard type *).
Sample:
plist :: [[char]]
plist = ["A","A","A","A","B","A","A","A","A"]
result :: [char]
result = nth 5 plist
If you want to code your functions with error handling, you need techniques to catch that there is no 5th element in your list.
As seen above, one technique is "Maybe". Another is continuations.
A bad technique is to check the length of list first, because this will crash with infinite lists.

Please walk me through this "Erlang Programming" recursive sample

From page 90 of Erlang Programming by Cesarini and Thomson, there is an example that has no detailed discussion. I'm quite the newbie to functional programming and recursive thinking, so I'm not familiar in solving problems this way.
"For example, the following function merges two lists (of the same length) by interleaving
their values: "
merge(Xs,Ys) -> lists:reverse(mergeL(Xs,Ys,[])).
mergeL([X|Xs],Ys,Zs) -> mergeR(Xs,Ys,[X|Zs]);
mergeL([],[],Zs) -> Zs.
mergeR(Xs,[Y|Ys],Zs) -> mergeL(Xs,Ys,[Y|Zs]);
mergeR([],[],Zs) -> Zs.
How does this work? Thanks!
step through it
merge([1,2],[3,4])
reverse(mergeL([1,2],[3,4],[]))
reverse(mergeR([2],[3,4],[1]))
reverse(mergeL([2],[4],[3,1]))
reverse(mergeR([], [4], [2,3,1]))
reverse(mergeL([], [], [4,2,3,1]))
reverse([4,2,3,1])
[1,3,2,4]
It's always good to work these functions by hand on a piece of paper with a small input where you're trying to figure it. You'll quickly see how it works.
This function is called first:
merge(Xs,Ys) -> lists:reverse(mergeL(Xs,Ys,[])).
The empty list [] passed to mergeL is the accumulator - this is where the answer will come from. Note that the first function calls mergeL - the left merge.
Let us pretend that this function is called as so:
merge([1, 2, 3], [a, b, c])
Two lists of the same length. This first function then calls mergeL:
mergeL([X|Xs],Ys,Zs) -> mergeR(Xs,Ys,[X|Zs]);
mergeL([],[],Zs) -> Zs.
There are 2 clauses in left merge. The call to mergeL with arguments will match these clauses in top down order.
The second of these clauses has three parameters - the first two of these are empty lists []. However the first time mergeL is called these two lists aren't empty they are the lists Xs and Ys so the first clause matches.
Lets break out the matches. This is the call to mergeL:
mergeL([1, 2, 3], [a, b, c], [])
and it matches the first clause in the following fashion:
X = 1
Xs = [2, 3]
Ys = [a, b, c]
Zs = []
This is because of the special form of the list:
[X | Xs]
This means match X to the head of the list (an individual item) and make Xs the tail of the list (a list).
We then build up the new function call. We can add the value X to the start of the list Zs the same way we pattern matched it out so we get the first mergeR call:
mergeR([2, 3], [a, b, c], [1])
The final argument is a one-item list caused by adding an item at the head of an empty list.
This this zips through until the end.
Actually the final clause of mergeL is redundant. By definition this function will exhaust in the final clause of mergeR (but I will leave that as an exercise for the reader).
What the example does is define a few states that the recursion will go through. There are 3 'functions' that are defined:
merge, mergeL and mergeR.
The lists to merge are Xs and Ys, whereas the Zs are the result of the merge.
The merge will start with calling 'merge' and supplying two lists. The first step is to call mergeL with the two lists to merge, and an empty resultset.
[X|Xs] takes the first element of the list (very much like array_shift would). This element is added to the head of the resultset ([X|Zs] does this). This resultset (containing one element now) is then passed to the next call, mergeR. mergeR does the same thing, only it takes an element from the second list. This behaviour will continue as long as the lists fed to mergeL or mergeR are not empty.
When mergeL or mergeR is called with two empty lists ([]) and a resultset (Zs), it will return the resultset (and not do another run, thus stopping the recursion).
Summary:
The start of the recursion is the first line, which defines 'merge'. This start will set the whole thing in motion by calling the first mergeL.
The body of the recursion is lines 2 and 4, which define the behaviour or mergeL and mergeR, which both call each other.
The stop of the recursion is defined by lines 3 and 5, which basicly tell the whole thing what to do when there are no more elements in the array.
Hope this helps!
I always look for those functions that will terminate the recursion first, in this case:
mergeL([],[],Zs) -> Zs.
and
mergeR([],[],Zs) -> Zs.
both of those will basically finish the "merging" when the first two parameters are empty lists.
So then I look at the first call of the function:
merge(Xs,Ys) -> lists:reverse(mergeL(Xs,Ys,[])).
Ignoring the reverse for a second, you will see that the last parameter is an empty list. So I'd expect the various mergeL and mergeR functions to move the elements of that array into the final parameter - and when they are all moved the function will basically terminate (although finally calling the reverse function of course)
And that is exactly what the remaining functions do:
mergeL([X|Xs],Ys,Zs) -> mergeR(Xs,Ys,[X|Zs]);
takes the first element of X and puts it into the Z array, and
mergeR(Xs,[Y|Ys],Zs) -> mergeL(Xs,Ys,[Y|Zs]);
takes the first element of Y and puts it into the Z array. The calling of the mergeR from mergeL and vice versa is doing the interleave part.
What's interesting to see (and easy to fix) is that the arrays X and Y must be of the same length or you'll end up calling mergeL or mergeR with an empty array in X or Y - and that won't match either [ X | Xs] or [ Y | Ys].
And the reason for the reverse is simply around the relative efficiency of [ X | Zs] vs [ Zs | X]. The former is much more efficient.

Resources