Gremlin nested loops and double repeat statements

Gremlin nested loops and double repeat statements - gremlin

How does gremlin interpret double repeats such as described here for Lowest Common Ancestor algorithm? http://tinkerpop.apache.org/docs/3.2.3-SNAPSHOT/recipes/#_lowest_common_ancestor
It appears this is interpreted as a nested loop inside another loop O(n^2), rather than two independent loops. I would like to verify this behavior. Can I have a detailed explanation of the semantics here.
If this is behavior, is there a way to break the outer loop on condition of inner loop?

That's not a nested repeat() (i.e. one repeat() inside another) - the first repeat() ends at the first emit() and then a new repeat() begins. It's thus saying that the traversal will first traverse out() emitting every vertex it comes across and each of those vertices will traverse in() only emitting the "D" vertex.
Nested loops were not supported in 3.2.3 and is actually only possible in the soon to be released 3.4.0 which you can read about here and looks like this:
gremlin> g.V().repeat(__.in('traverses').repeat(__.in('develops')).emit()).emit().values('name')
==>stephen
==>matthias
==>marko
Here you can see a repeat() actually inside a repeat()

Related

Why after pressing semicolon program is back in deep recursion?

I'm trying to understand the semicolon functionality.
I have this code:
del(X,[X|Rest],Rest).
del(X,[Y|Tail],[Y|Rest]) :-
del(X,Tail,Rest).
permutation([],[]).
permutation(L,[X|P]) :- del(X,L,L1), permutation(L1,P).
It's the simple predicate to show all permutations of given list.
I used the built-in graphical debugger in SWI-Prolog because I wanted to understand how it works and I understand for the first case which returns the list given in argument. Here is the diagram which I made for better understanding.
But I don't get it for the another solution. When I press the semicolon it doesn't start in the place where it ended instead it's starting with some deep recursion where L=[] (like in step 9). I don't get it, didn't the recursion end earlier? It had to go out of the recursions to return the answer and after semicolon it's again deep in recursion.
Could someone clarify that to me? Thanks in advance.

One analogy that I find useful in demystifying Prolog is that Backtracking is like Nested Loops, and when the innermost loop's variables' values are all found, the looping is suspended, the vars' values are reported, and then the looping is resumed.
As an example, let's write down simple generate-and-test program to find all pairs of natural numbers above 0 that sum up to a prime number. Let's assume is_prime/1 is already given to us.
We write this in Prolog as
above(0, N), between(1, N, M), Sum is M+N, is_prime(Sum).
We write this in an imperative pseudocode as
for N from 1 step 1:
for M from 1 step 1 until N:
Sum := M+N
if is_prime(Sum):
report_to_user_and_ask(Sum)
Now when report_to_user_and_ask is called, it prints Sum out and asks the user whether to abort or to continue. The loops are not exited, on the contrary, they are just suspended. Thus all the loop variables values that got us this far -- and there may be more tests up the loops chain that sometimes succeed and sometimes fail -- are preserved, i.e. the computation state is preserved, and the computation is ready to be resumed from that point, if the user presses ;.
I first saw this in Peter Norvig's AI book's implementation of Prolog in Common Lisp. He used mapping (Common Lisp's mapcan which is concatMap in Haskell or flatMap in many other languages) as a looping construct though, and it took me years to see that nested loops is what it is really all about.
Goals conjunction is expressed as the nesting of the loops; goals disjunction is expressed as the alternatives to loop through.
Further twist is that the nested loops' structure isn't fixed from the outset. It is fluid, the nested loops of a given loop can be created depending on the current state of that loop, i.e. depending on the current alternative being explored there; the loops are written as we go. In (most of the) languages where such dynamic creation of nested loops is impossible, it can be encoded with nested recursion / function invocation / inside the loops. (Here's one example, with some pseudocode.)
If we keep all such loops (created for each of the alternatives) in memory even after they are finished with, what we get is the AND-OR tree (mentioned in the other answer) thus being created while the search space is being explored and the solutions are found.
(non-coincidentally this fluidity is also the essence of "monad"; nondeterminism is modeled by the list monad; and the essential operation of the list monad is the flatMap operation which we saw above. With fluid structure of loops it is "Monad"; with fixed structure it is "Applicative Functor"; simple loops with no structure (no nesting at all): simply "Functor" (the concepts used in Haskell and the like). Also helps to demystify those.)
So, the proper slogan could be Backtracking is like Nested Loops, either fixed, known from the outset, or dynamically-created as we go. It's a bit longer though. :)
Here's also a Prolog example, which "as if creates the code to be run first (N nested loops for a given value of N), and then runs it." (There's even a whole dedicated tag for it on SO, too, it turns out, recursive-backtracking.)
And here's one in Scheme ("creates nested loops with the solution being accessible in the innermost loop's body"), and a C++ example ("create n nested loops at run-time, in effect enumerating the binary encoding of 2n, and print the sums out from the innermost loop").

There is a big difference between recursion in functional/imperative programming languages and Prolog (and it really became clear to me only in the last 2 weeks or so):
In functional/imperative programming, you recurse down a call chain, then come back up, unwinding the stack, then output the result. It's over.
In Prolog, you recurse down an AND-OR tree (really, alternating AND and OR nodes), selecting a predicate to call on an OR node (the "choicepoint"), from left to right, and calling every predicate in turn on an AND node, also from left to right. An acceptable tree has exactly one predicate returning TRUE under each OR node, and all predicates returning TRUE under each AND node. Once an acceptable tree has been constructed, by the very search procedure, we are (i.e. the "search cursor" is) on a rightmost bottommost node .
Success in constructing an acceptable tree also means a solution to the query entered at the Prolog Toplevel (the REPL) has been found: The variable values are output, but the tree is kept (unless there are no choicepoints).
And this is also important: all variables are global in the sense that if a variable X as been passed all the way down the call chain from predicate to predicate to the rightmost bottommost node, then constrained at the last possible moment by unifying it with 2 for example, X = 2, then the Prolog Toplevel is aware of that without further ado: nothing needs to be passed up the call chain.
If you now press ;, search doesn't restart at the top of the tree, but at the bottom, i.e. at the current cursor position: the nearest parent OR node is asked for more solutions. This may result in much search until a new acceptable tree has been constructed, we are at a new rightmost bottommost node. The new variable values are output and you may again enter ;.
This process cycles until no acceptable tree can be constructed any longer, upon which false is output.
Note that having this AND-OR as an inspectable and modifiable data structure at runtime allows some magical tricks to be deployed.
There is bound to be a lot of power in debugging tools which record this tree to help the user who gets the dreaded sphynxian false from a Prolog program that is supposed to work. There are now Time Traveling Debuggers for functional and imperative languages, after all...

Gremlin: Add edges to multiple vertices

I have vertices [song1, song2, song3, user].
I want to add edges listened from user to the songs.
I have the following:
g.V().is(within(song1, song2, song3)).addE('listened').from(user)
However I'm getting the following error:
No signature of method: org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal.from() is applicable for argument types: (org.janusgraph.graphdb.vertices.CacheVertex) values: [v[4344]]
Possible solutions: sort(), drop(int), sum(), find(), grep(), sort(groovy.lang.Closure)
Of course, I can iterate through them one at a time instead but a single query would be nice:
user.addEdge('listened', song1)
user.addEdge('listened', song2)
user.addEdge('listened', song3)

The from() modulator accepts two things:
a step label or
a traversal
A single vertex or a list of vertices can easily be turned into a traversal by wrapping it in V(). Also, note that g.V().is(within(...)) will most likely end up being a full scan over all vertices; it pretty much depends on the provider implementation, but you should prefer to use g.V(<list of vertices>) instead. Thus your traversal should look more like any of these:
g.V().is(within(song1, song2, song3)).
addE('listened').from(V(user)) // actually bad, as it's potentially a full scan
g.V(song1, song2, song3).
addE('listened').from(V(user))
g.V(user).as('u').
V(within(song1, song2, song3)).
addE('listened').from('u')

Specific example of recursion in Prolog

polarbear([],H,[H]).
polarbear([H|T],Y,[H|Z]):- polarbear(T,Y,Z).
This is the prolog code. When entering ?-polarbear([1,2], 6, P). Get P =[1,2,6].
The thing is I just don't understand how it's working and I've been trying to work out how Prolog is doing what it's doing.
I have some experience with Prolog, but I don't understand this, so any guidance as to how it does what it does in order to help me understand Prolog would be greatly appreciated.

states that the first argument is a list with head H and tail T and the third argument is a list with head H and tail Z. So it forces (by using unification) the heads of the two lists to be the same. Recursively the two lists become identical except the fact that the third argument list has one more element in the end (element Y) and this is defined by the first clause. Note that second clause only works for lists with one or more elements. So as a base of the recursion when we examine the empty list then the third list due to first clause contains only one more element the element Y.

Prolog recursion head and tail

I have to query a database to find the siblings of children who have at least two other siblings as well as printing the names of their parents.
This is what I've got so far:
queryQuestion3(Year,FatherName,MotherName):-
family(person(FatherName,_,_,_),
person(MotherName,_,_,_),
[person(Name,Surname,date(_,_,Year),_),_,_|_])
; family(person(FatherName,_,_,_),
person(MotherName,_,_,_),
[_,person(Name,Surname,date(_,_,Year),_),_|_])
; family(person(FatherName,_,_,_),
person(MotherName,_,_,_),
[_,_,person(Name,Surname,date(_,_,Year),_)|_]).
This works, and it gives me the parent's names, but it only gives me the first three siblings and I have to deal with families larger than that without hardcoding.
I can imagine that the answer will use recursion, starting from the first sibling and iterating over them until you get to the last one, the base case, and then moving on to the next family, but I'm new to Prolog and wouldn't be very sure how to use tail and head recursion effectively to achieve this.
* Update #2 *
Thanks for the reply.
Here is the code now.
queryQuestion3(Year,FatherName,MotherName):-
family(person(FatherName,_,_,_),person(MotherName,_,_,_), Children),
member(person(_,_,date(_,_,Year),_), Children).

Decompiler loop nesting order and code gen ordering

After obtaining "natural loops" from a control flow diagram of basic blocks. How can these loops be ordered from inner most to outer most? I.e the inner most loop contains no other loops?
I obtained the loops using the dominator method, see the slide titled "Identifying Natural Loops with Dominators" here: http://www.cs.colostate.edu/~mstrout/CS553Fall07/Slides/lecture15-control.pdf
Additionally what algorithm should be used to traverse the control flow graph such that writing out each node would yield the correct output code?

In a well structured program (i.e. no gotos), the beginning of a loop must dominate the contents of the loop.
Every node which has incoming backedges must be the head of a loop. However, you have some freedom to the actual loop contents thanks to the ability to specify explicit continues. The minimal set of nodes that must be in the loop are all blocks that have a backedge to the head and all blocks which are reverse reachable from them and dominated by the head. The maximal set of nodes that can be in the loop is of course just all nodes dominated by the head.
Nesting is determined by whether the head of one loop is in the contents of another loop. In some cases, you have the freedom to decide whether to place the loop inside the outer loop or not.