I'm looking over interview questions, and I came across "How do you find out if a linked-list has an end? (i.e. the list is not a cycle)." It gives a solution (traverse it one and two nodes at a time, and see if the pointers are ever equal).
Couldn't we just keep the pointer that we start at and see if while traversing it, we ever hit that pointer again? Or will that not work?
That will not work: the linked list may contain a cycle that does not include the first pointer.
Keep in mind that a node in a linked list can be linked to by more than one other node!
Couldn't we just keep the pointer that we start at and see if
while traversing it, we ever hit that pointer again?
No. See the below case. You will just traverse the loop without ever hitting the start node of the list
Another way to find if there is a loop:
If you reverse the list, and remember the inital node, you will know that there is a cycle if you get back to the first node. While efficient, this solution changes the list and not suited for multithreaded applications.
Take two pointers one should traverse the list one node at a time and another should traverse the list 2 nodes at a time.
if at any point of time they meet each other(both the pointers refer to the same node).It means its a circular linked list and does not have an end.
Related
I'm experiencing issues querying a large graph involving repeat steps that aim at making "hops" across vertices and edges. My intention is to infer indirect relationships between objects. Consider the following:
John--livesIn-->Paris
Paris--isIn-->France
What I expect to come up with is that John is based in France. Simple enough, and this works great with a small data set.
The query that I use is the following, where I make no more than 2 hops:
g.V().has('name','John')
.emit(loops().is(lt(2)))
.repeat(__.bothE().bothV().simplePath())
.inE('isIn').outV().path()
This is working as expected, until I apply this to a graph made of about 1000 vertices and 3000 edges. Then, after a few minutes, I get various kinds of error (over the REST API) with no clear logic:
Error: Error encountered evaluating script
Error: 504 Gateway Time-out
Error: Java heap space
Error
I suspect that I am doing something wrong in my query. For exemple, setting the number of "hops" to 1 (direct relationship) with .emit(loops().is(lt(1))), I would expect the results to be delivered swiftly since it would not go into the repeat loop. However, this triggers the same issue.
Many thanks for your help!
Olivier
So it looks like you have a few things going on here. First let me take a shot at answering your question then let's look at why your traversal may be taking a long time to complete.
Based on your description of wanting to return John and France the following traversal should get your data:
g.V().has('name','John').as('person')
out('livesIn')
.out('isIn').as('country').select('person', 'country')
That will select all countries that a person named 'John' lives in.
Now to understand why your traversal was taking a long time. First, you are using several steps which are very memory and resource intensive such as bothE and bothV. Each of these steps navigate the relationship in both directions. Since you know the direction of the edge you are trying to traverse is out in both cases it is much quicker and less resource intensive to just use an out edge as this will traverse the specified edge name (if supplied) and end you on the adjacent vertex. Additionally, the simplePath step is another resource (specifically memory) intensive step as it must track the path value for each traverser until it contains repeated objects at which time it is dropped. This combined with the extra traversers created by the usage of loops and bothE and bothV is likely the cause of the slow query. I suspect that the query above will perform significantly better.
If you would like to see exactly what your query is doing I would suggest taking a look at the explain and profile steps which provide detailed information on your queries performance.
I (almost) fully understand the Zipper data structure for trees. However, in some publications I saw hints that it is also possible to use the Zipper idea to create immutable functional data structure for arbitrary graphs (which might have cycles as well).
What's the way to do it?
As soon as we have cycles, it means that any node can be reached via several paths. Hence, if I focus on a node, do some change to it, and move the focus away, I might later on come back to the same node via a different path, which means that it would be an 'old' version of the node, prior to the change made.
The only solution I came up with is to include to the context the list of changes to any node. Every time before the focus is changed to node X, it should be checked whether X is the member of the list of changes, and if so, it should be taken as the focused node.
If we also track the number of times N node X was copied from the list of changes, we can remove X from the list of changes, as soon as N = number of edges, inward to X.
Is there any better way to do it?
I'm thinking it's probably possible if i have a count variable that keeps track of the # of records in the list, but even then i can't just jump to the element[count] to the tail just like in the array.
This really befuddling me, any help would be appreciated
One thought is to keep track of a tail pointer as you build the linked list. The other way is to allow building the linked list only by inserting nodes at the beginning, this way you can keep track of tail.
I am trying to implement minesweeper solver in lisp. I know this is not rare problem but i didn't find any article that can help me with that. At start i have a minefield as input with numbers on uncovered fields. Algorithm should be finished when all mines are found. So, in every step i have to check what fields i can put in my list of mined fields and to choose one field from my list of not mined fields and open it. Later i will check is my list of mined fields completed and if yes algorithm is done. I would appreciate any help. I don't ask for source code, but i need good ideas. I am not experienced with this kind of problems.
I HAVE to use A* algorithm. And i don't need to open all unopened fields...I need to find positions of all mined fields. And of course it has to be the SHORTEST path to do that. When i find positions of all mined fields algorithm is finished. So, once more, i need to find all mined fields with optimal number of opened fields. And of course i need a heuristic for my algorithm which will help to choose one of all safe unopened fields.
And that list of safe unopened fields needs to be determined after every opening. So i need to call main function, that function will check did i find all mined fields, if not, then all safe adjacent unopened fields needs to be added to list of paths. And a path with best heuristic will be chosen
I did implement a minesweeper solver in my first year at the University so I can give you some tips. (This is not using A* algorithm)
Important - Not all positions are solvable.
Backtracking of the whole mine field is a bit complicated for advanced difficulties (complicated=takes some time, consider all the possibilites to place 100 mines in a 30x30 field).
You can solve everything locally, in the same way a human solves the minesweeper. The potential of this is to give the users a hint how to continue instead of solving everything.
Example:
Have a separate mine field where you do the solving
Find all the unsolved cells that have a solved (number/ known mine) cell close enough (2 cell distance)
For every such cell, take a 5x5 neighborhood with the cell in the center, find every possibility (backtracking) and check if the possibilites have something in common (mines/non-mines), if yes, you can check the mines and uncover the non-mines.
Repeat while you can uncover something.
When you cannot uncover anything and the number of remaining mines is small enough, you can try backtracking over the whole field.
I hope I remember it correctly, I did some proofs why the 5x5 area is enough to check but it was almost 10 years ago.
You do not need the A* algorithm; its purpose is to find the shortest path in a graph (such as the shortest path between two places in a map, or the smallest amount of moves that will solve a puzzle). You will probably want to use a technique that is known as backtracking.
As long as there are unopened fields, pick an unopened field that is next to an open field, and tentatively flag it as a mine. Then, look at an unopened field that is adjacent to the previous one as well as to an opened field, and flag that one as a mine too, if this doesn't contradict the adjacent numbers - if it does, flag it as safe instead. Continue. Eventually, you will have looked at all unopened fields that surround the current area and have found one possible way of flagging the fields as safe or unsafe. However, this was based on several guesses, so now you need to go back to the last field where you made a guess and then make the opposite guess and then move forwards again to get another possible flag combination. Then, go even further back, revise your guesses, and so on. This can be implemented quite neatly with recursion. Eventually, you will have a collection of possible flag combinations. If you can find a field that is safe in all possible flag combinations, open that field. Otherwise, pick a field that is safe in as many flag combinations as possible.
We want to use Riak's Links to create a doubly linked list.
The algorithm for it is quite simple, I believe:
Let 'N0' be the new element to insert
Get the head of the list, including its 'next' link (N1)
Set the 'previous' of N1 to be the N0.
Set the 'next' of N0 to be N1
Set the 'next' of the head of the list to be N0.
The problem that we have is that there is an obvious race condition here, because if 2 concurrent clients get the head of the list, one of the items will likely be 'lost'. Any way to avoid that?
Riak is an eventually consistent system when talking about CAP theorem.
Provided you set the bucket property allow_multi=true, if two concurrent clients get the head of the list then write, you will have sibling records. On your next read you'll receive multiple values (siblings) and will then have to resolve the conflict and write the result. Given that we don't have any sort of atomicity this will possibly lead to additional conflicts under heavy write concurrency as you attempt to update the linked objects. Not impossible to resolve, but definitely tricky.
You're probably better off simply serializing the entire list into a single object. This makes your conflict resolution much simpler.