neo4j: extraction of random graphs - graph

We have a big graph database made with Neo4j which has two type of relationships "E" and "I".
We would like to extract two graphs from it with a starting node called n0.
The first graph Gxi, based on the "I" relationship, must be obtained randomly.
The following request is wrong but this is the idea we want to implement. Here 10 neighbors are randomly chosen for each node of the last step
MATCH r1:(n0)-[:I]-(n1)
WITH random(n1) LIMIT 10
MATCH r2:(n1)-[:I]-(n2)
WITH random(n2) LIMIT 10*10
MATCH r3:(n2)-[:I]-(n3)
WITH random(n3) LIMIT 10*10*10
MATCH r4:(n4)-[:I]-(n4)
WITH random(n4) LIMIT 10*10*10*10
RETURN r1+r2+r3+r4
Then we would like to create the second graph Gxe based on the relationships "E" and the nodes of Gxi.
Thank you for your help.

APOC Procedures may be able to help here. There are collection functions that can be used to choose random items from a collection, and you can get slices of the collection rather than having to use LIMIT.
The trickier part will actually be collecting the subpaths along the way.
// assume already matched to start node n
MATCH r = (n)-[:I]-()
WITH apoc.coll.randomItems(collect(r), 10) as r1
UNWIND r1 as r
WITH r1, last(nodes(r)) as n
MATCH r = (n)-[:I]-()
WITH r1, apoc.coll.randomItems(collect(r), 10) as r2
UNWIND r2 as r
WITH r1, r2, last(nodes(r)) as n
MATCH r = (n)-[:I]-()
WITH r1, r2, apoc.coll.randomItems(collect(r), 10) as r3
UNWIND r3 as r
WITH r1, r2, r3, last(nodes(r)) as n
MATCH r = (n)-[:I]-()
WITH r1, r2, r3, apoc.coll.randomItems(collect(r), 10) as r4
RETURN r1 + r2 + r3 + r4

Related

The Eight-Queen Puzzle in Programming in Lua Fourth Edition

I'm currently reading Programming in Lua Fourth Edition and I'm already stuck on the first exercise of "Chapter 2. Interlude: The Eight-Queen Puzzle."
The example code is as follows:
N = 8 -- board size
-- check whether position (n, c) is free from attacks
function isplaceok (a, n ,c)
for i = 1, n - 1 do -- for each queen already placed
if (a[i] == c) or -- same column?
(a[i] - i == c - n) or -- same diagonal?
(a[i] + i == c + n) then -- same diagonal?
return false -- place can be attacked
end
end
return true -- no attacks; place is OK
end
-- print a board
function printsolution (a)
for i = 1, N do -- for each row
for j = 1, N do -- and for each column
-- write "X" or "-" plus a space
io.write(a[i] == j and "X" or "-", " ")
end
io.write("\n")
end
io.write("\n")
end
-- add to board 'a' all queens from 'n' to 'N'
function addqueen (a, n)
if n > N then -- all queens have been placed?
printsolution(a)
else -- try to place n-th queen
for c = 1, N do
if isplaceok(a, n, c) then
a[n] = c -- place n-th queen at column 'c'
addqueen(a, n + 1)
end
end
end
end
-- run the program
addqueen({}, 1)
The code's quite commented and the book's quite explicit, but I can't answer the first question:
Exercise 2.1: Modify the eight-queen program so that it stops after
printing the first solution.
At the end of this program, a contains all possible solutions; I can't figure out if addqueen (n, c) should be modified so that a contains only one possible solution or if printsolution (a) should be modified so that it only prints the first possible solution?
Even though I'm not sure to fully understand backtracking, I tried to implement both hypotheses without success, so any help would be much appreciated.
At the end of this program, a contains all possible solutions
As far as I understand the solution, a never contains all possible solutions; it either includes one complete solution or one incomplete/incorrect one that the algorithm is working on. The algorithm is written in a way that simply enumerates possible solutions skipping those that generate conflicts as early as possible (for example, if first and second queens are on the same line, then the second queen will be moved without checking positions for other queens, as they wouldn't satisfy the solution anyway).
So, to stop after printing the first solution, you can simply add os.exit() after printsolution(a) line.
Listing 1 is an alternative to implement the requirement. The three lines, commented respectively with (1), (2), and (3), are the modifications to the original implementation in the book and as listed in the question. With these modifications, if the function returns true, a solution was found and a contains the solution.
-- Listing 1
function addqueen (a, n)
if n > N then -- all queens have been placed?
return true -- (1)
else -- try to place n-th queen
for c = 1, N do
if isplaceok(a, n, c) then
a[n] = c -- place n-th queen at column 'c'
if addqueen(a, n + 1) then return true end -- (2)
end
end
return false -- (3)
end
end
-- run the program
a = {1}
if not addqueen(a, 2) then print("failed") end
printsolution(a)
a = {1, 4}
if not addqueen(a, 3) then print("failed") end
printsolution(a)
Let me start from Exercise 2.2 in the book, which, based on my past experience to explain "backtracking" algorithms to other people, may help to better understand the original implementation and my modifications.
Exercise 2.2 requires to generate all possible permutations first. A straightforward and intuitive solution is in Listing 2, which uses nested for-loops to generate all permutations and validates them one by one in the inner most loop. Although it fulfills the requirement of Exercise 2.2, the code does look awkward. Also it is hard-coded to solve 8x8 board.
-- Listing 2
local function allsolutions (a)
-- generate all possible permutations
for c1 = 1, N do
a[1] = c1
for c2 = 1, N do
a[2] = c2
for c3 = 1, N do
a[3] = c3
for c4 = 1, N do
a[4] = c4
for c5 = 1, N do
a[5] = c5
for c6 = 1, N do
a[6] = c6
for c7 = 1, N do
a[7] = c7
for c8 = 1, N do
a[8] = c8
-- validate the permutation
local valid
for r = 2, N do -- start from 2nd row
valid = isplaceok(a, r, a[r])
if not valid then break end
end
if valid then printsolution(a) end
end
end
end
end
end
end
end
end
end
-- run the program
allsolutions({})
Listing 3 is equivalent to List 2, when N = 8. The for-loop in the else-end block does what the whole nested for-loops in Listing 2 do. Using recursive call makes the code not only compact, but also flexible, i.e., it is capable of solving NxN board and board with pre-set rows. However, recursive calls sometimes do cause confusions. Hope the code in List 2 helps.
-- Listing 3
local function addqueen (a, n)
n = n or 1
if n > N then
-- verify the permutation
local valid
for r = 2, N do -- start from 2nd row
valid = isplaceok(a, r, a[r])
if not valid then break end
end
if valid then printsolution(a) end
else
-- generate all possible permutations
for c = 1, N do
a[n] = c
addqueen(a, n + 1)
end
end
end
-- run the program
addqueen({}) -- empty board, equivalent allsolutions({})
addqueen({1}, 2) -- a queen in 1st row and 1st column
Compare the code in Listing 3 with the original implementation, the difference is that it does validation after all eight queens are placed on the board, while the original implementation validates every time when a queen is added and will not go further to next row if the newly-added queen causes conflicts. This is all what "backtracking" is about, i.e. it does "brute-force" search, it abandons the search branch once it finds a node that will not lead to a solution, and it has to reach a leaf of the search tree to determine it is a valid solution.
Back to the modifications in Listing 1.
(1) When the function hits this point, it reaches a leaf of the search tree and a valid solution is found, so let it return true representing success.
(2) This is the point to stop the function from further searching. In original implementation, the for-loop continues regardless of what happened to the recursive call. With modification (1) in place, the recursive call returns true if a solution was found, the function needs to stop and to propagate the successful signal back; otherwise, it continues the for-loop, searching for other possible solutions.
(3) This is the point the function returns after finishing the for-loop. With modification (1) and (2) in place, it means that it failed to find a solution when the function hits this point, so let it explicitly return false representing failure.

MPI convention for index of rows and columns

I am using MPI for solving PDE. For this, I breakdown the 2D domain into different cells (size of each of these cells is "xcell,ycell" with xcell = size_x_domain/(number of X subdomains) and ycell = size_y_domain/(number of Y subdomains).
So, I am running the code with number of processes = (number of X subdomains)*(number of Y subdomains)
The gain relatively to sequential version is that I communicate between each process representing the sub-domains.
Here a figure illustrating my breakdown with 8 processes (2 subdomains for X and 4 for Y) :
(xs,xe) represent x_start and x_end of the cell,
(ys,ye) represent y_start and y_end of the cell
I would like to know if I have to set, into x(i,j) array, i the index as row index and j as column index ?
Is it a general rule to put the first index for row and the second one for column ? ( for example, in C, Fortran and Matlab language or maybe more)
Thanks for your help.
I'm not sure, but maybe try having a different flag for all 4 of the communications.

Read After Write(RAW) HAZARD

I am confused in finding RAW dependencies whether we have to find only in adjacent instructions or non-adjacent also.
consider the following assembly code
I1: ADD R1 , R2, R2;
I2: ADD R3, R2, R1;
I3: SUB R4, R1 , R5;
I4: ADD R3, R3, R4;
FIND THE NUMBER OF READ AFTER WRITE(RAW) DEPENDENCIES IN THE Above Code.
assume ADD x,y,z = x <- y + z
I am getting 2 dependency I2-I1 and I4-I3.
Let us say that after an instruction enters the pipeline, it will take it x stages after which any register write by that instruction will be visible to any following instruction.
Then you have to take care of the RAW dependencies among every set of x consecutive instructions. In the worst case you can take x to be the max no. of stages in the pipeline.
Now, the case in the question looks like a HW problem and since the pipeline structure is not defined so you will have to look at the RAW dependencies over all the instructions, which in this case are:
I2 and I1 over R1
I3 and I1 over R1
I4 and I2 over R3
I4 and I3 over R4

Prolog infinite loop

I'm fairly new to Prolog and I hope this question hasn't been asked and answered but if it has I apologize, I can't make sense of any of the other similar questions and answers.
My problem is that I have 3 towns, connected by roads. Most are one way, but there are two towns connected by a two way street. i.e.
facts:
road(a, b, 1).
road(b, a, 1).
road(b, c, 3).
where a, b and c are towns, and the numbers are the distances
I need to be able to go from town a to c without getting stuck between a and b
Up to here I can solve with the predicates: (where r is a list of towns on the route)
route(A, B, R, N) :-
road(A, B, N),
R1 = [B],
R = [A|R1],
!.
route(A, B, R, N) :-
road(A, C, N1),
route(C, B, R1, N2),
\+ member(A, R1),
R = [A | R1],
N is N1+N2.
however if I add a town d like so
facts:
road(b, d, 10)
I can't get Prolog to recognize this is a second possible route. I know that this is because I have used a cut, but without the cut it doesn't stop and ends in stack overflow.
Furthermore I will then need to be able to write a new predicate that returns true when R is given as the shortest route between a and c.
Sorry for the long description. I hope someone can help me!
This is a problem of graph traversal. I think your problem is that you've got a cyclic graph — you find the leg a-->b and the next leg you find is b-->a where it again finds the leg a-->b and ... well, you get the picture.
I would approach the problem like this, using a helper predicate with accumulators to build my route and compute total distance. Something like this:
% ===========================================================================
% route/4: find the route(s) from Origin to Destination and compute the total distance
%
% This predicate simply invoke the helper predicate with the
% accumulator(s) variables properly seeded.
% ===========================================================================
route(Origin,Destination,Route,Distance) :-
route(Origin,Destination,[],0,Route,Distance)
.
% ------------------------------------------------
% route/6: helper predicate that does all the work
% ------------------------------------------------
route(D,D,V,L,R,L) :- % special case: you're where you want to be.
reverse([D|V],R) % - reverse the visited list since it get built in reverse order
. % - and unify the length accumulator with the final value.
route(O,D,V,L,Route,Length) :- % direct connection
road(O,D,N) , % - a segment exists connecting origin and destination directly
L1 is L+N , % - increment the length accumulator
V1 = [O|V] , % - prepend the current origin to the visited accumulator
route(D,D,V1,L1,Route,Length) % - recurse down, indicating that we've arrived at our destination
. %
route(O,D,V,L,Route,Length) :- % indirect connection
road(O,X,N) , % - a segment exists from the current origin to some destination
X \= D , % - that destination is other than the desired destination
not member(X,V) , % - and we've not yet visited that destination
L1 is L+N , % - increment the length accumulator
V1 = [O|V] , % - prepend the current origin to the visited accumulator
route(X,D,V1,L1,Route,Length) % - recurse down using the current destination as the new origin.

Counting and listing motifs in SAGE

The question was correctly answered in http://ask.sagemath.org/question/2612/motifs-and-subgraphs
I'm counting the number of 3-motifs (3-nodes isophormic class of connected subgraphs) in a random directed network. There are 13 of this. One is, for example S1={1 -> 2, 2 -> 3} and another one S2={1 -> 2, 2 -> 3, 1 -> 3}: they are two distinct motifs, and I wouldn't count a S1 when I actually find S2. The problem is that S1 is in S2, hence subgraph_search() finds a S1 in each S2 and all related functions inherit the problem (wrong counting, wrong iterator...).
Any idea how to resolve this issue? Similar things would happen for 4-nodes motifs and so on... I could remove from the graph the occurrence of S2 after having counted them, but that would be really a awful trick (and dangerous if I wanted to count also 4 motifs).
The code I used goes like:
import numpy
M1 = DiGraph(numpy.array([[0,1,0],[0,0,1],[0,0,0]])) #first motif
M5 = DiGraph(numpy.array([[0,1,1],[0,0,1],[0,0,0]])) #second motif
g = digraphs.RandomDirectedGNP(20,0.1) #a random network
l1 = []
for p in g.subgraph_search_iterator(M1): #search first motif
l1.append(p) #make a list of its occurences
l5 = []
for p in g.subgraph_search_iterator(M5): #the same for the second motif
l5.append(p)
The trick was to include the option induce=true in the subgraph_search() function as correctly answered in http://ask.sagemath.org/question/2612/motifs-and-subgraphs .

Resources