Neo4j : create a recursive query/function - recursion

☼ Hello !
I want to get the critical path of an activity list, but through Neo4j.
For this, I need the Earliest Times (Start and Finish). The Earliest Start of an activity equals the greatest Earliest Finish of its predecessors, and so on.
I already had something "working". But my problem is that I just need to "recall the function". I can go down by hand, but I can't do it indefinitely...
The Activity List
Here is my code :
// LEVEL 1
/****** collect (start.successors) as startSucessors *****/
MATCH (project:Project)-[:CONTAINS]->(:Activity{tag:'Start'})-[s:ENABLES]->(:Activity)
WHERE ID(project)=toInteger(322)
WITH collect(endNode(s)) AS startSuccessors
/**** foreach node in startSucessors ****/
UNWIND startSuccessors AS node
/**** collect (node.predecessors) as nodePredecessors ****/
MATCH (activity:Activity)-[p:ENABLES]->(node)
WITH collect(startNode(p)) AS nodePredecessors, node, startSuccessors
/**** foreach activity in nodePredecessors ****/
UNWIND nodePredecessors AS activity
/**** IF (node.ES is null OR node.ES < activity.EF) ****/
WITH node, activity, startSuccessors,(node.ES = 0) AS cond1, (node.ES < activity.EF) AS cond2
MERGE (activity)-[:ENABLES]->(node)
ON MATCH SET node.ES =
CASE
/**if**/ WHEN cond1 OR cond2
/**node.ES = activity.EF**/ THEN activity.EF
END
ON MATCH SET node.EF = node.ES + node.ET
// LEVEL 2
/**T.O.D.O. : loop for each node in startSuccessors and their nodes **/
WITH startSuccessors
UNWIND startSuccessors AS node
MERGE (node)-[s2:ENABLES]->(successor:Activity)
WITH collect(successor) AS nodeSuccessors,node
UNWIND nodeSuccessors AS successor
CREATE UNIQUE (act:Activity)-[p2:ENABLES]->(successor)
WITH successor, node,act, (successor.ES = 0) AS cond3, (successor.ES < act.EF) AS cond4
MERGE (act)-[p2:ENABLES]->(successor)
ON MATCH SET successor.ES =
CASE
/**if**/ WHEN cond3 OR cond4
/**node.ES = activity.EF**/ THEN act.EF
END
ON MATCH SET successor.EF = successor.ES + successor.ET
Here is the result
Earliest Times Query Result
The second problem is that if I rerun the query, the ES and EF properties disappear ... (prove below)
Problem when rerunning the query
To repair this, I have to run this query :
MATCH (p:Project) WHERE ID(p)=322
MATCH (p)-[:CONTAINS]->(one:Activity{tag:'one'}),(p)-[:CONTAINS]->(zrht:Activity{tag:'zrht'}),(p)-[:CONTAINS]->(ore:Activity{tag:'ore'}),(p)-[:CONTAINS]->(bam:Activity{tag:'bam'}),(p)-[:CONTAINS]->(two:Activity{tag:'two'})
SET one.EF = 0,one.ES = 0,one.LF=0,one.LS=0,zrht.EF = 0,zrht.ES = 0,zrht.LF=0,zrht.LS=0,ore.EF = 0,ore.ES = 0,ore.LF=0,ore.LS=0,bam.EF = 0,bam.ES = 0,bam.LF=0,bam.LS=0,two.EF = 0,two.ES = 0,two.LF=0,two.LS=0
This javascript code reaches what I want to do.
Thank you very much for your help.

☼ I finally found what I was looking for : Project Management with Neo4j
In hopes it will help other to find in a quicker way ;)

Related

Cannot get Realm result for objects filtered by the latest (nsdate) value of a property of a collection property swift (the example is clearer)

I Have the following model
class Process: Object {
#objc dynamic var processID:Int = 1
let steps = List<Step>()
}
class Step: Object {
#objc private dynamic var stepCode: Int = 0
#objc dynamic var stepDateUTC: Date? = nil
var stepType: ProcessStepType {
get {
return ProcessStepType(rawValue: stepCode) ?? .created
}
set {
stepCode = newValue.rawValue
}
}
}
enum ProcessStepType: Int { // to review - real value
case created = 0
case scheduled = 1
case processing = 2
case paused = 3
case finished = 4
}
A process can start, processing , paused , resume (to be in step processing again), pause , resume again,etc. the current step is the one with the latest stepDateUTC
I am trying to get all Processes, having for last step ,a step of stepType processing "processing ", ie. where for the last stepDate, stepCode is 2 .
I came with the following predicate... which doesn't work. Any idea of the right perform to perform such query ?
my best trial is the one. Is it possible to get to this result via one realm query .
let processes = realm.objects(Process.self).filter(NSPredicate(format: "ANY steps.stepCode = 2 AND NOT (ANY steps.stepCode = 4)")
let ongoingprocesses = processes.filter(){$0.steps.sorted(byKeyPath: "stepDateUTC", ascending: false).first!.stepType == .processing}
what I hoped would work
NSPredicate(format: "steps[LAST].stepCode = \(TicketStepType.processing.rawValue)")
I understand [LAST] is not supported by realm (as per the cheatsheet). but is there anyway around I could achieve my goal through a realm query?
There are a few ways to approach this and it doesn't appear the date property is relevant because lists are stored in sequential order (as long as they are not altered), so the last element in the List was added last.
This first piece of code will filter for processes where the last element is 'processing'. I coded this long-handed so the flow is more understandable.
let results = realm.objects(Process.self).filter { p in
let lastIndex = p.steps.count - 1
let step = p.steps[lastIndex]
let type = step.stepType
if type == .processing {
return true
}
return false
}
Note that Realm objects are lazily loaded - which means thousands of objects have a low memory impact. By filtering using Swift, the objects are filtered in memory so the impact is more significant.
The second piece of code is what I would suggest as it makes filtering much simpler, but would require a slight change to the Process model.
class Process: Object {
#objc dynamic var processID:Int = 1
let stepHistory = List<Step>() //RENAMED: the history of the steps
#objc dynamic var name = ""
//ADDED: new property tracks current step
#objc dynamic var current_step = ProcessStepType.created.index
}
My thought here is that the Process model keeps a 'history' of steps that have occurred so far, and then what the current_step is.
I also modified the ProcessStepType enum to make it more filterable friendly.
enum ProcessStepType: Int { // to review - real value
case created = 0
case scheduled = 1
case processing = 2
case paused = 3
case finished = 4
//this is used when filtering
var index: Int {
switch self {
case .created:
return 0
case .scheduled:
return 1
case .processing:
return 2
case .paused:
return 3
case .finished:
return 4
}
}
}
Then to return all processes where the last step in the list is 'processing' here's the filter
let results2 = realm.objects(Process.self).filter("current_step == %#", ProcessStepType.processing.index)
The final thought is to add some code to the Process model so when a step is added to the list, the current_step var is also updated. Coding that is left to the OP.

Gremlin Scala Neo4j: Search for node, add new node and edge between

I am unable to find a node via a key and then add a new node and an edge between them. with the movie nodes already in graph, i use:
case class GraphBuilder(movieId: Int, personId: Int)
//
val Ident = Key[String]("personId")
val ItemId = Key[String]("movieId")
//
def applyToGraph(it: GraphBuilder): Unit = {
val thisPerson = graph + ("Person", Ident -> it.personId.asInstanceOf[String])
val movies = graph.V.hasLabel("Movie").has(ItemId, it.movieId)
movies.headOption match {
case Some(v) =>
v --- "likedBy" --> thisPerson // tested with println("yay" + v)
case None => println("youre a failure")
}
graph.tx.commit()
}
But each time I run this programmatically, it correctly adds the person to the graph via thisPerson val, correctly finds the movie vertex based on the movieId, but does not create the "likedBy" edge. I have also tried without pattern matching on the option but that does not work either.
What method is best to find node, add node, add edge between added and found?
I'm a bit confused by the syntax in your snippet, but since you have to have the identifiers for both vertices, the following query should do the trick:
g.V().has("Movie", "movieId", movieId).as("m").
V().has("Person", "personId", personId).
addE("likedBy").to("m").iterate()
If the person vertex doesn't exist yet:
g.V().has("Movie", "movieId", movieId).as("m").
addV("Person").property("personId", personId).
addE("likedBy").to("m").iterate()
And if you don't know whether the person vertex already exists or not:
g.V().has("Movie", "movieId", movieId).as("m").
coalesce(
V().has("Person", "personId", personId)
addV("Person").property("personId", personId)).
addE("likedBy").to("m").iterate()

Crossfilter grouping filtered keys

I have some json, for examle:
data = {
"name":"Bob","age":"20",
"name":"Jo","age":"21",
"name":"Jo","age":"22",
"name":"Nick","age":"23"
}
Next, I use crossfilter, create dimension and filter it:
let ndx = crossfilter(data);
let dim = ndx.dimension(d => d.name).filter(d !== "Jo");
//try to get filtered values
let filtered = dim.top(Infinity); // -> return 2 values where 'name'!='Jo'
//"name":"Bob","age":"20"
//"name":"Nick","age":"23"
let myGroup = dim.group(d => {
if(d === 'Jo') {
//Why we come here? This values must be filtered already
}
})
How can I filter my dimension and don't have these values on 'dim.group'?
Not sure what version you are using, but in the current version of Crossfilter, when a new group is created all records are first added to the group and then filtered records are removed. So the group accessor will be run at least once for all records.
Why do we do this? Because for certain types of grouping logic, it is important for the group to "see" a full picture of all records that are in scope.
It is possible that the group accessor is run over all records (even filtered ones) anyway in order to build the group index, but I don't remember.

Remove nodes which are single or have 2nd degree visjs

I've a network graph
Now I've some connected nodes and as you can see most of the nodes only have one connected node that is their degree is 1. Now I'd like to remove such nodes to clear the clutter. Unable to find how to since last 2 days. No such helper functions available in visjs documentation. Would appreciate help.
I believe the algorithm suggested by the 1st answer -by macramole- (before updates) would actually hide the non-connected nodes (degree 0), instead of the ones with degree 1.
I would probably just iterate over all the edges in the network while keeping 'degree' counters for each node that is an endpoint in the edge you are visiting (you can obtain these nodes by grabbing the edge.from and edge.to values, as shown above). You would increment the degree counter for a node, whenever the node is 'hit' in this search through the edges.
Eventually you'll end up with the degree value for each node in the network, at which point you can decide which ones to hide.
Updating this answer now to include my suggested code (note: nodes and edges are vis DataSet instances):
Example code:
var nodeToDegrees = {}; // keeps a map of node ids to degrees
var nodeFrom, nodeTo;
for (edge in edges) {
nodeFrom = edge.from;
nodeTo = edge.to;
nodeToDegrees[nodeFrom] = nodeToDegrees[nodeFrom] ? nodeToDegrees[nodeFrom] + 1 : 0;
nodeToDegrees[nodeTo] = nodeToDegrees[nodeTo] ? nodeToDegrees[nodeTo] + 1 : 0;
}
for (node in nodes) {
if (nodeToDegrees[node.id] = 1) nodes.update([{node.id, hidden: true}]);
}
This might work:
var DEGREES_HIDDEN = 1;
for ( var node of nodes ) {
node.cantLinks = 0;
for ( var link of links ) {
if ( link.from == node.id || link.to == node.id ) {
node.cantLinks++;
}
}
}
for ( var node of nodes ) {
if ( node.cantLinks <= DEGREES_HIDDEN ) {
node.hidden = true;
}
}
Nodes and links are arrays not vis.DataSet, I create the latter after doing that.
Doesn't look very nice perfomance wise but it does get the job done. Hope you find it useful.

Why is my query so slow?

I try to tune my query but I have no idea what I can change:
A screenshot of both tables: http://abload.de/image.php?img=1plkyg.jpg
The relation is: 1 UserPM (a Private Message) has 1 Sender (User, SenderID -> User.SenderID) and 1 Recipient (User, RecipientID -> User.UserID) and 1 User has X UserPMs as Recipient and X UserPMs as Sender.
The intial load takes around 200ms, it only takes the first 20 rows and display them. After this is displayed a JavaScript PageMethod gets the GetAllPMsAsReciepient method and loads the rest of the data
this GetAllPMsAsReciepient method takes around 4.5 to 5.0 seconds each time to run on around 250 rows
My code:
public static List<UserPM> GetAllPMsAsReciepient(Guid userID)
{
using (RPGDataContext dc = new RPGDataContext())
{
DateTime dt = DateTime.Now;
DataLoadOptions options = new DataLoadOptions();
//options.LoadWith<UserPM>(a => a.User);
options.LoadWith<UserPM>(a => a.User1);
dc.LoadOptions = options;
List<UserPM> pm = (
from a in dc.UserPMs
where a.RecieverID == userID
&& !a.IsDeletedRec
orderby a.Timestamp descending select a
).ToList();
TimeSpan ts = DateTime.Now - dt;
System.Diagnostics.Debug.WriteLine(ts.Seconds + "." + ts.Milliseconds);
return pm;
}
}
I have no idea how to tune this Query, I mean 250 PMs are nothing at all, on other inboxes on other websites I got around 5000 or something and it doesn't need a single second to load...
I try to set Indexes on Timestamp to reduce the Orderby time but nothing happend so far.
Any ideas here?
EDIT
I try to reproduce it on LinqPad:
Without the DataLoadOptions, in LinqPad the query needs 300ms, with DataLoadOptions around 1 Second.
So, that means:
I could save around 60% of the time, If I can avoid to load the User-table within this query, but how?
Why Linqpad needs only 1 second on the same connection, from the same computer, where my code is need 4.5-5.0 seconds?
Here is the execution plan: http://abload.de/image.php?img=54rjwq.jpg
Here is the SQL Linqpad gives me:
SELECT [t0].[PMID], [t0].[Text], [t0].[RecieverID], [t0].[SenderID], [t0].[Title], [t0].[Timestamp], [t0].[IsDeletedRec], [t0].[IsRead], [t0].[IsDeletedSender], [t0].[IsAnswered], [t1].[UserID], [t1].[Username], [t1].[Password], [t1].[Email], [t1].[RegisterDate], [t1].[LastLogin], [t1].[RegisterIP], [t1].[RefreshPing], [t1].[Admin], [t1].[IsDeleted], [t1].[DeletedFrom], [t1].[IsBanned], [t1].[BannedReason], [t1].[BannedFrom], [t1].[BannedAt], [t1].[NowPlay], [t1].[AcceptAGB], [t1].[AcceptRules], [t1].[MainProfile], [t1].[SetShowHTMLEditorInRPGPosts], [t1].[Age], [t1].[SetIsAgePublic], [t1].[City], [t1].[SetIsCityShown], [t1].[Verified], [t1].[Design], [t1].[SetRPGCountPublic], [t1].[SetLastLoginPublic], [t1].[SetRegisterDatePublic], [t1].[SetGBActive], [t1].[Gender], [t1].[IsGenderVisible], [t1].[OnlinelistHidden], [t1].[Birthday], [t1].[SetIsMenuHideable], [t1].[SetColorButtons], [t1].[SetIsAboutMePublic], [t1].[Name], [t1].[SetIsNamePublic], [t1].[ContactAnimexx], [t1].[ContactRPGLand], [t1].[ContactSkype], [t1].[ContactICQ], [t1].[ContactDeviantArt], [t1].[ContactFacebook], [t1].[ContactTwitter], [t1].[ContactTumblr], [t1].[IsContactAnimexxPublic], [t1].[IsContactRPGLandPublic], [t1].[IsContactSkypePublic], [t1].[IsContactICQPublic], [t1].[IsContactDeviantArtPublic], [t1].[IsContactFacebookPublic], [t1].[IsContactTwitterPublic], [t1].[IsContactTumblrPublic], [t1].[IsAdult], [t1].[IsShoutboxVisible], [t1].[Notification], [t1].[ShowTutorial], [t1].[MainProfilePreview], [t1].[SetSound], [t1].[EmailNotification], [t1].[UsernameOld], [t1].[UsernameChangeDate]
FROM [UserPM] AS [t0]
INNER JOIN [User] AS [t1] ON [t1].[UserID] = [t0].[RecieverID]
WHERE ([t0].[RecieverID] = #p0) AND (NOT ([t0].[IsDeletedRec] = 1))
ORDER BY [t0].[Timestamp] DESC
If you want to get rid of the LoadWith, you can select your field explicitly :
public static List<Tuple<UserPM, User> > GetAllPMsAsReciepient(Guid userID)
{
using (var dataContext = new RPGDataContext())
{
return (
from a in dataContext.UserPMs
where a.RecieverID == userID
&& !a.IsDeletedRec
orderby a.Timestamp descending
select Tuple.Create(a, a.User1)
).ToList();
}
}
I found a solution:
At first it seems that with the DataLoadOptions is something not okay, at second its not clever to load a table with 30 Coloumns when you only need 1.
To Solve this, I make a view which covers all nececeery fields and of course the join.
It reduces the time from 5.0 seconds to 230ms!

Resources