Drools, graph traversal, query to find root nodes - graph

I have a Java-side class with essential behaviour like:
declare Datum
description: String
broader: List <Datum>
narrower: List <Datum>
end
I want to write
query rootDatumsFor(Datum datum)
that provides a list of the root datums - that is, work "up" the broader property and return a list of each datum that has an empty broader list.
I am getting totally confused how to write this - mainly because of the negation involved.
I think I want something like
query rootDatumsFor( Datum datum )
not Datum() from $datum.broader
or
rootDatumsFor( $datum.broader )
end
But I am getting confused on both parts. If there are no broader terms, which the not should detect, how do I "return" the current value of $datum? I feel each part wants a $result and I want to do a $result: $datum but that isn't valid.
And I'm not certain how to do the recursion. Should I have rootDatumsFor(datum, result) and do it via binding?
I've seen examples that do things likes Datum( this == $datum) but that doesn't seem to be accepted when I try it.
Any assistance, whilst I keep re-reading the docn to find a little clue how to proceed, would be much appreciated.

To find all Datum facts with an empty broaderlist, all you have to do is
query rootDatumsFor( Datum $datum )
$datum: Datum( broader.size() == 0 )
end

Related

Gremlin continue traversal only if 2 vertices are not the same

I have a query which looks at 2 different vertices and I want to stop traversing if they don't both roll up to the same root ancestor via a path of "contains" edges.
g.V('node1')
.until(hasLabel('root')).repeat(in('contains')).as('node1Root')
.V('node2')
.until(hasLabel('root')).repeat(in('contains')).as('node2Root')
//FILTER|WHERE clause
I'd like to confirm that node1Root and node2root are the same vertex before continuing the traversal, but for the life of me I cannot figure out how to do this.
I've tried the following:
g.V('node1')
.until(hasLabel('root')).repeat(in('contains')).as('node1Root')
.V('node2')
.until(hasLabel('root')).repeat(in('contains')).as('node2Root')
//.where('node1Root', P.eq('node2Root')
//.where(select("node1Root").is(P.eq("node2Root")))
//.where(select("node1Root").is("node2Root"))
What's interesting is that the following query does work to filter appropriately.
g.V('node1').as('1')
.V('node2').as('2')
.where('1', P.eq('2'))
I'm not sure if there's something up with the until/repeat that screws it up or if I'm just doing something blatantly wrong. Any help would be much appreciated.
Thanks!
I found How to check equality with nodes from an earlier part of query in Gremlin?
and it seems like you use "as" with the same key as the previous "as" and if they match its considered equal.
So here's the winner (I think):
g.V('node1')
.until(hasLabel('root')).repeat(in('contains')).as('node1Root')
.V('node2')
.until(hasLabel('root')).repeat(in('contains')).as('node2Root')
.where(select('node1Root').as('node2Root')
//.not(select('node1Root').as('node2Root')) //OR this to determine they aren't the same
//continue traversal
I also found that my original issue was that the .until().repeat() steps could return a LIST, but in my case I know that my graph model will always return a single 'root' so to make it work, I can use 'unfold'
g.V('node1')
.until(hasLabel('root')).repeat(in('contains')).unfold().as('node1Root')
.V('node2')
.until(hasLabel('root')).repeat(in('contains')).unfold().as('node2Root')
.where('node1Root', P.eq('node2Root')
I think I'll be going with the second solution because I'm much more confident in it, unless I hear otherwise.
You can try this gremlin query
g.V(node1-id)
.map(until(hasLabel('root')).repeat(in().aggregate('x')).cap('x')).as("array")
.V(node2-id)
.until(
as("i").select("array").unfold().as("j")
.where("i", eq("j"))
).repeat(in())
Here we are putting all the vertices in path to root from node1 in an array, and secondly we are checking existence of node in array.
this query can only work with traversal with only one iteration because aggregate step collect to a global variable to traversal that means it will be same array for every iteration. To fix this If you are doing this on jvm do use lamda/groovy closures
g.V(node-start-id-1,node-start-id-2)
.map(
{ x->
var v = x.get()
var g = getGraph().get().traversal();
g.V(v.id())until(hasLabel('root')).repeat(in().aggregate('x')).cap('x')).next()
}
)
.as("array")
.V(node2-id)
.until(
as("i").select("array").unfold().as("j")
.where("i", eq("j"))
).repeat(in())

Ordering of nodes after in memory update with mem:insert-child() is inconsistent

I am using Ryan Dew's xquery module (https://github.com/ryanjdew/XQuery-XML-Memory-Operations) with MarkLogic v8.0-5 to perform in-memory updates to a document.
The following code sample replicates the problem I experience when I use the insert-child() operation. Elements inserted with insert-child() should reflect the insertion order, but I find they do not. In other words, the elements should be ordered one, two, three, four, five in the output. Instead, the order appears to be random.
let $x := <root><original-container><one/><two/><three/><four/><five/></original-container><new-container/></root>
let $tx := mem:copy($x)
let $_ :=
for $item in $x/original-container/*
return (
xdmp:set( $tx, mem:delete($tx, $item) ) ,
xdmp:set( $tx, mem:insert-child($tx, $x/new-container, $item) )
)
return mem:execute( $tx )
I may be misusing the code library--in particular, I am creating a transaction map and then resetting it with xdmp:set() after each insert-child() operation. Perhaps this is not correct?
I apologize, as this was a bug in my code. An expression was returning the nodes in "document order" and across documents that can appear random. I've committed a fix and more details can be found at the ticket on GitHub.

Filter using edges / vertices

I try to filter nodes :
user = g.v(42);
g.idx('comparisons')[[id:Neo4jTokens.QUERY_HEADER + '*']]
.filter{
if (it.out('COMPARED_VALUE1').in('VOTED').in('VOTES').next().equals(user))
{
return true;
}
return false;
}.count();
I don't really understand how pipes works, but I understand that the next() breaks something in the filter "loop".
I should get 2 results, but I get none.
Regards,
I might need to amend my answer as I could require more specifics on what you are trying to achieve (as #Michael also requested), but if you think your problem is with next(), then consider the following:
user = g.v(42);
g.idx('comparisons')[[id:Neo4jTokens.QUERY_HEADER + '*']]
.filter{it.out('COMPARED_VALUE1').in('VOTED').in('VOTES').next().equals(user)}.count();
First, note above that your filter closure can immediately reduce to that (which will yield the same error, of course). Given that filter closure you are assuming that a user vertex will come out of the pipeline when you next(). That may not be the case. As such, I would re-write the filter closure as:
user = g.v(42);
g.idx('comparisons')[[id:Neo4jTokens.QUERY_HEADER + '*']].filter{
def p = it.out('COMPARED_VALUE1').in('VOTED').in('VOTES')
p.hasNext() ? p.next().equals(user) : false
}.count();
That should likely solve your problem right there given the assumption that you only need to evaluate the first item in the pipeline p which is effectively what you were doing before. I wonder if you couldn't simply use except/retain pattern here to get your answer as it is a bit less convoluted:
user = g.v(42);
g.idx('comparisons')[[id:Neo4jTokens.QUERY_HEADER + '*']]
.out('COMPARED_VALUE1').in('VOTED').in('VOTES').retain([user])
.count();
Hopefully something here puts on you on the right track to your answer.
What do you want to achieve?
Sorry my gremlin knowledge is close to zero these days.
In cypher it would probably look like this
START user=node(42), comp=node:comparisons("id:*")
MATCH comp-[:COMPARED_VALUE1]->()<-[:VOTED*2]-(user)
RETURN count(*)

How to suppress function return

Suppose I have a function that has multiple returned values (shown below). However, this output is not informative as users do not know what each value stands for unless they look up the function definition. So I would like to use println() to print the results with appropriate names to the screen, while suppressing the the actual returned values from being printed on the screen. In R, the function invisible() does that, but how do you do the same thing in Julia?
function trimci(x::Array; tr=0.2, alpha=0.05, nullvalue=0)
se=sqrt(winvar(x,tr=tr))./((1-2.*tr)*sqrt(length(x)))
ci=cell(2)
df=length(x)-2.*floor(tr.*length(x))-1
ci=[tmean(x, tr=tr)-qt(1-alpha./2, df).*se, tmean(x, tr=tr)+qt(1-alpha./2, df).*se]
test=(tmean(x,tr=tr)-nullvalue)./se
sig=2.*(1-pt(abs(test),df))
return ci, tmean(x, tr=tr), test, se, sig
end
In addition to what Harlan and Stefan said, let me share an example from the ODBC.jl package (source here).
One of my favorite features of Julia over other languages is how dead simple it is to create custom types (and without performance issues either!). Here's a custom type, Metadata, that simply holds several fields of data that describe an executed query. This doesn't necessarily need its own type, but it makes it more convenient passing all this data between functions as well as allowing custom formatting of its output by overloading the Base.show() function.
type Metadata
querystring::String
cols::Int
rows::Int
colnames::Array{ASCIIString}
coltypes::Array{(String,Int16)}
colsizes::Array{Int}
coldigits::Array{Int16}
colnulls::Array{Int16}
end
function show(io::IO,meta::Metadata)
if meta == null_meta
print(io,"No metadata")
else
println(io,"Resultset metadata for executed query")
println(io,"------------------------------------")
println(io,"Columns: $(meta.cols)")
println(io,"Rows: $(meta.rows)")
println(io,"Column Names: $(meta.colnames)")
println(io,"Column Types: $(meta.coltypes)")
println(io,"Column Sizes: $(meta.colsizes)")
println(io,"Column Digits: $(meta.coldigits)")
println(io,"Column Nullable: $(meta.colnulls)")
print(io,"Query: $(meta.querystring)")
end
end
Again, nothing fancy, but illustrates how easy it really is to define a custom type and produce custom output along with it.
Cheers.
One thing you could do would be to define a new type for the return value for this function, call it TrimCIResult or something. Then you could define appropriate methods to show that object in the REPL. Or you may be able to generalize that solution with a type hierarchy that could be used for storing the results from and displaying any statistical test.
The value nothing is how you return a value that won't print: the repl specifically checks for the value nothing and prints nothing if that's the value returned by an expression. What you're looking to do is to return a bunch of values and not print them, which strikes me as rather odd. If a function returns some stuff, I want to know about it – having the repl lie to users seems like a bad idea. Harlan's suggesting would work though: define a type for this value with the values you don't want to expose to the user as fields and customize its printing so that the fields you don't want to show people aren't printed.

Get nth element of a collection in Cypher

Using Cypher 1.8, there are some functions working on collections and returning a single element:
HEAD( expression ):
START a=node(2)
RETURN a.array, head(a.array)
LAST( expression ):
START a=node(2)
RETURN a.array, last(a.array)
However, I could not find a function to return the nth element of a collection. What am I missing?
There's no good way to do that at the moment. Please submit a feature request at https://github.com/neo4j/neo4j
I've seen people do head(tail(tail(tail(coll)))), and while it's probably acceptably fast, it still makes me a little ill to see in a query, especially if you're talking about the 17th element or worse.
Example:
http://console.neo4j.org/r/bbo6o4
Update:
Here's a way to do it using reduce and range. It makes it so you can give a parameter for nth at least, even though it still makes me cringe:
start n=node(*)
with collect(n) as allnodes
return head(reduce(acc=allnodes, x in range(1,3): tail(acc)));
http://console.neo4j.org/r/8erfup
Update 2 (8/31/2013):
The new collection syntax is now merged into 2.0 and will be theoretically be a part of M05! So, you'll be able to do:
start n=node(*)
with collect(n) as allnodes
return allnodes[3]; // or slices, like [1..3]
I'll add a link to the snapshot documentation when it gets updated.
I've just come across this old question, and for the benefit of anyone else recently coming across it... it seems the list support has improved.
From the Cypher 4 list docs:
Cypher has comprehensive support for lists.
^ Sidenote: I think that's list comprehensions pun? ;-)
They go on to give an example showing how you'd access the n'th element of a list:
To access individual elements in the list, we use the square brackets again. This will extract from the start index and up to but not including the end index.
... we’ll use the range function. It gives you a list containing all numbers between given start and end numbers. Range is inclusive in both ends.
RETURN range(0, 10)[3]
^ returns "3"
Currently, with the release of APOC Procedures 3.3.0.2 you can use aggregation functions.
This way, you can do thinks like:
create (:Node {node_id : 1}),
(:Node {node_id : 2}),
(:Node {node_id : 3});
match(n:Node)
with n order by n.node_id
// returns {"node_id":2}
return apoc.agg.nth(n, 1);
or:
match(n:Node)
with n order by n.node_id
// returns {"node_id":1}
// you can also use apoc.agg.last
return apoc.agg.first(n);
To work with lists UNWIND the list first:
with ['fist', 'second', 'third'] as list
unwind list as value
// returns 'second'
return apoc.agg.nth(value, 1);

Resources