Neo4j: How to use APOC apoc.algo.cover procedure? - graph

Hi,
I try to use the "cover" function from APOC like this :
WITH ["f1,"f2",...] as list1
MATCH (n:Frag)
WHERE n.frag in list1
WITH COLLECT(ID(n)) as nodeIds
CALL apoc.algo.cover(nodeIds)
YIELD rel
RETURN rel
It works but it is very slow the first time. If I do it once again, it becomes muck quicker! What does that mean?

Probably your issue is not related to apoc.algo.cover usage, but to the WHERE part of your query. You can try a performance improvement adding an index in the Frag.frag property.
CREATE INDEX ON :Frag(frag)
After creating the index run your query again. Note that the index is not immediately available, but will be created in the background.

Related

Gremlin Java replace old vertex properties with new property

I am trying to use gremlin java to replace some vertex's property like this:
graph.V(1).properties().drop().property("foo", "bar").iterate()
However, this only remove property but doesn't add new property.
I think iterate should probably be immediately after drop, but I am connecting to remote graph db so I wish to reduce query count.
Hope there is some way to achieve this:
graph.V(1).properties().drop().property("foo", "bar")
.V(2).properties().drop().property("foo", "bar").....iterate()
Yes, you can definitely do that, check out the live example: https://gremlify.com/27
You can select the vertex and see it's properties, or run g.V().valueMap(true) and afterwards:
g.V().sideEffect(properties ().drop()).property ("foo", "boo");
To see the new properties applied to it.
(For simplicity, I assumed a single vertex on the graph but this applies for any vertex).
The answer by #gremlify is correct. And alternatively you can use .union() command also.
g.V().union(
properties().drop(),
property('foo', 'boo')
);

Xquery optimization

I have this xquery as follows:
declare variable $i := doc()/some-element/modifier[empty(modifier-value)];
$i[1]/../..;
I need to run this query on Marklogic's Qconsole where we have 721170811 records. Since that is huge number of record, I am getting timeout error. Is there any way I can optimize this query to get the result?
P.S. I cannot request amdin to increase the timeout time.
Try creating an element range index (or a path range index if the target element is not unique) and using a cts:values() lexicon lookup.
That way, the request can read the values from the range index instead of having to read each document.
See:
http://docs.marklogic.com/guide/search-dev/lexicon
You could use xdmp:spawn, create a library when you will make the query, get the documents, iterate the result collecting 1000 documents per iteration and call another xdmp:spawn to process the information from that dataset, I would suggest summarize the result to return only the information you will need to don't crash the browser, at the end should look something like this:
xdmp:spawn("process.xqy")
into the library process.xqy
function local:start-process(){
let $docs := (....)
let $temp := for $x in $docs[$start to $end]
return local:process-dataset($temp) (: Could use spawn here too if you want :)
return xdmp:spawn("collect.xqy",$temp)
}
local:start-process()
compact-data function should create a file or a set of files with your data, this way the server will run all the process and in some minutes you will be available to see your data without problems.
You don't want to run something like doc() or xdmp:directory - just returns a result set that will kill you every time. You need to lower your result set by a lot.
A few thoughts:
You want to have as much done in MarkLogic's d-node, and the least work done in the e-node as possible. This is a way over-generalization, but for the most part I look at it like d-node stuff is data, indexes, lexicon work, etc. e-node stuff handles xQuery and such. So, in your example, you're definitely working out the e-node more than you need to.
You're going to want to use cts:search, as it uses indexes, not xPath to resolve your query. So, something like this:
declare variable $i := cts:search(fn:collection(),
cts:element-query(xs:QName("some-element"),
cts:element-value-query(xs:QName("modifier"), "", "exact")
)
)[1];
This will return document-node's, which it looks like what you were wanting with the $i[1]/../... This searches the xPath some-element for a modifier that is empty.
Please create element range index and attribute range index and use cts:search if you are familiar with marklogic it will be easy for you to write the query.

In tensorflow, can you define your own collection name?

I searched all resources in tensorflow's API documents and can't find any indication.
It seems when using get_variable(), I can put a specific name for collections term like:
x=tf.get_variable('x',[2,2],collections='my_scope')
but get only empty list when doing:
tf.get_collection('my_scope')
collectionS needs a list of collection name.
>>x = tf.get_variable('x',[2,2], collections=['my_scope'])
>>tf.get_collection('my_scope')
[<tensorflow.python.ops.variables.Variable at 0x10d8e1590>]
watch out that if you use it some other operations can have side effects.
like tf.all_variables() will not work and thus tf.initialize_all_variables() also will not see your variable. One way to fix it is to specify the default collection too.
>>x = tf.get_variable('x',[2,2], collections=['my_scope', tf.GraphKeys.VARIABLES])
but things starts to get tedious.
Actually, you can use tf.get_collection to create a new collection:
tf.get_collection('my_collection')
var = tf.get_variable('var', [2, 2], initializer=tf.constant_initializer())
tf.add_to_collection('my_collection', var)

How to dynamically search/replace text with update in XQuery (exist-db)

My intention is to somehow clean source files automatically. How to do that in XQuery? (I am not interested in reconstructing the document in memory and storing it as a new one.) It is quite easy to do something similar in case of short and simple elements addressed directly, however, I can’t figure out how to do that dynamically for all the text nodes, if possible.
I would expect something like this could work:
update replace $div[contains(., 'chapter')] with replace(., 'chapter', 'Chapter')
This throws err:XPDY0002 Undefined context sequence for 'self::node()' [source: String]
Apparently, there is a problem in addressing the context with . in the replacing function. But maybe I don’t understand the update thing in general. I am only inspired by the bottom of this article.
Expression to the right of with is independent from expression to the left. So an explicit node/context is needed on both part :
update replace $div[contains(., 'chapter')] with replace($div, 'chapter', 'Chapter')

avoiding XDMP-EXPNTREECACHEFULL and loading document

I am using marklogic 4 and I have some 15000 documents (each of around 10 KB). I want to load the entire content as a document ( and convert the total documents to a single csv file and output to HTTP output stream for downloading). While I load the documents this way:
let $uri := cts:uri-match('products/documents/*.xml')
let $doc := fn:doc ($uri)
The xpath has some 15000 xmls. So fn:doc throws an error XDMP-EXPNTREECACHEFULL.
Is there any workaround for this? I cannot increase tree cache size in admin console because the number of xml files in products/documents/*.xml may increase.
Thanks.
When you want to export large quantities of XML from MarkLogic, the best technique is to write the query so that results can stream, avoiding the expanded tree cache entirely. It is a very different style of coding, though: you'll have to avoid strong typing of any kind, and refactor your code to remove FLWOR expressions. You won't be able to test any of the code in cq or qconsole, either.
Take a look at http://blakeley.com/blogofile/2012/03/19/let-free-style-and-streaming/ for some tips on how to get there. At a minimum the code sample you posted would have to become:
doc(cts:uri-match('products/documents/*.xml'))
In passing I would try to rework that to avoid the *.xml part, because it will be slower than needed. Maybe something like this?
cts:search(
collection(),
cts:directory-query('products/documents/', 'infinity'))
If you need to test for something more than the directory, you could add a cts:and-query with some cts:element-query test.
For general information about this error, see the MarkLogic knowledge base article on XDMP-EXPNTREECACHEFULL

Resources