Xquery too much information returned from the function distinctive-terms - xquery

I'm trying to find the distinctive terms from a text using xquery in marlogic. When I use distinctive terms I got the following results
<cts:term id="146927407734894632" val="1142" score="292352" confidence="0.990467" fitness="1">
<cts:element-word-query>
<cts:element xmlns:rec="http://xxx.xxx/xxx">rec:Text</cts:element>
<cts:text xml:lang="en">ram</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>diacritic-insensitive</cts:option>
<cts:option>stemmed</cts:option>
<cts:option>unwildcarded</cts:option>
</cts:element-word-query>
</cts:term>
This is too much information for me. I'm only interested in <cts:text xml:lang="en">ram</cts:text>. Is there a way to just get the results of this line?

You can pull the cts:text out from the results with XPath:
cts:distinctive-terms(...)//cts:text

Related

Gremlin/Tinkerpop - is there a way to add metadata to a union step so I know which query the resulting traversal came from?

This is a little strange, but I have a situation where it'd be beneficial for me to know which traversal an element came from.
For a simple example, something like this:
.union(
select('parent').out('contains'), //traversal 1
select('parent2').out('contains') //traversal 2
)
.dedup()
.project('id','traversal')
.by(id())
.by( //any way to determine which traversal it came from? or if it was in both? )
Edit: One thing I found is that I can use Map with Group/By to get partly there:
.union(
select('parent').out('contains')
.map(group().by(identity()).by(constant('t1'))),
select('parent2').out('contains')
.map(group().by(identity()).by(constant('t2'))),
)
.dedup() //Dedup isn't gonna work here because each hashmap will be different.
.project('id','traversal')
.by( //here I can't figure out how to read a value from the hashmap inline )
The above query without the project/by piece returns this:
[{v[199272505353083909]: 't1'}, {v[199272515180338177]: 't2'}]
Or is there a better way to do this?
Thanks!
One simple approach might be to just fold the results. If you get back an empty list you will know you did not find any on that "branch":
g.V('44').
union(out('route').fold().as('a').project('res','branch').by().by(constant('b1')),
out('none').fold().as('b').project('res','branch').by().by(constant('b2')))
which yields
{'res': [v[8], v[13], v[20], v[31]], 'branch': 'b1'}
{'res': [], 'branch': 'b2'}
UPDATED after discussion in comments to include an alternative approach that uses nested union steps to avoid the project step inside the union. I still think I prefer the project approach unless the performance when measured is not good.
g.V('44').
union(local(union(out('route').fold(),constant('b1')).fold()),
local(union(out('none').fold(),constant('b2')).fold()))
which yields
[[v[8], v[13], v[20], v[31]], 'b1']
[[], 'b2']

Adobe AEM Querybuilder Debugger - Multiple Paths and Multiple Nodenames

I am using querybuilder debugger and want to do a search where "nodename=.pdf OR nodename=.doc*" and "path=/content/dam/1 OR path=/content/dam/2".
I have been trying to find an example but no luck on the web. What I have below is not quite right - just wondering what I am missing.
The query does work but there is a huge difference in the amount of time that it runs when compared with when I just query using one nodename instead of 2.
Thanks in advance,
Jerry
type=dam:asset
mainasset=true
1_group.p.or=true
1_group.1.nodename=*.pdf
1_group.2.nodename=*.doc*
2_group.p.or=true
2_group.1_path=/content/dam/1
2_group.2_path=/content/dam/2
p.limit=-1
orderby=path
I thought maybe something as simple as this might work but no luck....
type=dam:asset
mainasset=true
group.p.or=true
group.1_nodename=*.doc*
group.1_path=/content/dam/1
group.2_nodename=*.doc*
group.2_path=/content/dam/2
group.3_nodename=*.pdf
group.3_path=/content/dam/1
group.4_nodename=*.pdf
group.4_path=/content/dam/2
p.limit=-1
orderby=path
Try splitting your query if this won't affect the behaviour you're trying to achieve.
path=/content/dam/1
type=dam:asset
mainasset=true
group.1.nodename=*.pdf
group.2.nodename=*.doc*
p.limit=-1
orderby=path
path=/content/dam/2
type=dam:asset
mainasset=true
group.1.nodename=*.pdf
group.2.nodename=*.doc*
p.limit=-1
orderby=path

Assign query using 'match()' to subgraph

I have a JanusGraph database with a graph structure as follows:
(Paper)<-[AuthorOf]-(Author)
I'm want to use Gremlin's match clause to query the data and assign the results to a subgraph. This is what I have so far:
g.V().match(
__.as('a').has('Paper','paperTitle', 'The name of my paper'),
__.as('a').in('AuthorOf').outV().as('b')).
select('b').values()
This query returns what I want, the Authors of the paper I'm for which I'm searching. However, I want to assign the results to a subgraph so I can export it using:
sg.io(IoCore.graphml()).writeGraph("/home/ubuntu/myresults.graphml")
Previously, I've achieved this with a different query structure like this:
sg = g.V().has('paperTitle', 'The name of my paper').
inE('AuthorOf').subgraph('sg1').
outV().
cap('sg1').
next()
Is there away to achieve the same results using the 'match()' statement?
After a little trial and error I was able to create a working solution:
sg = g.V().match(
__.as('a').has('Paper','paperTitle', 'ladle pouring guide'),
__.as('a').inE('AuthorOf').subgraph('sg').outV().as('b')).
cap('sg').next()
At first, I was trying to use the 'select' statement to isolate the subgraph. After reviewing the documentation on 'subgraph' and learning more about side-effects in gremlin I realized it wasn't necessary.

Gremlin order on Map results

I have the below query:
g.V('1')
.union(
out('R1')
.project('timestamp', 'data')
.by('completionDate')
.by(valueMap().by(unfold()))
out('R2')
.project('timestamp', 'data')
.by('endDate')
.by(valueMap().by(unfold()))
)
How can I order the UNION results by timestamp?
I've tried using ".order().by('timestamp')" but this only works on traversals and UNION returns a MAP object.
Here are a couple of ways to approach it. First, you could just use your code as-is and then order() by the "timestamp":
g.V('1').
union(out('R1').
project('timestamp', 'data').
by('completionDate').
by(valueMap().by(unfold())),
out('R2').
project('timestamp', 'data').
by('endDate').
by(valueMap().by(unfold()))).
order().by(select('timestamp'))
Note the difference is to select() the key from the Map that you want to sort on. Versions after 3.4.5 will work more as you expect and you can simply do by('timestamp') for a Map as well as an Element.
I think that a more readable approach however would be to go with this approach:
g.V('1').
out('R1','R2').
project('timestamp', 'data').
by(coalesce(values('endDate'), values('completionDate'))).
by(valueMap().by(unfold())).
order().by(select('timestamp'))
You might need to enhance the by(coalesce(...)) depending on the nature of your schema, but hopefully you get the idea of what I'm trying to do there.

getting the PDG graph

I'm developing a plugin in Frama-C and I want to get the Pdg graph from the C program.
I tried the Db.Pdg.get which takes "kernel_function" type while I have a "funcdec" type. I didn't know how I fix that.
Any help?
You can use Globals.Functions.get fdec.Cil_types.svar to convert a fundec fdec into a kernel_function. Calling Db.Pdg.get is then the right way to go.

Resources