Restrict the expansion of search for cts:element-value-search- Marklogic - xquery

I am using cts:element-value-search to search document based on the parameter. While the query works fine I want it to search elements in immediate children only.
For Ex. My document tree looks like this
document1.xml
<Person>
<FirstName>Johnson</FirstName>
<LastName>W</LastName>
<EmailAddress>john#abc.com</EmailAddress>
<Neighbour>
<FirstName>Mathew</FirstName>
<LastName>Long</LastName>
</Neighbour>
</Person>
document2.xml
<Person>
<FirstName>Mathew</FirstName>
<LastName>W</LastName>
<EmailAddress>john#abc.com</EmailAddress>
<Neighbour>
<FirstName>Anderson</FirstName>
<LastName>Long</LastName>
</Neighbour>
</Person>
and my query is
cts:search(
/Person,
cts:and-query((
cts:element-value-query(xs:QName("FirstName"), concat('*', "son", '*'),
("wildcarded", "case-insensitive", "whitespace-sensitive", "punctuation-sensitive"))
)),
()
)
This returns both the documents because for the first document it matches <FirstName>Johnson</FirstName>
and for the second document it matches
<FirstName>Anderson</FirstName>
which is at the lower level.
I do not want the second result and want the query to search at level 1 only.
Any help is appreciated.

You can scope sub-queries to particular container elements or properties, using cts:element-query, and cts:json-property-scope-query. Those will trim down sub-query matches to a particular ancestor.
cts:element-query(xs:QName('Person'), cts:element-value-query(xs:QName('Firstname', ...)) will not be enough however, as Neighbour/Firstname is also a descendant.
Simplest option is to use a path range index on Person/Firstname. That is by far the most straight-forward solution here.
HTH!

Related

Custom sorting issue in MarkLogic?

xquery version "1.0-ml";
declare function local:sortit(){
for $i in ('a','e','f','b','d','c')
order by $i
return
element Result{
element N{1},
element File{$i}
}
};
local:sortit()
the above code is sample, I need the data in this format. This sorting function is used multiple places, and I need only element N data some places and only File element data at other places.
But the moment I use the local:sortit()//File. It removes the sorting order and gives the random output. Please let me know what is the best way to do this or how to handle it.
All these data in File element is calculated and comes from multiple files, after doing all the joins and calculation, it will be formed as XML with many elements in it. So sorting using index and all is not possible here. Only order by clause can be used.
XPath expressions are always returned in document order.
You lose the sorting when you apply an XPath to the sequence returned from that function call.
If you want to select only the File in sorted order, try using the simple mapping operator !, and then plucking the F element from the item as you are mapping each item in the sequence:
local:sortit() ! File
Or, if you like typing, you can use a FLWOR to iterate over the sequence and return the File:
for $result in local:sortit()
return $result/File

Search for value in element with specific attribute

<elemA>
<elemZ mytype ="1">
<myval>100</myval>
</elemZ>
<elemZ mytpe ="2">
<myval>200</myval>
</elemZ>
</elemA>
Using cts:queries, I would want to find myval of 100 in elemZ with mytype = "1". I do not see any cts query that allows cts:element-query and also filtering on attribute. Even an cts:and-query does not appear helpful.
Without attribute constraint, element-value-query and two element-queries would work easy.
cts:search(doc(), (some cts query?))
First try this simple xpath -- validate that it works, and that its not sufficiently performant for you.
//elemZ[#mytype=1]/myval[. = "100" ]
That should return myval element children of elemZ with mytype=1 and myval text content = "100"
To do better (with cts:query) will need those 'dreaded' other cts:queries and possibly some range indexes.
Roughly : (untested)
search(doc(),
cts:element-query(xs:QName("elemZ"),
cts:and-query((
cts:element-attribute-value-query(xs:QName("elemZ"), xs:QName("mytype"), "1"),
cts:element-value-query(xs:QName("myval"), "100") )) ) )
Recommend you start with the simplest expression that does anything then one by one add constraints.
In your case, it's conceivable that the query optimizer will optimize the simple xpath into the appropriate cts query. Worth trying and measuring performance. I personally like to start with a basic xpath and then only work my way up to a cts:query as needed.

Numbers in cts:word query in Marklogic

I have a cts:word-query which is having number as the text value.
cts:search(fn:doc(),cts:word-query("226"))
This query will fetch results matching to only 226 in the documents. But I need to get the documents which contain 0026 also.
Example:
This is abc.xml
<a>
<b>00226</b>
</a>
This is abc1.xml
<a>
<b>226</b>
</a>
If I give the query as cts:search(fn:doc(),cts:word-query("226")), it will fetch only abc1.xml and if the query is cts:search(fn:doc(),cts:word-query("00226")), it will fetch only abc.xml.
But I need to get both the documents, irrespective of leading zeros.
Simplest way would be to use a wild card character (*) and add the wildcarded option
cts:search(fn:doc(),cts:word-query("*226", ('wildcarded')))
EDIT:
Although this matches the example documents, as Kishan points out in the comments, the wildcard also matches unwanted documents (e.g. containing "226226").
Since range indexes are not an option in this case because the data is mixed, here is an alternative hack:
cts:search(
fn:doc(),
cts:word-query(
for $lead in ('', '0', '00', '000')
return $lead || "226"))
Obviously, this depends on how many leading zeros there can be and will only work if this is known and limited.
You can add an element range index on the element <b> in the database with scalar type int or long, then you do the following query, it should return both documents:
let $query := cts:element-range-query(xs:QName("b"),"=",00226)
return cts:search(fn:doc(),$query)

XQuery Create where clause based on xml structure as a kind of dynamic where clause

this is about XQuery - I am using MarkLogic as Database.
I have data as in the following example:
<instrument name="myTest1" id="test1">
<daten>
<daily>
<day date="2016-02-05">
<screener>
<column name="i1">
<value>1</value>
<bg>red</bg>
</column>
<column name="i2">
<value>1</value>
<fg>lime</bg>
</column>
<column name="i4">
<fg>black</bg>
</column>
</screener>
</day>
</daily>
</daten>
</instrument>
I have many instruments, and each one has an entry for each day in the daily element, and inside screener, there can be manz columns, all with different names. Some screeners include more columns than others. Each column can include a value element, a bg element and a fg element.
I want to search for instruments that fullfill specific criteria about what kind of columns do have children with specific values. Example: I want a sequence of all instruments, that for a given day, have a value 1 for column i1 and that have a fg black for column i2
Since I have many different of those conditions, I would not like to hardcode them in XQuery where clauses. I did that for a few and it works, but the code gets a lot of duplications and is hard to maintain.
My question is, is it possible to build a where clause in a FLOWR statement programatically, meaning, based on another xml structure, which could look like this:
<searchpatterns>
<pattern name="test1">
<c>
<name>i1</name>
<element>value</element>
<value>1</value>
</c>
<c>
<name>i2</name>
<element>fg</element>
<value>red</value>
<modifier>not</modifier>
</c>
</pattern>
</searchpatterns>
which would find those instruments, where the screener has a column i1 which itself has a value of 1, and also it must not have column i2 with a fg of red.
When I do it the normal way I query my date like this:
for $res in doc()/instrument
where $res/daten/daily/day[#date="2016-02-05"]/screener/column[#name="i1"]/value/text()="1"
and res/daten/daily/day[#date="2016-02-05"]/screener/column[#name="i2"]/fg/text()!="red"
This kind of where clause I want to generate based on an XML structure.
I did some research of the MarkLogic inbuilt cts:search function and a lot of stuff around it but it seems to be for something else (more user interactive searching)
If you have a hint to point me in the right direction, if what I want is even possible, I would very much appreciate it.Thanks!
The doc()/instrument XPath asks for every document with an instrument element and then filters those documents.
Where possible, it's usually better in MarkLogic to model the documents so you can use the indexes to retrieve as few documents as possible. It's also usually better to use cts:search() instead of XPath to generate the sequence so you are working directly with the indexes.
In this case, you might consider using the values of the name attribute as elements instead of the generic "column." You could then generate a cts:element-query that matches the name containing a cts:element-value-query that matches the value within the name.
Hoping that helps,
Yes, this can be achieved programmatically. If you want to check whether an element satisifes a test for every item in a sequence, the every ... satisfies construct comes to mind. So in this case it could be:
for $res in doc()/instrument
where every $pattern in $searchpatterns/pattern/c satisfies (
let $equal := $res/daten/daily/day[#date="2016-02-05"]/screener/column[#name = $pattern/name]/*[name() = $pattern/element] = $pattern/value
return if ($pattern/modifier = "not") then not($equal) else $equal
)
return $res
So every $pattern will be checked. I assume the modifier element is supposed to modify the equal construct. So we first check if the element satisfies the equal condition and the we check whether the modifier element is equal to not. Of course, applying the same idea could also be used to implement other modifiers as well.

An XDMP-NOTANODE error using xquery in marklogic

I'm getting the XDMP-NOTANODE error when I try to run an XQuery in MarkLogic. When I loaded my xml documents I loaded meta data files with them. I'm a student and I don't have experience in XQuery.
error:
[1.0-ml] XDMP-NOTANODE: (err:XPTY0019) $article/article/front/article-meta/title-group/article-title -- xs:untypedAtomic("
") is not a node
Stack Trace
At line 3 column 77:
In xdmp:eval("(for $article in fn:distinct-values(/article/text()) &#1...", (), <options xmlns="xdmp:eval"><database>4206169969988859108</database> <root>C:\mls-projects\pu...</options>)
$article := xs:untypedAtomic("
")
1. (for $article in fn:distinct-values(/article/text())
2.
3. return (fn:distinct-values($article/article/front/article-meta/title-group/article-title)
4.
5.
Code:
(
for $article in fn:distinct-values(/article/text())
return (
fn:distinct-values($article/article/front/article-meta/title-group/article-title/text())
)
)
Every $article is bound to an atomic value (fn:distinct-values() returns a sequence of atomic values). Then you try to apply a path expression (using the / operator) on $article. Which is forbidden, as the path operator requires its LHS operator to be nodes.
I am afraid your code does not make sense enough for me to suggest you an actual solution. I can only pinpoint where the error is.
Furthermore, using text() at the end of a path is most of the time a bad idea. And if /article is a complex document, it is certainly not what you want. One of the text nodes you select (most likely the first one) is simply one single newline character.
What do you want to achieve?
Your $article variable is bound to an atomic value, not a node() from the article document. You can only use an XPath axis on a node.
When you apply the function distinct-values() in the for statement, it returns simple string values, not the article document or nodes from it.
You can probably make things work by using the values in a predicate filter like this:
for $article-text in fn:distinct-values(/article/text())
return
fn:distinct-values(/article[text()=$article-text]/front/article-meta/title-group/article-title/text())
Note: The above XQuery should avoid the XDMP-NOTANODE error, but there are likely easier (and more efficient) solutions for achieving your goal. If you were to post a sample of your document and describe what you are trying to achieve, we could suggest alternatives.
Bit of a wild guess, but you have two distinct-values in your code. That makes me think you want a unique list of articles, and then finally a unique list of article-title's. I would hope you already have unique articles in your database, unless you are explicitly attempting to de-duplicate them.
In case you just want the overall unique list of article titles, I would do something like:
distinct-values(
for $article in collection()/article
return
$article/front/article-meta/title-group/article-title
)
HTH!

Resources