Need URI list in marklogic database using Xquery - xquery

After executing the below xquery, resulted in whole content from XML but my objective is to get the list of URI.
let $i := cts:search(//root,
cts:element-value-query(
xs:QName("no"),
"123"))
return ($i)

If all you want is the URI, use cts:uris(). The 3rd parameter lets you define a query that will filter the URIs list.
So with your example this would work:
cts:uris(
(),
(),
cts:element-value-query(
xs:QName("no"),
"123")
)

Related

How to convert string to XPATH in BaseX

How can i convert string into XPATH, below is the code
let $ti := "item/title"
let $tiValue := "Welcome to America"
return db:open('test')/*[ $tiValue = $ti]/base-uri()
Here is one way to solve it:
let $ti := "item/title"
let $tiValue := "Welcome to America"
let $input := db:open('test')
let $steps := tokenize($ti, '/')
let $process-step := function($input, $step) { $input/*[name() = $step] }
let $output := fold-left($input, $steps, $process-step)
let $test := $output[. = $tiValue]
return $test/base-uri()
The path string is split into single steps (item, title). With fold-left, all child nodes of the current input (initially db:open('test')) will be matched against the current step (initially, item). The result will be used as new input and matched against the next step (title), and so on. Finally, only those nodes with $tiValue as text value will be returned.
Your question is very unclear - the basic problem is that you've shown us some code that doesn't do what you want, and you're asking us to work out what you want by guessing what was going on in your head when you wrote the incorrect code.
I suspect -- I may be wrong -- that you were hoping this might somehow give you the result of
db:open('test')/*[item/title = $ti]/base-uri()
and presumably $ti might hold different path expressions on different occasions.
XQuery 3.0/3.1 doesn't have any standard way to evaluate an XPath expression supplied dynamically as a string (unless you count the rather devious approach of using fn:transform() to invoke an XSLT transformation that uses the xsl:evaluate instruction).
BaseX however has an query:eval() function that will do the job for you. See https://docs.basex.org/wiki/XQuery_Module

How do I get output for a XQuery in MarkLogic in a one line output?

Will elaborate - when I execute the following command :
let $value := xdmp:forest-status(
xdmp:forest-open-replica(
xdmp:database-forests(xdmp:database("Documents"))))
return $value
Above query returns a lot of information about the database "Documents" forest, like - forest-id, host-id, etc.
I only require that it should return only the "state" of my forest. How do I do that?
Use XPath to select what you want to return.
let $value := xdmp:forest-status(
xdmp:forest-open-replica(
xdmp:database-forests(xdmp:database("Documents"))))
return $value/*:state/text()
Also, no need for a FLWOR you could make it a one-liner:
xdmp:forest-status(
xdmp:forest-open-replica(
xdmp:database-forests(xdmp:database("Documents"))))/*:state/text()
Or you may find that using the arrow operator makes things easier to read instead of nested function calls and tons of parenthesis wrapping them:
(xdmp:database("Documents")
=> xdmp:database-forests()
=> xdmp:forest-open-replica()
=> xdmp:forest-status()
)/*:state/text()
The XML elements in the response are in the http://marklogic.com/xdmp/status/forest namespace. So, you would either need to declare the namespace (i.e. declare namespace f = "http://marklogic.com/xdmp/status/forest";) and use the prefix in your XPath (i.e. f:state), or just use the wildcard as I have done *:state

searching in multiple collections joined by common fileds in xquery marklogic

I have two collections('A' and 'B') with millions of transport insurance data documents. The two collections have four elements in common(customer-no, date-of-insurance, insurance-no,accident-number) and one element(license-no) exists only in one collection('A'). I want to extract all the documents that are present in both the collections and also have the element of collection'A'. I am able to retrieve all the customer-nos from 'A' with cts-search. Then I loop through each of these customer-nos to look for license-no in 'A'. It gives an empty sequence. But I know this is not possible. Could someone guide me with appropriate search logic?
let $col-A := cts:search(
doc(),
cts:and-query((
cts:collection-query('col-A'),
cts:element-value-query(xs:QName('abc:Acusno'), '*', (("wildcarded")))
)))
for $each in $col-A
let $col-B := cts:search(doc(),
cts:and-query((cts:collection-query('col-B'),
cts:element-value-query(xs:QName('abc:Bcusno'), $each)
)))
return $col-B
returns empty sequence
Your first cts:search is returning entire documents, which you are then passing in as argument into the value-query. You probably want to pass in just the value of abc:Acusno. You could do that with something like $each//abc:Acusno.
Your code is not using a very efficient approach though, and what if certain Acusno values occur multiple times?
I would recommend putting a range index on abc:Acusno, and using cts:values to pull up the unique values that match a given query. Then feed that entire list as one argument without any looping to a query against abc:Bcusno. You don't have to use a range index, and range query on Bcusno, but it could be useful to have that index anyhow. The code would then look something like this:
let $query :=
cts:and-query((
cts:collection-query('col-A'),
cts:element-query(xs:QName('abc:Acusno'), cts:true-query())
))
let $customerNrs :=
cts:values(
cts:element-reference(xs:QName("abc:Acusno")),
(),
(),
$query
)
return cts:search(
collection(),
cts:and-query((
cts:collection-query('col-B'),
cts:element-range-query(xs:QName('abc:Bcusno'), '=', $customerNrs)
))
)
Note: be careful when returning full search lists like this. You might want to paginate the response.
HTH!

Combined search query for a few xml documents

I have in each books directory /books/{book_id}/ a couple of xml documents.
/books/{book_id}/basic.xml and /books/{book_id}/formats.xml.
First one is
<document book_id="{book_id}">
<title>The book</title>
</document>
and the second is
<document book_id="{book_id}">
<format>a</format>
<format>b</format>
<format>c</format>
</document>
How can I find all books in /books/ directory with format eq 'a' and title eq *'book'* by one query? I have done one variant when I first finding all books by format by cts:search() and then filter the result in "for loop" by checking title in basic.xml file.
Thank you!
This question is listed as MarkLogic as well as xQuery. For completeness, I have included a MarkLogic solution that is a single statement:
let $res := cts:search(doc(), cts:and-query(
(
cts:element-word-query(xs:QName("title"), '*book*', ('wildcarded'))
,
cts:element-attribute-range-query(xs:QName("document"), xs:QName("book_id"), '=', cts:element-attribute-values(xs:QName("document"), xs:QName("book_id"), (), (), cts:element-value-query(xs:QName("format"), 'b')))
)
)
)
OK. Now lets break this down and have a look.
Note: This sample requires a single range index on the attribute book_id.
I tool advantage of the fact that you have the same attribute in the same namespace in both types of documents. This allowed the following:
I could use a single index
Then I used element-attribute-values for the list of book_ids
-- This was constrained by the 'format' element
The list of book_ids above was used to filter the books (range query)
Which was then further filtered by the title
This approach joins the two documents using a range index which is super-fast - especially on the integer value of the book_id
It should be noted that in this articular case, I was able to isolate the proper documents because title elements only exist in one type of document.
Now, lets look at a cleaner example of the same query.
(: I used a word-query so that I could do wildcarded searches for document with 'book' in the title. This is because your sample has a title 'The Book', yet you search for 'book' so I can olnly conclude that you meant to have wildcard searches :)
let $title-constraint := "*book*"
(: This could also be a sequence :)
let $format-constraint := "a"
(: used for the right-side of the element-range-query :)
let $format-filter := cts:element-attribute-values(xs:QName("document"), xs:QName("book_id"), (), (), cts:element-value-query(xs:QName("format"), $format-constraint))
(: final results :)
let $res := cts:search(doc(), cts:and-query((
cts:element-word-query(xs:QName("title"), $title-constraint, ('wildcarded'))
,
cts:element-attribute-range-query(xs:QName("document"), xs:QName("book_id"), '=', $format-filter)
)
) )
return $res
Maybe stating the obvious, the best approach would be to change the model so the format is in the same document as the title and can be matched by a single query.
If that's not possible, one alternative would be to turn on the uri lexicon in the database configuration (if it's not enabled already).
Assuming that the title is more selective than the format, something along the following lines might work.
let $title-uris := cts:uris((), (), cts:and-query((
cts:directory-query("/books/", "infinity"),
cts:element-word-query(xs:QName("title"), "book")
)))
let $title-dirs :=
for $uri in $title-uris
return fn:replace($uri, "/basic\.xml$", "/")
let $format-uris := cts:uris((), (), cts:and-query((
cts:directory-query($title-dirs),
cts:element-value-query(xs:QName("format"), "a")
)))
let $book-docs :=
for $uri in $format-uris
return fn:replace($uri, "/format\.xml$", "/basic.xml")
for $doc in fn:doc($book-docs)
return ... do something with the basic document ...
The extra cost beyond the document reads consists of two lookups in the uri lexicon and the string manipulation. The benefit is in reading only the documents that match.
In general, it's better at scale to use the indexes to match the relevant documents instead of reading the documents into memory and filtering out the irrelevant documents. The cts:uris() and cts:search() functions always match using the indexes first (and only filter when the search option is specified). XPaths optimize by matching with the indexes when possible but have to fallback to filtering for some predicates. Unless you're careful, it's usually better to limit XPaths to navigation of nodes in memory.
Hoping that helps,
How can I find all books in /books/ directory with format eq 'a' and title eq 'book' by one query?
Try:
doc('basic.xml')/document[#book_id='X']/title[contains(., 'book')]]
[doc('format.xml')/document[#book_id='X'][format = 'a']
The last predicate, if it turns empty, will result in the title to not be found. If it exists, then title will be returned.
You should, of course, replace X with your ID. And you can set the relative path to include the ID. If you have a set of ID's you want to go over, you can do this:
for $id in ('{book_id1}', '{book_id2}')
return
doc(concat($id, '/basic.xml'))/document[#book_id=$id]/title[contains(., 'book')]]
[doc(concat($id, '/format.xml'))/document[#book_id=$id][format = 'a']
You'll get the drift ;)
PS: I'm not sure if {...} is a legal URI pathpart, but I assume you'll replace it with something sensible. Otherwise, escape it with the appropriate percent-encoding.
I think I found better solution
let $book_ids := cts:values(
cts:element-attribute-reference(xs:QName("document"), xs:QName("book_id") ),
(),
("map"),
cts:and-query((
cts:directory-query(("/books/"), "infinity"),
cts:element-query(xs:QName("title"),"book")
))
)
return
cts:search(
/,
cts:and-query((
cts:element-attribute-value-query(xs:QName("document"), xs:QName("book_id"), map:keys($book_ids)),
cts:element-value-query(xs:QName("format"), "a"),
))
)

How to dynamically create a search query based on a set of quoted strings in MarkLogic

I have the following query, where i want to form a string of values from a list and i want to use that comma separated string as an or-query but it does not give any result, however when i return just the concatenated string it gives the exact value needed for the query.
The query is as follows:
xquery version "1.0-ml";
declare namespace html = "http://www.w3.org/1999/xhtml";
declare variable $docURI as xs:string external ;
declare variable $orQuery as xs:string external ;
let $tags :=
<tags>
<tag>"credit"</tag>
<tag>"bank"</tag>
<tag>"private banking"</tag>
</tags>
let $docURI := "/2012-10-22_CSGN.VX_(Citi)_Credit_Suisse_(CSGN.VX)__Model_Update.61198869.xml"
let $orQuery := (string-join($tags/tag, ','))
for $x in cts:search(doc($docURI)/doc/Content/Section/Paragraph, cts:or-query(($orQuery)))
let $r := cts:highlight($x, cts:or-query($orQuery), <b>{$cts:text}</b>)
return <result>{$r}</result>
The exact query that i want to run is :
cts:search(doc($docURI)/doc/Content/Section/Paragraph, cts:or-query(("credit","bank","private banking")))
and when i do
return (string-join($tags/tag, ','))
it gives me exactly what i require
"credit","bank","private banking"
But why does it not return any result in or-query?
The string-join step should not need to be string-join. That passes in a literal string. In xQuery, sequences are your friend.
I think you want to do something like this:
let $tags-to-search := ($tags/tag/text()!replace(., '^"|"$', '') ) (: a sequence of tags :)
cts:search(doc($docURI)/doc/Content/Section/Paragraph, cts:word-query($tags-to-search))
cts:word-query is the default query used for parameter 2 of search if you pass in a string. cts:word query also returns matches for any items in a sequence if presented with that.
https://docs.marklogic.com/cts:word-query
EDIT: Added the replace step for the quotes as suggested by Abel. This is specific to the data as presented by the original question. The overall approach remains the same.
Maybe do you need something like this
let $orQuery := for $tag in $tags/tag return cts:word-query($tag)
I used fn:tokenize instead it worked perfectly for my usecase
its because i was trying to pass these arguments from java using XCC api and it would not return anything with string values
xquery version "1.0-ml";
declare namespace html = "http://www.w3.org/1999/xhtml";
declare variable $docURI as xs:string external ;
declare variable $orQuery as xs:string external ;
let $input := "credit,bank"
let $tokens := fn:tokenize($input, ",")
let $docURI := "2012-11-19 0005.HK (Citi) HSBC Holdings Plc (0005.HK)_ Model Update.61503613.pdf"
for $x in cts:search(fn:doc($docURI), cts:or-query(($tokens)))
let $r := cts:highlight($x, cts:or-query(($tokens)), <b>{$cts:text}</b>)
return <result>{$r}</result>

Resources