XQuery get data based on distinct values - xquery

My xml is like this:
full xml at inputxml
<course>
<reg_num>10616</reg_num>
<subj>BIOL</subj>
<crse>361</crse>
<sect>F</sect>
<title>Genetics and MolecularBiology</title>
<units>1.0</units>
<instructor>Russell</instructor>
<days>M-W-F</days>
<time>
<start_time>11:00AM</start_time>
<end_time>11:50</end_time>
</time>
<place>
<building>PSYCH</building>
<room>105</room>
</place>
</course>
I need to take distinct values for courses and return instructors that teach those courses.
This is my current code:
let $x := doc("reed.xml")/root/course
for $y in distinct-values($x/title)
let $z := $y/instructor
return ( {data($y)} ,{data($z)})
What am i doing wrong

In the code that you posted, the for loop is iterating over the sequence of distinct title string values.
You can't XPath into those strings.
Rather, you want to XPath into the $x courses and use the $y in each iteration to select those course elements that have the title equal to $y, then select it's instructor.
let $x := doc("reed.xml")/root/course
for $y in distinct-values($x/title)
let $z := $x[title = $y]/instructor
return ( {data($y)} ,{data($z)})

In XQuery 3.1 you can do this with group by.
for $course := doc("reed.xml")/root/course
group by $title := $course/title
return ( <title>{$title}</title> ,
<instructors>{$course/instructor}</instructors> )

Related

How to count equal values of the same element name [xQuery]

Here is an example:
`
<bracketQualifier>
<bracketSequenceNumber>1</bracketSequenceNumber>
</bracketQualifier>
<bracketQualifier>
<bracketSequenceNumber>1</bracketSequenceNumber>
</bracketQualifier>
<bracketQualifier>
<bracketSequenceNumber>1</bracketSequenceNumber>
</bracketQualifier>
`
What i need to do is if bracketSequenceNumber holds the same value trow an exception.
Number of elements is N there can be more than 3. How can i achieve this using xquery.
I tried something like this without success and i cant say i understand xQuery completley:
`
let $count := ( for $bracketSequenceNumber in $bracketQualifier/bracketSequenceNumber return count(bracketQualifier[#bracketSequenceNumber = $bracketSequenceNumber ])) return
if($GDSN_PriceSyncPriceSegmentTM/value ='250' and $count >= 1) then something
`
You can use
if (count(//bracketSequenceNumber)
!= count(distinct-values(//bracketSequenceNumber) then ...
If you actually want to find the duplicates, use group by in XQuery 3.1 to process each group of equal values and test whether the group size is 2 or more.

limit the result using cts:collection-match in marklogic

I have 2 collections in collections.
/test/1
This has 10 documents with id
test1_1
test1_2
.....
test1_10
/test/2
This has 20 documents with id as follows
test2_1
test2_2
.....
test2_20
Query:
let $result := cts:collection-match("/test/*")
let $id :=(
fn:distinct-values(
for $doc in fn:collection(result)
order by $doc//timestamp descending
return $doc//timestamp/text()
)[1 to 5]
)
return $id
I want to return the top 5 documents from each collection descending order of timestamp but it returns only 5 documents not 10 i.e. top 5 from each collection
When $result is a sequence of greater than one item, writing for $doc in fn:collection($result) aggregates all of the documents from multiple collections into a single sequence. You need to iterate over collections first, then iterate over the values in each collection, ordered and limited.
let $collections := cts:collection-match("/test/*")
let $id :=
for $collection in $collections
return
fn:distinct-values(
for $doc in fn:collection($collection)
order by $doc//timestamp descending
return $doc//timestamp/string()
)[1 to 5]
return $id

How to get the difference URI between 2 collection in marklogic

I have two collections .
I need to get difference uri between the two collections based on the file name.
Example Scenario :
Collection 1:
/data/1.xml
/data/2.xml
/data/3.xml
collection 2:
/test/1.xml
/test/2.xml
/test/3.xml
/test/4.xml
/test/5.xml
output:
/data/1.xml
/data/2.xml
/data/3.xml
/test/4.xml
/test/5.xml
Using set delta as David suggests is correct, but you will need to first generate filename keys for the URIs. Maps are very helpful for this, which make it easy to keep a filename key associated with its original URI.
First generate two maps with filename keys and URI values. Then, using set delta on the map keys, generate a sequence of diff filenames. Then get the URIs for those filenames from its source map:
let $x := (
"/data/1.xml",
"/data/2.xml",
"/data/3.xml")
let $y := (
"/test/1.xml",
"/test/2.xml",
"/test/3.xml",
"/test/4.xml",
"/test/5.xml")
let $map-x := map:new($x ! map:entry(tokenize(., '/')[last()], .))
let $map-y := map:new($y ! map:entry(tokenize(., '/')[last()], .))
let $keys-diff-y := map:keys($map-y)[not(. = map:keys($map-x))]
let $diff-y := map:get($map-y, $keys-diff-y)
return ($x, $diff-y)
Two alternative solutions:
First approach, put each of the items in the map, using a consistent key(substring after the last slash), and then select the first item in the map for each key:
let $x := (
"/data/1.xml",
"/data/2.xml",
"/data/3.xml")
let $y := (
"/test/1.xml",
"/test/2.xml",
"/test/3.xml",
"/test/4.xml",
"/test/5.xml")
let $intersection := map:map()
let $_ := ($x, $y) ! (
let $key := tokenize(., "/")[last()]
return
map:put($intersection, $key, (map:get($intersection, $key), .))
)
return
for $key in map:keys($intersection)
for $uri in map:get($intersection, $key)[1]
order by number(replace($uri, ".*/(\d+).xml", '$1'))
return $uri
Second approach, ensure that only the first item is set for a given key:
let $x := (
"/data/1.xml",
"/data/2.xml",
"/data/3.xml")
let $y := (
"/test/1.xml",
"/test/2.xml",
"/test/3.xml",
"/test/4.xml",
"/test/5.xml")
let $intersection := map:map()
let $_ := ($x, $y) ! (
let $key := tokenize(., "/")[last()]
return
if (fn:exists(map:get($intersection, $key))) then ()
else map:put($intersection, $key, .)
)
return
for $uri in map:get($intersection, map:keys($intersection))
order by number(replace($uri, ".*/(\d+).xml", '$1'))
return $uri
The order by is optional, but with maps you may not have consistent ordering of the keys. Customize for what you need (i.e. /data/ uris first, and then /test/ uris, etc), or remove if you don't care about the order of the URIs.
Set notation:
Delta: (Yields 'a')
let $c1 := ('a', 'b', 'c')
let $c2 := ('b', 'c', 'd')
return $c1[fn:not(.= $c2)]
Intersection: (Yields b,c)
let $c1 := ('a', 'b', 'c')
let $c2 := ('b', 'c', 'd')
return $c1[.= $c2]
Reverse c1 and c2 for the other two permutations.
For a good read, check out this post from Dave Cassel

Xquery where clause in nested for loop

I have 2 xml documents that I need to reference in a nested FLWOR statement.
MJH source.xml
<data-set>
<record>
<table>CONTACT</table>
<column>AGGREGATE_REPORT</column>
<change>Dropped</change>
<logic>Related to agCenter</logic>
</record>
<record>
<table>QNR_DESIGN_TEMPLATE</table>
<column>LOGO1</column>
<change>Dropped</change>
<logic>Related to agCenter</logic>
</record>
</data-set>
This is the outer document and I need to extract the 'table' and 'column' values per row.
MVReport - a collection of report documents that contain SQL statements.
....
<xml-property name="queryText"><![CDATA[select v.*
from poi_visit_details_v v
where v.site_id=?
and v.std_code_id in ('15', 'TelephoneMonitoringVisit')
and to_char(v.actual_visit_date, 'yyyy-mm-dd') = substr(?, 0, 10)
and (reports_access_pk.site_access(v.SITE_ID, ?) = 'Y'
or reports_access_pk.superuser_access(?) = 'Y')
]]></xml-property>
<xml-property name="queryText"><![CDATA[select v.*
from poi_visit_details_v v
where v.site_id=?
and v.std_code_id in ('15', 'TelephoneMonitoringVisit')
and to_char(v.actual_visit_date, 'yyyy-mm-dd') = substr(?, 0, 10)
and (reports_access_pk.site_access(v.SITE_ID, ?) = 'Y'
or reports_access_pk.superuser_access(?) = 'Y')
]]></xml-property>
....
This collection is the inner loop where I need to find all SQL statements that contain the word "select" and the 'table', 'column' values returned from the outer loop.
Here is my xquery.
The issue I have is how to reference the outer 'table','column' values in the inner loop's "where ... and ..." clauses?
for $target in doc("H:\MJH source.xml")/data-set/record
let $tTable := $target/table
let $tColumn := $target/column
for $sql in collection("MVReport") //*:xml-property[#name = "queryText"]
where $sql contains text "select"
and $sql contains $tTable
and $sql contains $tColumn
let $report := base-uri()
return <report><name>{$sql/base-uri()}</name><sql>{$sql}</sql></report>
Thanks for your help.

Selecting element attributes using boolean operators (and, or, not) in XQuery

I want to filter the results of the following XQuery:
for $units in $data//*[#id = $ids and (#xref = $a or #xref = $b)]/#id
How do I select the elements with a matching #id value and and an #xref attribute that matches either $a or $b, but not both $a and $b.
Both $a and $b are node sets with tokenized values, which both act as identifiers. The wanted identifier may be stored in either $a or $b.
My intention is that if $a matches the #xref attribute, the query does not check for $b.
Best would be to use xor. Too bad there is no xor...
But != and ne does the same:
for $units in $data//*[#id = $ids and ((#xref = $a) ne (#xref = $b))]/#id
And it should be faster to use eq instead of =, for single values:
for $units in $data//*[#id = $ids and ((#xref eq $a) ne (#xref eq $b))]/#id

Resources