Prioritizing data while sorting in DbXML - xquery

Using XQuery in DBXML I want to prioritize some elements depending on multiple nodes set to certain values.
I want to be able to show three of this elements at the top and the rest below.
<properties>
<property>
<zip_code>5550</zip_code>
<agency>ABC</agency>
</property>
<property>
<zip_code>5550</zip_code>
<agency>DEF</agency>
</property>
<property>
<zip_code>5550</zip_code>
<agency>DEF</agency>
</property>
<property>
<zip_code>XYZ</zip_code>
<agency>ABC</agency>
</property>
</properties>
We are getting this XML in a property search page. Real search results will be having hundreds of records but we are only taking the first 10 records to display on the first page. Here we need to apply a sorting order which will show properties of "ABC" agency followed by zip code "XYZ" always on top. If the total result set does not have these agencies we can show them in the normal sorting order.

XQuery's flwor-expressions know order by, which can order by arbitrary values which can also be computed. Use an expression which decides if some product is a "top product" or not (resulting in a boolean value).
Afterwards split up result sequence to highlight only a number of results and limit to a total results.
let $highlighted := 3
let $total := 10
let $sorted :=
for $p in //property
(: order by highlighting predicate :)
order by $p/agency eq "ABC" and $p/zip_code eq "XYZ" descending
return $p
return (
(: first $highlighted elements as defined by predicates above :)
$sorted[ position() = (1 to $highlighted) ],
(: the other elements, `/.` forces sorting back to document order :)
$sorted[ position() = ($highlighted + 1 to $total) ]/.
)
The boolean expression can get arbitrary complex for being more precise on top products, like limiting to TVs or defining some minimum price.

Related

XQuery comparing variables as numbers

I want to produce an XQuery that outputs the price of three CDs, less the price of the cheapest CD. I have produced a user defined function sumtwomax() which takes three integer parameters and produces the sum of the largest two numbers. This works when I supply it with numbers.
But I have a problem when supplying it with variables from a FLWOR expression. Could someone please help me with this?
Here is my XML code:
<items>
<item>
<code>c002</code>
<price>10</price>
<rating>5</rating>
</item>
<item>
<code>c006</code>
<price>15</price>
<rating>3</rating>
</item>
<item>
<code>c004</code>
<price>12</price>
<rating>3</rating>
</item>
<item>
<code>c001</code>
<price>7</price>
<rating>5</rating>
</item>
<item>
<code>c003</code>
<price>10</price>
<rating>4</rating>
</item>
<item>
<code>c005</code>
<price>8</price>
<rating>4</rating>
</item>
</items>
And here is my XQuery:
declare namespace myfn = "http://www.brookes.ac.uk/P00601/xquery/functions";
declare function myfn:sumtwomax( $first, $sec, $third) { sum(($first, $sec, $third)) - min (($first, $sec, $third))};
for $d in doc("shop.xml") //item
let $price1 := xs:integer($d/price/data()[$d/code="c002"])
let $price2 := xs:integer($d/price/data()[$d/code="c004"])
let $price3 := xs:integer($d/price/data()[$d/code="c006"])
return
myfn:sumtwomax($price1, $price2, $price3)
This produces '0 0 0' as a result, instead of the desired value '15'. Could someone please help with this?
A FLWOR expression doesn't make any sense in this context: It's only evaluating a single item at a time (and returning one result per item), but the expressions you run inside that loop only return useful results if they can search through all the items, to be able to fill out all three variables (as opposed to only the variable associated with the single item being iterated over by the loop at that time).
Consider instead iterating over items elements (of which there's only one), if you really want to make this a FLWOR:
declare function myfn:sumtwomax( $first, $sec, $third) { sum(($first, $sec, $third)) - min (($first, $sec, $third))};
for $d in //items
let $price1 := xs:integer($d/item[code="c002"]/price)
let $price2 := xs:integer($d/item[code="c004"]/price)
let $price3 := xs:integer($d/item[code="c006"]/price)
return myfn:sumtwomax($price1, $price2, $price3)
...or, removing the needless for entirely:
declare function myfn:sumtwomax( $first, $sec, $third) { sum(($first, $sec, $third)) - min (($first, $sec, $third))};
let $price1 := xs:integer(//item[code="c002"]/price)
let $price2 := xs:integer(//item[code="c004"]/price)
let $price3 := xs:integer(//item[code="c006"]/price)
return myfn:sumtwomax($price1, $price2, $price3)

XQuery – counting parent’s preceding siblings

I wish I could be able to count preceding siblings of the highest div in ePub (for a footnote). I need to pass the value to the attribute before passing notes through XSLT.
for $note in doc('/db/custom_jh/bukwor.xml')//tei:note[#place='bottom']
let $parent := count($note[preceding-sibling::tei:div[#n='1']])
let $update := update insert attribute att2 {$parent} into $note
return $note
Attempts with $note[preceding-sibling::tei:div[#n='1']] or $note[ancestor-or-self::tei:div[#n='1']] returns just 0 or the total sum of all the divs.
Something like <xsl:number level="any" select="tei:div[#n='1']/>" from XSLT, if possible.
UPDATE
The very minimal code for counting (still not working, returns only 6 × 1, should at least one 2:
for $note at $count in doc('/db/custom_jh/bukwor.xml')//tei:note[#place='bottom']
let $parent := count($note[ancestor-or-self::*/tei:div[#n='1']])
return $parent
I don't know about ePub format of XML and there is no sample XML provided so the requirement isn't clear, at least for me. But according to the title, you might want something like this :
let $parent := count($note/parent::*/preceding-sibling::tei:div[#n='1'])
basically counting preceding sibling tei:div from parent element of current $note, where the tei:div have n attribute value equals 1.
The whole example was slightly bad. Finally, I restructured the whole thing. At the moment, I do it this way:
let $chaps :=
(
let $countAll := count($doc//tei:note)
for $chapter at $count in $doc//tei:div[#n='1']
let $countPreceding := count($chapter/preceding::tei:div[#n='1']//tei:note[#place='bottom'])
let $params :=
<parameters>
<param name="footnoteNo" value="{$countPreceding}"/>
</parameters>
return
<entry name="OEBPS/chapter-{$count}.xhtml" type="xml">
{
transform:transform($chapter, doc("/db/custom_jh/xslt/style-web.xsl"), $params)
}
</entry>
)
The count($chapter/preceding::tei:div[#n='1']//tei:note[#place='bottom']) does the trick for me. (I need to collect all footnotes in one file and make backlinks to locations of their indexes in different files).

How to find the lowest common ancestor of two nodes in XQuery?

Suppose the input XML is
<root>
<entry>
<title>Test</title>
<author>Me</author>
</entry>
</root>
I would like to find the lowest common ancestor of title and author.
I tried the following code in BaseX:
let $p := doc('t.xq')//title,
$q := doc('t.xq')//author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())
return
$cla
But it returns nothing (blank output).
Your code works totally fine for me, apart from returning all common ancestors.
The Last Common Ancestor
Since they're returned in document order and the last common ancestor must also be the last node, simply extend with a [last()] predicate.
declare context item := document {
<root>
<entry>
<title>Test</title>
<author>Me</author>
</entry>
</root>
};
let $p := //title,
$q := //author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())[last()]
return
$cla
Files and Databases
If the query you posted does not return anything, you might be working on a file t.xq. intersect requires all nodes to be compared in the same database, each invocation of doc(...) on a file creates a new in-memory database. Either create a database in BaseX with the contents, or do something like
declare variable $doc := doc('t.xq');
and replace subsequent doc(...) calls by $doc (which now references a single in-memory database created for the file).
This is one possible way :
let $db := doc('t.xq'),
$q := $db//*[.//title and .//author][not(.//*[.//title and .//author])]
return
$q
brief explanation :
[.//title and .//author] : The first predicate take into account elements having descendant of both title and author.
[not(.//*[.//title and .//author])] : Then the 2nd predicate applies the opposite criteria to the descendant elements, meaning that overall we only accept the inner-most elements matching the first predicate criteria.
output :
<entry>
<title>Test</title>
<author>Me</author>
</entry>
I changed doc('t.xq') in front of the variables $p and $q with the variable $db as follows. Now it works (plus, I used the last() to have the last (lowest) common ancestor).
let
$db := doc('t.xq'),
$p := $db//title,
$q := $db//author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())[last()]
return $cla

Find the total number of child elements?

<?xml version="1.0" encoding="UTF-8"?>
<root>
<author>
<name>A</name>
<book>Book1</book>
<book>Book2</book>
</author>
<author>
<name>B</name>
<age>45</age>
<book>Book3</book>
</author>
</root>
How do I write a XQuery to display the total number of books by an author?
One approach is:
let $max-books = max(/root/author/count(book))
return /root/author[count(book) = $max-books]
which will return all authors who have authored a maximum number of books.
As a one liner, this can be simplified to:
/root/author[count(book) = max(/root/author/count(book))]
Another way to do this is:
(for $author in /root/author
order by count($author/book) descending
return $author/name)[1]
which will return an author with the maximum number of books.
#Fox: should you not tag this question with "homework"? ;-)
#Oliver Hallam: IMHO, #Fox wants to list each author together with the amount of books by that author and not the author with the highest amount of books.
Your first query
let $max-books = max(/root/author/count(book))
return /root/author[count(book) = $max-books]
Contains a syntax error. You should use ":=" instead of only "=". Furthermore
#Fox: to find the solution, have a look at FLWOR expressions. You can use the "for" part to select each of the book nodes with an XPath expression and bind each node to a $book variable. Then use the "let" part to define two variables ($authorName and $amountOfBooks) using the $book as "starting point". Last, use the "return" part to define the output format you need for the resulting XML.

Xquery Top Function

How can I achieve something similar to the TOP function is SQL using Xquery? In other words, how can I select the top 5 elements of something with ties? This should be simple but I'm having trouble finding it with Google.
An example of some data I might want to format looks like this:
<?xml version="1.0"?>
<root>
<value>
<a>first</a>
<b>1</b>
</value>
<value>
<a>third</a>
<b>3</b>
</value>
<value>
<a>second</a>
<b>2</b>
</value>
<value>
<a>2nd</a>
<b>2</b>
</value>
</root>
I want to sort by b for all of the values and return a. To illustrate my problem, say I want to return the top two values with ties.
Thanks
For the provided source XML document:
<root>
<value>
<a>first</a>
<b>1</b>
</value>
<value>
<a>third</a>
<b>3</b>
</value>
<value>
<a>second</a>
<b>2</b>
</value>
<value>
<a>2nd</a>
<b>2</b>
</value>
</root>
To get the first two results "with ties" use:
let $vals :=
for $k in distinct-values(/*/*/b/xs:integer(.))
order by $k
return $k
return
for $a in /*/value[index-of($vals,xs:integer(b)) le 2]/a
order by $a/../b/xs:integer(.)
return $a
When this expression is evaluated, the wanted, correct result is produced:
<a>first</a>
<a>second</a>
<a>2nd</a>
Explanation:
We specify in $vals the sorted sequence of all distinct values of /*/*/b, used as integers. This is necessary, because the function distinct-values() is not guaranteed to produce its result sequence in any predefined order. Also, if we do not convert the values to xs:integer before sorting, they would be sorted as strings and this would generally produce incorrect results.
Then we select only those /*/value/a whose b-sibling's index in the sorted sequence of distinct integer b-values is less or equal to 2.
Finally, we need to sort the results by their b-sibling's integer values, because otherwise they will be selected in document order
Do note:
Only this solution at present produces correctly sorted results for any integer values of /*/*/b.
To filter a sequence to the first 5 items you use the fn:position() function:
$sequence[position() le 5]
Do note that when the sequence to filter is a node set resultion from an / step operation, the predicate works againts the last axis. So, maybe you would need to wrap that expression between parentesis.
But, to filter a "calculated sequence" (like sorting or tuples filter conditions), you need to use the full power of the FLWOR expression.
This XQuery:
(for $value in /root/value
order by $value/b
return $value/a)[position() le 2]
Output:
<a>first</a><a>second</a>
Note: This is a simple sort. The filter is the outer most expression because this allows lazy avaluation.
This XQuery:
for $key in (for $val in distinct-values(/root/value/b)
order by xs:integer($val)
return $val)[position() le 2]
return /root/value[b=$key]/a
Output:
<a>first</a><a>second</a><a>2nd</a>
Note: This order the keys first an then return all the result for the first two keys.
Edit: Added explicit integer casting.
You can use XPath on the Node with top limit as indexes
<one>
<two>a</two>
<two>b</two>
<two>c</two>
<two>d</two>
<two>e</two>
<two>f</two>
<two>g</two>
<two>h</two>
</one>
Then
$xml_data/one/two[ 1 to 5 ]
$xml_data/one/two[ some_number to fn:last() ]
With sample input
<one>
<two>a</two>
<two>b</two>
<two>c</two>
<two>d</two>
<two>e</two>
<two>f</two>
<two>g</two>
<two>h</two>
</one>
The following is one way to get the first five rows:
for $two at $index in /one/two
where $index <= 5
return $two
In the MySQL dialect it's not called TOP, but LIMIT. When you google for "limit xquery" you will find:
http://osdir.com/ml/text.xml.exist/2004-02/msg00214.html
and
http://osdir.com/ml/text.xml.exist/2004-08/msg00115.html

Resources