Distinct attribute names - xquery

With XQuery I want to select a special value from every article within a product.
What I currently have:
Input XML (extract):
<product type="product" id="2246091">
<product type="article">
<attribute identifier="EXAMPLE1" type="BOOLEAN">0</attribute>
<attribute identifier="EXAMPLE2" type="BOOLEAN">1</attribute>
</product>
<product type="article">
<attribute identifier="EXAMPLE1" type="BOOLEAN">1</attribute>
<attribute identifier="EXAMPLE2" type="BOOLEAN">1</attribute>
</product>
<product type="article">
<attribute identifier="EXAMPLE1" type="BOOLEAN">0</attribute>
<attribute identifier="EXAMPLE2" type="BOOLEAN">1</attribute>
</product>
</product>
XQuery:
for $i in //product
[#type = 'product'
and #id = '2246091']
//attribute
[#type='BOOLEAN'
and #identifier= ('EXAMPLE1', 'EXAMPLE2') ]
where $i = '1'
return $i
This returns me every attribute element from every article under a product where the content is '1' and its identifier is EXAMPLE1 or EXAMPLE2.
It could be, that in article 1 there is the same attribute identifier (e.g. EXAMPLE1) as in article 2.
What I get:
<?xml version="1.0" encoding="UTF-8"?>
<attribute identifier="EXAMPLE2" type="BOOLEAN">1</attribute>
<attribute identifier="EXAMPLE1" type="BOOLEAN">1</attribute>
<attribute identifier="EXAMPLE2" type="BOOLEAN">1</attribute>
<attribute identifier="EXAMPLE2" type="BOOLEAN">1</attribute>
I tried to add a distinct-values around my for loop, but this will return me only '1'.
What I would like is to get every attribute only once:
<attribute identifier="EXAMPLE2" type="BOOLEAN">1</attribute>
<attribute identifier="EXAMPLE1" type="BOOLEAN">1</attribute>

It sounds as if what you want is to see one attribute element for each distinct value of the identifier attribute found among the attribute elements whose content is 1. (Or, slightly more challengingly, one attribute element for each set of equivalent attribute elements, where equivalence is defined by deep-equals().)
The distinct-values() function isn't helping you here, because it coerces any input nodes into simple values (here, 1).
If matching on the identifier attribute suffices
If the identifier attribute suffices to establish equivalence among the elements, then something like the following should suffice (not tested):
let $ones := //product[#type = 'product'
and #id = '2246091']
//attribute[#type='BOOLEAN'
and #identifier =
('EXAMPLE1', 'EXAMPLE2') ],
$ids := distinct-values($ones/#identifier)
for $id in $ids
return ($ones[#identifier = $id])[1]
If a more general equivalence test is needed
If #identifier does not suffice to establish equivalence for your purposes, you will have to do something more complicated; in the general case one way to do it would be to write a function of two arguments (I'll call it local:equivalent()) which returns true iff the two arguments are equivalent for your purposes. Then write a second function to accept a sequence of items and remove duplicates from the sequence (where 'being a duplicate' means 'returning true on local:equivalent()). Something like this might work as a first approximation (not tested):
(: dedup#1: remove duplicates from a sequence :)
declare function local:dedup(
$items as item()*
) as xs:boolean {
local:dedup($items, ())
};
(: dedup#2: work through the input sequence one
by one, removing duplicates and accumulating
non-duplicates. Cost is n^2 / 2. :)
declare function local:dedup(
$in as item()*,
$out as item()*
) as xs:boolean {
if (empty($in))
then $out
else let $car := head($in)
return if (some $i in $in
satisfies
local:equivalent($i, $car))
then local:dedup(tail($in), $out)
else local:dedup(tail($in), ($car, $out))
};
(: equivalent#2: true iff arguments are equivalent :)
declare function local:equivalent(
$x, $y : item()
) as xs:boolean {
// determine application-specific equivalence
// however you like ...
deep-equal($x, $y)
};
(: Now do the work :)
let $ones := //product[#type = 'product'
and #id = '2246091']
//attribute[#type='BOOLEAN'
and #identifier =
('EXAMPLE1', 'EXAMPLE2') ]
return local:dedup($ones)
Those comfortable with higher-order functions will want to go a step further and remove the dependency on having a function named local:equivalent by allowing both local:dedup functions to accept an additional argument providing the equivalence function.

Related

Why does the first xquery statement work but the second doesn't

I'm new to xquery. Why does the first xquery statement work but the second doesn't? The first has multiple xml elements at the second level and the first has multiple at the top level.
let $payload := <root><foo>bar</foo></root>
return
<root>
{
if (exists($payload/foo)) then
<prop>
<key>mykey</key>
<value>bar</value>
</prop>
else
""
}
</root>
and this doesn't
let $payload := <root><foo>bar</foo></root>
return
<root>
{
if (exists($payload/foo)) then
<key>mykey</key>
<value>bar</value>
else
""
}
</root>
You will need to wrap your element into parentheses and separate the elements with commas as there is no enclosing root element:
if (exists($payload/foo)) then (
<key>mykey</key>,
<value>bar</value>
) else (
""
)
A single element constructor is a valid expression:
<key>mykey</key>
A sequence of two element constructors (with no separator) is not:
<key>mykey</key>
<value>bar</value>
Note that this differs from XSLT, where such element constructors (called literal result elements) always appear as part of a "sequence constructor", and a sequence constructor allows multiple elements to appear.
Since you are starting to learn XQuery it might interest you that instead of returning an empty string you can also return an empty sequence instead.
let $payload := <root><foo>bar</foo></root>
return
<root>
{
if (exists($payload/foo))
then (
<key>mykey</key>,
<value>bar</value>
)
else ()
}
</root>

XQuery comparing variables as numbers

I want to produce an XQuery that outputs the price of three CDs, less the price of the cheapest CD. I have produced a user defined function sumtwomax() which takes three integer parameters and produces the sum of the largest two numbers. This works when I supply it with numbers.
But I have a problem when supplying it with variables from a FLWOR expression. Could someone please help me with this?
Here is my XML code:
<items>
<item>
<code>c002</code>
<price>10</price>
<rating>5</rating>
</item>
<item>
<code>c006</code>
<price>15</price>
<rating>3</rating>
</item>
<item>
<code>c004</code>
<price>12</price>
<rating>3</rating>
</item>
<item>
<code>c001</code>
<price>7</price>
<rating>5</rating>
</item>
<item>
<code>c003</code>
<price>10</price>
<rating>4</rating>
</item>
<item>
<code>c005</code>
<price>8</price>
<rating>4</rating>
</item>
</items>
And here is my XQuery:
declare namespace myfn = "http://www.brookes.ac.uk/P00601/xquery/functions";
declare function myfn:sumtwomax( $first, $sec, $third) { sum(($first, $sec, $third)) - min (($first, $sec, $third))};
for $d in doc("shop.xml") //item
let $price1 := xs:integer($d/price/data()[$d/code="c002"])
let $price2 := xs:integer($d/price/data()[$d/code="c004"])
let $price3 := xs:integer($d/price/data()[$d/code="c006"])
return
myfn:sumtwomax($price1, $price2, $price3)
This produces '0 0 0' as a result, instead of the desired value '15'. Could someone please help with this?
A FLWOR expression doesn't make any sense in this context: It's only evaluating a single item at a time (and returning one result per item), but the expressions you run inside that loop only return useful results if they can search through all the items, to be able to fill out all three variables (as opposed to only the variable associated with the single item being iterated over by the loop at that time).
Consider instead iterating over items elements (of which there's only one), if you really want to make this a FLWOR:
declare function myfn:sumtwomax( $first, $sec, $third) { sum(($first, $sec, $third)) - min (($first, $sec, $third))};
for $d in //items
let $price1 := xs:integer($d/item[code="c002"]/price)
let $price2 := xs:integer($d/item[code="c004"]/price)
let $price3 := xs:integer($d/item[code="c006"]/price)
return myfn:sumtwomax($price1, $price2, $price3)
...or, removing the needless for entirely:
declare function myfn:sumtwomax( $first, $sec, $third) { sum(($first, $sec, $third)) - min (($first, $sec, $third))};
let $price1 := xs:integer(//item[code="c002"]/price)
let $price2 := xs:integer(//item[code="c004"]/price)
let $price3 := xs:integer(//item[code="c006"]/price)
return myfn:sumtwomax($price1, $price2, $price3)

XQuery – counting parent’s preceding siblings

I wish I could be able to count preceding siblings of the highest div in ePub (for a footnote). I need to pass the value to the attribute before passing notes through XSLT.
for $note in doc('/db/custom_jh/bukwor.xml')//tei:note[#place='bottom']
let $parent := count($note[preceding-sibling::tei:div[#n='1']])
let $update := update insert attribute att2 {$parent} into $note
return $note
Attempts with $note[preceding-sibling::tei:div[#n='1']] or $note[ancestor-or-self::tei:div[#n='1']] returns just 0 or the total sum of all the divs.
Something like <xsl:number level="any" select="tei:div[#n='1']/>" from XSLT, if possible.
UPDATE
The very minimal code for counting (still not working, returns only 6 × 1, should at least one 2:
for $note at $count in doc('/db/custom_jh/bukwor.xml')//tei:note[#place='bottom']
let $parent := count($note[ancestor-or-self::*/tei:div[#n='1']])
return $parent
I don't know about ePub format of XML and there is no sample XML provided so the requirement isn't clear, at least for me. But according to the title, you might want something like this :
let $parent := count($note/parent::*/preceding-sibling::tei:div[#n='1'])
basically counting preceding sibling tei:div from parent element of current $note, where the tei:div have n attribute value equals 1.
The whole example was slightly bad. Finally, I restructured the whole thing. At the moment, I do it this way:
let $chaps :=
(
let $countAll := count($doc//tei:note)
for $chapter at $count in $doc//tei:div[#n='1']
let $countPreceding := count($chapter/preceding::tei:div[#n='1']//tei:note[#place='bottom'])
let $params :=
<parameters>
<param name="footnoteNo" value="{$countPreceding}"/>
</parameters>
return
<entry name="OEBPS/chapter-{$count}.xhtml" type="xml">
{
transform:transform($chapter, doc("/db/custom_jh/xslt/style-web.xsl"), $params)
}
</entry>
)
The count($chapter/preceding::tei:div[#n='1']//tei:note[#place='bottom']) does the trick for me. (I need to collect all footnotes in one file and make backlinks to locations of their indexes in different files).

How to find the lowest common ancestor of two nodes in XQuery?

Suppose the input XML is
<root>
<entry>
<title>Test</title>
<author>Me</author>
</entry>
</root>
I would like to find the lowest common ancestor of title and author.
I tried the following code in BaseX:
let $p := doc('t.xq')//title,
$q := doc('t.xq')//author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())
return
$cla
But it returns nothing (blank output).
Your code works totally fine for me, apart from returning all common ancestors.
The Last Common Ancestor
Since they're returned in document order and the last common ancestor must also be the last node, simply extend with a [last()] predicate.
declare context item := document {
<root>
<entry>
<title>Test</title>
<author>Me</author>
</entry>
</root>
};
let $p := //title,
$q := //author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())[last()]
return
$cla
Files and Databases
If the query you posted does not return anything, you might be working on a file t.xq. intersect requires all nodes to be compared in the same database, each invocation of doc(...) on a file creates a new in-memory database. Either create a database in BaseX with the contents, or do something like
declare variable $doc := doc('t.xq');
and replace subsequent doc(...) calls by $doc (which now references a single in-memory database created for the file).
This is one possible way :
let $db := doc('t.xq'),
$q := $db//*[.//title and .//author][not(.//*[.//title and .//author])]
return
$q
brief explanation :
[.//title and .//author] : The first predicate take into account elements having descendant of both title and author.
[not(.//*[.//title and .//author])] : Then the 2nd predicate applies the opposite criteria to the descendant elements, meaning that overall we only accept the inner-most elements matching the first predicate criteria.
output :
<entry>
<title>Test</title>
<author>Me</author>
</entry>
I changed doc('t.xq') in front of the variables $p and $q with the variable $db as follows. Now it works (plus, I used the last() to have the last (lowest) common ancestor).
let
$db := doc('t.xq'),
$p := $db//title,
$q := $db//author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())[last()]
return $cla

How do I get BaseX to return multiple elements in a nested XQuery?

BaseX is complaining about a nested query of mine. I do not understand why it cannot return multiple lines like it did in the first query. The error says, "Expecting }, found >" and the > it is referring to is the > after name under trips. It works fine if the } is after the close-bracket for id, but obviously, that's not what I want. Here is the query:
for $u in doc("export.xml")/database/USERS/tuple
return
<user>
<login>{$u/USERNAME/text()}</login>
<email></email>
<name></name>
<affiliation></affiliation>
<friend></friend>
<trip>
{for $t in doc("export.xml")/database/TRIPS/tuple
where $t/ADMIN/text() = $u/USERNAME/text()
return
<id> {$t/ID/text()} </id>
<name> {$t/NAME/text()} </name> (: Error is here with <name> :)
<feature> {$t/FEATURE/text()} </feature>
<privacyFlag> {$t/PRIVACY/text() </privacyFlag>)
}
</trip>
</user>
If you want to return multiple items, you need to encapsulate them in a sequence ($item1, $item2, ..., $itemnN). In your case:
for $t in doc("export.xml")/database/TRIPS/tuple
where $t/ADMIN/text() = $u/USERNAME/text()
return (
<id> {$t/ID/text()} </id>,
<name> {$t/NAME/text()} </name>,
<feature> {$t/FEATURE/text()} </feature>,
<privacyFlag> {$t/PRIVACY/text() </privacyFlag>
)
But I'm unsure whether this will do what you expected, or if you actually want to create one element set per trip. Then, you'd also have a single trip element for result and are not required to return a sequence (this is also what's the case in the outer flwor-loop, here the <user/> element encapsulates to a single element):
for $t in doc("export.xml")/database/TRIPS/tuple
where $t/ADMIN/text() = $u/USERNAME/text()
return
<trip>
<id> {$t/ID/text()} </id>
<name> {$t/NAME/text()} </name>
<feature> {$t/FEATURE/text()} </feature>
<privacyFlag> {$t/PRIVACY/text() </privacyFlag>
</trip>

Resources