Aggregation of XML elements (sum, grouping and unique values) - xquery

I am new in xquery. Can you please suggest xquery to get the below output xml.
Input xml :
<elem count="4" name="ABC">
<elem count="3" name="VAl_1">
<elem count="1" name="VAl_2">
</elem>
<elem count="5" name="PQR">
<elem count="2" name="VAl_1">
<elem count="3" name="Val_3">
</elem>
Output xml:
<elem count="9" name="ABC,PQR">
<elem count="5" name="VAl_1">
<elem count="1" name="VAl_2">
<elem count="3" name="VAl_3">
</elem>
I have parent node with count and name as attr. When I combine the parent node. count values should get added for parent as well as for children if they having same name.
This should be done in recursive way for any no of parents.

I made a few assumptions, but the following solves at least the problem as stated:
declare variable $input :=
<input>
<elem count="4" name="ABC">
<elem count="3" name="VAl_1"/>
<elem count="1" name="VAl_2"/>
</elem>
<elem count="5" name="PQR">
<elem count="2" name="VAl_1"/>
<elem count="3" name="Val_3"/>
</elem>
</input>/*;
<elem count="{ sum($input/#count) }"
name="{ string-join(distinct-values($input/#name), ',') }">
{
for $name in distinct-values($input/elem/#name)
let $grp := $input/elem[#name eq $name]
return
<elem count="{ sum($grp/#count) }" name="{ $name }"/>
}
</elem>
The first half of the code is just declaring a variable with the example input. The query itself is an element template using simple sum() and distinct-values() call to provision both attributes.
The interesting part is the for loop. By looping over distinct values for #name, then for each of them selecting all elem with the same value, it actually makes a grouping of the elems using the key #name.

Related

Why does the first xquery statement work but the second doesn't

I'm new to xquery. Why does the first xquery statement work but the second doesn't? The first has multiple xml elements at the second level and the first has multiple at the top level.
let $payload := <root><foo>bar</foo></root>
return
<root>
{
if (exists($payload/foo)) then
<prop>
<key>mykey</key>
<value>bar</value>
</prop>
else
""
}
</root>
and this doesn't
let $payload := <root><foo>bar</foo></root>
return
<root>
{
if (exists($payload/foo)) then
<key>mykey</key>
<value>bar</value>
else
""
}
</root>
You will need to wrap your element into parentheses and separate the elements with commas as there is no enclosing root element:
if (exists($payload/foo)) then (
<key>mykey</key>,
<value>bar</value>
) else (
""
)
A single element constructor is a valid expression:
<key>mykey</key>
A sequence of two element constructors (with no separator) is not:
<key>mykey</key>
<value>bar</value>
Note that this differs from XSLT, where such element constructors (called literal result elements) always appear as part of a "sequence constructor", and a sequence constructor allows multiple elements to appear.
Since you are starting to learn XQuery it might interest you that instead of returning an empty string you can also return an empty sequence instead.
let $payload := <root><foo>bar</foo></root>
return
<root>
{
if (exists($payload/foo))
then (
<key>mykey</key>,
<value>bar</value>
)
else ()
}
</root>

What “Attribute node cannot follow non-attribute node in element content” tells me

one-attr.xml
<requestConfirmation xmlns="http://example/confirmation">
<trade>
<amount>
<currency id="settlementCurrency">USD</currency>
<referenceAmount>StandardISDA</referenceAmount>
<cashSettlement>true</cashSettlement>
</amount>
</trade>
</requestConfirmation>
two-attr.xml
<requestConfirmation xmlns="http://example/confirmation">
<trade>
<cal>
<c>PRECEDING</c>
<bcs id="businessCenters">
<bc>USNY</bc>
<bc>GBLO</bc>
</bcs>
</cal>
<amount>
<currency id="settlementCurrency" currencyScheme="http://example/iso4">USD</currency>
<referenceAmount>StandardISDA</referenceAmount>
<cashSettlement>true</cashSettlement>
</amount>
</trade>
</requestConfirmation>
I use XQuery to transform the id attribute into element. There are only two documents like two-attr.xml out of 70K documents.
Apparently, the currency element already has value USD. I got below error in the ML QConsole when transforming two-attr.xml. I got very similar error in Oxygen.
XDMP-ATTRSEQ: (err:XQTY0024) $node/#*[fn:local-name(.) = $attr] -- Attribute node cannot follow non-attribute node in element content
My XQuery module:
declare namespace hof = "http://fc.fasset/function";
declare function hof:remove-attr-except
( $node as node()* ,
$newNs as xs:string ,
$keepAttr as xs:string*
) as node()*
{
for $attr in $node/#*
return
if (local-name($attr) = $keepAttr)
then (element {fn:QName ($newNs, name($attr))} {data($attr)})
else
$node/#*[name() = $keepAttr], hof:transform-ns-root-flatten($node/node(), $newNs, $keepAttr)
};
declare function hof:transform-ns-root-flatten
( $nodes as node()* ,
$newNs as xs:string ,
$keepAttr as xs:string*
) as node()*
{
for $node in $nodes
return
typeswitch($node)
case $node as element()
return (element { fn:QName ($newNs, local-name($node)) }
{ hof:remove-attr-except($node, $newNs, $keepAttr) }
)
case $node as document-node()
return hof:transform-ns-root-flatten($node/node(), $newNs, fn:normalize-space($keepAttr))
default return $node
};
(: let $inXML := doc("/fasset/bug/two-attr.xml") :)
let $inXML :=
let $inXML :=
<requestConfirmation xmlns="http://example/confirmation">
<trade>
<cal>
<c>PRECEDING</c>
<bcs id="businessCenters">
<bc>USNY</bc>
<bc>GBLO</bc>
</bcs>
</cal>
<amount>
<currency id="settlementCurrency" currencyScheme="http://example/iso4">USD</currency>
<referenceAmount>StandardISDA</referenceAmount>
<cashSettlement>true</cashSettlement>
</amount>
</trade>
</requestConfirmation>
let $input := $inXML/*[name() = name($inXML/*)]/*
let $ns := "schema://fc.fasset/execution"
let $root := "executionReport"
let $keep := "id"
return
element { fn:QName ($ns, $root) }
{ hof:transform-ns-root-flatten($input, $ns, $keep) }
Then I switch XSLT to transform two-attr.xml. Surprisingly, the XSLT transform is a success.
<xsl:param name="ns" as="xs:string">schema://fc.fasset/product</xsl:param>
<xsl:param name="attr" static="yes" as="xs:string*" select="'href', 'id'"/>
=================================
<xsl:template match="#*">
<xsl:choose>
<xsl:when test="local-name() = $attr">
<xsl:element name="{local-name()}" namespace="{$ns}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:when>
</xsl:choose>
</xsl:template>
The collective successful underlying transform is against the one-attr.xml model. Java|ML API, Oxygen, XSLT returns the same result:
<amount>
<currency>
<id>settlementCurrency</id>USD</currency>
<referenceAmount>StandardISDA</referenceAmount>
<cashSettlement>true</cashSettlement>
</amount>
Now here is the rub: it doesn’t look like a valid XML. For although I can get the currency text value
doc("/product/eqd/a7c1db2d.xml")//prod:trade//prod:amount/prod:currency/text()
, I expect below result to facilitate the search engine:
<executionReport xmlns="schema://fc.fasset/execution">
<trade>
<cal>
<c>PRECEDING</c>
<bcs>
<id>businessCenters</id>
<bc>USNY</bc>
<bc>GBLO</bc>
</bcs>
</cal>
<amount>
<currency>USD</currency>
<id>settlementCurrency</id>
<referenceAmount>StandardISDA</referenceAmount>
<cashSettlement>true</cashSettlement>
</amount>
</trade>
</executionReport>
Among the following solutions, the latest result is as below:
<executionReport xmlns="schema://fc.fasset/execution">
<cal>
<c>PRECEDING</c>
<bcs>
<bc>USNY</bc>
<bc>GBLO</bc>
</bcs>
<!-- Line9: id is out of <bcs> element and its context is completed lost! -->
<id>businessCenters</id>
</cal>
<amount>
<currency>USD</currency>
<!-- Line14: id is in the correct position! -->
<id>settlementCurrency</id>
<referenceAmount>StandardISDA</referenceAmount>
<cashSettlement>true</cashSettlement>
</amount>
</executionReport>
How can I get my XQuery and XSLT module work?
You can't create attributes after you have started creating child nodes. So, if you are transforming the #id into <id> then you have to do that AFTER you have copied the other attributes.
The shortest and easiest way to avoid the problem is to sort the attributes, ensuring that the ones that will be copied forward are processed first, then the ones that will be converted to elements.
You could achieve that by sorting the sequence of items returned from the hof:remove-attr-except() function, ensuring that the sequence has attributes and then the elements:
element { fn:QName ($newNs, local-name($node)) }
{ for $item in (hof:remove-attr-except($node, $newNs, $keepAttr))
order by $item instance of attribute() descending
return $item }
You could also just have two separate FLWOR with a where clause that processes the $keepAttr and then those that will be converted into elements:
declare function hof:remove-attr-except
( $node as node()* ,
$newNs as xs:string ,
$keepAttr as xs:string*
) as node()*
{
for $attr in $node/#*
where not(local-name($attr) = $keepAttr)
return
$node/#*[name() = $keepAttr], hof:transform-ns-root-flatten($node/node(), $newNs, $keepAttr)
,
for $attr in $node/#*
where local-name($attr) = $keepAttr
return
element {fn:QName ($newNs, name($attr))} {data($attr)}
};
But if you want those new elements to be outside of the original element, and you don't want to retain the attributes then I would change the processing of the element in your typeswitch, so that you call the function that converts those attributes into elements outside of the element constructor:
declare namespace hof = "http://fc.fasset/function";
declare function hof:attr-to-element
( $node as node()* ,
$newNs as xs:string ,
$keepAttr as xs:string*
) as node()*
{
for $attr in $node/#*
where local-name($attr) = $keepAttr
return
element {fn:QName ($newNs, name($attr))} {data($attr)}
};
declare function hof:transform-ns-root-flatten
( $nodes as node()* ,
$newNs as xs:string ,
$keepAttr as xs:string*
) as node()*
{
for $node in $nodes
return
typeswitch($node)
case $node as element()
return (element { fn:QName ($newNs, local-name($node)) }
{ hof:transform-ns-root-flatten($node/node(), $newNs, $keepAttr) }
,
hof:attr-to-element($node, $newNs, $keepAttr)
)
case $node as document-node()
return hof:transform-ns-root-flatten($node/node(), $newNs, fn:normalize-space($keepAttr))
default return $node
};
The code above produces the following output from the provided input XML:
<executionReport xmlns="schema://fc.fasset/execution">
<amount>
<currency>USD</currency>
<id>settlementCurrency</id>
<referenceAmount>StandardISDA</referenceAmount>
<cashSettlement>true</cashSettlement>
</amount>
</executionReport>

Cannot wrap element around XQuery output

Returning to XQuery after a long hiatus.
let $root := <a:b xmlns:a="ans" xmlns:c="cns"/>
for $prefix in in-scope-prefixes($root)[not(. = ('xml', 'xsi'))]
return
namespace-uri-for-prefix($prefix,$root) !
<param name="{$prefix}" value="{.}"/>
gives the expected
<param name="a" value="ans"/>
<param name="c" value="cns"/>
But if I try to wrap an element around that output like below nothing is returned
<parameters>{
let $root := <a:b xmlns:a="ans" xmlns:c="cns"/>
for $prefix in in-scope-prefixes($root)[not(. = ('xml', 'xsi'))]
return
namespace-uri-for-prefix($prefix,$root) !
<param name="{$prefix}" value="{.}"/>
}</parameters>
So what is wrong and how do I wrap the output in a parameters element?
Try letting a variable with the sequence of <param> elements, and return the <parameters> element referencing that variable instead of putting the FLWOR inline.
You shouldn't have to do that. I was able to generate the desired output in MarkLogic with your original code, but it seems necessary for eXist to generate the desired output.
xquery version "3.0" encoding "UTF-8";
let $root := <a:b xmlns:a="ans" xmlns:c="cns"></a:b>
let $params :=
for $prefix in in-scope-prefixes($root)[not(. = ('xml', 'xsi'))]
return
namespace-uri-for-prefix($prefix,$root) !
<param name="{$prefix}" value="{.}"/>
return
<parameters>{ $params }</parameters>

How to find the lowest common ancestor of two nodes in XQuery?

Suppose the input XML is
<root>
<entry>
<title>Test</title>
<author>Me</author>
</entry>
</root>
I would like to find the lowest common ancestor of title and author.
I tried the following code in BaseX:
let $p := doc('t.xq')//title,
$q := doc('t.xq')//author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())
return
$cla
But it returns nothing (blank output).
Your code works totally fine for me, apart from returning all common ancestors.
The Last Common Ancestor
Since they're returned in document order and the last common ancestor must also be the last node, simply extend with a [last()] predicate.
declare context item := document {
<root>
<entry>
<title>Test</title>
<author>Me</author>
</entry>
</root>
};
let $p := //title,
$q := //author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())[last()]
return
$cla
Files and Databases
If the query you posted does not return anything, you might be working on a file t.xq. intersect requires all nodes to be compared in the same database, each invocation of doc(...) on a file creates a new in-memory database. Either create a database in BaseX with the contents, or do something like
declare variable $doc := doc('t.xq');
and replace subsequent doc(...) calls by $doc (which now references a single in-memory database created for the file).
This is one possible way :
let $db := doc('t.xq'),
$q := $db//*[.//title and .//author][not(.//*[.//title and .//author])]
return
$q
brief explanation :
[.//title and .//author] : The first predicate take into account elements having descendant of both title and author.
[not(.//*[.//title and .//author])] : Then the 2nd predicate applies the opposite criteria to the descendant elements, meaning that overall we only accept the inner-most elements matching the first predicate criteria.
output :
<entry>
<title>Test</title>
<author>Me</author>
</entry>
I changed doc('t.xq') in front of the variables $p and $q with the variable $db as follows. Now it works (plus, I used the last() to have the last (lowest) common ancestor).
let
$db := doc('t.xq'),
$p := $db//title,
$q := $db//author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())[last()]
return $cla

How do I get BaseX to return multiple elements in a nested XQuery?

BaseX is complaining about a nested query of mine. I do not understand why it cannot return multiple lines like it did in the first query. The error says, "Expecting }, found >" and the > it is referring to is the > after name under trips. It works fine if the } is after the close-bracket for id, but obviously, that's not what I want. Here is the query:
for $u in doc("export.xml")/database/USERS/tuple
return
<user>
<login>{$u/USERNAME/text()}</login>
<email></email>
<name></name>
<affiliation></affiliation>
<friend></friend>
<trip>
{for $t in doc("export.xml")/database/TRIPS/tuple
where $t/ADMIN/text() = $u/USERNAME/text()
return
<id> {$t/ID/text()} </id>
<name> {$t/NAME/text()} </name> (: Error is here with <name> :)
<feature> {$t/FEATURE/text()} </feature>
<privacyFlag> {$t/PRIVACY/text() </privacyFlag>)
}
</trip>
</user>
If you want to return multiple items, you need to encapsulate them in a sequence ($item1, $item2, ..., $itemnN). In your case:
for $t in doc("export.xml")/database/TRIPS/tuple
where $t/ADMIN/text() = $u/USERNAME/text()
return (
<id> {$t/ID/text()} </id>,
<name> {$t/NAME/text()} </name>,
<feature> {$t/FEATURE/text()} </feature>,
<privacyFlag> {$t/PRIVACY/text() </privacyFlag>
)
But I'm unsure whether this will do what you expected, or if you actually want to create one element set per trip. Then, you'd also have a single trip element for result and are not required to return a sequence (this is also what's the case in the outer flwor-loop, here the <user/> element encapsulates to a single element):
for $t in doc("export.xml")/database/TRIPS/tuple
where $t/ADMIN/text() = $u/USERNAME/text()
return
<trip>
<id> {$t/ID/text()} </id>
<name> {$t/NAME/text()} </name>
<feature> {$t/FEATURE/text()} </feature>
<privacyFlag> {$t/PRIVACY/text() </privacyFlag>
</trip>

Resources