XQuery – counting parent’s preceding siblings - xquery

I wish I could be able to count preceding siblings of the highest div in ePub (for a footnote). I need to pass the value to the attribute before passing notes through XSLT.
for $note in doc('/db/custom_jh/bukwor.xml')//tei:note[#place='bottom']
let $parent := count($note[preceding-sibling::tei:div[#n='1']])
let $update := update insert attribute att2 {$parent} into $note
return $note
Attempts with $note[preceding-sibling::tei:div[#n='1']] or $note[ancestor-or-self::tei:div[#n='1']] returns just 0 or the total sum of all the divs.
Something like <xsl:number level="any" select="tei:div[#n='1']/>" from XSLT, if possible.
UPDATE
The very minimal code for counting (still not working, returns only 6 × 1, should at least one 2:
for $note at $count in doc('/db/custom_jh/bukwor.xml')//tei:note[#place='bottom']
let $parent := count($note[ancestor-or-self::*/tei:div[#n='1']])
return $parent

I don't know about ePub format of XML and there is no sample XML provided so the requirement isn't clear, at least for me. But according to the title, you might want something like this :
let $parent := count($note/parent::*/preceding-sibling::tei:div[#n='1'])
basically counting preceding sibling tei:div from parent element of current $note, where the tei:div have n attribute value equals 1.

The whole example was slightly bad. Finally, I restructured the whole thing. At the moment, I do it this way:
let $chaps :=
(
let $countAll := count($doc//tei:note)
for $chapter at $count in $doc//tei:div[#n='1']
let $countPreceding := count($chapter/preceding::tei:div[#n='1']//tei:note[#place='bottom'])
let $params :=
<parameters>
<param name="footnoteNo" value="{$countPreceding}"/>
</parameters>
return
<entry name="OEBPS/chapter-{$count}.xhtml" type="xml">
{
transform:transform($chapter, doc("/db/custom_jh/xslt/style-web.xsl"), $params)
}
</entry>
)
The count($chapter/preceding::tei:div[#n='1']//tei:note[#place='bottom']) does the trick for me. (I need to collect all footnotes in one file and make backlinks to locations of their indexes in different files).

Related

How to manipulate file-paths

I know this seems like a duplicate, and I am sure it more or less is ...
However, it really bugs me, and I cannot make anything of the posts before:
I am building a digital edition, utlizing TEI, XML, XSLT, (and probably existDB, maybe I switch to node/javascript).
I built a php-function that should transforme each file in a specified directory to html. (My xsl-file works well)
declare function app:XMLtoHTML-forAll ($node as node(), $model as map(*), $query as xs:string?){
let $ref := xs:string(request:get-parameter("document", ""))
let $xml := doc(concat("/db/apps/BookOfOrders/data/edition/",$ref))
let $xsl := doc("/db/apps/BookOfOrders/resources/xslt/xmlToHtml.xsl")
let $params :=
<parameters>
{for $p in request:get-parameter-names()
let $val := request:get-parameter($p,())
where not($p = ("document","directory","stylesheet"))
return
<param name="{$p}" value="{$val}"/>
}
</parameters>
return
transform:transform($xml, $xsl, $params)
};
There is a list of files in the apps/BookofOrders/data/edition/ named FolioX.html, where x is the page-number. (I'll probably change names to [FolioNumber].xml, but that's not the issue)
I am trying to make a text slider (so that when I open the page, a page is presented and further buttons are created, and I can slide to the right and read the rest of the pages).
I have a table of content, that is linked to the transformed files:
declare function app:toc($node as node(), $model as map(*)) {
for $doc in collection("/db/apps/BookOfOrders/data/edition")/tei:TEI
return
<li>{document-uri(root($doc))}</li>
};
I guess I am wondering on how to change the link inside to for example Folio29 to Folio30.
Can I take a part of the provided link and make the destination of a link flexible, similar but not identical to what I did in the toc-function above?
I'd be really happy if anyone could point me in the right direction.
Given an expression like document-uri(root($doc)) (perhaps more simply util:document-name($doc), since you're using eXist) that returns the path to (or filename of) the document ending in "FolioX", you just need to isolate X, then cast it as an integer so you can perform addition/subtraction on the value:
document-uri(root($doc)) => substring-after("Folio") => xs:integer()
util:document-name($doc) => substring-after("Folio") => xs:integer()
Then add 1, and you've got your next document. Subtract one, and you've got the previous
However, this could lead to broken links: Folio0 or Folio98 (assuming there are only 97). To avoid this, you might want to retrieve determine the complete list of Folios, find the current position, and then never hit 0 or 98:
let $this-folio := $doc => util:document-name()
let $collection := $doc => util:collection-name()
let $all-folios := xmldb:get-child-resources($collection)
(: sort the filenames using UCA Numeric collation to ensure Folio2 < Folio10.
: see https://www.w3.org/TR/xpath-functions-31/#uca-collations :)
let $sorted-folios := $all-folios => sort("?numeric=yes")
let $this-folio-n := index-of($all-folios, $this-folio)
let $prev-folio := if ($this-folio-n gt 1) then "Folio" || $this-folio-n - 1 else ()
let $next-folio := if ($this-folio-n lt count($all-folios)) then "Folio" || $this-folio-n + 1 else ()
return
<nav>
<prev>{$prev-folio}</prev>
<this>{"Folio" || $this-folio-n}</this>
<next>{$next-folio}</next>
</nav>

How to tidy-up Processing Instructions in Marklogic

I have a content which is neither a valid HTML nor a XML in my legacy database. Considering the fact, it would be difficult to clean the legacy, I want to tidy this up in MarkLogic using xdmp:tidy. I am currently using ML-8.
<sub>
<p>
<???†?>
</p>
</sub>
I'm passing this content to tidy functionality in a way :
declare variable $xml as node() :=
<content>
<![CDATA[<p><???†?></p>]]>
</content>;
xdmp:tidy(xdmp:quote($xml//text()),
<options xmlns="xdmp:tidy">
<assume-xml-procins>yes</assume-xml-procins>
<quiet>yes</quiet>
<tidy-mark>no</tidy-mark>
<enclose-text>yes</enclose-text>
<indent>yes</indent>
</options>)
As a result it returns :
<p>
<? ?†?>
</p>
Now this result is not the valid xml format (I checked it via XML validator) due to which when I try to insert this XML into the MarkLogic it throws an error saying 'MALFORMED BODY | Invalid Processing Instruction names'.
I did some investigation around PIs but not much luck. I could have tried saving the content without PI but this is also not a valid PI too.
That is because what you think is a PI is in fact not a PI.
From W3C:
2.6 Processing Instructions
[Definition: Processing instructions (PIs) allow documents to contain
instructions for applications.]
Processing Instructions
[16] PI ::= '' Char*)))?
'?>'
[17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' |
'l'))
So the PI name cannot start with ? as in your sample ??†
You probably want to clean up the content before you pass it to tidy.
Like below:
declare variable $xml as node() :=
<content><![CDATA[<p>Hello <???†?>world</p>]]></content>;
declare function local:copy($input as item()*) as item()* {
for $node in $input
return
typeswitch($node)
case text()
return fn:replace($node,"<\?[^>]+\?>","")
case element()
return
element {name($node)} {
(: output each attribute in this element :)
for $att in $node/#*
return
attribute {name($att)} {$att}
,
(: output all the sub-elements of this element recursively :)
for $child in $node
return local:copy($child/node())
}
(: otherwise pass it through. Used for text(), comments, and PIs :)
default return $node
};
xdmp:tidy(local:copy($xml),
<options xmlns="xdmp:tidy">
<assume-xml-procins>no</assume-xml-procins>
<quiet>yes</quiet>
<tidy-mark>no</tidy-mark>
<enclose-text>yes</enclose-text>
<indent>yes</indent>
</options>)
This would do the trick to get rid of all PIs (real and fake PIs)
Regards,
Peter

Access variable from within itself in XQuery

I'm wondering whether in XQuery it is possible to access some elements in a variable from within the variable itself.
For instance, if you have a variable with several numbers and you want to sum them all up inside the variable itself. Can you do that with only one variable? Consider something like this:
let $my_variable :=
<my_variable_root>
<number>5</number>
<number>10</number>
<sum>{sum (??)}</sum>
</my_variable_root>
return $my_variable
Can you put some XPath expression inside sum() to access the value of the preceding number elements? I've tried $my_variable//number/number(text()), //number/number(text()), and preceding-sibling::number/number(text()) - but nothing worked for me.
You cannot do that. The variable is not created, till everything in it is constructed.
But you can have temporary variables in the variable
Like
let $my_variable :=
<my_variable_root>{
let $numbers := (
<number>5</number>,
<number>10</number>
)
return ($numbers, <sum>{sum ($numbers)}</sum>)
} </my_variable_root>
Or (XQuery 3):
let $my_variable :=
<my_variable_root>{
let $numbers := (5,10)
return (
$numbers ! <number>{.}</number>,
<sum>{sum ($numbers)}</sum>)
} </my_variable_root>
This is not possible, neither by using the variable name (it is not defined yet), nor using the preceding-sibling axis (no context item bound).
Construct the variable's contents in a flwor-expression instead:
let $my_variable :=
let $numbers := (
<number>5</number>,
<number>10</number>
)
return
<my_variable_root>
{ $numbers }
<sum>{ sum( $numbers) }</sum>
</my_variable_root>
return $my_variable
If you have similar patterns multiple times, consider writing a function; using XQuery Update might also be an alternative (but does not seem to be the most reasonable one to me, both in terms of readability and probably performance).

How to find the lowest common ancestor of two nodes in XQuery?

Suppose the input XML is
<root>
<entry>
<title>Test</title>
<author>Me</author>
</entry>
</root>
I would like to find the lowest common ancestor of title and author.
I tried the following code in BaseX:
let $p := doc('t.xq')//title,
$q := doc('t.xq')//author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())
return
$cla
But it returns nothing (blank output).
Your code works totally fine for me, apart from returning all common ancestors.
The Last Common Ancestor
Since they're returned in document order and the last common ancestor must also be the last node, simply extend with a [last()] predicate.
declare context item := document {
<root>
<entry>
<title>Test</title>
<author>Me</author>
</entry>
</root>
};
let $p := //title,
$q := //author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())[last()]
return
$cla
Files and Databases
If the query you posted does not return anything, you might be working on a file t.xq. intersect requires all nodes to be compared in the same database, each invocation of doc(...) on a file creates a new in-memory database. Either create a database in BaseX with the contents, or do something like
declare variable $doc := doc('t.xq');
and replace subsequent doc(...) calls by $doc (which now references a single in-memory database created for the file).
This is one possible way :
let $db := doc('t.xq'),
$q := $db//*[.//title and .//author][not(.//*[.//title and .//author])]
return
$q
brief explanation :
[.//title and .//author] : The first predicate take into account elements having descendant of both title and author.
[not(.//*[.//title and .//author])] : Then the 2nd predicate applies the opposite criteria to the descendant elements, meaning that overall we only accept the inner-most elements matching the first predicate criteria.
output :
<entry>
<title>Test</title>
<author>Me</author>
</entry>
I changed doc('t.xq') in front of the variables $p and $q with the variable $db as follows. Now it works (plus, I used the last() to have the last (lowest) common ancestor).
let
$db := doc('t.xq'),
$p := $db//title,
$q := $db//author,
$cla := ($p/ancestor-or-self::node() intersect $q/ancestor-or-self::node())[last()]
return $cla

Updating counter in XQuery

I want to create a counter in xquery. My initial attempt looked like the following:
let $count := 0
for $prod in $collection
let $count := $count + 1
return
<counter>{$count }</counter>
Expected result:
<counter>1</counter>
<counter>2</counter>
<counter>3</counter>
Actual result:
<counter>1</counter>
<counter>1</counter>
<counter>1</counter>
The $count variable either failing to update or being reset. Why can't I reassign an existing variable? What would be a better way to get the desired result?
Try using 'at':
for $d at $p in $collection
return
element counter { $p }
This will give you the position of each '$d'. If you want to use this together with the order by clause, this won't work since the position is based on the initial order, not on the sort result. To overcome this, just save the sorted result of the FLWOR expression in a variable, and use the at clause in a second FLWOR that just iterates over the first, sorted result.
let $sortResult := for $item in $collection
order by $item/id
return $item
for $sortItem at $position in $sortResult
return <item position="{$position}"> ... </item>
As #Ranon said, all XQuery values are immutable, so you can't update a variable. But if you you really need an updateable number (shouldn't be too often), you can use recursion:
declare function local:loop($seq, $count) {
if(empty($seq)) then ()
else
let $prod := $seq[1],
$count := $count + 1
return (
<count>{ $count }</count>,
local:loop($seq[position() > 1], $count)
)
};
local:loop($collection, 0)
This behaves exactly as you intended with your example.
In XQuery 3.0 a more general version of this function is even defined in the standard library: fn:fold-right($f, $zero, $seq)
That said, in your example you should definitely use at $count as shown by #tohuwawohu.
Immutable variables
XQuery is a functional programming language, which involves amongst others immutable variables, so you cannot change the value of a variable. On the other hand, a powerful collection of functions is available to you, which solves lots of daily programming problems.
let $count := 0
for $prod in $collection]
let $count := $count + 1
return
<counter>{$count }</counter>
let $count in line 1 defines this variable in all scope, which are all following lines in this case. let $count in line 3 defines a new $count which is 0+1, valid in all following lines within this code block - which isn't defined. So you indeed increment $count three times by one, but discard the result immediatly.
BaseX' query info shows the optimized version of this query which is
for $prod in $collection
return element { "counter" } { 1 }
The solution
To get the total number of elements in $collection, you can just use
return count($collection)
For a list of XQuery functions, you could have a look at the XQuery part of functx which contains both a list of XQuery functions and also some other helpful functions which can be included as a module.
Specific to MarkLogic you can also use xdmp:set. But this breaks functional language assumptions, so use it conservatively.
http://docs.marklogic.com/5.0doc/docapp.xqy#display.xqy?fname=http://pubs/5.0doc/apidoc/ExsltBuiltins.xml&category=Extension&function=xdmp:set
For an example of xdmp:set in real-world code, the search parser https://github.com/mblakele/xqysp/blob/master/src/xqysp.xqy might be helpful.
All the solution above are valid but I would like to mention that you can use the XQuery Scripting extension to set variable values:
variable $count := 0;
for $prod in (1 to 10)
return {
$count := $count + 1;
<counter>{$count}</counter>
}
You can try this example live at http://www.zorba-xquery.com/html/demo#twh+3sJfRpHhZR8pHhOdsmqOTvQ=
Use xdmp:set instead of the below query
let $count := 0
for $prod in (1 to 4)
return ( xdmp:set($count,number($count+1)) ,<counter>{$count }</counter>
I think you are looking for something like:
XQUERY:
for $x in (1 to 10)
return
<counter>{$x}</counter>
OUTPUT:
<counter>1</counter>
<counter>2</counter>
<counter>3</counter>
<counter>4</counter>
<counter>5</counter>
<counter>6</counter>
<counter>7</counter>
<counter>8</counter>
<counter>9</counter>
<counter>10</counter>

Resources