Below is the XML structure. Its a specimen of my original structure, not the exact.
<Docs>
<Doc>
<Para>
<P n="1"><B>Constants : T</B>he value of pi is 3.14</P>
<P n="2">pi is a geometric term.</P>
</Para>
</Doc>
<Doc>
<Para>
<P n="1"><B>Constants : T</B>he value of g is 9.81 m/sqr of sec</P>
<P n="2">g is a acceleration due to gravity.</P>
</Para>
</Doc>
<Doc>
<Para>
<P n="1"><B>Constants : T</B>he value of c is 3.00 x 10 power 8 m/sec</P>
<P n="2">c is a speed of light in vacuum.</P>
</Para>
</Doc>
</Docs>
I have generated XML files programmatically. The B node has data Constant : T, where as it should be only Constants :. I have written an XQuery to do the necessary changes, but its not working as expected.
below is the XQuery - Version 1
for $x in doc('doc1')//Doc
where $x/Para/P[#n="1"]/B/text()="Constants : T"
return
let $p := $x/Para/P[#n="1"]
let $pText := concat("T", $p/text())
let $tag := <P n="1">{$pText}</P>
return
(
delete node $p,
insert node $tag as first into $x/Para,
insert node <B>Constants :</B> as first into $x/Para/P[#n="1"]
)
Version - 2 (Smaller, sweeter but not working !!!)
let $b := <B> Constants :</B>
for $x in doc('doc1')//Doc/Para[P[#n="1"]/B/text()="Constants : T"]/P[#n="1"]
return
(
replace value of node $x with concat("T", $x/text()),
insert node $b/node() as first into $x
)
Neither query is inserting <B>Constants : </B>. Can anybody help me on this?
The problem you are facing has to do with he nature of XQuery Updates. It uses a pending update list and applies all updates at the end of the query. The order of the update operation is well defined and is therefore independent from the order you give in your update statement. See some more information at https://docs.basex.org/wiki/Updates#Pending_Update_List.
So in your case, insert is applied before replace, so you are actually replacing your just already inserted node and thus overwrite this change.
To resolve this, I would just replace the text values and replace the B node. Therefore, both of your operations are independent from another and their order of execution can be changed without a problem.
let $b := <B> Constants :</B>
for $x in doc('doc1')//Doc/Para[P[#n="1"]/B/text()="Constants : T"]/P[#n="1"]
return
(
replace value of node $x/text() with concat("T", $x/text()),
replace node $x/B with $b
)
Related
This is my xml-file:
<?xml version="1.0" encoding="UTF-8"?>
<QQ:Envelope xmlns:QQ="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header>
<RR:ABCInfo xmlns:RR="http://abc.test.de/abc/SOAP-Header/1.0">
<RR:Version>2.2.2.2</RR:Version>
<RR:BuildRevision>3333</RR:BuildRevision>
<RR:BuildTimestamp>2019-01-01T00:00:00.000+02:00</RR:BuildTimestamp>
<RR:Start>2019-01-01T10:10:10.101+02:00</RR:Start>
<RR:End>2019-01-01T11:11:11.111+02:00</RR:End>
<RR:Something>2.222 sek.</RR:Something>
<RR:Anything/>
</RR:ABCInfo>
<work:WorkContext xmlns:work="http://test.com/">1234567890abcdefghijklmnopqrstuvwxyz</work:WorkContext>
</SOAP-ENV:Header>
<QQ:Body>
<TT:testA xmlns:TT="http://abc.test.de/XYZ/2.0.1" xmlns:RR="http://abc.test.de/abc/abcdefgh/1.0">
<TT:testB>
<TT:testC>
<TT:testD>
<TT:testE id="1234567" quellID="09876543">
<TT:data>urn:de:abc:test:whatever</TT:data>
<TT:changeDate>2019-02-02T02:02:02.020+02:00</TT:changeDate>
<TT:part1 listURI="urn:de:abc:codeliste:555" listVersionID="V12">
<code>555_777</code>
<name>Fischers Fritze</name>
</TT:part1>
<TT:piece2>Frische Fische fischen</TT:piece2>
<TT:begin>
<TT:date>20191231</TT:date>
</TT:begin>
</TT:testE>
</TT:testD>
</TT:testC>
</TT:testB>
</TT:testA>
</QQ:Body>
</QQ:Envelope>
I have a XQuery, where I have to return XML. The first element in the returning XML is "result". The other elements in the returning XML should be dynamically created.
I get 2 sequences from outside, though I have made 2 fix Sequences in the following example to test it.
In Sequence No 1 I get the names for the other elements.
In Sequence No 2 I get the related path to the element names in Sequence 1.
I open the XML file an read a path (there might be several elements, though in my example is only one.
Then I want to process this result in a loop and return the dynamic elements.
If I access the path with a fix value (variable $c in the following code) I get the correct value, but then I must know the elements in Sequence 1 and the path in Sequence 2.
If I concatenate the path then I get the value from all elements.
This is my XQuery Code:
declare namespace TT="http://abc.test.de/XYZ/2.0.1";
declare namespace QQ="http://schemas.xmlsoap.org/soap/envelope/";
declare function local:getValue($path) as xs:string {
if (fn:exists($path)) then
(
data($path)
) else (
""
)
};
let $a := ('part1', 'piece2', 'beginDate')
let $b := ('TT:part1/name','TT:piece2', 'TT:begin/TT:date')
for $x in doc("Test.XML")/QQ:Envelope/QQ:Body/TT:testA/TT:testB/TT:testC/TT:testD/TT:testE
return <result>
{
for $item at $ind in $a
let $c := local:getValue($x/TT:part1/name)
let $d := local:getValue($x || concat("/", $b[$ind]))
return element { $item } {$c, " --- ", $d}
}
</result>
Is there a possibility to access the path dynamically?
Thank you in advance.
http://www.xqueryfunctions.com/xq/functx_dynamic-path.html could help - at least did it help ME ;)
The functx:dynamic-path function dynamically evaluates a simple path expression. The function only supports element names and attribute names preceded by #, separated by single slashes. The names can optionally be prefixed, but they must use the same prefix that is used in the input document. It does not support predicates, other axes, or other node kinds. Note that most processors have an extension function that evaluates path expressions dynamically in a much more complete way.
This is the first time I've run into the Xquery (3.1) error Content for update is empty and a search on Google returns nothing useful.
If I run this simple query to identify nested /tei:p/tei:p:
for $x in $mycollection//tei:p/tei:p
return $x
I get XML fragments like the following:
<p xmlns="http://www.tei-c.org/ns/1.0"/>
<p xmlns="http://www.tei-c.org/ns/1.0">Histoires qui sont maintenant du passé (Konjaku monogatari shū). Traduction, introduction et
commentaires de Bernard Frank, Paris, Gallimard/UNESCO, 1987 [1re éd. 1968] (Connaissance de
l'Orient, Série japonaise, 17), p. 323. </p>
<p xmlns="http://www.tei-c.org/ns/1.0">Ed. Chavannes, Cinq cents contes et apologues extraits du Tripitaka chinois, Paris, t. 4,
1934, Notes complémentaires..., p. 147.</p>
<p xmlns="http://www.tei-c.org/ns/1.0"/>
<p xmlns="http://www.tei-c.org/ns/1.0">Ed. Chavannes, Cinq cents contes et apologues extraits du Tripitaka chinois, Paris, t. 4,
1934, Notes complémentaires..., p. 129.</p>
i.e. some with text() and others empty
I am trying to "de-duplicate" the /tei:p/tei:p, but the following attempts return the same aforementioned error:
for $x in $mycollection//tei:p/tei:p
return update replace $x with $x/(text()|*)
for $x in $mycollection//tei:p/tei:p
let $y := $x/(text()|*)
return update replace $x with $y
I don't understand what the error is trying to tell me in order to correct the query.
Many, many thanks.
edit:
for $x in $mycollection//tei:p[tei:p and count(node()) eq 1]
let $y := $x/tei:p
return update replace $x with $y
I also tried this, replacing parent with self axis, which resulted in a very ambiguous error exerr:ERROR node not found:
for $x in $mycollection//tei:p/tei:p
let $y := $x/self::*
return update replace $x/parent::* with $y
solution:
for $x in $local:COLLECTIONS//tei:p/tei:p
return if ($x/(text()|*))
then update replace $x with $x/(text()|*)
else update delete $x
The error message indicates that $y is an empty sequence. The XQuery Update documentation describes the replace statement as follows:
update replace expr with exprSingle
Replaces the nodes returned by expr with the nodes in exprSingle. expr must evaluate to a single element, attribute, or text node. If it is an element, exprSingle must contain a single element node...
In certain cases, as shown in your sample data above, $y would return an empty sequence - which would violate the rule that expr must evaluate to a single element.
To work around such cases, you can add a conditional expression, with an else clause of either an empty sequence () or a delete statement:
if ($y instance of element()) then
update replace $x with $y
else
update delete $x
If your goal is not simply to workaround the error, but to arrive at a more direct solution for replacing "double-nested" elements such as:
<p><p>Mixed <hi>content</hi>.</p></p>
.... with:
<p>Mixed <hi>content</hi>.</p>
... I'd suggest this query, which takes care not to inadvertently delete nodes that might somehow have slipped in between the two nested <p> elements:
xquery version "3.1";
declare namespace tei="http://www.tei-c.org/ns/1.0";
for $x in $mycollection//tei:p[tei:p and count(node()) eq 1]
let $y := $x/tei:p
return
update replace $x with $y
Given a $mycollection such as this:
<text xmlns="http://www.tei-c.org/ns/1.0">
<p>Hello</p>
<p><p>Hello there</p></p>
<p>Hello <p>there</p></p>
</text>
The query will transform the collection to be as follows:
<text xmlns="http://www.tei-c.org/ns/1.0">
<p>Hello</p>
<p>Hello there</p>
<p>Hello <p>there</p></p>
</text>
This is the expected result for the query, because only the 2nd <p> element had the nested <p> that could be cleanly stripped off. Obviously, if you can assume your content meets simpler patterns, you can remove the and count(node()) eq 1 condition.
My attempt to ask this before was apparently too convoluted, trying again!
I am composing a search in Xquery. In one of the fields (title) it should be possible to enter multiple keywords. At the moment only ONE keyword works. When there is more than one there is the error ERROR XPTY0004: The actual cardinality for parameter 1 does not match the cardinality declared in the function's signature: concat($atomizable-values as xs:anyAtomicType?, ...) xs:string?. Expected cardinality: zero or one, got 2.
In my xquery I am trying to tokenize the keywords by \s and then match them individually. I think this method is probably false but I am not sure what other method to use. I am obviously a beginner!!
Here is the example XML to be searched:
<files>
<file>
<identifier>
<institution>name1</institution>
<idno>signature</idno>
</identifier>
<title>Math is fun</title>
</file>
<file>
<identifier>
<institution>name1</institution>
<idno>signature1</idno>
</identifier>
<title>philosophy of math</title>
</file>
<file>
<identifier>
<institution>name2</institution>
<idno>signature2</idno>
</identifier>
<title>i like cupcakes</title>
</file>
</files>
Here is the Xquery with example input 'math' for the search field title and 'name1' for the search field institution. This works, the search output are the titles 'math is fun' and 'philosophy of math'. What doesn't work is if you change the input ($title) to 'math fun'. Then you get the error message. The desired output is the title 'math is fun'.
xquery version "3.0";
let $institution := 'name1'
let $title := 'math' (:change to 'math fun' and doesn't work anymore, only a single word works:)
let $title-predicate :=
if ($title)
then
if (contains($title, '"'))
then concat("[contains(lower-case(title), '", replace($title, '["]', ''), "')]") (:This works fine:)
else
for $title2 in tokenize($title, '\s') (:HERE IS THE PROBLEM, this only works when the input is a single word, for instance 'math' not 'math fun':)
return
concat("[matches(lower-case(title), '", $title2, "')]")
else ()
let $institution-predicate := if ($institution) then concat('[lower-case(string-join(identifier/institution))', " = '", $institution, "']") else ()
let $eval-string := concat
("doc('/db/Unbenannt.xml')//file",
$institution-predicate,
$title-predicate
)
let $records := util:eval($eval-string)
let $test := count($records)
let $content :=
<inner_container>
<div>
<h2>Search Results</h2>
<ul>
{
for $record in $records
return
<li id="searchList">
<span>{$record//institution/text()}</span> <br/>
<span>{$record//title/text()}</span>
</li>
}
</ul>
</div>
</inner_container>
return
$content
You have to wrap your FLWOR expression with string-join():
string-join(
for $title2 in tokenize($title, '\s')
return
concat("[matches(lower-case(title), '", $title2, "')]")
)
If tokenize($title) returns a sequence of strings, then
for $title2 in tokenize($title, '\s')
return concat("[matches(lower-case(title), '", $title2, "')]")
will also return a sequence of strings
Therefore $title-predicate will be a sequence of strings, and you can't supply a sequence of strings as one of the arguments to concat().
So it's clear what's wrong, but fixing it requires a deeper understanding of your query than I have time to acquire.
I find it hard to believe that the approach of generating a query as a string and then doing dynamic evaluation of that query is really necessary.
I wish I could be able to count preceding siblings of the highest div in ePub (for a footnote). I need to pass the value to the attribute before passing notes through XSLT.
for $note in doc('/db/custom_jh/bukwor.xml')//tei:note[#place='bottom']
let $parent := count($note[preceding-sibling::tei:div[#n='1']])
let $update := update insert attribute att2 {$parent} into $note
return $note
Attempts with $note[preceding-sibling::tei:div[#n='1']] or $note[ancestor-or-self::tei:div[#n='1']] returns just 0 or the total sum of all the divs.
Something like <xsl:number level="any" select="tei:div[#n='1']/>" from XSLT, if possible.
UPDATE
The very minimal code for counting (still not working, returns only 6 × 1, should at least one 2:
for $note at $count in doc('/db/custom_jh/bukwor.xml')//tei:note[#place='bottom']
let $parent := count($note[ancestor-or-self::*/tei:div[#n='1']])
return $parent
I don't know about ePub format of XML and there is no sample XML provided so the requirement isn't clear, at least for me. But according to the title, you might want something like this :
let $parent := count($note/parent::*/preceding-sibling::tei:div[#n='1'])
basically counting preceding sibling tei:div from parent element of current $note, where the tei:div have n attribute value equals 1.
The whole example was slightly bad. Finally, I restructured the whole thing. At the moment, I do it this way:
let $chaps :=
(
let $countAll := count($doc//tei:note)
for $chapter at $count in $doc//tei:div[#n='1']
let $countPreceding := count($chapter/preceding::tei:div[#n='1']//tei:note[#place='bottom'])
let $params :=
<parameters>
<param name="footnoteNo" value="{$countPreceding}"/>
</parameters>
return
<entry name="OEBPS/chapter-{$count}.xhtml" type="xml">
{
transform:transform($chapter, doc("/db/custom_jh/xslt/style-web.xsl"), $params)
}
</entry>
)
The count($chapter/preceding::tei:div[#n='1']//tei:note[#place='bottom']) does the trick for me. (I need to collect all footnotes in one file and make backlinks to locations of their indexes in different files).
I have a content which is neither a valid HTML nor a XML in my legacy database. Considering the fact, it would be difficult to clean the legacy, I want to tidy this up in MarkLogic using xdmp:tidy. I am currently using ML-8.
<sub>
<p>
<???†?>
</p>
</sub>
I'm passing this content to tidy functionality in a way :
declare variable $xml as node() :=
<content>
<![CDATA[<p><???†?></p>]]>
</content>;
xdmp:tidy(xdmp:quote($xml//text()),
<options xmlns="xdmp:tidy">
<assume-xml-procins>yes</assume-xml-procins>
<quiet>yes</quiet>
<tidy-mark>no</tidy-mark>
<enclose-text>yes</enclose-text>
<indent>yes</indent>
</options>)
As a result it returns :
<p>
<? ?†?>
</p>
Now this result is not the valid xml format (I checked it via XML validator) due to which when I try to insert this XML into the MarkLogic it throws an error saying 'MALFORMED BODY | Invalid Processing Instruction names'.
I did some investigation around PIs but not much luck. I could have tried saving the content without PI but this is also not a valid PI too.
That is because what you think is a PI is in fact not a PI.
From W3C:
2.6 Processing Instructions
[Definition: Processing instructions (PIs) allow documents to contain
instructions for applications.]
Processing Instructions
[16] PI ::= '' Char*)))?
'?>'
[17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' |
'l'))
So the PI name cannot start with ? as in your sample ??†
You probably want to clean up the content before you pass it to tidy.
Like below:
declare variable $xml as node() :=
<content><![CDATA[<p>Hello <???†?>world</p>]]></content>;
declare function local:copy($input as item()*) as item()* {
for $node in $input
return
typeswitch($node)
case text()
return fn:replace($node,"<\?[^>]+\?>","")
case element()
return
element {name($node)} {
(: output each attribute in this element :)
for $att in $node/#*
return
attribute {name($att)} {$att}
,
(: output all the sub-elements of this element recursively :)
for $child in $node
return local:copy($child/node())
}
(: otherwise pass it through. Used for text(), comments, and PIs :)
default return $node
};
xdmp:tidy(local:copy($xml),
<options xmlns="xdmp:tidy">
<assume-xml-procins>no</assume-xml-procins>
<quiet>yes</quiet>
<tidy-mark>no</tidy-mark>
<enclose-text>yes</enclose-text>
<indent>yes</indent>
</options>)
This would do the trick to get rid of all PIs (real and fake PIs)
Regards,
Peter