Xquery - what is wrong here - xquery

I would like the node replace to act on the $person variable. What do I need to change?
The following code should change the name of the people in the sequence to X.
declare function local:ChangeName($person)
{
xdmp:node-replace($person//Name/text, text { "X" } )
<p>{$person}</p>
};
let $b := <Person>
<Name>B</Name>
<IsAnnoying>No</IsAnnoying>
</Person>
let $j := <Person>
<Name>J</Name>
<IsAnnoying>No</IsAnnoying>
</Person>
let $people := ($b, $j)
return $people ! local:ChangeName(.)

xdmp:node-replace() only operates on persisted documents, not on in-memory documents.
Your local:ChangeName() function could construct the Person and Name elements but copying the IsAnnoying element, as in:
declare function local:ChangeName($person)
{
<p>
<Person>
<Name>X</Name>
{$person//IsAnnoying}
</Person>
</p>
};
For more complex transformations, consider a recursive typeswitch or XSLT transform.

As Erik noted above, you can't apply the xdmp update functions to in-memory nodes. If you have a strong need to to do in-memory node updates, Ryan Grimm's memupdate library can come in handy. It basically does the grunt work of the recursive traversal for you. But beware, in-memory updates do not scale to large documents because it requires making a copy to do an update.
https://github.com/marklogic/commons/tree/master/memupdate

I don't have a ML instance running, so I couldn't test it (maybe there are more issues), but it has to be
$person//Name/text()
instead of
$person//Name/text

The Xpath:
$person//Name/text
should be:
$person//Name/text()
The XPath is looking to replace the text node, which doesn't exist.

Also note that the results of xdmp:node-replace will not be visible in the same transaction. To see the results you need to start a new transaction.

Related

eXist-db serialize is expand-xincludes=no ignored?

In eXist-db 4.4, Xquery 3.1, I am compressing a number of XML files to a .zip in a directory. The compression process uses serialize().
The XML files have some large xincludes which according to the documentation are automatically processed in serializing. I have attempted to 'turn off' the xinclude serialization in two places in the code (prologue declare and map), but the serializer is still outputting all xincludes:
declare option exist:serialize "expand-xincludes=no";
declare function zip:get-entries-for-zip()
{
(: get documents prefixed by 'MS609' :)
let $pref := "MS609"
(: get list of document names :)
let $doclist := xmldb:get-child-resources($globalvar:URIdata)[starts-with(., $pref)]
(: output serialized entries :)
let $entries :=
for $n in $doclist
return
<entry name="{$n}" type='text' method='store'>
{serialize(doc(concat($globalvar:URIdata, "/", $n)), map { "method": "xml", "expand-xincludes": "no"})}
</entry>
return $entries
};
The XML data with xincludes to reproduce this problem can be found here http://medieval-inquisition.huma-num.fr/downloads under the description "BM MS609 Edition (tei-xml)".
Many thanks in advance.
The expand-xincludes serialization parameter is specific to eXist and, as such (or at least at present), cannot be set using the fn:serialize() function. Instead, use the util:serialize() function:
util:serialize($document, "expand-xincludes=no")
Alternatively, since you're ultimately interested in zipping the contents of a collection, you can skip the explicit serialization step, declare your serialization options in the query's prolog (or set it inline using util:declare-option()), and simply provide the compression:zip() function the URI path(s) to the collections/documents you want to zip. For example:
xquery version "3.1";
declare option exist:serialize "expand-xincludes=no";
let $sources := "/db/apps/my-app/my-data" (: or a sequence of paths to individual docs:) ! xs:anyURI(.)
let $preserve-collection-structure := false()
let $zip := compression:zip($sources, $preserve-collection-structure),
return
xmldb:store("/db", "my-data.zip", $zip)
For more on serialization options in eXist, see my earlier answer to a similar question: https://stackoverflow.com/a/49290616/659732.

MarkLogic - How to insert element into XML

How to insert the node in XML.
let $a := <a><b>bbb</b></a>)
return
xdmp:node-insert-after(doc("/example.xml")/a/b, <c>ccc</c>);
Expected Output:
<a><c>ccc</c><b>bbb</b></a>
Please help to get the output.
You should be using xdmp:node-insert-before I believe in the following way:
xdmp:document-insert('/example.xml', <a><b>bbb</b></a>);
xdmp:node-insert-before(fn:doc('/example.xml')/a/b, <c>ccc</c>);
fn:doc('/example.xml');
(: returns <a><c>ccc</c><b>bbb</b></a> :)
Nodes are immutable, so in-memory mutation can only be done by creating a new copy.
The copy can use the unmodified contained nodes from the original:
declare function local:insert-after(
$prior as node(),
$inserted as node()+
) as element()
{
let $container := $prior/parent::element()
return element {fn:node-name($container)} {
$container/namespace::*,
$container/attribute(),
$prior/preceding-sibling::node(),
$prior,
$inserted,
$prior/following-sibling::node()
}
};
let $a := <a><b>bbb</b></a>
return local:insert-after($a//b, <c>ccc</c>)
Creating a copy in memory and then inserting the copy is faster than inserting and modifying a document in the database.
Depending on how many documents are inserted, the difference could be significant.
There are community libraries for copying with changes, but sometimes it's as easy to write a quick function (recursive where necessary).
You can use below code to insert the element into the XML:
xdmp:node-insert-child(fn:doc('directory URI'),element {fn:QName('http://yournamesapce','elementName') }{$elementValue})
Here we use fn:QName to remove addition of xmlns="" in added node.

How to tidy-up Processing Instructions in Marklogic

I have a content which is neither a valid HTML nor a XML in my legacy database. Considering the fact, it would be difficult to clean the legacy, I want to tidy this up in MarkLogic using xdmp:tidy. I am currently using ML-8.
<sub>
<p>
<???†?>
</p>
</sub>
I'm passing this content to tidy functionality in a way :
declare variable $xml as node() :=
<content>
<![CDATA[<p><???†?></p>]]>
</content>;
xdmp:tidy(xdmp:quote($xml//text()),
<options xmlns="xdmp:tidy">
<assume-xml-procins>yes</assume-xml-procins>
<quiet>yes</quiet>
<tidy-mark>no</tidy-mark>
<enclose-text>yes</enclose-text>
<indent>yes</indent>
</options>)
As a result it returns :
<p>
<? ?†?>
</p>
Now this result is not the valid xml format (I checked it via XML validator) due to which when I try to insert this XML into the MarkLogic it throws an error saying 'MALFORMED BODY | Invalid Processing Instruction names'.
I did some investigation around PIs but not much luck. I could have tried saving the content without PI but this is also not a valid PI too.
That is because what you think is a PI is in fact not a PI.
From W3C:
2.6 Processing Instructions
[Definition: Processing instructions (PIs) allow documents to contain
instructions for applications.]
Processing Instructions
[16] PI ::= '' Char*)))?
'?>'
[17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' |
'l'))
So the PI name cannot start with ? as in your sample ??†
You probably want to clean up the content before you pass it to tidy.
Like below:
declare variable $xml as node() :=
<content><![CDATA[<p>Hello <???†?>world</p>]]></content>;
declare function local:copy($input as item()*) as item()* {
for $node in $input
return
typeswitch($node)
case text()
return fn:replace($node,"<\?[^>]+\?>","")
case element()
return
element {name($node)} {
(: output each attribute in this element :)
for $att in $node/#*
return
attribute {name($att)} {$att}
,
(: output all the sub-elements of this element recursively :)
for $child in $node
return local:copy($child/node())
}
(: otherwise pass it through. Used for text(), comments, and PIs :)
default return $node
};
xdmp:tidy(local:copy($xml),
<options xmlns="xdmp:tidy">
<assume-xml-procins>no</assume-xml-procins>
<quiet>yes</quiet>
<tidy-mark>no</tidy-mark>
<enclose-text>yes</enclose-text>
<indent>yes</indent>
</options>)
This would do the trick to get rid of all PIs (real and fake PIs)
Regards,
Peter

How to set the value of one field to another filed in using XQuery

Very new to XQuery and MarkLogic, what is the XQuery version of the following statement?
update all_the_records
set B_field = A_field
where B_field is null and A_field is not null
Something like this might get you started. But remember that you're working with trees, not tables. Things are generally more complicated because of that extra dimension.
for $doc in collection()/doc[not(b)][a]
let $a as element() := $doc/a
return xdmp:node-insert-child($doc, element b { $a/#*, $a/node() })

how to change the XML structure using XQuery

I have a XML file containing Employees Name and the Job done by them.
The structure of the XML file is -
<Employee>AAA#A#B#C#D</Employee>
<Employee>BBB#A#B#C#D</Employee>
<Employee>CCC#A#B#C#D</Employee>
<Employee>DDD#A#B#C#D</Employee>
There are thousands of records and I have to change structure to -
<Employee>
<Name>AAA</Name>
<Jobs>
<Job>A</Job>
<Job>B</Job>
<Job>C</Job>
<Job>D</Job>
</Jobs>
</Employee>
How to get this done using XQuery in BaseX ?
3 XQuery functions, substring-before, substring-after and tokenize are used to get
the required output.
substring-before is used to get the Name.
Similarly, the substring-after is used to get the Job portion.
Then the tokenize function, is used to split the Jobs.
let $data :=
<E>
<Employee>AAA#A#B#C#D</Employee>
<Employee>BBB#A#B#C#D</Employee>
<Employee>CCC#A#B#C#D</Employee>
<Employee>DDD#A#B#C#D</Employee>
</E>
for $x in $data/Employee
return
<Employee>
{<Name>{substring-before($x,"#")}</Name>}
{<Jobs>{
for $tag in tokenize(substring-after($x,"#"),'#')
return
<Job>{$tag}</Job>
}</Jobs>
}</Employee>
HTH...
Tokenizing the string is probably easier and faster. tokenize($string, $pattern) splits $string using the regular expression $pattern, head($seq) returns the first value of a sequence and tail($seq) all but the first. You could also use positional predicates of course, but these functions are easier to read.
for $employee in //Employee
let $tokens := tokenize($employee, '[##]')
return element Employee {
element Name { head($tokens) },
element Jobs {
for $job in tail($tokens)
return element Job { $job }
}
}

Resources