Not picking up modifications to an XML document in the database - xquery

I have only just started with MarkLogic and XQuery. I am having a really tough time in modifying the content of one of my XML documents. I just cannot seem to get a change to an element to pick up. Here's my process (I have had to take things back as basic as I could just to try and get it working):
In query console I have one tab open which queries for the contents of one XML doc:
xquery version "1.0-ml";
declare namespace html = "http://www.w3.org/1999/xhtml";
xdmp:document-get("C:/Users/Paul/Documents/MarkLogic/xml/ppl/ppl/jdbc_ppl_3790.xml")
This brings back the document as below
false
...
3790
Victoria Wilson
</ppl_name>
I now want to update the element using XQuery but it's just not happening. Here's the XQuery:
xquery version "1.0-ml";
declare namespace html = "http://www.w3.org/1999/xhtml";
let $docxml :=
xdmp:document-get("C:/Users/Paul/Documents/MarkLogic/xml/ppl/ppl/jdbc_ppl_3065.xml")/document/meta/ppl_name
return
for $node in $docxml/*
let $target := xdmp:document-get("C:/Users/Paul/Documents/MarkLogic/xml/ppl/ppl/jdbc_ppl_3790.xml")/document/meta/*[fn:name() = fn:name($node)]
return
xdmp:node-replace($target, $node)
I am basically looking to replace the ppl_name element in the target (3790) with the ppl_name element from the source (3065).
I run the XQuery - it completes without error (making me thing it has worked) - return value reads your query returned an empty sequence.
I then go back to the same tab as I used in step 1 and re-run the XQuery used in step 1. The doc (3790) comes back but it STILL has Victoria Wilson as the ppl_name.

The node returned by xdmp:document-get is an in-memory node from a document on the filesystem. It isn't coming from the database. You can't use xdmp:node-replace on in-memory nodes. That's only for database-resident nodes.
You can insert it using xdmp:document-insert. Then it's in the database, and you can access it using doc and update it using xdmp:node-replace. Or you can use in-memory operations to construct a new version with the changes you want.
See What are in memory elements in marklogic? for previous answers to a similar question, and more tips.

Here the node returned by xdmp:document-get is an in-memory node
If your working with in memory elements import the following module
import module namespace mem = "http://xqdev.com/in-mem-update" at "/MarkLogic/appservices/utils/in-mem-update.xqy";
Instead of using xdmp:node-replace you can use mem:node-replace(<x/>, <y/>)

Related

how to write query where the input file is passed from the command line (Saxon)

I'm very new to this.
I have a query and an xml file.
I can write a query over that specific file
for $x in doc("file:///C:/Users/Foo/IdeaProjects/XQuery/src/books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title
I have a basic xml file, with books in it, works nicely in intellij.
but if I wanted to run this query against some file defined on the command line, then how do I do it?
the command line for running the above is (as much for other peoples reference)
java -cp C:\Users\Foo\.IdeaIC2019.2\config\plugins\xquery-intellij-plugin\lib\Saxon-HE-9.9.1-7.jar net.sf.saxon.Query -t -q:"C:\Users\Foo\IdeaProjects\XQuery\src\w3schools.com.xqy"
and that also works nicely.
the saxon documentation
https://www.saxonica.com/html/documentation/using-xquery/commandline.html
implies that I can specify an input file, using "-d"
and "The document node of the document is made available to the query as the context item"
but this doesnt really make any sense to my 1 day old XQuery skills.
how do I specify the document is sent from the command line in the query? what is the context item? and how do I reference it?
(I can do a bit of XSLT 1.0, so I understand the notion of a context).
I think the option is named -s (for source) so you can use -s:books.xml and inside your XQuery main expression any path is evaluated with that document as the context item so you can just use e.g.
for $x in /bookstore/book
where $x/price>30
order by $x/title
return $x/title
and the answer is to drop the doc() function
for $x in bookstore/book
i.e. the same notion as xslt.

File Exists and Exception Handling in U-Sql

Two questions
How to check file exists or not before EXTRACT?
we have scenario where new inputs file is generated every day for catalog data. we need to merge new input with d-1 file. before merge we what to make sure that new input file exists at source location
does u-sql supports try...catch block?
Regarding checking if a file exists. We recently released a compile-time IF statement that indeed can check for partition existence (and other objects such as files and tables are on the roadmap).
Once that feature is released (still one or two refreshs out at the time of this answer) it may look something like (syntax subject to change):
IF FILE.EXISTS("/mydir/myfile.csv") THEN
#data = EXTRACT ... FROM "/mydir/myfile.csv" USING ...;
...
#jobstate = SELECT * FROM (VALUES("job completed")) AS T(status);
ELSE
#jobstate = SELECT * FROM (VALUES("file not ready. Job not executed.")) AS T(status);
END;
OUTPUT #jobstate TO "/jobs/myjobstate.csv" USING Outputters.Csv();
You will be able to provide the name as a parameter as well. Please let me know if that will work for your scenario.
An other alternative is to use the file set syntax, especially if you want to use a dynamic value to determine the process. That would simply create an empty rowset:
#data = EXTRACT ..., date DateTime
FROM "/mydir/{date:yyyy}/{date:MM}/{date:dd}/data.csv"
USING ...;
#data = SELECT * FROM #data WHERE date == DateTime.Now.AddDays(-1);
... // continue processing #data that is empty if yesterday's file is not yet there
Having said that, you may want to check of your job orchestration framework (such as ADF) may be a better place to check for existence before submitting the job in the first place.
As to the try catch block: U-SQL itself is a script-level optimizable, declarative language where the plan gets generated and optimized at runtime over the whole script. Thus providing a dynamic TRY-CATCH is currently not available, since it would severely impact the ability to optimize the script (e.g., you cannot move predicates or column pruning outside of a try-catch block). Also TRY/CATCH can lead to some very hard to understand and debug code, especially if it is used to mimic procedural workflows in an otherwise declarative environment.
However, you can use try/catch inside your C# functions without problems if you need to catch C# runtime errors.
FILE.EXISTS() always returns True when executed locally. However, it works when executing against Azure Data Lake.
Tried MSDN example and the following returns True, True
DECLARE #filepath_good = "/Samples/Data/SearchLog.tsv";
DECLARE #filepath_bad = "/Samples/Data/zzz.tsv";
#result =
SELECT FILE.EXISTS(#filepath_good) AS exists_good,
FILE.EXISTS(#filepath_bad) AS exists_bad
FROM (VALUES (1)) AS T(dummy);
OUTPUT #result
TO "/Output/FileExists.txt"
USING Outputters.Csv();
I have Microsoft Azure Data Lake Tools for Visual Studio version 2.2.5000.0

Updating embedded triples using xquery in MarkLogic

I tried to update embedded triples in marklogic using xquery but it seems to be not working for embedded triples however the same query is working for other triples
can you tell me if there is some other option which needs to specified while performing an update on embedded triples.
The code i used is
xquery version "1.0-ml";
import module namespace sem = "http://marklogic.com/semantics"
at "/Marklogic/semantics.xqy";
let $triples := cts:triples(sem:iri("http://smartlogic.com/document#2012-10-26_DNB.OL_(Citi)_DNB_ASA_(DNB.OL)__Model_Update.61259187.xml"),()())
for $triple in $triples
let $node := sem:database-nodes($triple)
let $replace :=
<sem:triple>
<sem:subject>http://www.example.com/products/1001_Test
</sem:subject>
{$node/sem:predicate, $node/sem:object}
</sem:triple>
return $node ! xdmp:node-replace(., $replace)
My document contains the following triple
<sem:triples xmlns:sem="http://marklogic.com/semantics">
<sem:triple>
<sem:subject>http://smartlogic.com/document#2012-10-26_DNB.OL_(Citi)_DNB_ASA_(DNB.OL)__Model_Update.61259187.xml</sem:subject>
<sem:predicate>http://www.smartlogic.com/schemas/docinfo.rdf#cik</sem:predicate>
<sem:object>datatype="http://www.w3.org/2001/XMLSchema#string</sem:object>
</sem:triple>
</sem:triples>
and i want this particular subject to change into something like this
<sem:subject>http://www.example.com/products/1001_Test</sem:subject>
But when i use the xquery to update it , it does not alter anything, the embedded triple in the documents remains the same.
Because when i tried to see if any of the results have changed to the subject i specified it returned me no results.
I used the following query to test.
SELECT *
WHERE {
<http://www.example.com/products/1001_Test> ?predicate ?object
}
You need to add the option 'all' when you ask for the database nodes backing the triple: sem:database-nodes($triple, 'all').
To be perfectly honest, I am not 100% sure why, but I think this is because your sem:triples element is not the root element of the document it appears on.

How to rename a document in MarkLogic?

I have simple task to do but unable to find the exact solutions for this.I have saved a file as abc.xml in MarkLogic.How can i rename the file as some example.xml using XQuery?
Code which I tried:
xquery version "1.0-ml";
xdmp:document-rename ("/aaa.xml","/final.xml");
This is showing an error.
There is no way, that I know of, to change the document URI of an existing document. The only way I can think of is to create a new document with the same content and the new URI, and delete the existing one, in the same transaction.
Where it gets tricky is to make sure to preserve the ownership, the permissions, all the properties, the property document, make sure that the old URI is not used anywhere to link to the existing document, etc.
But usually, the document URI is never really used. You should first considering whether you really need to rename the document, and why.
(Note that saying "this is showing an error" is rarely useful on SO or on mailing lists, if you do not show what the error is.)
Florent is correct, a true 'rename' is not possible, or perhaps not even meaningful. ( analogy - rename a file from one disk to another )
"Move" however is meaningful (copy then delete in a transaction).
Defining "Move" is use case dependent - i.e. what metatdata also needs to 'move' ? permissions? collections ? document properties ? inherited permissions ?
xmlsh (http://www.xmlsh.org) implements a 'rename' (http://www.xmlsh.org/MarkLogicRename) command for the marklogic extension which is really a 'move', with the implemenation borrowed from postings on markmail (http://markmail.org/)
The implementation is the following XQuery - it doesnt do everything you might want and it might do more then you want. YMMV
https://github.com/DALDEI/xmlsh/blob/master/extensions/marklogic/src/org/xmlsh/marklogic/resources/rename.xquery
( it was also written long ago - it is likely to benefit from improvement )
I have working example this works for me.
xquery version "1.0-ml";
declare function local:document-rename(
$old-uri as xs:string, $new-uri as xs:string)
as empty-sequence()
{
xdmp:document-delete($old-uri),
let $permissions := xdmp:document-get-permissions($old-uri)
let $collections := xdmp:document-get-collections($old-uri)
return xdmp:document-insert(
$new-uri, doc($old-uri),
if ($permissions) then $permissions
else xdmp:default-permissions(),
if ($collections) then $collections
else xdmp:default-collections(),
xdmp:document-get-quality($old-uri)
)
,
let $prop-ns := namespace-uri(<prop:properties/>)
let $properties :=
xdmp:document-properties($old-uri)/node()
[ namespace-uri(.) ne $prop-ns ]
return xdmp:document-set-properties($new-uri, $properties)
};
(: function call :)
local:document-rename ("/opt/backup/x.xml","y.xml");
MarkLogic has a tutorial up addressing file renaming (moving):
https://developer.marklogic.com/recipe/move-a-document/
Importantly, it uses the function xdmp:lock-for-update() to prevent modifications to the source file while it is being copied to the target location.
Also, if you are doing a batch renaming you'll want to make sure that each file URI you rename corresponds to a document in the database or you'll get runtime errors.

XQuery Update: insert or replace depending if node exists not possible?

I am trying to build simple XML database (inside BaseX or eXist-db), but I have trouble figuring how to modify values in document :
content is simple as this for test :
<p>
<pl>
<id>6></id>
</pl>
</p>
I am trying to build something like function which would insert element into <pl> if that element is not present or replace it if it is present. But XQuery is giving me troubles yet :
When I try it head-on with if-then-else logic :
if (exists(/p/pl[id=6]/name)=false)
then insert node <name>thenname</name> into /p/pl[id=6]
else replace value of node /p/pl[id=6]/name with 'elsename'
I get error Error: [XUDY0027] Replace target must not be empty. Clearly I am confused, why the else part is evaluated in both cases, thus the error.
When I empty out the else part :
if (exists(/p/pl[id=6]/name)=true)
then insert node <name>thenname</name> into /p/pl[id=6]
else <dummy/>
Then I get Error: [XUST0001] If expression: no updating expression allowed.
When I try through declaring updating function, even then it reports error :
declare namespace testa='test';
declare updating function testa:bid($a, $b)
{
if (exists(/p/pl[id=6]/name)=true)
then insert node <name>thenname</name> into /p/pl[id=6]
else <dummy/>
};
testa:bid(0,0)
Error: [XUST0001] If expression: no updating expression allowed.
I've got these errors from BaseX 6.5.1 package.
So how can I modify values in a simple fashion if possible ?
If I call insert straight, the there could be multiple elements of same value.
If I call replace it will fail when the node does not exist.
If I delete the node before insert/replace then I could destroy sub-nodes which I don't want.
In most SQL databases, these are quite simple task (like MYSQL 'replace' command).
#Qiqi: #Alejandro is right. Your if expression is incorrect XQuery syntax:
if (exists(/p/pl[id=6]/name))
then insert node <name>thenname</name> into /p/pl[id=6]
else replace value of node /p/pl[id=6]/name with 'elsename'
Note that eXist-db's XQuery Update functionality is currently an eXist-specific implementation, so in eXist (currently) 1.4.x and 1.5dev, you'll want:
if (exists(/p/pl[id=6]/name))
then update insert <name>thenname</name> into /p/pl[id=6]
else update value /p/pl[id=6]/name with 'elsename'
This eXist-specific XQuery Update syntax is documented on http://exist-db.org/update_ext.html. This syntax was developed before the W3C XQuery Update spec had reached its current state. The eXist team plans to make eXist fully compliant with the W3C spec soon, but in the meantime the docs above should help you achieve what you need to if you use eXist.
Note too that your example code contains a typo inside the pl and id elements. The valid XML version would be:
<p>
<pl>
<id>6</id>
</pl>
</p>

Resources