I have written an Xquery to that gets executed at the time of when incremental backup is in progress. I know the backup status returns three possible values -
completed, in-progress and failed. Not sure the exact value of last one but anyways this is my xquery -
xquery version "1.0-ml";
declare function local:escape-for-regex
( $arg as xs:string? ) as xs:string {
replace($arg,
'(\.|\[|\]|\\|\||\-|\^|\$|\?|\*|\+|\{|\}|\(|\))','\\$1')
} ;
declare function local:substring-before-last
( $arg as xs:string? ,
$delim as xs:string ) as xs:string {
if (matches($arg, local:escape-for-regex($delim)))
then replace($arg,
concat('^(.*)', local:escape-for-regex($delim),'.*'),
'$1')
else ''
} ;
let $server-info := doc("/config/server-info.xml")
let $content-database :="xyzzy"
let $backup-directory:=$server-info/configuration/server-info/backup-directory/text()
let $backup-latest-dateTime := xdmp:filesystem-directory(fn:concat( $backup-directory,'/',$content-database))/dir:entry[1]/dir:filename/text()
let $backup-latest-date := fn:substring-before($backup-latest-dateTime,"-")
let $backup-info := cts:search(/,cts:element-value-query(xs:QName("directory-name"),$backup-latest-date))
let $new-backup := if($backup-info)
then fn:false()
else fn:true()
let $db-bkp-status := if($new-backup)
then (xdmp:database-backup-status(())[./*:forest/*:backup-path[fn:contains(., $backup-latest-dateTime)]][./*:forest/*:incremental-backup eq "false"]/*:status)
else (xdmp:database-backup-status(())[./*:forest/*:backup-path[fn:contains(., $backup-latest-dateTime)]][./*:forest/*:incremental-backup eq "true"][./*:forest/*:incremental-backup-path[fn:contains(., fn:replace(local:substring-before-last(xs:string(fn:current-date()), "-"), "-", ""))]]/*:status)
return $db-bkp-status
We maintain a configuration file that stores backup status. If there is a new full backup day then $backup-info will return nothing. If it is daily incremental backup day then it will return the config. I'm using it just to check if todays backup is new full or incremental. For incremental day $backup-info is false and so it goes to the last line i.e. else condition. this doesn't return anything for incremental backups. Neither completed nor in-progress. I wonder how markLogic picks up the timestamp. Please assist on this.
Feel free to provide your own xquery from scratch. I can update mine.
I even took out the Job id and search in the output of the function xdmp:database-backup-status(()) but that job id too doesn't exist in the result set.
MarkLogic provides the Admin modules to provide much of the information you are attempting to get via other methods. The Admin UI modules (typically found in /opt/MarkLogic/Modules/MarkLogic/Admin/Lib) contains a lot of helpful code that can be adapted to get these sorts of details. In this case I would refer to database-status-form.xqy
define function db-mount-state(
$fstats as node()*,
$fcounts as node()*,
$dbid as xs:unsignedLong)
{
let $times := $fstats/fs:last-state-change,
$ls := max($times),
$since :=
if (not(empty($ls)))
then concat(" since ", longDate($ls), " ", longTimeSecs($ls))
else ""
return concat(database-status($dbid,$fstats,$fcounts),$since)
}
define function backup-recov-state($fstats as node()*)
{
if(empty($fstats/fs:backups/fs:backup)
and
empty($fstats/fs:restore))
then
"No backup or restore in progress"
else
if(empty($fstats/fs:backups/fs:backup))
then
"Restore in progress (see below for details)"
else
"Backup in progress (see below for details)"
}
... Call the functions against your database, then pull the details from the elements you want:
let $last-full-backup := max($fstats/fs:last-backup)
let $last-incremental-backup : = max($fstats/fs:last-incr-backup
return ($last-full-backup, $last-incremental-backup)
This is just some sample code snippets, not executable, but it should get you moving in the right direction.
Related
I am trying to write an XQuery-result to a CSV-file, see attached code (resulting in at least 1.6 millions lines, will problably become a lot more..).
However several minutes into execution the program fails with an 'out of main memory' error. I am using a laptop with 4GB of memory. I would have thought that writing to file would prevent memory bottlenecks. Also, I am already using the copynode-false pragma.
I might have gone about the code the wrong way, since this is my first XQuery/BaseX-program. Or this might be non-solvable without extra hardware.. (current Database-SIZE: 3092 MB; NODES: 142477344) Any assistance would be much appreciated!
let $params :=
<output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization">
<output:method value="csv"/>
<output:csv value="header=yes, separator=semicolon"/>
</output:serialization-parameters>
return file:write(
'/tmp/output.csv',
(# db:copynode false #){<csv>{
for $stand in //*:stand
return <record>{$stand//*:kenmerk}</record>
(: {$stand//*:identificatieVanVerblijfsobject}
{$stand//*:inOnderzoek}
{$stand//*:documentdatum}
{$stand//*:documentnummer} :)
}</csv>},
$params
)
It’s a good idea to use the copynode pragma to save memory. In the given case, it’s probably the total amount of newly created element nodes that will simply consume too much memory before the data can be written to disk.
If you have large data sets, the xquery serialization format may be the better choice. Maps and arrays consume less memory than XML nodes:
let $params := map {
'format': 'xquery',
'header': true(),
'separator': 'semicolon'
}
let $data := map {
'names': [
'kenmerk', 'inOnderzoek'
],
'records': (
for $stand in //*:stand
return [
string($stand//*:kenmerk),
string($stand//*:inOnderzoek)
]
)
}
return file:write-text(
'/tmp/output.csv',
csv:serialize($data, $params)
)
Another approach is to use the window clause and write the results in chunks:
for tumbling window $stands in //*:stand
start at $s when true()
end at $e when $e - $s eq 100000
let $first := $s = 1
let $path := '/tmp/output.csv'
let $csv := <csv>{
for $stand in $stands
return <record>{
$stand//*:kenmerk,
$stand//*:inOnderzoek
}</record>
}</csv>
let $params := map {
'method': 'csv',
'csv': map {
'separator': 'semicolon',
'header': $first
}
}
return if ($first) then (
file:write($path, $csv, $params)
) else (
file:append($path, $csv, $params)
)
After the first write operation, subsequent table rows will be appended to the original file. The chunk size (here: 100000 rows per loop) can be freely adjusted. Similar as in your original code, the serialization parameters can also be specified as XML; and it’s of course also possible to use the xquery serialization format in the second example.
I know this seems like a duplicate, and I am sure it more or less is ...
However, it really bugs me, and I cannot make anything of the posts before:
I am building a digital edition, utlizing TEI, XML, XSLT, (and probably existDB, maybe I switch to node/javascript).
I built a php-function that should transforme each file in a specified directory to html. (My xsl-file works well)
declare function app:XMLtoHTML-forAll ($node as node(), $model as map(*), $query as xs:string?){
let $ref := xs:string(request:get-parameter("document", ""))
let $xml := doc(concat("/db/apps/BookOfOrders/data/edition/",$ref))
let $xsl := doc("/db/apps/BookOfOrders/resources/xslt/xmlToHtml.xsl")
let $params :=
<parameters>
{for $p in request:get-parameter-names()
let $val := request:get-parameter($p,())
where not($p = ("document","directory","stylesheet"))
return
<param name="{$p}" value="{$val}"/>
}
</parameters>
return
transform:transform($xml, $xsl, $params)
};
There is a list of files in the apps/BookofOrders/data/edition/ named FolioX.html, where x is the page-number. (I'll probably change names to [FolioNumber].xml, but that's not the issue)
I am trying to make a text slider (so that when I open the page, a page is presented and further buttons are created, and I can slide to the right and read the rest of the pages).
I have a table of content, that is linked to the transformed files:
declare function app:toc($node as node(), $model as map(*)) {
for $doc in collection("/db/apps/BookOfOrders/data/edition")/tei:TEI
return
<li>{document-uri(root($doc))}</li>
};
I guess I am wondering on how to change the link inside to for example Folio29 to Folio30.
Can I take a part of the provided link and make the destination of a link flexible, similar but not identical to what I did in the toc-function above?
I'd be really happy if anyone could point me in the right direction.
Given an expression like document-uri(root($doc)) (perhaps more simply util:document-name($doc), since you're using eXist) that returns the path to (or filename of) the document ending in "FolioX", you just need to isolate X, then cast it as an integer so you can perform addition/subtraction on the value:
document-uri(root($doc)) => substring-after("Folio") => xs:integer()
util:document-name($doc) => substring-after("Folio") => xs:integer()
Then add 1, and you've got your next document. Subtract one, and you've got the previous
However, this could lead to broken links: Folio0 or Folio98 (assuming there are only 97). To avoid this, you might want to retrieve determine the complete list of Folios, find the current position, and then never hit 0 or 98:
let $this-folio := $doc => util:document-name()
let $collection := $doc => util:collection-name()
let $all-folios := xmldb:get-child-resources($collection)
(: sort the filenames using UCA Numeric collation to ensure Folio2 < Folio10.
: see https://www.w3.org/TR/xpath-functions-31/#uca-collations :)
let $sorted-folios := $all-folios => sort("?numeric=yes")
let $this-folio-n := index-of($all-folios, $this-folio)
let $prev-folio := if ($this-folio-n gt 1) then "Folio" || $this-folio-n - 1 else ()
let $next-folio := if ($this-folio-n lt count($all-folios)) then "Folio" || $this-folio-n + 1 else ()
return
<nav>
<prev>{$prev-folio}</prev>
<this>{"Folio" || $this-folio-n}</this>
<next>{$next-folio}</next>
</nav>
I am trying to update a document in a different Database then my current DB. But it is giving me the below error-
XDMP-UPEXTNODES: xdmp:node-replace(fn:doc("/C:/Users/Downloads/abc.csv-0-2")/*:envelope/*:root/*:Status, <Status>1000</Status>) -- Cannot update external nodes
I am using the below code-
let $temp :=
for $i in $result
let $error := $i/*:envelope/*:ErrorMessage
let $status := $i/*:envelope/*:Status
return
if(fn:exists($i) eq fn:true()) then (
xdmp:invoke-function(
function() {
xdmp:node-replace($status,<Status>1000</Status>),
xdmp:node-replace($error,<ErrorMessage>Change Error in other Database-2</ErrorMessage>)
},
<options xmlns="xdmp:eval">
<database>{xdmp:database("DATABASE-2")}</database>
</options>))
else ()
I want to update the Error and Status node of my Database-2.
$result is the document i fetched from Database-2.
This code i am running from Database-1
Any Suggestions ?
You cannot pass database nodes as variable for updating purposes like that. Instead you should pass through the database uri, and get a fresh copy of the element you'd like to update inside the invoked function. Maybe you can push a bit more logic inside the invoked function to make that easier. Something like:
for $i in $result
let $uri := xdmp:node-uri($i)
return xdmp:invoke-function(function() {
let $doc := fn:doc($uri)
let $error := $doc/*:envelope/*:ErrorMessage
let $status := $doc/*:envelope/*:Status
return if(fn:exists($doc) eq fn:true()) then (
xdmp:node-replace($status, <Status>1000</Status>),
xdmp:node-replace($error, <ErrorMessage>Change Error in other Database-2</ErrorMessage>)
) else ()
}, map:entry("database", xdmp:database("DATABASE-2")))
Be careful though. It sounds like $i is pointing to the actual document in Database-2 as well, and it could easily result in dead-locks; the invoking query could be putting a read lock on $i, causing the invoked function to be unable to update it.
HTH!
I have to copy an entire project folder inside the MarkLogic server and instead of doing it manually I decided to do it with a recursive function, but is becoming the worst idea I have ever had. I'm having problems with the transactions and with the syntax but being new I don't find a true way to solve it. Here's my code, thank you for the help!
import module namespace dls = "http://marklogic.com/xdmp/dls" at "/MarkLogic/dls.xqy";
declare option xdmp:set-transaction-mode "update";
declare function local:recursive-copy($filesystem as xs:string, $uri as xs:string)
{
for $e in xdmp:filesystem-directory($filesystem)/dir:entry
return
if($e/dir:type/text() = "file")
then dls:document-insert-and-manage($e/dir:filename, fn:false(), $e/dir:pathname)
else
(
xdmp:directory-create(concat(concat($uri, data($e/dir:filename)), "/")),
local:recursive-copy($e/dir:pathname, $uri)
)
};
let $filesystemfolder := 'C:\Users\WB523152\Downloads\expath-ml-console-0.4.0\src'
let $uri := "/expath_console/"
return local:recursive-copy($filesystemfolder, $uri)
MLCP would have been nice to use. However, here is my version:
declare option xdmp:set-transaction-mode "update";
declare variable $prefix-replace := ('C:/', '/expath_console/');
declare function local:recursive-copy($filesystem as xs:string){
for $e in xdmp:filesystem-directory($filesystem)/dir:entry
return
if($e/dir:type/text() = "file")
then
let $source := $e/dir:pathname/text()
let $dest := fn:replace($source, $prefix-replace[1], $prefix-replace[2])
let $_ := xdmp:document-insert($source,
<options xmlns="xdmp:document-load">
<uri>{$dest}</uri>
</options>)
return <record>
<from>{$source}</from>
<to>{$dest}</to>
</record>
else
local:recursive-copy($e/dir:pathname)
};
let $filesystemfolder := 'C:\Temp'
return <results>{local:recursive-copy($filesystemfolder)}</results>
Please note the following:
I changed my sample to the C:\Temp dir
The output is XML only because by convention I try to do this in case I want to analyze results. It is actually how I found the error related to conflicting updates.
I chose to define a simple prefix replace on the URIs
I saw no need for DLS in your description
I saw no need for the explicit creation of directories in your use case
The reason you were getting conflicting updates because you were using just the filename as the URI. Across the whole directory structure, these names were not unique - hence the conflicting update on double inserts of same URI.
This is not solid code:
You would have to ensure that a URI is valid. Not all filesystem paths/names are OK for a URI, so you would want to test for this and escape chars if needed.
Large filesystems would time-out, so spawning in batches may be useful.
A an example, I might gather the list of docs as in my XML and then process that list by spawning a new task for every 100 documents. This could be accomplished by a simple loop over xdmp:spawn-function or using a library such as taskbot by #mblakele
i try to split my incoming documents using "Information Studio Flows" (MarkLogic v 8.0-1.1). The problem is in "Transform" section.
This is my importing documents. For simplicity i reduce it content to one stwtext-element
<docs>
<stwtext id="RD-10-00258" update="03.2011" seq="RQ-10-00001">
<head>
<ti>
<i>j</i>
</ti>
<ff-list>
<ff id="0103"/>
</ff-list>
</head><p>
Symbol für die
<vw idref="RD-19-04447">Stromdichte</vw>
.
</p>
</stwtext>
</docs>
This is my "xquery transform" content:
xquery version "1.0-ml";
(: Copyright 2002-2015 MarkLogic Corporation. All Rights Reserved. :)
(:
:: Custom action. It must be a CPF action module.
:: Replace this text completely, or use it as a template and
:: add imports, declarations,
:: and code between START and END comment tags.
:: Uses the external variables:
:: $cpf:document-uri: The document being processed
:: $cpf:transition: The transition being executed
:)
import module namespace cpf = "http://marklogic.com/cpf"
at "/MarkLogic/cpf/cpf.xqy";
(: START custom imports and declarations; imports must be in Modules/ on filesystem :)
(: END custom imports and declarations :)
declare option xdmp:mapping "false";
declare variable $cpf:document-uri as xs:string external;
declare variable $cpf:transition as node() external;
if ( cpf:check-transition($cpf:document-uri,$cpf:transition))
then
try {
(: START your custom XQuery here :)
let $doc := fn:doc($cpf:document-uri)
return
xdmp:eval(
for $wpt in fn:doc($doc)//stwtext
return
xdmp:document-insert(
fn:concat("/rom-data/", fn:concat($wpt/#id,".xml")),
$wpt
)
)
(: END your custom XQuery here :)
,
cpf:success( $cpf:document-uri, $cpf:transition, () )
}
catch ($e) {
cpf:failure( $cpf:document-uri, $cpf:transition, $e, () )
}
else ()
by running of snippet, i take the error:
Invalid URI format
and long description of it:
XDMP-URI: (err:FODC0005) fn:doc(fn:doc("/8122584828241226495/12835482492021535301/URI=/content/home/admin/Vorlagen/testing/v10.new-ML.xml")) -- Invalid URI format: "
j
Symbol für die
Stromdichte
"
In /18200382103958065126.xqy on line 37
In xdmp:invoke("/18200382103958065126.xqy", (xs:QName("trgr:uri"), "/8122584828241226495/12835482492021535301/URI=/content/home/admi...", xs:QName("trgr:trigger"), ...), <options xmlns="xdmp:eval"><isolation>different-transaction</isolation><prevent-deadlocks>t...</options>)
$doc = fn:doc("/8122584828241226495/12835482492021535301/URI=/content/home/admin/Vorlagen/testing/v10.new-ML.xml")
In /MarkLogic/cpf/triggers/internal-cpf.xqy on line 179
In execute-action("on-state-enter", "http://marklogic.com/states/initial", "/8122584828241226495/12835482492021535301/URI=/content/home/admi...", (xs:QName("trgr:uri"), "/8122584828241226495/12835482492021535301/URI=/content/home/admi...", xs:QName("trgr:trigger"), ...), <options xmlns="xdmp:eval"><isolation>different-transaction</isolation><prevent-deadlocks>t...</options>, (fn:doc("http://marklogic.com/cpf/pipelines/14379829270688061297.xml")/p:pipeline, fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline), fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline/p:state-transition[1]/p:default-action, fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline/p:state-transition[1])
$caller = "on-state-enter"
$state-or-status = "http://marklogic.com/states/initial"
$uri = "/8122584828241226495/12835482492021535301/URI=/content/home/admi..."
$vars = (xs:QName("trgr:uri"), "/8122584828241226495/12835482492021535301/URI=/content/home/admi...", xs:QName("trgr:trigger"), ...)
$invoke-options = <options xmlns="xdmp:eval"><isolation>different-transaction</isolation><prevent-deadlocks>t...</options>
$pipelines = (fn:doc("http://marklogic.com/cpf/pipelines/14379829270688061297.xml")/p:pipeline, fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline)
$action-to-execute = fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline/p:state-transition[1]/p:default-action
$chosen-transition = fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline/p:state-transition[1]
$raw-module-name = "/18200382103958065126.xqy"
$module-kind = "xquery"
$module-name = "/18200382103958065126.xqy"
In /MarkLogic/cpf/triggers/internal-cpf.xqy on line 320
i thought, it was a problem with "Document setting" in "load" section of "Flow editor"
URI=/content{$path}/{$filename}{$dot-ext}
but if i remove it, i recive the same error.
i have no idea what to do. i am really new. please help
First of all, Information Studio has been deprecated in MarkLogic 8. I would also recommend very much looking in to the aggregate_record feature of MarkLogic Content Pump:
http://docs.marklogic.com/guide/ingestion/content-pump#id_65814
Apart from that, there are several issues with your code. You are calling fn:doc twice, effectively trying to interpret the doc contents as a uri. There is an unnecessary xdmp:eval wrapping the FLWOR statement, which expects a string as first param. I think you can shorten it to (showing inner part of the action only):
(: START your custom XQuery here :)
let $doc := fn:doc($cpf:document-uri)
for $wpt in $doc//stwtext
return
xdmp:document-insert(
fn:concat("/roempp-data/", fn:concat($wpt/#id,".xml")),
$wpt
)
(: END your custom XQuery here :)
HTH!
very many thanks #grtjn and this is my approach. Practically it is the same solution
(: START your custom XQuery here :)
xdmp:log(fn:doc($cpf:document-uri), "debug"),
let $doc := fn:doc($cpf:document-uri)
return
xdmp:eval('
declare variable $doc external;
for $wpt in $doc//stwtext
return (
xdmp:document-insert(
fn:concat("/roempp-data/", fn:concat($wpt/#id,".xml")),
$wpt,
xdmp:default-permissions(),
"roempp-data"
)
)'
,
(xs:QName("doc"), $doc),
<options xmlns="xdmp:eval">
<database>{xdmp:database("roempp-tutorial")}</database>
</options>
)
(: END your custom XQuery here :)
Ok, now it works. It is fine, but i found, that after the loading is over, i see in MarkLogic two documents:
my splited document "/rom-data/RD-10-00258.xml" with one root element "stwtext" (as desired)
origin document "URI=/content/home/admin/Vorlagen/testing/v10.new-ML.xml" with root element "docs"
is it possible to prohibit insert of origin document ?