how to import a project into eXist-db - xquery

I'm fairly new to exide and exist-db so i'm getting used to it's gui and menus and ways of doing stuff. my problem is that i already have a project, a teacher downloaded it from his exist and gave it to me as a .rar.
the thing is that i want to import that project in a way that it keeps the same folder structure that it has. i've tried importing it through the collections menu but it ignores the folder structure and just uploads all the files in one folder
folder structure is as follows:
final:
-db
-contrib
-file
-file
-css
-somecss
-somecss
-js
-somejs
-somejs
-img
-img1.png
-xqueryelement.xq
-xqueryelement2.xq
-htmlelement.html

Besides the approaches suggested by Ron and Loren, you could use a WebDAV client - probably the simplest. Or you can write a query (e.g., in eXide) using the xmldb:store-files-from-pattern() function. All of the above are mentioned in the article, Getting Data into eXist-db.
If you're interested in using the xmldb:store-files-from-pattern() function, here are specific directions. You'll use the variant of the function with the $preserve-structure parameter. The direct link to the function documentation for this variant of the function is http://exist-db.org/exist/apps/fundocs/view.html?uri=http://exist-db.org/xquery/xmldb#store-files-from-pattern.5. Here are steps:
Expand the .rar file into a folder on your file system, e.g., /path/to/files
Open eXide (probably at http://localhost:8080/exist/apps/eXide on your local system). Log in, e.g., as the admin user, so that your query will be able to write to the database.
Use this query:
xquery version "3.0";
let $collection-uri := '/db'
let $directory := '/path/to/files'
let $pattern := '**/*'
let $mime-type := ()
let $preserve-structure := true()
return
xmldb:store-files-from-pattern($collection-uri, $directory, $pattern, $mime-type, $preserve-structure)

If when the .rar file is extracted out and each folder contains "_ _ contents_ _.xml", then you can use the restore functionality from the java admin client.
If it does not, then you can use the following. (I have not tested the code and wrote it on the fly.) Launch eXide from the dashboard. Log in as admin and then copy the code below into the ide and then run it.
The code has some functions copied from Priscilla Walmsley's FUNCTX function module. http://www.xqueryfunctions.com/
xquery version "3.0";
declare namespace file="http://exist-db.org/xquery/file";
(:~
: Escapes regex special characters
:
: #author Priscilla Walmsley, Datypic
: #version 1.0
: #see http://www.xqueryfunctions.com/xq/functx_escape-for-regex.html
: #param $arg the string to escape
:)
declare function local:escape-for-regex( $arg as xs:string? ) as xs:string {
replace($arg, '(\.|\[|\]|\\|\||\-|\^|\$|\?|\*|\+|\{|\}|\(|\))','\\$1')
} ;
(:~
: The substring before the last occurrence of a delimiter
:
: #author Priscilla Walmsley, Datypic
: #version 1.0
: #see http://www.xqueryfunctions.com/xq/functx_substring-before-last.html
: #param $arg the string to substring
: #param $delim the delimiter
:)
declare function local:substring-before-last( $arg as xs:string?, $delim as xs:string ) as xs:string {
if (matches($arg, local:escape-for-regex($delim)))
then replace($arg,
concat('^(.*)', local:escape-for-regex($delim),'.*'),
'$1')
else ''
} ;
(:~
: The substring after the last occurrence of a delimiter
:
: #author Priscilla Walmsley, Datypic
: #version 1.0
: #see http://www.xqueryfunctions.com/xq/functx_substring-after-last.html
: #param $arg the string to substring
: #param $delim the delimiter
:)
declare function local:substring-after-last
( $arg as xs:string? ,
$delim as xs:string ) as xs:string {
replace ($arg,concat('^.*',local:escape-for-regex($delim)),'')
} ;
declare function local:import($base as xs:string, $path as xs:string) {
let $offset := fn:substring-after($path, $base)
return
if (file:is-directory($path))
then
let $created := xmldb:create-collection('/', '/')
let $coll := for $resource in file:list($path)
return local:import($base, $path || '/' || $resource)
return $path
else
let $collection : = local:substring-before-last($offset, '/')
let $resource : = local:substring-after-last($offset, '/')
let $stored := xmldb:store($collection, $resource,
file:read-binary($path), 'application/octet-stream')
let $permission := if (fn:ends-with($resource, ".xq"))
then sm:chmod($collection || '/' || $resource, 'rwxr-xr-x')
else ()
return $path
};
let $base := '/tmp/lctmp'
let $path := '/tmp/lctmp/db'
return local:import($base, $path)

To upload folder structures in eXist, you have 2 options:
use the Java admin client; in a vanilla eXist installation, you can access it directly via the browser at http://localhost:8080/exist/webstart/exist.jnlp or via the link in the dashboard at http://localhost:8080/exist/apps/dashboard/index.html; or via the pop-up menu in the quick-launch button (quick launch toolbar in the bottom right if you're on Windows). There you can add entire folders.
package the project as a XAR archive, which essentialy is a zip archive with proper project descriptor files. See details at http://exist-db.org/exist/apps/doc/repo.xml
Hope this helps,
Ron

Related

How to execute XQuery on all XML documents in the folder

I need to make sure a particular node exists in many XML files. I have to switch the context each time I want to query another document.
Is there any way I can execute XQuery on all documents in the directory without switching the context?
I may be a little late, but most probably the following XQuery will do what you wish, it returns the path to each XML-File that does not contain a specific element:
let $path := "."
for $file in file:list( $path, true(), '*.xml')
let $path := $path || "/" || $file
where not(
exists(fetch:xml($path)/foo/bar[text() = "Text"])
)
return $path
If you were only interested if there were XML-files in a specific that do or do not contain a specific element the following query might be useful:
declare variable $path := "/Users/michael/Code/foo";
every $doc in file:list($path, true(), '*.xml') (: returns a sequence of file-names :)
=> for-each(concat($path,"/", ?)) (: returns a sequence of full paths to each file :)
=> for-each(fetch:xml#1) (: returns a sequence of documents :)
satisfies exists(
$doc/*/*[text() = "Text"]
)
Hope this helps ;-)

XQuery Collection - Cant get multiple files from a directory

I'm trying to get all the filenames from a directory that have a specific string in the XML elements. I'm using this code for that:
for $x in collection('file:///C:/Sricpts/Software20180101-V1.1.1/workspace/Data_PreProcessing?select=*.xml')
where matches($x, 'INSERT INTO')
return $x
However when I run the code it gives that error:
[FODC0002] Resource 'C:/Program Files (x86)/BaseX/file:/file:///C:/Sricpts/Software20180101-V1.1.1/workspace/Data_PreProcessing?select=*.xml' does not exist.
How can I solve this problem?
Many thanks!
If you use BaseX, there is no need to specify a file filter. All XML documents that occur in the specified directory will be added to your collection:
for $x in collection('file:///C:/Sricpts/Software20180101- V1.1.1/workspace/Data_PreProcessing')
where matches($x, 'INSERT INTO')
return $x
If you want to have more control over the documents to be chosen, you can take advantage of the File Module:
let $root := 'C:\Sricpts\Software20180101- V1.1.1\workspace\Data_PreProcessing'
for $path in file:list($root, true(), '*.xml')
let $doc := doc($root || $path)
where matches($doc, 'INSERT INTO')
return $doc

How to share markup snippets amongst functions in eXist-db?

I wonder whether there is a way how to share html code snippets in eXist-db. I have two different (more expected later) functions returning the same big html form for different results. It is annoying to maintain the same code when I change something in one of these. I have tried:
Saving it like html file and load it with doc() function (eXist complains it is not an xml file, it is binary.
Saving it like global variable into a separate module (eXist complains there is a problem with contexts). I don’t know how to pass such a variable without the namespace prefix.
Saving it like a function returning its own huge variable (eXist complains there is a problem with contexts).
What is the best practice?
UPDATE
Well, I have tried to put the snippet into a variable insinde a function loaded as a module. For me, it seems reasonable. However, I got an error when try to return that:
err:XPDY0002 Undefined context sequence for 'child::snip:snippet' [at line 62, column 13, source: /db/apps/karolinum-apps/modules/app.xql]
In function:a pp:book-search(node(), map, xs:string?) [34:9:/db/apps/karolinum-apps/modules/app.xql]
I am calling it like so:
declare function app:list-books($node as node(), $model as map(*)) {
for $resource in collection('/db/apps/karolinum-apps/data/mono')
let $author := $resource//tei:titleStmt/tei:author/text()
let $bookName := $resource//tei:titleStmt/tei:title/text()
let $bookUri := base-uri($resource)
let $imgPath := replace($bookUri, '[^/]*?$', '')
let $fileUri := ( '/exist/rest' || $bookUri )
let $fileName := replace($bookUri, '.*?/', '')
return
if ($resource//tei:titleStmt/tei:title)
then
snip:snippet
else ()
};
Any ideas, please?
UPDATE II
Here I have the function in the module:
module namespace snip = "http://46.28.111.241:8081/exist/db/apps/karolinum-apps/modules/snip";
declare function snip:snippet($node as node(), $model as map(*), $author as xs:string, $bookTitle as xs:string, $bookUri as xs:anyURI, $fileUri as xs:anyURI) as element()* {
let $snippet :=
(
<div class="panel panel-default">
<div class="panel-heading">
<h3 class="panel-title">{$bookTitle} ({$author})</h3>
</div>
<div class="panel-body">
...
</div>
)
return $snippet
};
Here I am trying to call it:
declare function app:list-books($node as node(), $model as map(*)) {
for $resource in collection('/db/apps/karolinum-apps/data/mono')
let $author := $resource//tei:titleStmt/tei:author/text()
let $bookTitle := $resource//tei:titleStmt/tei:title/text()
let $bookUri := base-uri($resource)
let $fileUri := ('/exist/rest' || $bookUri)
let $fileName := replace($bookUri, '.*?/', '')
where not(util:is-binary-doc($bookUri))
order by $bookTitle, $author
return
snip:snippet($author, $bookTitle, $bookUri, $fileUri)
};
It throws:
err:XPST0017 error found while loading module app: Error while loading module app.xql: Function snip:snippet() is not defined in namespace 'http://46.28.111.241:8081/exist/db/apps/karolinum-apps/modules/snip' [at line 35, column 9]
When I tried to put the snippet into a variable, it was not possible to pass there those local variables used (it threw $fileUri is not set). Besides that I tried to change the returned type element()* but nothing helped.
All of your approaches should work. Let me address each one:
Is the HTML snippet well-formed XML? If so, save it as, e.g., form.xml or form.html (since by default eXist assumes files with the .html extension are well-formed; see mime-types.xml in your eXist installation folder) and refer to it with doc($path). If it is not well-formed, you can save it as form.txt and pull it in with util:binary-to-string(util:binary-doc($path)). Or make the HTML well-formed and use the first alternative.
This too is valid, so you must not be properly declaring or referring to the global variable. What is the exact error you are getting? Can you post a small example snippet that we could run to reproduce your results?
See #2.
I was very close. It was necessary to somehow pass parameters to the nested function and omit eXist’s typical $node as node(), $model as map(*) as arguments.
Templating function:
declare function app:list-books($node as node(), $model as map(*)) {
for $resource in collection('/db/apps/karolinum-apps/data/mono')
let $author := $resource//tei:titleStmt/tei:author/text()
let $bookTitle := $resource//tei:titleStmt/tei:title/text()
let $bookUri := base-uri($resource)
let $bookId := xs:integer(util:random() * 10000)
let $fileUri := ('/exist/rest' || $bookUri)
let $fileName := replace($bookUri, '.*?/', '')
where not(util:is-binary-doc($bookUri))
order by $bookTitle, $author
return
snip:snippet($author, $bookTitle, $bookUri, $bookId, $fileUri)
};
Snippet function:
declare function snip:snippet($author as xs:string, $bookTitle as xs:string, $bookUri as xs:anyURI, $bookId as xs:string, $fileUri as xs:anyURI) as element()* {
let $snippet :=
(
<div class="panel panel-default">
...
</div>
)
return $snippet
};

split document by using MarkLogic Flow Editor

i try to split my incoming documents using "Information Studio Flows" (MarkLogic v 8.0-1.1). The problem is in "Transform" section.
This is my importing documents. For simplicity i reduce it content to one stwtext-element
<docs>
<stwtext id="RD-10-00258" update="03.2011" seq="RQ-10-00001">
<head>
<ti>
<i>j</i>
</ti>
<ff-list>
<ff id="0103"/>
</ff-list>
</head><p>
Symbol für die
<vw idref="RD-19-04447">Stromdichte</vw>
.
</p>
</stwtext>
</docs>
This is my "xquery transform" content:
xquery version "1.0-ml";
(: Copyright 2002-2015 MarkLogic Corporation. All Rights Reserved. :)
(:
:: Custom action. It must be a CPF action module.
:: Replace this text completely, or use it as a template and
:: add imports, declarations,
:: and code between START and END comment tags.
:: Uses the external variables:
:: $cpf:document-uri: The document being processed
:: $cpf:transition: The transition being executed
:)
import module namespace cpf = "http://marklogic.com/cpf"
at "/MarkLogic/cpf/cpf.xqy";
(: START custom imports and declarations; imports must be in Modules/ on filesystem :)
(: END custom imports and declarations :)
declare option xdmp:mapping "false";
declare variable $cpf:document-uri as xs:string external;
declare variable $cpf:transition as node() external;
if ( cpf:check-transition($cpf:document-uri,$cpf:transition))
then
try {
(: START your custom XQuery here :)
let $doc := fn:doc($cpf:document-uri)
return
xdmp:eval(
for $wpt in fn:doc($doc)//stwtext
return
xdmp:document-insert(
fn:concat("/rom-data/", fn:concat($wpt/#id,".xml")),
$wpt
)
)
(: END your custom XQuery here :)
,
cpf:success( $cpf:document-uri, $cpf:transition, () )
}
catch ($e) {
cpf:failure( $cpf:document-uri, $cpf:transition, $e, () )
}
else ()
by running of snippet, i take the error:
Invalid URI format
and long description of it:
XDMP-URI: (err:FODC0005) fn:doc(fn:doc("/8122584828241226495/12835482492021535301/URI=/content/home/admin/Vorlagen/testing/v10.new-ML.xml")) -- Invalid URI format: "
j
Symbol für die
Stromdichte
"
In /18200382103958065126.xqy on line 37
In xdmp:invoke("/18200382103958065126.xqy", (xs:QName("trgr:uri"), "/8122584828241226495/12835482492021535301/URI=/content/home/admi...", xs:QName("trgr:trigger"), ...), <options xmlns="xdmp:eval"><isolation>different-transaction</isolation><prevent-deadlocks>t...</options>)
$doc = fn:doc("/8122584828241226495/12835482492021535301/URI=/content/home/admin/Vorlagen/testing/v10.new-ML.xml")
In /MarkLogic/cpf/triggers/internal-cpf.xqy on line 179
In execute-action("on-state-enter", "http://marklogic.com/states/initial", "/8122584828241226495/12835482492021535301/URI=/content/home/admi...", (xs:QName("trgr:uri"), "/8122584828241226495/12835482492021535301/URI=/content/home/admi...", xs:QName("trgr:trigger"), ...), <options xmlns="xdmp:eval"><isolation>different-transaction</isolation><prevent-deadlocks>t...</options>, (fn:doc("http://marklogic.com/cpf/pipelines/14379829270688061297.xml")/p:pipeline, fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline), fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline/p:state-transition[1]/p:default-action, fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline/p:state-transition[1])
$caller = "on-state-enter"
$state-or-status = "http://marklogic.com/states/initial"
$uri = "/8122584828241226495/12835482492021535301/URI=/content/home/admi..."
$vars = (xs:QName("trgr:uri"), "/8122584828241226495/12835482492021535301/URI=/content/home/admi...", xs:QName("trgr:trigger"), ...)
$invoke-options = <options xmlns="xdmp:eval"><isolation>different-transaction</isolation><prevent-deadlocks>t...</options>
$pipelines = (fn:doc("http://marklogic.com/cpf/pipelines/14379829270688061297.xml")/p:pipeline, fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline)
$action-to-execute = fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline/p:state-transition[1]/p:default-action
$chosen-transition = fn:doc("http://marklogic.com/cpf/pipelines/15861601524191348323.xml")/p:pipeline/p:state-transition[1]
$raw-module-name = "/18200382103958065126.xqy"
$module-kind = "xquery"
$module-name = "/18200382103958065126.xqy"
In /MarkLogic/cpf/triggers/internal-cpf.xqy on line 320
i thought, it was a problem with "Document setting" in "load" section of "Flow editor"
URI=/content{$path}/{$filename}{$dot-ext}
but if i remove it, i recive the same error.
i have no idea what to do. i am really new. please help
First of all, Information Studio has been deprecated in MarkLogic 8. I would also recommend very much looking in to the aggregate_record feature of MarkLogic Content Pump:
http://docs.marklogic.com/guide/ingestion/content-pump#id_65814
Apart from that, there are several issues with your code. You are calling fn:doc twice, effectively trying to interpret the doc contents as a uri. There is an unnecessary xdmp:eval wrapping the FLWOR statement, which expects a string as first param. I think you can shorten it to (showing inner part of the action only):
(: START your custom XQuery here :)
let $doc := fn:doc($cpf:document-uri)
for $wpt in $doc//stwtext
return
xdmp:document-insert(
fn:concat("/roempp-data/", fn:concat($wpt/#id,".xml")),
$wpt
)
(: END your custom XQuery here :)
HTH!
very many thanks #grtjn and this is my approach. Practically it is the same solution
(: START your custom XQuery here :)
xdmp:log(fn:doc($cpf:document-uri), "debug"),
let $doc := fn:doc($cpf:document-uri)
return
xdmp:eval('
declare variable $doc external;
for $wpt in $doc//stwtext
return (
xdmp:document-insert(
fn:concat("/roempp-data/", fn:concat($wpt/#id,".xml")),
$wpt,
xdmp:default-permissions(),
"roempp-data"
)
)'
,
(xs:QName("doc"), $doc),
<options xmlns="xdmp:eval">
<database>{xdmp:database("roempp-tutorial")}</database>
</options>
)
(: END your custom XQuery here :)
Ok, now it works. It is fine, but i found, that after the loading is over, i see in MarkLogic two documents:
my splited document "/rom-data/RD-10-00258.xml" with one root element "stwtext" (as desired)
origin document "URI=/content/home/admin/Vorlagen/testing/v10.new-ML.xml" with root element "docs"
is it possible to prohibit insert of origin document ?

Encode string as html in eXist-db / XQuery

I'm trying to generate a treeview from a collection (filesystem). Unfortunately some Files have special characters like ü ä and ö. And I'd like to have them html encoded as &­auml;
When I get them from the variable, they are URL encoded. First I decode them to UTF-8 and then .... i don't know how to go further.
<li>{util:unescape-uri($child, "UTF-8")}
The function util:parse is doing the exact opposite from that what I want.
Here is the recursive function:
xquery version "3.0";
declare namespace ls="ls";
declare option exist:serialize "method=html media-type=text/html omit-xml-declaration=yes indent=yes";
declare function ls:ls($collection as xs:string, $subPath as xs:string) as element()* {
if (xmldb:collection-available($collection)) then
(
for $child in xmldb:get-child-collections($collection)
let $path := concat($collection, '/', $child)
let $sPath := concat($subPath, '/', $child)
order by $child
return
<li>{util:unescape-uri($child, "UTF-8")}
<ul>
{ls:ls($path,$sPath)}
</ul>
</li>,
for $child in xmldb:get-child-resources($collection)
let $sPath := concat($subPath, '/', $child)
order by $child
return
<li> {util:unescape-uri($child, "UTF-8")}</li>
)
else ()
};
let $collection := request:get-parameter('coll', '/db/apps/ebner-online/resources/xss/xml')
return
<ul>{ls:ls($collection,"")}</ul>
Rather than util:unescape-uri(), I would suggest using xmldb:encode-uri() and xmldb:decode-uri(). Use the encode version on a collection or document name when creating/storing it. Use the decode version when displaying the collection or document name. See the function documentation for the xmldb module.
As to forcing ä instead of ü, this is an even trickier serialization issue. Both, along with ä, are equivalent representations of the same UTF-8 character. Why not just let the character through as ü?

Resources