Xquery - Using sparql results to dynamically create XML. Dynamic Element Names - xquery

I am using MarkLogic 8.
I have a SPARQL statement as such.
let $results :=
sem:sparql(
"
PREFIX skosxl: <http://www.w3.org/2008/05/skos-xl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX slc: <http://www.smartlogic.com/2014/08/semaphore-core#>
select ?relation ?value
where
{
$input ?relation ?c .
OPTIONAL {
?c skosxl:prefLabel/skosxl:literalForm|skosxl:literalForm ?d .
}
BIND ( if (bound(?d), ?d, ?c) as ?value )
}
", $bindings
)
This gives me back results that are a list of (relation, value) pairs.
I am trying to turn this response into an XML document that will be stored statically.
I've tried a variety of different approaches.
Attempt 1
let $doc := <test>{
for $item in $results
return element {map:get($item, 'relation')} {map:get($item, 'value')}
}</test>
return $doc
Error :
XDMP-ELEMNAME: (err:XPTY0004) for $item in $results -- Cannot use
sem:iri("http://www.w3.org/2008/05/skos-xl#altLabel") as an element
name
I tried casting the item in question, to a string using fn:string but that leads to a
[1.0-ml] XDMP-QNAMELEXFORM: for $item in $results -- Invalid lexical
form for QName
How can i declare a dynamic element name in XQuery during XML Building?
What is causing this error in the first place? I have been messing with syntax to try and figure it out, what am I unaware of that is causing this issue?
Thank you for reading.

Casting as string should be enough.
However, your example has foreward slashes which I believe are invalid.
Second, your example would be making an element defined as being in the html namespace - or whatever you defined the prefix html to be.
Also, the first char after the colon is not an alphanumeric characters which is required.
In my opinion, the name you are trying to use for an element name is the issue - not the actual approach.

Related

Dynamic Path in XQUERY

This is my xml-file:
<?xml version="1.0" encoding="UTF-8"?>
<QQ:Envelope xmlns:QQ="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header>
<RR:ABCInfo xmlns:RR="http://abc.test.de/abc/SOAP-Header/1.0">
<RR:Version>2.2.2.2</RR:Version>
<RR:BuildRevision>3333</RR:BuildRevision>
<RR:BuildTimestamp>2019-01-01T00:00:00.000+02:00</RR:BuildTimestamp>
<RR:Start>2019-01-01T10:10:10.101+02:00</RR:Start>
<RR:End>2019-01-01T11:11:11.111+02:00</RR:End>
<RR:Something>2.222 sek.</RR:Something>
<RR:Anything/>
</RR:ABCInfo>
<work:WorkContext xmlns:work="http://test.com/">1234567890abcdefghijklmnopqrstuvwxyz</work:WorkContext>
</SOAP-ENV:Header>
<QQ:Body>
<TT:testA xmlns:TT="http://abc.test.de/XYZ/2.0.1" xmlns:RR="http://abc.test.de/abc/abcdefgh/1.0">
<TT:testB>
<TT:testC>
<TT:testD>
<TT:testE id="1234567" quellID="09876543">
<TT:data>urn:de:abc:test:whatever</TT:data>
<TT:changeDate>2019-02-02T02:02:02.020+02:00</TT:changeDate>
<TT:part1 listURI="urn:de:abc:codeliste:555" listVersionID="V12">
<code>555_777</code>
<name>Fischers Fritze</name>
</TT:part1>
<TT:piece2>Frische Fische fischen</TT:piece2>
<TT:begin>
<TT:date>20191231</TT:date>
</TT:begin>
</TT:testE>
</TT:testD>
</TT:testC>
</TT:testB>
</TT:testA>
</QQ:Body>
</QQ:Envelope>
I have a XQuery, where I have to return XML. The first element in the returning XML is "result". The other elements in the returning XML should be dynamically created.
I get 2 sequences from outside, though I have made 2 fix Sequences in the following example to test it.
In Sequence No 1 I get the names for the other elements.
In Sequence No 2 I get the related path to the element names in Sequence 1.
I open the XML file an read a path (there might be several elements, though in my example is only one.
Then I want to process this result in a loop and return the dynamic elements.
If I access the path with a fix value (variable $c in the following code) I get the correct value, but then I must know the elements in Sequence 1 and the path in Sequence 2.
If I concatenate the path then I get the value from all elements.
This is my XQuery Code:
declare namespace TT="http://abc.test.de/XYZ/2.0.1";
declare namespace QQ="http://schemas.xmlsoap.org/soap/envelope/";
declare function local:getValue($path) as xs:string {
if (fn:exists($path)) then
(
data($path)
) else (
""
)
};
let $a := ('part1', 'piece2', 'beginDate')
let $b := ('TT:part1/name','TT:piece2', 'TT:begin/TT:date')
for $x in doc("Test.XML")/QQ:Envelope/QQ:Body/TT:testA/TT:testB/TT:testC/TT:testD/TT:testE
return <result>
{
for $item at $ind in $a
let $c := local:getValue($x/TT:part1/name)
let $d := local:getValue($x || concat("/", $b[$ind]))
return element { $item } {$c, " --- ", $d}
}
</result>
Is there a possibility to access the path dynamically?
Thank you in advance.
http://www.xqueryfunctions.com/xq/functx_dynamic-path.html could help - at least did it help ME ;)
The functx:dynamic-path function dynamically evaluates a simple path expression. The function only supports element names and attribute names preceded by #, separated by single slashes. The names can optionally be prefixed, but they must use the same prefix that is used in the input document. It does not support predicates, other axes, or other node kinds. Note that most processors have an extension function that evaluates path expressions dynamically in a much more complete way.

pad the string with whitespaces to make it of certain length in xquery osb 12 c

I want to pad a string with whitespaces to make it of certain length in XQuery on the OSB platform.
I tried string-join and concat, but none of them pad whitespaces as they consider them as empty string.
Sample input:
<root-element xmlns="">
<string-to-pad>abc</string-to-pad>
</root-element>
**Expected output:**
<root-element>
<paddedString>abc </paddedString>
</root-element>
Yes not much to say without a code sample. This is how the functx library, solves your problem in XQuery. Either import it as a module (its uri is stable), or google the function name.
declare namespace functx = "http://www.functx.com";
declare function functx:pad-string-to-length
( $stringToPad as xs:string? ,
$padChar as xs:string ,
$length as xs:integer ) as xs:string {
substring(
string-join (
($stringToPad, for $i in (1 to $length) return $padChar)
,'')
,1,$length)
} ;
see this fiddle: http://xqueryfiddle.liberty-development.net/jyyiVhe/2
Will generate the desired output but Oracle Jdev will not display it with proper spacing.

Multiple keyword search in xquery with tokenize and match

My attempt to ask this before was apparently too convoluted, trying again!
I am composing a search in Xquery. In one of the fields (title) it should be possible to enter multiple keywords. At the moment only ONE keyword works. When there is more than one there is the error ERROR XPTY0004: The actual cardinality for parameter 1 does not match the cardinality declared in the function's signature: concat($atomizable-values as xs:anyAtomicType?, ...) xs:string?. Expected cardinality: zero or one, got 2.
In my xquery I am trying to tokenize the keywords by \s and then match them individually. I think this method is probably false but I am not sure what other method to use. I am obviously a beginner!!
Here is the example XML to be searched:
<files>
<file>
<identifier>
<institution>name1</institution>
<idno>signature</idno>
</identifier>
<title>Math is fun</title>
</file>
<file>
<identifier>
<institution>name1</institution>
<idno>signature1</idno>
</identifier>
<title>philosophy of math</title>
</file>
<file>
<identifier>
<institution>name2</institution>
<idno>signature2</idno>
</identifier>
<title>i like cupcakes</title>
</file>
</files>
Here is the Xquery with example input 'math' for the search field title and 'name1' for the search field institution. This works, the search output are the titles 'math is fun' and 'philosophy of math'. What doesn't work is if you change the input ($title) to 'math fun'. Then you get the error message. The desired output is the title 'math is fun'.
xquery version "3.0";
let $institution := 'name1'
let $title := 'math' (:change to 'math fun' and doesn't work anymore, only a single word works:)
let $title-predicate :=
if ($title)
then
if (contains($title, '"'))
then concat("[contains(lower-case(title), '", replace($title, '["]', ''), "')]") (:This works fine:)
else
for $title2 in tokenize($title, '\s') (:HERE IS THE PROBLEM, this only works when the input is a single word, for instance 'math' not 'math fun':)
return
concat("[matches(lower-case(title), '", $title2, "')]")
else ()
let $institution-predicate := if ($institution) then concat('[lower-case(string-join(identifier/institution))', " = '", $institution, "']") else ()
let $eval-string := concat
("doc('/db/Unbenannt.xml')//file",
$institution-predicate,
$title-predicate
)
let $records := util:eval($eval-string)
let $test := count($records)
let $content :=
<inner_container>
<div>
<h2>Search Results</h2>
<ul>
{
for $record in $records
return
<li id="searchList">
<span>{$record//institution/text()}</span> <br/>
<span>{$record//title/text()}</span>
</li>
}
</ul>
</div>
</inner_container>
return
$content
You have to wrap your FLWOR expression with string-join():
string-join(
for $title2 in tokenize($title, '\s')
return
concat("[matches(lower-case(title), '", $title2, "')]")
)
If tokenize($title) returns a sequence of strings, then
for $title2 in tokenize($title, '\s')
return concat("[matches(lower-case(title), '", $title2, "')]")
will also return a sequence of strings
Therefore $title-predicate will be a sequence of strings, and you can't supply a sequence of strings as one of the arguments to concat().
So it's clear what's wrong, but fixing it requires a deeper understanding of your query than I have time to acquire.
I find it hard to believe that the approach of generating a query as a string and then doing dynamic evaluation of that query is really necessary.

How to tidy-up Processing Instructions in Marklogic

I have a content which is neither a valid HTML nor a XML in my legacy database. Considering the fact, it would be difficult to clean the legacy, I want to tidy this up in MarkLogic using xdmp:tidy. I am currently using ML-8.
<sub>
<p>
<???†?>
</p>
</sub>
I'm passing this content to tidy functionality in a way :
declare variable $xml as node() :=
<content>
<![CDATA[<p><???†?></p>]]>
</content>;
xdmp:tidy(xdmp:quote($xml//text()),
<options xmlns="xdmp:tidy">
<assume-xml-procins>yes</assume-xml-procins>
<quiet>yes</quiet>
<tidy-mark>no</tidy-mark>
<enclose-text>yes</enclose-text>
<indent>yes</indent>
</options>)
As a result it returns :
<p>
<? ?†?>
</p>
Now this result is not the valid xml format (I checked it via XML validator) due to which when I try to insert this XML into the MarkLogic it throws an error saying 'MALFORMED BODY | Invalid Processing Instruction names'.
I did some investigation around PIs but not much luck. I could have tried saving the content without PI but this is also not a valid PI too.
That is because what you think is a PI is in fact not a PI.
From W3C:
2.6 Processing Instructions
[Definition: Processing instructions (PIs) allow documents to contain
instructions for applications.]
Processing Instructions
[16] PI ::= '' Char*)))?
'?>'
[17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' |
'l'))
So the PI name cannot start with ? as in your sample ??†
You probably want to clean up the content before you pass it to tidy.
Like below:
declare variable $xml as node() :=
<content><![CDATA[<p>Hello <???†?>world</p>]]></content>;
declare function local:copy($input as item()*) as item()* {
for $node in $input
return
typeswitch($node)
case text()
return fn:replace($node,"<\?[^>]+\?>","")
case element()
return
element {name($node)} {
(: output each attribute in this element :)
for $att in $node/#*
return
attribute {name($att)} {$att}
,
(: output all the sub-elements of this element recursively :)
for $child in $node
return local:copy($child/node())
}
(: otherwise pass it through. Used for text(), comments, and PIs :)
default return $node
};
xdmp:tidy(local:copy($xml),
<options xmlns="xdmp:tidy">
<assume-xml-procins>no</assume-xml-procins>
<quiet>yes</quiet>
<tidy-mark>no</tidy-mark>
<enclose-text>yes</enclose-text>
<indent>yes</indent>
</options>)
This would do the trick to get rid of all PIs (real and fake PIs)
Regards,
Peter

How to dynamically create a search query based on a set of quoted strings in MarkLogic

I have the following query, where i want to form a string of values from a list and i want to use that comma separated string as an or-query but it does not give any result, however when i return just the concatenated string it gives the exact value needed for the query.
The query is as follows:
xquery version "1.0-ml";
declare namespace html = "http://www.w3.org/1999/xhtml";
declare variable $docURI as xs:string external ;
declare variable $orQuery as xs:string external ;
let $tags :=
<tags>
<tag>"credit"</tag>
<tag>"bank"</tag>
<tag>"private banking"</tag>
</tags>
let $docURI := "/2012-10-22_CSGN.VX_(Citi)_Credit_Suisse_(CSGN.VX)__Model_Update.61198869.xml"
let $orQuery := (string-join($tags/tag, ','))
for $x in cts:search(doc($docURI)/doc/Content/Section/Paragraph, cts:or-query(($orQuery)))
let $r := cts:highlight($x, cts:or-query($orQuery), <b>{$cts:text}</b>)
return <result>{$r}</result>
The exact query that i want to run is :
cts:search(doc($docURI)/doc/Content/Section/Paragraph, cts:or-query(("credit","bank","private banking")))
and when i do
return (string-join($tags/tag, ','))
it gives me exactly what i require
"credit","bank","private banking"
But why does it not return any result in or-query?
The string-join step should not need to be string-join. That passes in a literal string. In xQuery, sequences are your friend.
I think you want to do something like this:
let $tags-to-search := ($tags/tag/text()!replace(., '^"|"$', '') ) (: a sequence of tags :)
cts:search(doc($docURI)/doc/Content/Section/Paragraph, cts:word-query($tags-to-search))
cts:word-query is the default query used for parameter 2 of search if you pass in a string. cts:word query also returns matches for any items in a sequence if presented with that.
https://docs.marklogic.com/cts:word-query
EDIT: Added the replace step for the quotes as suggested by Abel. This is specific to the data as presented by the original question. The overall approach remains the same.
Maybe do you need something like this
let $orQuery := for $tag in $tags/tag return cts:word-query($tag)
I used fn:tokenize instead it worked perfectly for my usecase
its because i was trying to pass these arguments from java using XCC api and it would not return anything with string values
xquery version "1.0-ml";
declare namespace html = "http://www.w3.org/1999/xhtml";
declare variable $docURI as xs:string external ;
declare variable $orQuery as xs:string external ;
let $input := "credit,bank"
let $tokens := fn:tokenize($input, ",")
let $docURI := "2012-11-19 0005.HK (Citi) HSBC Holdings Plc (0005.HK)_ Model Update.61503613.pdf"
for $x in cts:search(fn:doc($docURI), cts:or-query(($tokens)))
let $r := cts:highlight($x, cts:or-query(($tokens)), <b>{$cts:text}</b>)
return <result>{$r}</result>

Resources