loop xml and retrieve node values and construct xml outputusing Xquery - xquery

Team, I need your help /expertise to retrieve node value by traversing an xml response. I would like to use this an integration middleware.
Input file sample:
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices"
xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
xml:base="https://api12preview.sapsf.eu:443/odata/v2/">
<title type="text">PerEmail</title>
<id>https://api12preview.sapsf.eu:443/odata/v2/PerEmail</id>
<updated>2022-11-09T13:58:27Z</updated>
<link href="PerEmail" rel="self" title="PerEmail"/>
<entry>
<id>https://api12preview.sapsf.eu:443/odata/v2/PerEmail(emailType='54139',personIdExternal='GI00152188')</id>
<title type="text"/>
<updated>2022-11-09T13:58:27Z</updated>
<author>
<name/>
</author>
<link href="PerEmail(emailType='54139',personIdExternal='GI00152188')"
rel="edit"
title="PerEmail"/>
<category scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme"
term="SFOData.PerEmail"/>
<content type="application/xml">
< properties>
<d:personIdExternal>GI00152188</d:personIdExternal>
<d:emailAddress>someone#test_boehringer.com</d:emailAddress>
</m:properties>
</content>
</entry>
<entry>
<id>https://api12preview.sapsf.eu:443/odata/v2/PerEmail(emailType='54139',personIdExternal='GI00453224')</id>
<title type="text"/>
<updated>2022-11-09T13:58:27Z</updated>
<author>
<name/>
</author>
<link href="PerEmail(emailType='54139',personIdExternal='GI00453224')"
rel="edit"
title="PerEmail"/>
<category scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme"
term="SFOData.PerEmail"/>
<content type="application/xml">
<m:properties>
<d:personIdExternal>GI00453224</d:personIdExternal>
<d:emailAddress>someone#test_boehringer.com</d:emailAddress>
</m:properties>
</content>
</entry>
<link href="https://api12preview.sapsf.eu:443/odata/v2/PerEmail?$select=emailAddress,personIdExternal&$filter=emailType%20eq%2054139&$skiptoken=eyJzdGFydFJvdyI6MTAwMCwiZW5kUm93IjoyMDAwfQ=="
rel="next"/>
</feed>
Out of this response or xml Xquery should run through all 'entry' node and pick values of node 'personIdExternal' and I'm expecting result like this
<element>
<personIdExternal>GI00152188</personIdExternal>
<personIdExternal>GI00453224</personIdExternal>
</element>
I have tried something below code earlier but it's not working here, and I suspect this is due to namespace in the source xml. My knowledge is limited in XQuery - Please help
{let $input:= /entry
for $i in $input/properties
return
<element>
<personIdExternal>{i/personIdExternal/text()}</personIdExternal>
</element>}

/entry doesn't select anything because the entry elements aren't at the top level, and they're in a namespace.
$input/properties is wrong because the properties element isn't a child of entry and it's in a namespace.
i doesn't select anything, it should be $i
personIdExternal doesn't select anything because it's in a namespace.
You just need
<element>{//*:personIdExternal}</element>

Related

How to upload and save a picture with eXist-db?

I am tryng to upload a picture and store it in exist-db but i get the next error when opening the stored picture:
Cannot open specified file: Could not recognize image encoding.
I have tryed the next code with a small adjustment for normal txt files and it works fine but not with pictures.
picture.xhtml
<?xml-model href="http://www.oxygenxml.com/1999/xhtml/xhtml-xforms.nvdl" schematypens="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0"?>
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:ev="http://www.w3.org/2001/xml-events"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xf="http://www.w3.org/2002/xforms"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<head>
<title/>
<xf:model>
<xf:instance xmlns="">
<data>
<image xsi:type="xs:base64Binary"/>
</data>
</xf:instance>
<xf:submission id="save" action="save.xquery" method="post"/>
</xf:model>
</head>
<body>
<xf:upload ref="image">
<xf:label>Upload Photo:</xf:label>
</xf:upload>
<br/>
<xf:submit submission="save">
<xf:label>Save</xf:label>
</xf:submit>
</body>
</html>
save.xquery
xquery version "3.1";
declare option exist:serialize "method=xhtml media-type=text/html indent=yes";
let $login:=xmldb:login('xmldb:exist:///db/apps/places','admin','admin')
(: The small adjusment i refer is just to change file extension from .jpeg to .txt :)
return xmldb:store("/db/apps/places/",concat("pic",".jpeg"), util:base64-decode(request:get-data()//image))
If you want to store images to the eXist-db you should probably replace xmldb:store() with xmldb:store-as-binary().

XSLT to format Wordpress WXR XML for importing in to Drupal via Feeds

I'm trying to format a Wordpress WXR file using XSLT so I can import it into Drupal.
I'm aware of modules for Drupal that will import WXR files but I need the flexibility that the Feeds module can give as the imported data will be imported against different content types and I'll be pulling images and other attachments into the newly created Drupal pages. With this in mind the standard WordPress Migrate just won't cut it.
So, the WXR format has Wordpress posts and attachments as separate items within the feed and links the posts an attachments using an id. Attachments can be images, files (pdf,doc etc) and are found at the xpath wp:postmeta/wp:meta_key and have values of _thumbnail_id, _wp_attached_file
What I'd like to do is take various nodes from items of type attachment and put them within the cooresponding post item, where the id links them together
A fragment of the xml to be transformed... First item is post second is attachment. The
<item>
<title>Some groovy title</title>
<link>http://example.com/groovy-example</link>
<wp:post_id>2050</wp:post_id>
<wp:post_type>page</wp:post_type>
...
...
...
<wp:postmeta>
<wp:meta_key>_thumbnail_id</wp:meta_key>
<wp:meta_value>566</wp:meta_value>
</wp:postmeta>
</item>
...
...
...
<item>
<title>My fantastic attachment</title>
<link>http://www.example.com/fantastic-attachment</link>
<wp:post_id>566</wp:post_id>
<wp:post_type>attachment</wp:post_type>
...
...
...
<wp:attachment_url>http://www.example.com/wp-content/uploads/2012/12/fantastic.jpg</wp:attachment_url>
<wp:postmeta>
<wp:meta_key>_wp_attached_file</wp:meta_key>
<wp:meta_value>2012/12/fantastic.jpg</wp:meta_value>
</wp:postmeta>
</item>
After the transform I would like
<item>
<title>Some groovy title</title>
<link>http://example.com/groovy-example</link>
<wp:post_id>2050</wp:post_id>
<wp:post_type>page</wp:post_type>
...
...
...
<wp:postmeta>
<wp:meta_key>_thumbnail_id</wp:meta_key>
<wp:meta_value>566</wp:meta_value>
<wp:meta_url>http://www.example.com/wp-content/uploads/2012/12/fantastic.jpg</wp:attachment_url>
</wp:postmeta>
</item>
Maybe, there is a better approach? Maybe merge post and attachment where the id create a link between the nodes?
I'm new to XSLT and have read a few posts on identity transforms and I think thats the correct direction but I just don't have the experience to pull of what i need, assistance would be appreciated.
It looks like I've managed to sort out a solution.
I used a number of indexes to organise the attachments. My requirements changed a little on further inspection of the XML, as there was
I changed my resulting output to be in the format of...
<item>
<title>Some groovy title</title>
<link>http://example.com/groovy-example</link>
<wp:post_id>2050</wp:post_id>
<wp:post_type>page</wp:post_type>
...
...
...
<thumbnail>
<title>Spaner</title>
<url>http://www.example.com/wp-content/uploads/2012/03/spanner.jpg</url>
</thumbnail>
<attachments>
<attachment>
<title>Fixing your widgets: An idiots guide</title>
<url>http://www.example.com/wp-content/uploads/2012/12/fixiing-widgets.pdf</url>
</attachment>
<attachment>
<title>Do It Yourself Trepanning</title>
<url>http://www.example.com/wp-content/uploads/2013/04/trepanning.pdf</url>
</attachment>
</attachments>
</item>
So using the following xsl gave me the desired result. The conditions on the indexes ensured I was selecting the correct files.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:wp="http://wordpress.org/export/1.2/">
<xsl:output indent="yes" cdata-section-elements="content"/>
<!-- Setup indexes -->
<!-- Index all main posts -->
<xsl:key
name="mainposts"
match="*/item[wp:post_type[text()='post']]"
use="wp:post_id" />
<!-- Index all sub posts (posts within posts)-->
<xsl:key
name="subposts"
match="*/item[wp:post_type[text()='post'] and category[#nicename = 'documents']]"
use="category[#domain = 'post_tag']" />
<!-- Index all image thumbs -->
<xsl:key
name="images"
match="*/item[wp:post_type[text()='attachment'] and wp:postmeta/wp:meta_key[text()='_wp_attachment_metadata']]"
use="wp:post_parent" />
<!-- Index all files (unable to sort members file at the moment)-->
<xsl:key
name="attachments"
match="*/item[wp:post_type[text()='attachment'] and not(wp:postmeta/wp:meta_key = '_wp_attachment_metadata')]"
use="wp:post_parent" />
<xsl:key
name="thumbnails"
match="*/item[wp:post_type[text()='attachment']]"
use="wp:post_id" />
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*/item/wp:post_parent[text()= 0]">
<wp:post_parent>
<xsl:value-of select="." />
</wp:post_parent>
<xsl:for-each select="key('thumbnails', ../wp:postmeta[wp:meta_key[text()='_thumbnail_id']]/wp:meta_value)">
<thumbnail>
<title><xsl:value-of select="title" /></title>
<url><xsl:value-of select="wp:attachment_url" /></url>
</thumbnail>
</xsl:for-each>
<xsl:for-each select="key('subposts', ../category[#domain = 'post_tag'])">
<attachments>
<xsl:for-each select="key('images', wp:post_id)">
<file>
<title><xsl:value-of select="title" /></title>
<url><xsl:value-of select="wp:attachment_url" /></url>
</file>
</xsl:for-each>
<xsl:for-each select="key('attachments', wp:post_id)">
<file>
<title><xsl:value-of select="title" /></title>
<url><xsl:value-of select="wp:attachment_url" /></url>
</file>
</xsl:for-each>
</attachments>
</xsl:for-each>
</xsl:template>

Specific Attribute node replacement

How to do node replacement in MarkLogic for a particular attribute? For example like below:
<chapters>
<title id="primary">first primary content</title>
<title id="primary">second primary content</title>
<title id="secondary">this is amy middle content</title>
<title id="terciary">this is amy last content</title>
</chapters>
I want like below:
<chapters>
<title id="primary">third primary content</title>
<title id="secondary">this is amy middle content</title>
<title id="terciary">this is amy last content</title>
</chapters>
I mean suppose A.xml file stored in MarkLogic database server that contain data like bleow:
<chaptermetadata>
<chapters>
<title id="primary">first content</title>
<title id="primary">second content</title>
<title id="secondary">This is middle content</title>
<title id="terciary">This is last content</title>
</chapters>
<chapters>
<title id="primary">fouth content</title>
<title id="primary">fifth content</title>
<title id="primary">sixth content</title>
<title id="secondary">This is new content</title>
<title id="terciary">This is old content</title>
</chapters>
<chaptermetadata>
Now, I want to replace a node in all the element title which contain attribute #id='primary' in all chapter like below:
<chaptermetadata>
<chapters>
<title id="primary">common content</title>
<title id="secondary">This is middle content</title>
<title id="terciary">This is last content</title>
</chapters>
<chapters>
<title id="primary">common content</title>
<title id="secondary">This is new content</title>
<title id="terciary">This is old content</title>
</chapters>
<chaptermetadata>
If you are just getting started with XQuery and MarkLogic, http://developer.marklogic.com/learn/technical-overview and http://developer.marklogic.com/learn may help.
The best way to modify elements and attributes depends on the context, which you haven't supplied. I suppose the first question is "how do I select nodes by attribute?" A simple bit of XPath does that. For all chapters in the database:
/chapters/title[#id eq $id]
...or relative to a previously selected sequence of element(chapter)*
$chapters/title[#id eq $id]
If this is a database document, you could take it from there with the http://docs.marklogic.com/xdmp:node-replace and http://docs.marklogic.com/xdmp:node-delete functions. If the nodes are only in memory, see http://docs.marklogic.com/guide/app-dev/typeswitch for guidance and examples on using an XQuery typeswitch or XSLT. At http://developer.marklogic.com/blog/tired-of-typeswitch there are more examples and comparison of typeswitch and XSLT.
Based upon the examples you provided, it appears that you want to replace the text() for the first title element that has #id='primary', and remove the othertitleelements with#id='primary'`.
The following will achieve that, using xdmp:node-replace() and xdmp:node-delete() methods.
for $primary-title in doc("A.xml")/chaptermetadata/chapters/title[#id='primary']
return
if ($primary-title/preceding-sibling::title[#id='primary'])
then xdmp:node-delete($primary-title)
else xdmp:node-replace($primary-title/text(), text{"common content"})

Strip All Foreign-Namespace Nodes with XQuery

Input document:
<entry xmlns="http://www.w3.org/2005/Atom">
<id>urn:uuid:1234</id>
<updated>2012-01-20T11:30:11-05:00</updated>
<published>2011-12-29T15:44:11-05:00</published>
<link href="?id=urn:uuid:1234" rel="edit" type="application/atom+xml"/>
<title>Title</title>
<category scheme="http://uri/categories" term="category"/>
<fake:fake xmlns:fake="http://fake/" attr="val"/>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>Blah</p>
</div>
</content>
</entry>
<!-- more entries -->
I want the output to be exactly the same, but with non-Atom elements like <fake:fake xmlns:fake="http://fake/" attr="val"/> stripped out. This is what I have, which doesn't work at all, just giving me the same input back:
declare namespace atom = "http://www.w3.org/2005/Atom";
<feed>
<title>All Posts</title>
{
for $e in collection('/db/entries')/atom:entry
return
if
(namespace-uri($e) = "http://www.w3.org/2005/Atom")
then
$e
else
''
}
</feed>
What am I doing wrong?
You can try the following query on try.zorba-xquery.com:
let $entry := <entry xmlns="http://www.w3.org/2005/Atom">
<id>urn:uuid:1234</id>
<updated>2012-01-20T11:30:11-05:00</updated>
<published>2011-12-29T15:44:11-05:00</published>
<link href="?id=urn:uuid:1234" rel="edit" type="application/atom+xml"/>
<title>Title</title>
<category scheme="http://uri/categories" term="category"/>
<fake:fake xmlns:fake="http://fake/" attr="val"/>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>Blah</p>
</div>
</content>
</entry>
return {
delete nodes $entry//*[not(namespace-uri(.) = "http://www.w3.org/2005/Atom")];
$entry
}
The following version is more portable:
let $entry := <entry xmlns="http://www.w3.org/2005/Atom">
<id>urn:uuid:1234</id>
<updated>2012-01-20T11:30:11-05:00</updated>
<published>2011-12-29T15:44:11-05:00</published>
<link href="?id=urn:uuid:1234" rel="edit" type="application/atom+xml"/>
<title>Title</title>
<category scheme="http://uri/categories" term="category"/>
<fake:fake xmlns:fake="http://fake/" attr="val"/>
<content type="xhtml">
<div xmlns="http://www.w3.org/1999/xhtml">
<p>Blah</p>
</div>
</content>
</entry>
return
copy $new-entry := $entry
modify (delete nodes $new-entry//*[not(namespace-uri(.) = "http://www.w3.org/2005/Atom")])
return $new-entry
Sort of a round-about way of doing it but this ended up working:
declare default element namespace "http://www.w3.org/2005/Atom";
<feed>
<title>All Posts</title>
{
for $entry in collection('/db/entries')/entry
return
element{node-name($entry)}{
$entry/#*,
for $child in $entry//*[namespace-uri(.) = "http://www.w3.org/2005/Atom"]
return $child
}
}
</feed>
Waiting for the time limit to expire and then I'll accept it as an answer.

Syndication format for describing threaded comments?

How to describe comments tree with Atom/RSS?
There's a draft standard to extend Atom with threaded discussions, but that's no longer active. This is a feed with comments:
<feed xmlns="http://www.w3.org/2005/Atom"
xmlns:thr="http://purl.org/syndication/thread/1.0">
<id>http://www.example.org/myfeed</id>
<title>My Example Feed</title>
<updated>2005-07-28T12:00:00Z</updated>
<link href="http://www.example.org/myfeed" />
<author><name>James</name></author>
<entry>
<id>tag:example.org,2005:1</id>
<title>My original entry</title>
<updated>2006-03-01T12:12:12Z</updated>
<link
type="application/xhtml+xml"
href="http://www.example.org/entries/1" />
<summary>This is my original entry</summary>
</entry>
<entry>
<id>tag:example.org,2005:1,1</id>
<title>A response to the original</title>
<updated>2006-03-01T12:12:12Z</updated>
<link href="http://www.example.org/entries/1/1" />
<thr:in-reply-to
ref="tag:example.org,2005:1"
type="application/xhtml+xml"
href="http://www.example.org/entries/1"/>
<summary>This is a response to the original entry</summary>
</entry>
</feed>
You can use html in RSS but < and > must be present as < and >
<description>
...
<!-- comments -->
<ul>
<li>comment1</li>
<li>comment2</li>
<li>comment3</li>
<li>comment4</li>
<ul>
</description>

Resources