Parse XML based on attributes and text values of related nodes

Parse XML based on attributes and text values of related nodes - r

I have used the XML package to parse both HTML and XML before, and have a rudimentary grasp of xPath. However I've been asked to consider XML data where the important bits are determined by a combination of text and attributes of the elements themselves, as well as those in related nodes. I've never done that. For example
[updated example, slightly more expansive]
<Catalogue>
<Bookstore id="ID910705541">
<location>foo bar</location>
<books>
<book category="A" id="1">
<title>Alpha</title>
<author ref="1">Matthew</author>
<author>Mark</author>
<author>Luke</author>
<author ref="2">John</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="B" id="10">
<title>Beta</title>
<author ref="1">Huey</author>
<author>Duey</author>
<author>Louie</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="D" id="100">
<title>Gamma</title>
<author ref="1">Tweedle Dee</author>
<author ref="2">Tweedle Dum</author>
<year>2005</year>
<price>29.99</price>
</book>
</books>
</Bookstore>
<Bookstore id="ID910700051">
<location>foo</location>
<books>
<book category="A" id="1">
<title>Happy</title>
<author>Dopey</author>
<author>Bashful</author>
<author>Doc</author>
<author ref="1">Grumpy</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="B" id="10">
<title>Ni</title>
<author ref="1">John</author>
<author ref="2">Paul</author>
<author ref="3">George</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="D" id="100">
<title>San</title>
<author ref="1">Ringo</author>
<year>2005</year>
<price>29.99</price>
</book>
</books>
</Bookstore>
<Bookstore id="ID910715717">
<location>bar</location>
<books>
<book category="A" id="1">
<title>Un</title>
<author ref="1">Winkin</author>
<author>Blinkin</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="B" id="10">
<title>Deux</title>
<author>Nod</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="D" id="100">
<title>Trois</title>
<author>Manny</author>
<author>Moe</author>
<year>2005</year>
<price>29.99</price>
</book>
</books>
</Bookstore>
</Catalogue>
I would like to extract all author names where:
1) the location element has a text value that contains "NY"
2) the author element does NOT contain a "ref" attribute; that is where ref is not present in the author tag
I will ultimately need to concatenate the extracted authors together within a given bookstore, so that my resulting data frame is one row per store. I'd like to preserve the bookstore id as an additional field in my data frame so that I can uniqely reference each store.
Since only the first bokstore is in NY, results from this simple example would look something like:
1 Jane Smith John Doe Karl Pearson William Gosset
If another bookstore contained "NY" in its location, it would comprise the second row, and so forth.
Am I asking too much of R to parser under these convoluted conditions?

require(XML)
xdata <- xmlParse(apptext)
xpathSApply(xdata,'//*/location[text()[contains(.,"NY")]]/following-sibling::books/.//author[not(#ref)]')
#[[1]]
#<author>Jane Smith</author>
#[[2]]
#<author>John Doe</author>
#[[3]]
#<author>Karl Pearson</author>
#[[4]]
#<author>William Gosset</author>
Breakdown:
Get all locations containing 'NY'
//*/location[text()[contains(.,"NY")]]
Get the books sibling of these nodes
/following-sibling::books
from these notes get all authors without a ref attribute
/.//author[not(#ref)]
Use xmlValue if you want the text:
> xpathSApply(xdata,'//*/location[text()[contains(.,"NY")]]/following-sibling::books/.//author[not(#ref)]',xmlValue)
[1] "Jane Smith" "John Doe" "Karl Pearson" "William Gosset"
UPDATE:
child.nodes <- xpathSApply(xdata,'//*/location[text()[contains(.,"NY")]]/following-sibling::books/.//author[not(#ref)]')
ans.func<-function(x){
xpathSApply(x,'.//ancestor::bookstore[#id]/#id')
}
sapply(child.nodes,ans.func)
# id id id id
#"1" "1" "1" "1"
UPDATE 2:
With your changed data
xdata <- '<Catalogue>
<Bookstore id="ID910705541">
<location>foo bar</location>
<books>
<book category="A" id="1">
<title>Alpha</title>
<author ref="1">Matthew</author>
<author>Mark</author>
<author>Luke</author>
<author ref="2">John</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="B" id="10">
<title>Beta</title>
<author ref="1">Huey</author>
<author>Duey</author>
<author>Louie</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="D" id="100">
<title>Gamma</title>
<author ref="1">Tweedle Dee</author>
<author ref="2">Tweedle Dum</author>
<year>2005</year>
<price>29.99</price>
</book>
</books>
</Bookstore>
<Bookstore id="ID910700051">
<location>foo</location>
<books>
<book category="A" id="1">
<title>Happy</title>
<author>Dopey</author>
<author>Bashful</author>
<author>Doc</author>
<author ref="1">Grumpy</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="B" id="10">
<title>Ni</title>
<author ref="1">John</author>
<author ref="2">Paul</author>
<author ref="3">George</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="D" id="100">
<title>San</title>
<author ref="1">Ringo</author>
<year>2005</year>
<price>29.99</price>
</book>
</books>
</Bookstore>
<Bookstore id="ID910715717">
<location>bar</location>
<books>
<book category="A" id="1">
<title>Un</title>
<author ref="1">Winkin</author>
<author>Blinkin</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="B" id="10">
<title>Deux</title>
<author>Nod</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="D" id="100">
<title>Trois</title>
<author>Manny</author>
<author>Moe</author>
<year>2005</year>
<price>29.99</price>
</book>
</books>
</Bookstore>
</Catalogue>'
Note previously you had bookstore now Bookstore. NY is gone so I have used foo
require(XML)
xdata <- xmlParse(xdata)
child.nodes <- getNodeSet(xdata,'//*/location[text()[contains(.,"foo")]]/following-sibling::books/.//author[not(#ref)]')
ans.func<-function(x){
xpathSApply(x,'.//ancestor::Bookstore[#id]/#id')
}
sapply(child.nodes,ans.func)
# id id id id id
#"ID910705541" "ID910705541" "ID910705541" "ID910705541" "ID910700051"
# id id
#"ID910700051" "ID910700051"
xpathSApply(xdata,'//*/location[text()[contains(.,"foo")]]/following-sibling::books/.//author[not(#ref)]',xmlValue)
# [1] "Mark" "Luke" "Duey" "Louie" "Dopey" "Bashful" "Doc"

Related

Remove or filter XML nodes by Xpaths from file in R

I have very very large complex xml files (look like this https://github.com/HL7/C-CDA-Examples/blob/master/General/Parent%20Document%20Replace%20Relationship/CCD%20Parent%20Document%20Replace%20(C-CDAR2.1).xml ) to process but only need attributes and values at particular XPaths (nodes). By removing unneeded nodes, processing time may be cut, filtering out fluff before detailed processing.
So far I have tried using: xml_remove
xmlfile <- paste0(dir,"xmlFiles/",filelist[k])
file<-read_xml(xmlfile)
file<-xml_ns_strip(file)
for(counx in 1:nrow(xpathTable)){
xr <- xml_find_all(file, xpath =paste0('/',toString(xpathTable$xpaths[counx])) )
xml_remove(xr, free = TRUE)
file<-file
}
This works well for removing few nodes but crashes as the numbers go up (>100)
Below show a kind of example of what I want to get too
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
<ISBN>
<Random>12354</Random>
</ISBN>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<ISBN>
<Random>12345</Random>
</ISBN>
<price>39.95</price>
</book>
</bookstore>
Filter by XPaths
/bookstore/book/title
/bookstore/book/year
/bookstore/book/ISBN/Random
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<year>2005</year>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<year>2005</year>
<ISBN>
<Random>12354</Random>
</ISBN>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<year>2003</year>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<year>2003</year>
<ISBN>
<Random>12345</Random>
</ISBN>
</book>
</bookstore>

Looks like an XQuery job, e.g. you could recreate your document like this
<bookstore>{
for $book in /bookstore/*
return <book category="{$book/#category}">
{$book/title}
{$book/year}
{$book/ISBN}
</book>
}</bookstore>
Using the book example to get the result below it. You can test this online here using XQuery as an option https://www.videlibri.de/cgi-bin/xidelcgi
There might be ways to run XQuery from R but I would rather do it in a pre-processing step from the command line using a tool like xidel.

All elements could be looked up in a single XPath 1.0 expression valid for many languages:
/bookstore/book/descendant::*[name()="title" or name()="year" or name()="Random"]
Equivalent/similar expressions:
/bookstore/book/title | /bookstore/book/year | /bookstore/book/ISBN/Random
//book/#category | //book/year | //ISBN/Random
To filter out elements:
//book/*[not(name()="title" or name()="year" or name()="ISBN" or name()="Random")]
For XMLs with namespaces, local-name() can be used instead of name() if namespace handling is not used.
For the given example and elements and testing on command line:
echo 'cat /bookstore/book/descendant::*[name()="title" or name()="year" or name()="Random"]' | xmllint --shell test.xml
Result:
/ > cat /bookstore/book/descendant::*[name()="title" or name()="year" or name()="Random"]
-------
<title lang="en">Everyday Italian</title>
-------
<year>2005</year>
-------
<title lang="en">Harry Potter</title>
-------
<year>2005</year>
-------
<Random>12354</Random>
-------
<title lang="en">XQuery Kick Start</title>
-------
<year>2003</year>
-------
<title lang="en">Learning XML</title>
-------
<year>2003</year>
-------
<Random>12345</Random>
/ >
For the mentioned R crash, worth looking here.

How to count elements with two "restrictions"?

My XML looks something like this :
<books>
<book id="b1">
<title>Set theory and the continuum problem</title>
<category>Mathematics</category>
<location>
<area>hall1</area>
<case>1</case>
<shelf>2</shelf>
</location>
<description>A lucid, elegant, and complete survey of set theory.</description>
<history>
<borrowed by="m4"/>
<borrowed by="m2" until="2018-04-05"/>
</history>
</book>
<book id="b2">
<title>Computational Complexity</title>
<isbn>978-0201-530-827</isbn>
<category>Computer Science</category>
<location>
<area>hall1</area>
<case>3</case>
<shelf>3</shelf>
</location>
<description>.</description>
</book>
<book id="b3">
<title>To mock a mockingbird</title>
<isbn>1-292-09761-2</isbn>
<category>Logic</category>
<category>Mathematics</category>
<location>
<area>hall1</area>
<case>1</case>
<shelf>3</shelf>
</location>
<description>.</description>
</book>
</books>
Is it possible to count how many books are there with elements area='hall1' and case='1'?
I tried this:
count(//books/book[location/area='hall1'])
but i do not know how to include case='1' "restriction" also

This should work:
count(//books/book[location/area='hall1'][location/case='1'])

** ADDITIONAL QUESTION **
Is it possible to list all book's titles which have area='hall1' and case='1'?
for $bb in //books/book[location/area='hall1'][location/case='1']
let $n := //books/book/title
return <book>$n</book>

Multiple for loops implementation in XQuery

Below is my sample XML
<catalog>
<book>
<author>Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
<storeno>123</storeno>
</book>
<book>
<author>Ralls, Kim</author>
<title>Rain Fantasy</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former</description>
<storeno>123</storeno>
</book>
<book>
<author>zxcv</author>
<title>Maeve</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-11-17</publish_date>
<description>After the collapse of a nanotechnology</description>
<storeno>123</storeno>
</book>
<book>
<author>zxcv</author>
<title>Legacy</title>
<genre>Fiction</genre>
<price>5.95</price>
<publish_date>2001-03-10</publish_date>
<description>In post-apocalypse</description>
<storeno>123</storeno>
</book>
<book>
<author>Corets, Eva</author>
<title>The</title>
<genre>Fiction</genre>
<price>5.95</price>
<publish_date>2001-09-10</publish_date>
<description>The two daughters</description>
<storeno>123</storeno>
</book>
<book>
<author>Horror</author>
<title>Horror</title>
<genre>Horror</genre>
<price>4.95</price>
<publish_date>2000-09-02</publish_date>
<description>When abc meets xyz</description>
<storeno>123</storeno>
</book>
<book>
<author>Knorr, Stefan</author>
<title>Creepy Crawlies</title>
<genre>Horror</genre>
<price>4.95</price>
<publish_date>2000-12-06</publish_date>
<description>An anthology of horror stories about roaches,
centipedes, scorpions and other insects.</description>
<storeno>123</storeno>
</book>
<book>
<author>O'Brien, Tim</author>
<title>kids ganes</title>
<genre>story</genre>
<price>36.95</price>
<publish_date>2000-12-09</publish_date>
<description>Microsoft's .NET initiative is explored in
detail in this deep programmer's reference.</description>
<storeno>123</storeno>
</book>
<book>
<author>O'Brien, Tim</author>
<title>MSXML3: A Comprehensive Guide</title>
<genre>computer</genre>
<price>36.95</price>
<publish_date>2000-12-01</publish_date>
<description>The abc</description>
<storeno>123</storeno>
</book>
<book>
<author>Galos, Mike</author>
<title>Visual Studio 7: A Comprehensive Guide</title>
<genre>story</genre>
<price>49.95</price>
<publish_date>2001-04-16</publish_date>
<description>Microsoft Visual Studio</description>
<storeno>123</storeno>
</book>
</catalog>
and I need an XQuery that returns
<titles>
need the first instance of the title where the genre is “Fantasy”
need all the titles concatenated where the genre is “Computer”
need all the titles concatenated where the genre is “Fiction”
need the first instance of the title where the genre is “Story”
</ titles >
example:( Rain Fantasy, XML Developer's Guide………………, Legacy……………….., kids ganes)
Note: the case can be ignored in the above for comparison.
Here is what we are trying
<Titles>
let $fan := $catalog /book[genre = ‘Fantasy’][1]/title
let $stry := $catalog /book[genre = ‘Story’][1]/title
for $comp in $catalog /book[genre ='Computer']/title
return concat($comp, “”)
for $fict in $catalog /book[genre ='Fiction']/title
return concat($fict, “”)
concat($fan, $comp, $fict, $stry)
</Titles>
we are facing issues in multiple for loops implementation.
Any help is really appreciated.
Thanks in advance

From the question and the comments you seem to want something like this:
<Titles>
{
for $genre in distinct-values($catalog/book/genre)
let $books := $catalog/book[genre=$genre]
let $retval := if($genre=("Fantasy","Story")) then $books[1]/title else $books/title
return data($retval)
}
</Titles>
With your input, the result this gives is:
<Titles>XML Developer's Guide Rain Fantasy Legacy The Horror Creepy Crawlies kids ganes Visual Studio 7: A Comprehensive Guide MSXML3: A Comprehensive Guide</Titles>
My gut tells me you probably don't want the data() part though. Without it you get:
<Titles>
<title>XML Developer's Guide</title>
<title>Rain Fantasy</title>
<title>Legacy</title>
<title>The</title>
<title>Horror</title>
<title>Creepy Crawlies</title>
<title>kids ganes</title>
<title>Visual Studio 7: A Comprehensive Guide</title>
<title>MSXML3: A Comprehensive Guide</title>
</Titles>

How to create XML from .csv properly?

I would like to create a XML file from a .csv file. I have some difficulties to get the desired structure:
<?xml version="1.0" encoding="UTF-8"?>
<document>
<employee ID="1">
<Name>Steve</Name>
<City>Boston</City>
<Age>33</Age>
</employee>
<employee ID="2">
<Name>Michael</Name>
<City>Dallas</City>
<Age>45</Age>
</employee>
<employee ID="3">
<Name>John</Name>
<City>New York</City>
<Age>89</Age>
</employee>
<employee ID="4">
<Name>Thomas</Name>
<City>LA</City>
<Age>62</Age>
</employee>
<employee ID="5">
<Name>Clint</Name>
<City>Paris</City>
<Age>30</Age>
</employee>
</document>
What I have tried:
library(XML)
# Some data
df <-
read.csv(textConnection('"ID","Name","City","Age"
"1","Steve","Boston",33
"2","Michael","Dallas",45
"3","John","New York",89
"4","Thomas","LA",62
"5","Clint","Paris",30'),
as.is=TRUE)
xml <- xmlTree()
xml$addTag("document", close=FALSE)
for (i in 1:nrow(df)) {
xml$addTag("employee", close=FALSE)
for (j in names(df)) {
xml$addTag(j, df[i, j])
}
xml$closeTag()
}
xml$closeTag()
Which looks almost as desired, but where ID is beneath employee rather then on the same line and the encoding is not in the header:
<?xml version="1.0"?>
<document>
<employee>
<ID>1</ID>
<Name>Steve</Name>
<City>Boston</City>
<Age>33</Age>
</employee>
<employee>
<ID>2</ID>
<Name>Michael</Name>
<City>Dallas</City>
<Age>45</Age>
</employee>
<employee>
<ID>3</ID>
<Name>John</Name>
<City>New York</City>
<Age>89</Age>
</employee>
<employee>
<ID>4</ID>
<Name>Thomas</Name>
<City>LA</City>
<Age>62</Age>
</employee>
<employee>
<ID>5</ID>
<Name>Clint</Name>
<City>Paris</City>
<Age>30</Age>
</employee>
</document>

Use addNode instead of addTag. They are identical
> identical(xml$addTag, xml$addNode)
[1] TRUE
so its a matter of preference. You can give an attrs argument to add the ID attribute. You can add the encoding when you save the file:
library(XML)
df <-
read.csv(textConnection('"ID","Name","City","Age"
"1","Steve","Boston",33
"2","Michael","Dallas",45
"3","John","New York",89
"4","Thomas","LA",62
"5","Clint","Paris",30'),
as.is=TRUE)
xml <- xmlTree("document")
for (i in 1:nrow(df)) {
xml$addNode("employee", attrs = c(ID = df[i,"ID"]), close = FALSE)
appNames <- names(df)[names(df) != "ID"]
for (j in appNames) {
xml$addNode(j, df[i, j])
}
xml$closeNode()
}
xml$closeNode()
saveXML(xml$doc(), "text.xml", encoding = "UTF-8")
xmlParse("text.xml")
<?xml version="1.0" encoding="UTF-8"?>
<document>
<employee ID="1">
<Name>Steve</Name>
<City>Boston</City>
<Age>33</Age>
</employee>
<employee ID="2">
<Name>Michael</Name>
<City>Dallas</City>
<Age>45</Age>
</employee>
<employee ID="3">
<Name>John</Name>
<City>New York</City>
<Age>89</Age>
</employee>
<employee ID="4">
<Name>Thomas</Name>
<City>LA</City>
<Age>62</Age>
</employee>
<employee ID="5">
<Name>Clint</Name>
<City>Paris</City>
<Age>30</Age>
</employee>
</document>

Can I get help to get the desired output with Xquery?

I have this DTD
<?xml version="1.0" encoding="utf-8"?>
<!ELEMENT MusicCatalog (Artist*,Album*,Genre+,Company*,Country*)>
<!ELEMENT Artist (Name)>
<!ELEMENT Name (FirstName,MiddleName*,LastName?)>
<!ELEMENT FirstName (#PCDATA)>
<!ELEMENT MiddleName (#PCDATA)>
<!ELEMENT LastName (#PCDATA)>
<!ATTLIST Artist ArtistID ID #REQUIRED
countryID IDREF #REQUIRED>
<!ELEMENT Album (Title,Price,Year)>
<!ELEMENT Title (#PCDATA)>
<!ELEMENT Price (#PCDATA)>
<!ELEMENT Year (#PCDATA)>
<!ATTLIST Album AlbumID ID #REQUIRED
ArtistID IDREF #REQUIRED
CompanyID IDREF #REQUIRED
GenreName IDREFS #REQUIRED >
<!ELEMENT Company (#PCDATA)>
<!ATTLIST Company CompanyID ID #REQUIRED>
<!ELEMENT Genre EMPTY>
<!ATTLIST Genre GenreName ID #REQUIRED>
<!ELEMENT Country (#PCDATA)>
<!ATTLIST Country countryID ID #REQUIRED>
and I Have this basic XML
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE MusicCatalog SYSTEM "Refined_DTD_For_catalog.dtd">
<MusicCatalog>
<Artist ArtistID="Ar_0000" countryID="US">
<Name>
<FirstName>Katey </FirstName>
<LastName>Berry</LastName>
</Name>
</Artist>
<Artist ArtistID="Ar_0001" countryID="US">
<Name>
<FirstName>Justine </FirstName>
<LastName>Temrilke</LastName>
</Name>
</Artist>
<Album AlbumID="AL_0000" ArtistID="Ar_0000" CompanyID="C_3" GenreName="Pop HipHop R_and_B">
<Title>Calfornia Girls</Title>
<Price>12</Price>
<Year>2012</Year>
</Album>
<Album AlbumID="AL_0001" ArtistID="Ar_0000" CompanyID="C_1" GenreName="Pop HipHop R_and_B">
<Title>Confessions</Title>
<Price>9</Price>
<Year>2008</Year>
</Album>
<Album AlbumID="AL_0002" ArtistID="Ar_0000" CompanyID="C_10" GenreName="Pop HipHop R_and_B">
<Title>Roar</Title>
<Price>13</Price>
<Year>2014</Year>
</Album>
<Album AlbumID="AL_0003" ArtistID="Ar_0000" CompanyID="C_4" GenreName=" HipHop R_and_B">
<Title>Teenge Dream</Title>
<Price>11</Price>
<Year>2010</Year>
</Album>
<Album AlbumID="AL_0004" ArtistID="Ar_0001" CompanyID="C_4" GenreName="HipHop R_and_B">
<Title>Future of sex</Title>
<Price>8</Price>
<Year>2007</Year>
</Album>
<Album AlbumID="AL_0005" ArtistID="Ar_0001" CompanyID="C_5" GenreName="HipHop">
<Title>Mirros</Title>
<Price>8</Price>
<Year>2013</Year>
</Album>
<Album AlbumID="AL_0006" ArtistID="Ar_0001" CompanyID="C_5" GenreName="Electro">
<Title>Holly Grail</Title>
<Price>9</Price>
<Year>2014</Year>
</Album>
<Album AlbumID="AL_0007" ArtistID="Ar_0001" CompanyID="C_6" GenreName="HipHop Electro">
<Title>Give it to me</Title>
<Price>5</Price>
<Year>2005</Year>
</Album>
<Genre GenreName="Rap"/>
<Genre GenreName="Country"/>
<Genre GenreName="R_and_B"/>
<Genre GenreName="HipHop"/>
<Genre GenreName="House"/>
<Genre GenreName="Pop"/>
<Genre GenreName="Electro"/>
<Genre GenreName="Blues"/>
<Genre GenreName="Punck"/>
<Genre GenreName="Rock"/>
<Genre GenreName="Metal"/>
<Genre GenreName="Alternative_Rock"/>
<Company CompanyID="C_1">
CBS Records
</Company>
<Company CompanyID="C_2">
RCA
</Company>
<Company CompanyID="C_3">
WEA
</Company>
<Company CompanyID="C_4">
Cloumbia
</Company>
<Company CompanyID="C_5">
Virgin Records
</Company>
<Company CompanyID="C_6">
Pickwick
</Company>
<Company CompanyID="C_7">
Atlantic
</Company>
<Company CompanyID="C_8">
Mega
</Company>
<Company CompanyID="C_9">
Grammy
</Company>
<Company CompanyID="C_10">
Wordo
</Company>
<Company CompanyID="C_11">
Fox
</Company>
<Country countryID="US">
United State
</Country>
<Country countryID="UK">
United Kingdom
</Country>
<Country countryID="FR">
France
</Country>
<Country countryID="GR">
Germany
</Country>
<Country countryID="ME">
Mexico
</Country>
<Country countryID="SP">
Spain
</Country>
<Country countryID="JP">
Japneas
</Country>
</MusicCatalog>
My problem with Xquery is I'm not getting what the expected result, in this query I'm trying to get companies name, create element with name and indie it to get the all albums titles but I'm getting only the company id, I tried to solve with many queries but most of it don't work
for $dc in distinct-values( //Album/#CompanyID )
return element {string ($dc)} {
for $j in //Album[#CompanyID = $dc] return $j/Title
}
This another try
for $i in distinct-values(//MusicCatalog/Album/#CompanyID)
return
if(compare($i,//MusicCatalog/Company/#CompanyID))
then "element {string(//MusicCatalog/Company)}"
{
for $j in //MusicCatalog/Album[#CompanyID eq $i] return $j/Title
}
}
I getting this
<c_3>
<title>Calfornia Girls</title>
<title>Roar/title>
</c_3>
while I want to get
<WEA>
<title>Calfornia Girls</title>
<title>Roar/title>
</WEA>
another problem is with getting genre albums beacuse it's mutlipe IDrefs .

If you want the company name as element name, you should not pass the id, but the name. I.e. search the name with another path expression:
for $i in distinct-values(//MusicCatalog/Album/#CompanyID)
return
element {translate(normalize-space(//MusicCatalog/Company[#CompanyID eq $i]), " ", "_")}
{
for $j in //MusicCatalog/Album[#CompanyID eq $i]
return $j/Title
}
}

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Parse XML based on attributes and text values of related nodes - r

Related

Remove or filter XML nodes by Xpaths from file in R

How to count elements with two "restrictions"?

Multiple for loops implementation in XQuery

How to create XML from .csv properly?

Can I get help to get the desired output with Xquery?

Categories

Resources