I have a requirement where we are getting a large XML file and I need to transform on small chunks
below is the XML sample with 4 records, I have to transform the XML so I am able to group them in chunks of 2.
<!-- Original XML-->
<EmpDetails>
<Records>
<EmpID>1</EmpID>
<Age>20</Age>
</Records>
<Records>
<EmpID>2</EmpID>
<Age>21</Age>
</Records>
<Records>
<EmpID>3</EmpID>
<Age>22</Age>
</Records>
<Records>
<EmpID>4</EmpID>
<Age>23</Age>
</Records>
</EmpDetails>
<!-- Expected XML-->
<EmpDetails>
<Split>
<Records>
<EmpID>1</EmpID>
<Age>20</Age>
</Records>
<Records>
<EmpID>2</EmpID>
<Age>21</Age>
</Records>
</Split>
<Split>
<Records>
<EmpID>3</EmpID>
<Age>22</Age>
</Records>
<Records>
<EmpID>4</EmpID>
<Age>23</Age>
</Records>
</Split>
</EmpDetails>
I tried few things including below without success.
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<EmpDetails>
<xsl:for-each select="/EmpDetails/Records">
<Split>
<Records>
<EmpID>
<xsl:value-of select="EmpID"/>
</EmpID>
<Age>
<xsl:value-of select="Age"/>
</Age>
</Records>
</Split>
</xsl:for-each>
</EmpDetails>
</xsl:template>
</xsl:stylesheet>
Thanks
Yatan
group them in chunks of 2.
This could be done simply by:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/EmpDetails">
<xsl:copy>
<xsl:for-each select="Records[position() mod 2 = 1]">
<Split>
<xsl:copy-of select=". | following-sibling::Records[1]"/>
</Split>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Added:
To divide the records into groups of 200, you can do:
...
<xsl:for-each select="Records[position() mod 200 = 1]">
<Split>
<xsl:copy-of select=". | following-sibling::Records[position() < 200]"/>
</Split>
</xsl:for-each>
...
In XSLT 2.0 you could do:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/EmpDetails">
<xsl:copy>
<xsl:for-each-group select="Records" group-adjacent="(position() - 1) idiv 200">
<Split>
<xsl:copy-of select="current-group()"/>
</Split>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
use this code:
<xsl:for-each select="Records[position() mod 2 = 0]">
instead of this
<xsl:for-each select="Records[position() mod 2 = 1]">
Related
I have an XML file that I haven't been able to get into a good data.frame format. I'm close but it's not quite there yet.
cellosaurus.xml slightly modified this file by removing everything before and after <cell-line-list> and </cell-line-list> tags
This is the messy code I've written so far:
require(XML)
require(xml2)
require(rvest)
require(dplyr)
require(xmltools)
require(stringi)
require(gtools)
setwd("~/Documents/Cancer_Cell_Lines/Cellosaurus")
file <- "cellosaurus.xml"
cellosaurus <- file %>% xml2::read_xml()
nodeset <- cellosaurus %>% xml_children()
terminal_xpaths <- nodeset[1] %>% xml_get_paths() %>% unlist() %>% unique()
terminal_nodesets <- lapply(terminal_xpaths[1], xml2::xml_find_all, x = cellosaurus)
df_list <- terminal_nodesets %>% purrr::map(xml_dig_df)
df <- lapply(df_list[[1]], function(x) as.data.frame(x))
table <- do.call("smartbind", df)
Problem 1: There are duplicate column names that are mixed up. For example in the file there are many paths that end up at a node called cv.term like
"/cell-line-list/cell-line/disease-list/cv-term"
"/cell-line-list/cell-line/species-list/cv-term"
"/cell-line-list/cell-line/derived-from/cv-term"
but in the table I get columns called cv.term, cv.term.1,cv.term.2 but the contents are mixed up because of missing data. Is there a way to fix this.
Problem 2: The file is big and it takes a long time to run (I've only been able to test on a small subset of the full file), I haven't been able to figure out how to split the xml correctly except by splitting into as many files are there are nodes ~109,000. And then I had a hard time incorporating that many files into my code for R to read.
Any help appreciated.
To use the relational database terminology, consider data normalization. Specifically, keep your data long as most nodes in XML are practically all one-to-many lists which you can extract each one as individual long data frames and merge together by a unique id such as cell_line node number.
Fortunately, there is a great extraction tool available known as XSLT, the special purpose, declarative language (same type as SQL) designed to transform XML into various end use needs such as extracting the individual pieces that you can parse more simply into data frames and then merge all items together. The beauty too is XSLT has nothing to do with R and is portable to other application layers (Java, PHP, Python) or dedicated XSLT processors.
See process below for roadmap to final solution. All XSLT scripts below parses from a specific part of every cell-line node and flattens XML to one child level:
R
library(xml2)
library(xslt) # INSTALL PACKAGE BEFORE HAND
library(dplyr) # ONLY FOR bind_rows
# PARSE XML AND XSLT
doc <- read_xml('Cellosaurus.xml')
scripts <- list.files(path='/path/to/xslt/scripts', pattern='.xsl')
xpaths <- c('//accession', '//cell-line', '//hla_gene', '//marker',
'//name', '//species_list', '//url')
proc_xml_parse <- function(x, s) {
style <- read_xml(s, package = "xslt")
# TRANSFORM INPUT INTO OUTPUT
new_xml <- xslt::xml_xslt(doc, style)
# INNER DF LIST BUILD
df_list <- lapply(xml_find_all(new_xml, x), function(x) {
vals <- xml_children(x)
setNames(data.frame(t(xml_text(vals)), stringsAsFactors = FALSE), xml_name(vals))
})
bind_rows(df_list)
}
# OUTER DF LIST BUILD
df_list <- Map(proc_xml_parse, xpaths, scripts)
# CHAIN MERGE
final_df <- Reduce(function(x,y) merge(x, y, by="cell_num", all=TRUE), df_list)
XSLT Scripts
Save each as separate .xsl or .xslt files (special .xml files) to be loaded in R above. Add more XSLT scripts by replicating patterns for other list nodes in XML as below does not capture all.
Cell Line List
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Cellosaurus">
<xsl:copy>
<xsl:apply-templates select="cell-line-list/cell-line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="cell-line">
<xsl:copy>
<cell_num>
<xsl:value-of select="count(preceding-sibling::*)+1"/>
</cell_num>
<xsl:for-each select="#*">
<xsl:element name="{name(.)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Accession List
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Cellosaurus">
<xsl:copy>
<xsl:apply-templates select="cell-line-list/cell-line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="cell-line">
<xsl:apply-templates select="accession-list"/>
</xsl:template>
<xsl:template match="accession-list">
<xsl:apply-templates select="accession"/>
</xsl:template>
<xsl:template match="accession">
<xsl:copy>
<cell_num>
<xsl:value-of select="count(ancestor::cell-line[1]/preceding-sibling::*)+1"/>
</cell_num>
<xsl:for-each select="#*">
<xsl:element name="{name(.)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
<accession_value><xsl:value-of select="."/></accession_value>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Name List
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Cellosaurus">
<xsl:copy>
<xsl:apply-templates select="cell-line-list/cell-line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="cell-line">
<xsl:apply-templates select="name-list"/>
</xsl:template>
<xsl:template match="name-list">
<xsl:apply-templates select="name"/>
</xsl:template>
<xsl:template match="name">
<xsl:copy>
<cell_num>
<xsl:value-of select="count(ancestor::cell-line/preceding-sibling::*)+1"/>
</cell_num>
<xsl:for-each select="#*">
<xsl:element name="{name(.)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
<name_value><xsl:value-of select="."/></name_value>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Web Page List
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Cellosaurus">
<xsl:copy>
<xsl:apply-templates select="cell-line-list/cell-line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="cell-line">
<xsl:apply-templates select="web-page-list"/>
</xsl:template>
<xsl:template match="web-page-list">
<xsl:apply-templates select="url"/>
</xsl:template>
<xsl:template match="url">
<xsl:copy>
<cell_num>
<xsl:value-of select="count(ancestor::cell-line/preceding-sibling::*)+1"/>
</cell_num>
<xsl:for-each select="#*">
<xsl:element name="{name(.)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
<url_value><xsl:value-of select="."/></url_value>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
HLA List
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Cellosaurus">
<xsl:copy>
<xsl:apply-templates select="cell-line-list/cell-line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="cell-line">
<xsl:apply-templates select="hla-lists/hla-list"/>
</xsl:template>
<xsl:template match="hla-list">
<xsl:apply-templates select="hla-gene"/>
</xsl:template>
<xsl:template match="hla-gene">
<hla_gene>
<cell_num>
<xsl:value-of select="count(ancestor::cell-line/preceding-sibling::*)+1"/>
</cell_num>
<xsl:for-each select="#*">
<xsl:element name="{name(.)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
<hla_value><xsl:value-of select="."/></hla_value>
</hla_gene>
</xsl:template>
</xsl:stylesheet>
Special List
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Cellosaurus">
<xsl:copy>
<xsl:apply-templates select="cell-line-list/cell-line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="cell-line">
<xsl:apply-templates select="species-list/cv-term"/>
</xsl:template>
<xsl:template match="cv-term">
<species_list>
<cell_num>
<xsl:value-of select="count(ancestor::cell-line/preceding-sibling::*)+1"/>
</cell_num>
<xsl:for-each select="#*">
<xsl:element name="{name(.)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
<species_value><xsl:value-of select="."/></species_value>
</species_list>
</xsl:template>
</xsl:stylesheet>
Marker List
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="Cellosaurus">
<xsl:copy>
<xsl:apply-templates select="cell-line-list/cell-line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="cell-line">
<xsl:apply-templates select="str-list"/>
</xsl:template>
<xsl:template match="str-list">
<xsl:apply-templates select="marker-list"/>
</xsl:template>
<xsl:template match="marker-list">
<xsl:apply-templates select="marker"/>
</xsl:template>
<xsl:template match="marker">
<xsl:copy>
<cell_num>
<xsl:value-of select="count(ancestor::cell-line/preceding-sibling::*)+1"/>
</cell_num>
<xsl:for-each select="#*">
<xsl:element name="{name(.)}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
<xsl:copy-of select="marker-data-list/marker-data/alleles"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Output
After chain merge where values repeat for every unique row similar to SQL joins for long data frames (many-to-many). Do note: there is a named list of data frames should you not want below merged output:
Just one comment: when you say "~109,000 cell lines with variations in missing data between each cell-line", you need to understand that the only madatory fields in a Cellosaurus entry are the primary accession, the cell line name (identifier), the cell line category and the taxonomy, all the rest are not required. All this is described in the cellosaurus.xsd files either using "minoccurs="0" or use "optional" depending on the type of field.
I'm trying to edit some XML with a transform but I'm struggling to achieve my desired results.
I have some XML:
<FX>
<Order ATTRIBUTE1="ACTIVE" ATTRIBUTE2="CCY" />
<Attribute NAME="N1" VALUE="V1" />
<Attribute NAME="N2" VALUE="V2" />
<Attribute NAME="N3" VALUE="V3" />
</FX>
And I want to transform it to look like:
<FX>
<Order ATTRIBUTE1="ACTIVE" ATTRIBUTE2="CCY" />
<Attribute NAME="N1, N2, N3" VALUE="V1,V2,V3" />
</FX>
Is this possible? Can anyone offer any suggestions on how to do this with a transform?
You can use the following, Asp.NET compatable, XSLT-1.0 stylesheet to perform an XSLT transformation from your source XML to your destination XML:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/FX">
<xsl:copy>
<xsl:copy-of select="Order" />
<Attribute>
<xsl:attribute name="NAME">
<xsl:for-each select="Attribute">
<xsl:value-of select="#NAME" />
<xsl:if test="position() != last()">
<xsl:text>, </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:attribute>
<xsl:attribute name="VALUE">
<xsl:for-each select="Attribute">
<xsl:value-of select="#VALUE" />
<xsl:if test="position() != last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:attribute>
</Attribute>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Its output is:
<FX>
<Order ATTRIBUTE1="ACTIVE" ATTRIBUTE2="CCY"/>
<Attribute NAME="N1, N2, N3" VALUE="V1,V2,V3"/>
</FX>
In general, if you want to transform some nodes but keep the rest you use the identity transformation template as the starting point and then add templates that change those nodes you want to change:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="FX/Attribute[1]">
<xsl:copy>
<xsl:apply-templates select="#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="FX/Attribute[position() > 1]"/>
<xsl:template match="FX/Attribute[1]/#*">
<xsl:attribute name="{name()}">
<xsl:for-each select=". | ../following-sibling::Attribute/#*[name() = name(current())]">
<xsl:if test="position() > 1">,</xsl:if>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/jyH9rNk
I have this XSLT to split a 25 MB XHTML file.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates select="html/body"/>
</xsl:template>
<xsl:template match="body">
<xsl:for-each-group select="node()"
group-starting-with="*[position()=1 or #class='toc']">
<xsl:if test="count(current-group()[self::*]) > 0 ">
<xsl:variable name="filename" select="concat('/home/t',position(),'.xml' )"/>
<xsl:apply-templates/>
<xsl:result-document
indent="yes" method="xml" href="$filename}">
<html>
<xsl:copy-of select="/html/#*"/>
<xsl:for-each select="/html/node()">
<xsl:choose>
<xsl:when test="not(self::body)">
<xsl:copy-of select="."/>
</xsl:when>
<xsl:otherwise>
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:copy-of select="current-group()"/>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each>
</html>
</xsl:result-document>
</xsl:if>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>
It currently works at splitting up the file when it finds a #toc. I need to alter this to be sensitive to size of the output file, as opposed to breaking at the #toc.
Desired end state: I want the result document to be about 500KB. I suppose position() might be the best way to regulate the split points?? I tried various string-length() approaches--I could not get one to work. Also, I think white space may be an issue.
By my calculations with these documents, splitting the file at a <p class="i0"> found at or near every 150th position increment should reliably give me the filesize I need.
I guess the best way to get there is to change this:
group-starting-with="*[position()=1 or #class='toc']"
So far I have not succeeded in anything I have changed it to. Thoughts?
UPDATE: I'm not ready to say this is answered, because someone may have a better idea. But right now I'm using group-starting-with="body/*[position()=1 or position() mod 350 = 0]" with some success. It is testing well.
UPDATE 2: The group-starting-with="body/*[position()=1 or position() mod 350 = 0]" is not working well. Problem is that it is the position within the for-each-loop, not the overall file.
The successful solution ended up being an xslt 3.0 accumulator.
As an alternative:
Dmitiri Novatchev solution for XSLT 1.0:
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:variable name="vResult">
<xsl:apply-templates/>
</xsl:variable>
Length of output is: <xsl:text/>
<xsl:value-of select="concat(string-length($vResult), '
')"/>
<xsl:if test="string-length($vResult) <= 1800">
<xsl:copy-of select="$vResult"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on this source.xml:
<nums>
<num>01</num>
<num>02</num>
<num>03</num>
<num>04</num>
<num>05</num>
<num>06</num>
<num>07</num>
<num>08</num>
<num>09</num>
<num>10</num>
</nums>
produces the wanted result:
Length of output is: 51
01
02
03
04
05
06
07
08
09
10
References
XSLT FAQ: WML and HDML - Measuring the size of the output file, in bytes
XSLT 3.0: Accumulator Function
Utilizing new capabilities of XML languages to verify integrity constraints
A Functional Tokenizer (Was: Re: Looping over a CSV in XSL)
XSL Techniques
FXSL:sumTree
I have the following XML data:
<result>
<row>
<CountryId>26</CountryId>
<CountryName>United Kingdom</CountryName>
<NoOfNights>1</NoOfNights>
<AccommodationID>6004</AccommodationID>
<RoomID>1</RoomID>
<RoomName>Double for Sole Use</RoomName>
<RatePlanID>1</RatePlanID>
<RoomRatePlan>Advance</RoomRatePlan>
<NoOfSameTypeRoom>0</NoOfSameTypeRoom>
<RoomSize/>
<Max_Person>1</Max_Person>
<RackRate>189</RackRate>
<CurrencySymbol>£</CurrencySymbol>
<NoOfRoomsAvailable>4</NoOfRoomsAvailable>
<Rate>79.00</Rate>
<RatePerDay>27 Mar 2013_79.00</RatePerDay>
</row>
<row>
<CountryId>26</CountryId>
<CountryName>United Kingdom</CountryName>
<NoOfNights>1</NoOfNights>
<AccommodationID>6004</AccommodationID>
<RoomID>1</RoomID>
<RoomName>Double for Sole Use</RoomName>
<RatePlanID>2</RatePlanID>
<RoomRatePlan>Standard</RoomRatePlan>
<NoOfSameTypeRoom>0</NoOfSameTypeRoom>
<RoomSize/>
<Max_Person>1</Max_Person>
<RackRate>189</RackRate>
<CurrencySymbol>£</CurrencySymbol>
<NoOfRoomsAvailable>5</NoOfRoomsAvailable>
<Rate>89.00</Rate>
<RatePerDay>27 Mar 2013_89.00</RatePerDay>
</row>
<row>
<CountryId>26</CountryId>
<CountryName>United Kingdom</CountryName>
<NoOfNights>1</NoOfNights>
<AccommodationID>6004</AccommodationID>
<RoomID>2</RoomID>
<RoomName>Double Room</RoomName>
<RatePlanID>1</RatePlanID>
<RoomRatePlan>Advance</RoomRatePlan>
<NoOfSameTypeRoom>0</NoOfSameTypeRoom>
<RoomSize/>
<Max_Person>2</Max_Person>
<RackRate>199</RackRate>
<CurrencySymbol>£</CurrencySymbol>
<NoOfRoomsAvailable>5</NoOfRoomsAvailable>
<Rate>89.00</Rate>
<RatePerDay>27 Mar 2013_89.00</RatePerDay>
</row>
</result>
My XSLT for the above xml is this:
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output omit-xml-declaration="yes" indent="yes" method="xml" />
<xsl:strip-space elements="*"/>
<!-- Default template : ignore unrecognized elements and text -->
<xsl:template match="*|text()" />
<!-- Match document root : add hotels element and process each children node of result -->
<xsl:template match="/">
<hotels>
<!-- We assume that the XML documents are always going to follow the structure:
result as the root node and xml_acc elements as its children -->
<xsl:for-each select="result/row">
<result>
<hotel_rooms>
<xsl:element name="hotel_id">
<xsl:value-of select="AccommodationID"/>
</xsl:element>
<xsl:apply-templates />
</hotel_rooms>
<xsl:element name="Rate">
<xsl:element name="RoomRatePlan">
<xsl:value-of select="RoomRatePlan"/>
</xsl:element>
<xsl:element name="numeric_price">
<xsl:value-of select="Rate"/>
</xsl:element>
</xsl:element>
</result>
</xsl:for-each>
</hotels>
</xsl:template>
<!-- Elements to be copied as they are -->
<xsl:template match="NoOfNights|RoomName|RoomSize|Max_Person|RackRate|RatePerDay|CurrencySymbol|NoOfRoomsAvailable|RoomDescription|RoomFacilities|PolicyComments|Breakfast|Policy|Message">
<xsl:copy-of select="." />
</xsl:template>
<xsl:template match="Photo_Max60">
<RoomImages>
<Photo_Max60>
<xsl:value-of select="." />
</Photo_Max60>
<Photo_Max300>
<xsl:value-of select="../Photo_Max300" />
</Photo_Max300>
<Photo_Max500>
<xsl:value-of select="../Photo_Max500" />
</Photo_Max500>
</RoomImages>
</xsl:template>
</xsl:stylesheet>
In XSLT 1.0, I want to group by Room ID and Hotel ID . so in the above data, I Want the result like this.
<hotels>
<result>
<hotel_rooms>
<hotel_id>6004</hotel_id>
<NoOfNights>1</NoOfNights>
<RoomID>1</RoomID>
<RoomName>Double for Sole Use</RoomName>
<RoomSize/>
<Max_Person>1</Max_Person>
<RackRate>189</RackRate>
<CurrencySymbol>£</CurrencySymbol>
<NoOfRoomsAvailable>4</NoOfRoomsAvailable>
<RatePerDay>27 Mar 2013_79.00</RatePerDay>
</hotel_rooms>
<Rate>
<RoomRatePlan>Advance</RoomRatePlan>
<numeric_price>79.00</numeric_price>
<RoomRatePlan>Standard</RoomRatePlan>
<numeric_price>89.00</numeric_price>
</Rate>
</result>
</result>
<hotels>
I want a xslt file for the above xml output i need.please help..
Muenchian method:
<?xml version="1.0" encoding="utf-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output indent="yes" />
<xsl:key name="room-per-hotel" match="result" use="concat(hotel_rooms/hotel_id, '-', hotel_rooms/RoomID)" />
<xsl:template match="/">
<hotels>
<xsl:for-each select="hotels/result[count(. | key('room-per-hotel', concat(hotel_rooms/hotel_id, '-', hotel_rooms/RoomID))[1]) = 1]">
<xsl:sort select="concat(hotel_rooms/hotel_id, '-', hotel_rooms/RoomID)" />
<result>
<xsl:copy-of select="hotel_rooms"/>
<Rate>
<xsl:copy-of select="key('room-per-hotel', concat(hotel_rooms/hotel_id, '-', hotel_rooms/RoomID))/Rate/*"/>
</Rate>
</result>
</xsl:for-each>
</hotels>
</xsl:template>
</xsl:stylesheet>
Working example
Well neither your input nor your wanted result is well-formed with tags not being closed properly and the existence of an entity reference £ to a not declared entity but if you want to group with XSLT 1.0 then have a look at the following XSLT 1.0 sample using Muenchian grouping:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="group" match="result" use="concat(hotel_rooms/hotel_id, '|', hotel_rooms/RoomID)"/>
<xsl:template match="hotels">
<xsl:copy>
<xsl:apply-templates select="result[generate-id() = generate-id(key('group', concat(hotel_rooms/hotel_id, '|', hotel_rooms/RoomID))[1])]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="result">
<xsl:copy>
<xsl:copy-of select="hotel_rooms"/>
<Rate>
<xsl:copy-of select="key('group', concat(hotel_rooms/hotel_id, '|', hotel_rooms/RoomID))/Rate/*"/>
</Rate>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
It transforms
<hotels>
<result>
<hotel_rooms>
<hotel_id>6004</hotel_id>
<NoOfNights>1</NoOfNights>
<RoomID>1</RoomID>
<RoomName>Double for Sole Use</RoomName>
<RoomSize/>
<Max_Person>1</Max_Person>
<RackRate>189</RackRate>
<CurrencySymbol>pound</CurrencySymbol>
<NoOfRoomsAvailable>4</NoOfRoomsAvailable>
<RatePerDay>27 Mar 2013_79.00</RatePerDay>
</hotel_rooms>
<Rate>
<RoomRatePlan>Advance</RoomRatePlan>
<numeric_price>79.00</numeric_price>
</Rate>
</result>
<result>
<hotel_rooms>
<hotel_id>6004</hotel_id>
<NoOfNights>1</NoOfNights>
<RoomID>1</RoomID>
<RoomName>Double for Sole Use</RoomName>
<RoomSize/>
<Max_Person>1</Max_Person>
<RackRate>189</RackRate>
<CurrencySymbol>pound</CurrencySymbol>
<NoOfRoomsAvailable>5</NoOfRoomsAvailable>
<RatePerDay>27 Mar 2013_89.00</RatePerDay>
</hotel_rooms>
<Rate>
<RoomRatePlan>Standard</RoomRatePlan>
<numeric_price>89.00</numeric_price>
</Rate>
</result>
</hotels>
into
<hotels>
<result>
<hotel_rooms>
<hotel_id>6004</hotel_id>
<NoOfNights>1</NoOfNights>
<RoomID>1</RoomID>
<RoomName>Double for Sole Use</RoomName>
<RoomSize />
<Max_Person>1</Max_Person>
<RackRate>189</RackRate>
<CurrencySymbol>pound</CurrencySymbol>
<NoOfRoomsAvailable>4</NoOfRoomsAvailable>
<RatePerDay>27 Mar 2013_79.00</RatePerDay>
</hotel_rooms>
<Rate>
<RoomRatePlan>Advance</RoomRatePlan>
<numeric_price>79.00</numeric_price>
<RoomRatePlan>Standard</RoomRatePlan>
<numeric_price>89.00</numeric_price>
</Rate>
</result>
</hotels>
I have a dateTime variable, and I want to convert it to a decimal value of epoch.
How can this be done?
I tried using:
seconds-from-duration($time, xs:dateTime('1970-01-01T00:00:00'))
but it just returns 0.
Please advice.
Thanks.
This transformation:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:sequence select="current-dateTime()"/>
<xsl:sequence select=
"( current-dateTime() - xs:dateTime('1970-01-01T00:00:00') )
div
xs:dayTimeDuration('PT1S')
"/>
</xsl:template>
</xsl:stylesheet>
when applied on any XML document (not used), produces the wanted result -- the current date-time and its Unix epoch (the number of seconds since 1/1/1970 ):
2010-08-12T06:26:54.273-07:00 1281594414.273
A pure xsl 1.0 lib example:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:date="https://github.com/ilyakharlamov/pure-xsl/date"
version="1.0">
<xsl:import href="https://raw.github.com/ilyakharlamov/pure-xsl/master/date.xsl"/>
<xsl:template match="/">
<xsl:variable name="time_as_timestamp" select="1365599995640"/>
<xsl:text>time_as_timestamp:</xsl:text><xsl:value-of select="$time_as_timestamp"/><xsl:text>
</xsl:text>
<xsl:variable name="time_as_xsdatetime">
<xsl:call-template name="date:date-time">
<xsl:with-param name="timestamp" select="$time_as_timestamp"/>
</xsl:call-template>
</xsl:variable>
<xsl:text>time_as_xsdatetime:</xsl:text><xsl:value-of select="$time_as_xsdatetime"/><xsl:text>
</xsl:text>
<xsl:text>converted back:</xsl:text>
<xsl:call-template name="date:timestamp">
<xsl:with-param name="date-time" select="$time_as_xsdatetime"/>
</xsl:call-template>
</xsl:template>
</xsl:stylesheet>
Output:
time_as_timestamp:1365599995640
time_as_xsdatetime:2013-04-10T13:19:55.640Z
converted back:1365599995640
As an xpath which does not use division but extracts from the duration:
for $i in (current-dateTime()-xs:dateTime('1970-01-01T00:00:00Z'))
return ((days-from-duration($i)*86400)+(hours-from-duration($i)*3600)+(minutes-from-duration($i)*60)+(seconds-from-duration($i)))