parse xml without namespaces - r

I'm trying to extract information from xml (or more precisely: xbrl) files, that reference a lot of namespaces. I don't really care about the namespaces, and I'd prefer to be able to search through the files without having to specify them.
Here is an example:
require(magrittr)
xml <-xml2::read_xml("http://regnskaber.virk.dk/21560853/ZG9rdW1lbnRsYWdlcjovLzAzLzdlLzk4L2JiLzg4L2NiNzctNDE2ZC1hOWJmLTkxN2QxZWRkMGY0Yg.xml")
this file contains the following node:
<cmn:IdentificationNumberOfAuditor contextRef=\"duration_IdentificationOfAuditorDimension_cmn_auditorIdentifier_only_1\">mne18078</cmn:IdentificationNumberOfAuditor>
I know I can find it using
xml2::xml_find_all(xml, '//cmn:IdentificationNumberOfAuditor')
But that is only if I know the namespace prefix, and I am not sure those are given the same way in all of the thousands of files I need to process. So I was hoping for this to work:
xml2::xml_find_all(xml2::xml_ns_strip(xml), '//IdentificationNumberOfAuditor')
because I thought xml_ns_strip would strip the xml file of the namespace information. However, xml_ns_strip does not actually seem to do anything at all, since:
identical(xml %>% as.character(), xml_ns_strip(xml) %>% as.character())
returns true.

For reference, the best solution i found to this problem is using an xslt stylesheet from https://www.ibm.com/support/knowledgecenter/vi/SSEPGG_10.1.0/com.ibm.db2.luw.xml.doc/doc/r0054369.html
strip_namespaces <- function(x){
stylesh <- '<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<!-- keep comments -->
<xsl:template match="comment()">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<!-- remove element prefix -->
<xsl:element name="{local-name()}">
<!-- process attributes -->
<xsl:for-each select="#*">
<!-- remove attribute prefix -->
<xsl:attribute name="{local-name()}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:for-each>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
')
xslt::xml_xslt(x, stylesh)
}

Related

XML File Transformations for inner text of an element?

I am in the process of setting up a release pipeline for one of our solutions, but I am struggling to use file transformations on my web.config to alter inner text values of elements.
From what it looks like, you can replace/insert/etc values attached to specific attributes, but not the inner text. Does this mean I will not be able to use file transformations for my purposes?
<setting name="Test" serializeAs="String">
<value>True</value>
</setting>
That "True" value must be replaced with False. There are quite a number of similar instances that need to be replaced. Can this be done with XML file transformations? I cannot use the variable substitution method as it only applies to certain elements like connectionString, etc.
Thanks in advance.
You could use a simple XSL transformation.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="1.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="setting[#name='Test']/value">
<xsl:element name="value">False</xsl:element>
</xsl:template>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
See it working here : https://xsltfiddle.liberty-development.net/bEzknsB

Simple xml document with Css

Hi i am new in programming i am creating a simple xml document and i want to apply Css in my xml data . My xml is
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE Customers SYSTEM "customers.dtd">
<Employees>
<Employee>
<FirstNameLabel>FirstName:</FirstNameLabel>
<FirstNameData>Ana</FirstNameData>
<FirstNameLabel>LastName:</FirstNameLabel>
<FirstNameData>Ali</FirstNameData>
</Employee>
<Employee>
<FirstNameLabel>FirstName:</FirstNameLabel>
<FirstNameData>Ash</FirstNameData>
<FirstNameLabel>LastName:</FirstNameLabel>
<FirstNameData>Ana</FirstNameData>
</Employee>
</Employees>
I need to display results as
FirstName: Ana
LastName: Ali
FirstName: Ash
LastName: Ana
While FirstName and LastName should be green while Values should be red
Help would be appreciated Thanks
I don't think you looked before posting this. w3schools shows exactly how. http://www.w3schools.com/xml/xml_display.asp It is very similar to adding a style sheet to html. Also, you should fix your indentation.
Try:
FirstNameLabel{
float:left;
}
FirstNameData{
clear:left;
}
I believe you may want to look to use XML Stylesheet to render and format your XML files rather than CSS. CSS is, as far as I know, for HTML documents. In your case, you would need to create a XSLT sheet, which is pretty much XML that specifies how to present each of your node. below is a quick example (untested):
<?xml version="1.0" encoding="ISO-8859-1"?>
<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">
<body style="font-family:Arial;font-size:12pt;background-color:#EEEEEE">
<xsl:for-each select="Employees/Employee">
<p>Firstname: <xsl:value-of select="FirstNameData"/>.</p>
<p>Lastname: <xsl:value-of select="LastNameData"/>.</p>
</div>
</xsl:for-each>
</body>
</html>
Note that you shouldn't need to store labels in your XML. You can have them in your stylesheet.

How to convert a xml element into url using css?

I have an xml file which has an element like this
<video><title>XXXX</title><url>VIDEO ID OF YOUTUBE></url></video>
Is there any way to use CSS to display the elements as
XXXX
CSS is for Styling, if you wish to read XML and produce HTML Content you should use Javascript. Or any server side language if you want it to be done before sending it to the client browser.
As Nicklas points out, many think of Css to be intended to be for styling and nothing more. More generally, though, it's intended for the presentation of your information. This is a fringe case that's difficult for me to say whether it's going too far or within the scope of CSS: is this simply changing the presentation, or is it doing something more?
I'm sure many, like Nicklas, would argue that what you want to do goes beyond the intended purpose of CSS. And I'd probably agree with Nicklas in that for most cases I'd find this to be a less-than-ideal way to go about things.
With that said, it is possible
#url:before {
display: inline-block;
content: "<a href=\"http://www.youtube.com/";
}
#url:after {
display: inline-block;
content: "\">";
}
#text:after {
display: inline-block;
content: "</a>";
}
Note: I used Html in this example for the sake of making a JsFiddle, but the same strategy should work for an Xml file
I do not know of a way to use CSS for this. But you can do this using a XSLT Stylesheet. Is that what you want? Then a XSLT Stylesheet similar to this can help you:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
<xsl:output method="html" indent="yes"/>
<xsl:template match="/" >
<body>
<xsl:apply-templates select ="/items/video"/>
</body>
</xsl:template>
<xsl:template match="video" >
<li>
<a>
<xsl:attribute name="href">
<xsl:value-of select="./url" disable-output-escaping="yes" />
</xsl:attribute>
<xsl:value-of select="./title"/>
</a>
</li>
</xsl:template>
<xsl:template match=
"*[not(#*|*|comment()|processing-instruction())
and normalize-space()=''
]"/>
<xsl:template match="text()"/>
</xsl:stylesheet>
This assumes that the XML input file looks as follows:
<?xml version="1.0" encoding="utf-8" ?>
<items>
<video>
<title>XXXX</title>
<url>VIDEO ID OF YOUTUBE></url>
</video>
<video>
<title>XXXX</title>
<url>VIDEO ID OF YOUTUBE></url>
</video>
</items>
Exactly how you should do the transform is hard to say when I don't know about your requirements. A XSLT transform can be done in a number of ways. If the XML input file is static one can let the webbrowser do the transformation. If the XML file is on a server you can write a transform in a awebpage and sen the HTML to the webbbrowser. It all depens on your environment.

How to retain mathml tags after transformation?

How do I retain mathml tags after the transformation? I'm using this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:m="http://www.w3.org/1998/Math/MathML">
By the way I'm using the m namespace. After transforming into html, mathml tags are gone. Please help.
You are using XSLT so that takes some unspecified input and can generate any XML output. Since you have given no indication of the input or the transformation it is pretty hard to help but as a wild guess
<xsl:template match="m:*">
<xsl:copy-of select="."/>
</xsl:template>
will copy mathml from the input to the output if templates are applied to the mathml elements.

I want to remove XSL whitespace from Markdown formatting

Markdown formatting is coming into my XSL and maintaining its whitespace and breaks. I want it to be converted to actual HTML elements to remove all whitespace.
Here's a look at the incoming data & HTML source, and here's the code used to process it..
<xsl:value-of select="description-continued" disable-output-escaping="yes" />
The XSL output method already contains indent="no"
You may just need (in case no group of whitespace characters really matters):
<xsl:template match="text()">
<xsl:value-of select="normalize-space()"/>
</xsl:template>
<xsl:value-of select="normalize-space(description-continued)" disable-output-escaping="yes" />

Resources