How to use variable with namespace in dataweave? - xml-namespaces

There is a requirement where we need to use dynamic tag names in XML along with namespace. I am storing the tagname in a variable and trying to use with namespace.
%dw 1.0
%output application/XML encoding= "UTF-8"
%namespace opt vision.soap.ogc
%var tag = flowVars.tag
---
{
opt#tag : 'something'
}
The output I am expecting is to be the tagname I have stored in variable along with namespace, However the actual output is just appending the string 'tag' with namespace. Is there a way to do it?

Please try this
%dw 1.0
%output application/XML encoding= "UTF-8"
%namespace opt vision.soap.ogc
%var tag = "a"
---
{
opt#"$(tag)" : 'something'
}
Output
<?xml version='1.0' encoding='UTF-8'?>
<opt:a xmlns:opt="vision.soap.ogc">something</opt:a>

Related

converting xhtml to xml in r

I want to parse a court document I downloaded in xml format. But the response type is application/xhtml+xml. And I'm getting an error in turning this xhtml document to xml in r so that I can extract information I need. See below. Can anyone help? Thank you.
resp_xml <- readRDS("had_NH_xml.rds")
# Load xml2
library(xml2)
# Check response is XML
http_type(resp_xml)
[1] "application/xhtml+xml"
# Examine returned text with content()
NH_text <- content(resp_xml, as = "text")
NH_text
[1] "<!DOCTYPE html>\n<html xmlns=\"http://www.w3.org/1999/xhtml\"><head>\n \t<meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\" /><link type=\"text/css\" rel=\"stylesheet\" href=\"/csologin/javax.faces.resource/theme.css.jsf?ln=primefaces-redmond\" /><link type=\"text/css\" rel=\"stylesheet\" href=\"/csologin/javax.faces.resource/primefaces.css.jsf?ln=primefaces&v=5.3\" /><script type=\"text/javascript\" src=\"/csologin/javax.faces.resource/jquery/jquery.js.jsf?ln=primefaces&v=5.3\"></script><script type=\"text/javascript\" src=\"/csologin/javax.faces.resource/jquery/jquery-plugins.js.jsf?ln=primefaces&v=5.3\"></script><script type=\"text/javascript\" src=\"/csologin/javax.faces.resource/primefaces.js.jsf?ln=primefaces&v=5.3\"></script><script type=\"text/javascript\" src=\"/csologin/javax.faces.resource/primefaces-extensions.js.jsf?ln=primefaces-extensions&v=4.0.0\"></script><link type=\"text/css\" rel=\"stylesheet\" href=\"/csologin/javax.faces.resou... <truncated>
>
> # Check htmltidy package: https://cran.r- project.org/web/packages/htmltidy/htmltidy.pdf
>
# Turn NH_text into an XML document
NH_xml <- read_xml(NH_text)
Error in doc_parse_raw(x, encoding = encoding, base_url = base_url,
as_html = as_html, :
Entity 'nbsp' not defined [26]
Named HTML entities are invalid in XML (regardless of what any potential troll comments might otherwise "suggest"). I do not know R programming though what I can tell you is that you need to do string replacement for the following array:
' ','>','<'
...and replace them with the following strings:
' ','<','>'
In PHP this would simply be:
$f = array(' ','>','<');
$r = array(' ','<','>');
$a = str_ireplace($f,$r,$a);
...and each relative key/value would be replaced, I'm not sure enough to try to post R code looking at basic tutorials though.
What I can tell you is that if you clean out those strings (and any doctype) then if the rest of the code is not malformed then it should render just fine as application/xml.

How to escape Special character in XML

I have XML data in string and tried to convert that string in to XML using
XmlDocument xl=new XmlDocument();
xl.LoadXml(mystring);
It was not parsing because my string has special character in XML element like below.
<ROOT>
<SUB>
<DATA>name < lastname</DATA>
<DATA>Myname > lastname</DATA>
<DATA>some special character in between text</DATA>
......
.....
</ROOT>
</SUB>
There were many <DATA> in my XML. It was generating dynamically.
I have tried to change < < > > but it was replacing other XML tags. How to escape above special characters without change other XML tags?
Use below strings in XML
< ==> lesser-than = <
> ==> greater-than = >

UnicodeEncodeError: "ascii" can't encode character '\xe0' while parsing HTML (Python)

I'm parsing HTML by inheriting HTMLParser, which is a class coming from the library html.parser. I'm making a web scraper. I have set "convert_charrefs" to true. The program downloads a page by doing "downloadPage(url)" and passes it to myParser (I think It will be better for you if I don't paste here all my code). When the parser finds the link I'm interested to (e.g Attività e procedimenti) from a web site, the program get the value of the attribute "href" and tries to download the page linked by href, by doing "downloadPage(href)", passes it to myParser and so on...
The code for downloadPage(href) is the following:
def getCharset(response):
str = response.info()["Content-type"]
if str:
end = re.search("charset=", str).span()[1]
if end:
return str[end:]
else:
return "ascii"
else:
return "ascii"
def downloadPage(url):
response = urllib.request.urlopen(url)
charset = getCharset(response)
return response.read().decode(charset)
Now, the problem is that certain link has some vowel stressed, such as "http://città.it/" (last url is faked). Not all links found in a web page are made of Unicode characters. So the following code sometimes raises UnicodeEncodeError:
urllib.request.urlopen(url)
I specify that I can't know at first glance how each link is composed
I have solved this problem in this way:
def fromIriToUri(iri):
myUri = []
iri = urlsplit(iri)
iri = list(iri)
for i in iri:
try:
i.encode("ascii")
myUri.append(i)
except UnicodeEncodeError:
myUri.append(urllib.parse.quote(i))
uri = urllib.parse.urlunsplit(myUri)
return uri

Using XQUERY to retrieve attributes value

Is it possible to use XQUERY to retrieve the attributes filename from the following XML? I am trying to use /preFileDoc/inpXML/#filename but it doesn't work...
<?xml version="1.0"?>
<preFileDoc xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/">
<senderId>ABC</senderId>
<receiverId>XYZ</receiverId>
<tranxCode>A001</tranxCode>
<inpXML version="1.0" encoding="UTF-8">
<soap-env:Envelope>
<soap-env:Header msgcode="SPPCONVAKT" orig-system="002FTB" refid="65355ff50a172064484bf9da64c1e245" timestamp="2009-02-11 21:00:10.741" filename="SPPCONVAKT20090128001.dat"/>
<soap-env:Body>
text1
text2
</soap-env:Body>
</soap-env:Envelope>
</inpXML>
</preFileDoc>
ps: Sometimes the filename attributes is sent as fileName in the incoming XML..thinking to retrieve value from attributes #filename OR #fileName.. can it achieve in single XQUERY? Thanks for advice...
I think your XPath is incomplete. The last child-step / in /preFileDoc/inpXML/#filename only matches attributes of the inpXML element, not its descendants.
One way to solve the problem would be the //-step:
/preFileDoc/inpXML//#filename
Note that this would find all attributes named filename in the soapenv:Body, too.
A more robust way would thus be to declare the soapenv prefix in the XQuery:
declare namespace soap-env="http://schemas.xmlsoap.org/soap/envelope/";
return /preFileDoc/inpXML//soap-env:Header/#filename
Finally, the different capitalizations of filename can be worked around by specifying both:
declare namespace soap-env="http://schemas.xmlsoap.org/soap/envelope/";
return /preFileDoc/inpXML//soap-env:Header/(#filename | #fileName)
You can take the union of multiple attributes. It will be unlikely that this attribute will appear multiple times with different casing, so that should always return a single node:
//soap-env:Header/#filename | //soap-env:Header/#fileName
Optionally, you could wrap it in parentheses, and add [1] behind it, to always take the first result.
(//soap-env:Header/#filename | //soap-env:Header/#fileName)[1]
If you replace the union with a comma, which creates a sequence instead of a document order node set, you can add a default as well at the end. Maybe not very usefull here, but perhaps in other situations:
(//soap-env:Header/#filename , //soap-env:Header/#fileName, "default.dat")[1]
HTH!
You need to respect and take into account the SOAP XML namespace!
Since I don't know what you're using, I cannot tell you how to do this - but there's the xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/" on the root node, and your #filename attribute is on the <soap-env:Header .... /> node - so you need to include the XML namespace in your XQuery.
In .NET / C#, you could do it like this (using the "older" XmlDocument style which supports XPath directly):
// define test XML
string xmlContent =
#"<?xml version='1.0'?>
<preFileDoc xmlns:soap-env='http://schemas.xmlsoap.org/soap/envelope/'>
<senderId>ABC</senderId>
<receiverId>XYZ</receiverId>
<tranxCode>A001</tranxCode>
<inpXML version='1.0' encoding='UTF-8'>
<soap-env:Envelope>
<soap-env:Header msgcode='SPPCONVAKT' orig-system='002FTB' refid='65355ff50a172064484bf9da64c1e245' timestamp='2009-02-11 21:00:10.741' filename='SPPCONVAKT20090128001.dat'/>
<soap-env:Body>
text1
text2
</soap-env:Body>
</soap-env:Envelope>
</inpXML>
</preFileDoc>";
// create XmlDocument and load test data
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlContent);
// define XML namespace manager and add the SOAP namespace to it
XmlNamespaceManager mgr = new XmlNamespaceManager(doc.NameTable);
mgr.AddNamespace("soap", "http://schemas.xmlsoap.org/soap/envelope/");
// use XPath and the XML namespaces to grab the <Header> node
// the first two nodes <preFileDoc> and <inpXML> are not inside any explicit
// XML namespace
// but the next two (<Envelope> and <Header>) are in the "soap" XML namespace
XmlNode header = doc.SelectSingleNode("/preFileDoc/inpXML/soap:Envelope/soap:Header", mgr);
// read the "filename" attribute from the header node
if(header != null && header.Attributes["filename"] != null)
{
string fileName = header.Attributes["filename"].Value;
}

Test existence of xml attribute in as3

What is the best method to test the existence of an attribute on an XML object in ActionScript 3 ?
http://martijnvanbeek.net/weblog/40/testing_the_existance_of_an_attribute_in_xml_with_as3.html is suggesting to test using
if ( node.#test != node.#nonexistingattribute )
and I saw comments suggesting to use:
if ( node.hasOwnProperty('#test')) { // attribute qtest exists }
But in both case, tests are case sensitive.
From the XML Specs : "XML processors should match character encoding names in a case-insensitive way" so I presume attribute name should also be match using a case-insensitive comparison.
Thank you
Please re-read your quote from the XML specs carefully:
XML processors should match character
encoding names in a case-insensitive
way
This is in chapter 4.3.3 of the specs describing character encoding declarations, and it refers only to the names present in the encoding value of the <?xml> processing instruction, such as "UTF-8" or "utf-8". I see absolutely no reason for this to apply to attribute names and/or element names anywhere else in the document.
In fact, there is no mention of this in section 2.3 of the specs, Common Syntactic Constructs, where names and name tokens are specified. There are a few restrictions on special characters and such, but there is absolutely no restriction on upper and lower case letters.
To make your comparison case-insensitive, you will have to do it in Flash:
for each ( var attr:XML in xml.#*) {
if (attr.name().toString().toLowerCase() == test.toLowerCase()) // attribute present if true
}
or rather:
var found:Boolean = false;
for each ( var attr:XML in xml.#*) {
if (attr.name().toString().toLowerCase() == test.toLowerCase()) {
found = true;
break;
}
}
if (found) // attribute present
else // attribute not present
How about using XML's contains() method or XMLList's length() method ?
e.g.
var xml:XML = <root><child id="0" /><child /></root>;
trace(xml.children().#id.length());//test if any children have the id attribute
trace(xml.child[1].#id.length());//test if the second node has the id attribute
trace(xml.contains(<nephew />));//test for inexistend node using contains()
trace(xml.children().nephew.length());//test for inexistend node using legth()

Resources