How to escape Special character in XML - asp.net

I have XML data in string and tried to convert that string in to XML using
XmlDocument xl=new XmlDocument();
xl.LoadXml(mystring);
It was not parsing because my string has special character in XML element like below.
<ROOT>
<SUB>
<DATA>name < lastname</DATA>
<DATA>Myname > lastname</DATA>
<DATA>some special character in between text</DATA>
......
.....
</ROOT>
</SUB>
There were many <DATA> in my XML. It was generating dynamically.
I have tried to change < < > > but it was replacing other XML tags. How to escape above special characters without change other XML tags?

Use below strings in XML
< ==> lesser-than = <
> ==> greater-than = >

Related

How can I parse a quoted string with Parsers.jl

Julia’s CSV.jl parses csv files with quoted strings. It uses Parsers.jl to do this. Yet from the documentation of Parsers.jl it is not clear how to parse a double-quoted string on its own. How would I do that? As a secondary question, what is the supported set of escape sequences that Parsers.jl uses?
You can pass arbitrary characters to indicate quotation and escape characters via Parsers.Options. For example,
using Parsers
str = "{-1}"
oq, cq, e = UInt8('{'), UInt8('}'), UInt8('\\')
res = Parsers.xparse(Int64, str; openquotechar=oq, closequotechar=cq, escapechar=e)
x, code, tlen = res.val, res.code, res.tlen
print(x)

Replace occurrences of the pipe delimiter inside an XML tag in R

I have a text file with embedded XML content. The fields are separated by "|" but there are some values separated by pipe inside xml tags.I want to replace the pipe separator inside the xml tags with white space.
A few rows of the data:
TASK_VIEWED_FLAG|TASK_OUTCOME|CURRENT_QUEUE|QUEUE_CHANGE_TS|
TASK_XML_DATA|SCORE_XML_DATA|
"N"|" "|"4"|"."|"<?xml version="1.0" encoding="UTF-8"?>
<tasks xmlns="xyz.com/abc/wkbh/task"><task><taskxml>"|
"<?xml version="1.0" encoding="UTF-8"?>
<scores xmlns:dnr="xyz.com/abc/wkbh/score"><score><scorexml>
<Params>score_var=26.0|perc_var=76.5|prop_var=0.74</Params>
<weight>w1=3.0|w2=7.0</weight>"
The below attempt is messing up the headers.
newtext <- readLines("sample.txt")
newtext <- gsub(">(.+?)|(.+?)|(.+?)<", ">\\1[[:space:]]\\2[[:space:]]\\3<", newtext)
Any hints on this is highly appreciated.
You might be able to use the stringr package to replace pipes in all strings between > and the next < using >[^<]+<. I also added [^"] to avoid replacing pipes between tags like <taskxml>"|"<?xml
x <- 'this|"<is>"|"<a>"|"<test>1|2|3</test>"'
str_replace_all( x, '>[^"][^<]+<', function(x) gsub("\\|", " ", x) )
[1] "this|\"<is>\"|\"<a>\"|\"<test>1 2 3</test>\""

Looping through XMLReader to replace special characters in data field

I have XML files I want to put into data sets to export to a database using VB.Net. There is a possibility that new XML files added to this list daily will have special characters (idk why anyone would include "&" in an address entry anyway). After creating the XMLReader, what is the easiest way to replace the escape characters? What would the pseudo code look like? Stream Reader maybe? Or does that work with XMLReader?
Here is my code right now that attempts the data set creation:
For Each file1 In Directory.GetFiles(My.Settings.Local_Meter_Path, "*BadMeter*.xml")
Dim filecreatedate As String = IO.File.GetLastWriteTime(file1)
FN = Path.GetFileName(file1).ToString()
xmlFile = XmlReader.Create(Path.Combine(My.Settings.Local_Meter_Path, FN), New XmlReaderSettings())
ds.ReadXml(xmlFile)
and the spot where I'm getting ampersand entity-name parsing error
<Cell ss:StyleID="Default"><Data ss:Type="String">1440 COUNTY ROAD 40 X-MAS LIGHT & RV #2 CAMP HILL</Data></Cell>

XQuery "flattening" an element

I am extracting data from an XML file and I need to extract a delimited list of sub-elements. I have the following:
for $record in //record
let $person := $record/person/names
return concat($record/#uid/string()
,",", $record/#category/string()
,",", $person/first_name
,",", $person/last_name
,",", $record/details/citizenships
,"
")
The element "citizenships" contains sub-elements called "citizenship" and as the query stands it sticks them all together in one string, e.g. "UKFrance". I need to keep them in one string but separate them, e.g. "UK|France".
Thanks in advance for any help!
fn:string-join($arg1 as xs:string*, $arg2 as xs:string) is what you're looking for here.
In your currently desired usage, that would look something like the following:
fn:string-join($record/details/citizenships/citizenship, "|")
Testing outside your document, with:
fn:string-join(("UK", "France"), "|")
...returns:
UK|France
Notably, ("UK", "France") is a sequence of strings, just as a query returning multiple citizenships would likewise be a sequence (the entries in which will be evaluated for their string value when passed to fn:string-join(), which is typed as taking a sequence of strings for its first argument).
Consider the following (simplified) query:
declare context item := document { <root>
<record uid="1">
<person>
<citizenships>
<citizenship>France</citizenship>
<citizenship>UK</citizenship>
</citizenships>
</person>
</record>
</root> };
for $record in //record
return concat(fn:string-join($record//citizenship, "|"), "
")
...and its output:
France|UK

Using XQUERY to retrieve attributes value

Is it possible to use XQUERY to retrieve the attributes filename from the following XML? I am trying to use /preFileDoc/inpXML/#filename but it doesn't work...
<?xml version="1.0"?>
<preFileDoc xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/">
<senderId>ABC</senderId>
<receiverId>XYZ</receiverId>
<tranxCode>A001</tranxCode>
<inpXML version="1.0" encoding="UTF-8">
<soap-env:Envelope>
<soap-env:Header msgcode="SPPCONVAKT" orig-system="002FTB" refid="65355ff50a172064484bf9da64c1e245" timestamp="2009-02-11 21:00:10.741" filename="SPPCONVAKT20090128001.dat"/>
<soap-env:Body>
text1
text2
</soap-env:Body>
</soap-env:Envelope>
</inpXML>
</preFileDoc>
ps: Sometimes the filename attributes is sent as fileName in the incoming XML..thinking to retrieve value from attributes #filename OR #fileName.. can it achieve in single XQUERY? Thanks for advice...
I think your XPath is incomplete. The last child-step / in /preFileDoc/inpXML/#filename only matches attributes of the inpXML element, not its descendants.
One way to solve the problem would be the //-step:
/preFileDoc/inpXML//#filename
Note that this would find all attributes named filename in the soapenv:Body, too.
A more robust way would thus be to declare the soapenv prefix in the XQuery:
declare namespace soap-env="http://schemas.xmlsoap.org/soap/envelope/";
return /preFileDoc/inpXML//soap-env:Header/#filename
Finally, the different capitalizations of filename can be worked around by specifying both:
declare namespace soap-env="http://schemas.xmlsoap.org/soap/envelope/";
return /preFileDoc/inpXML//soap-env:Header/(#filename | #fileName)
You can take the union of multiple attributes. It will be unlikely that this attribute will appear multiple times with different casing, so that should always return a single node:
//soap-env:Header/#filename | //soap-env:Header/#fileName
Optionally, you could wrap it in parentheses, and add [1] behind it, to always take the first result.
(//soap-env:Header/#filename | //soap-env:Header/#fileName)[1]
If you replace the union with a comma, which creates a sequence instead of a document order node set, you can add a default as well at the end. Maybe not very usefull here, but perhaps in other situations:
(//soap-env:Header/#filename , //soap-env:Header/#fileName, "default.dat")[1]
HTH!
You need to respect and take into account the SOAP XML namespace!
Since I don't know what you're using, I cannot tell you how to do this - but there's the xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/" on the root node, and your #filename attribute is on the <soap-env:Header .... /> node - so you need to include the XML namespace in your XQuery.
In .NET / C#, you could do it like this (using the "older" XmlDocument style which supports XPath directly):
// define test XML
string xmlContent =
#"<?xml version='1.0'?>
<preFileDoc xmlns:soap-env='http://schemas.xmlsoap.org/soap/envelope/'>
<senderId>ABC</senderId>
<receiverId>XYZ</receiverId>
<tranxCode>A001</tranxCode>
<inpXML version='1.0' encoding='UTF-8'>
<soap-env:Envelope>
<soap-env:Header msgcode='SPPCONVAKT' orig-system='002FTB' refid='65355ff50a172064484bf9da64c1e245' timestamp='2009-02-11 21:00:10.741' filename='SPPCONVAKT20090128001.dat'/>
<soap-env:Body>
text1
text2
</soap-env:Body>
</soap-env:Envelope>
</inpXML>
</preFileDoc>";
// create XmlDocument and load test data
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlContent);
// define XML namespace manager and add the SOAP namespace to it
XmlNamespaceManager mgr = new XmlNamespaceManager(doc.NameTable);
mgr.AddNamespace("soap", "http://schemas.xmlsoap.org/soap/envelope/");
// use XPath and the XML namespaces to grab the <Header> node
// the first two nodes <preFileDoc> and <inpXML> are not inside any explicit
// XML namespace
// but the next two (<Envelope> and <Header>) are in the "soap" XML namespace
XmlNode header = doc.SelectSingleNode("/preFileDoc/inpXML/soap:Envelope/soap:Header", mgr);
// read the "filename" attribute from the header node
if(header != null && header.Attributes["filename"] != null)
{
string fileName = header.Attributes["filename"].Value;
}

Resources