From the XML below that begins as:
<?xml version="1.0" encoding="UTF-8"?><searchRetrieveResponse>
<version>1.2</version>
<numberOfRecords>1</numberOfRecords>
<records>
<record>
<recordSchema>marcxml</recordSchema>
<recordPacking>xml</recordPacking>
<recordData>
<record>
<leader>01448cam a2200445Ia 4500</leader>
<controlfield tag="001">9910650701858</controlfield>
<controlfield tag="005">20181227054218.2</controlfield>
<controlfield tag="008">930525s1941 nyu b 001 0 eng d</controlfield>
<datafield tag="035" ind1=" " ind2=" ">
<subfield code="a">(OCoLC)28157672</subfield>
</datafield>
<datafield tag="035" ind1=" " ind2=" ">
<subfield code="a">(OCoLC)ocm28157672</subfield>
</datafield>
<datafield tag="035" ind1=" " ind2=" ">
<subfield code="a">(EXLNZ-01ALLIANCE_NETWORK)99153881770001451</subfield>
</datafield>
<datafield tag="040" ind1=" " ind2=" ">
<subfield code="a">UTS</subfield>
<subfield code="b">eng</subfield>
<subfield code="c">UTS</subfield>
I need to select only the text node in /searchRetrieveResponse/records/record/recordData/record/datafield[#tag="035"]/subfield[#code="a"] that contains (EXLNZ-01ALLIANCE_NETWORK) using xmlstarlet (XPATH 1.0) so the desirable output is (EXLNZ-01ALLIANCE_NETWORK)99153881770001451
I have attempted many variations of xmlstarlet sel -T -t -m '/searchRetrieveResponse/records/record/recordData/record/datafield[#tag="035"]/subfield[#code="a"][text()[contains(.,'ALLIANCE_NETWORK')]]' -v '.' but I keep returning all the 035/subfield[#code="a"] rather than just the one I want. What am I doing wrong? Thanks
Figured it out -- contains filter wasn't set up properly. I'm posting only because I found matching the node awkward.
xmlstarlet sel -T -t -m '/searchRetrieveResponse/records/record/recordData/record/datafield[#tag="035"]/subfield[#code="a"][contains(text(), "ALLIANCE_NETWORK")]' -v '.'
Related
I would like to select and change a value in an XML file. I'm trying to use xmlstarlet for this.
I have this file
<?xml version='1.0' encoding='UTF-8'?>
<DeviceDescription xmlns="http://www.3s-software.com/schemas/DeviceDescription-1.0.xsd">
<House>
<Id>
<Number>1</Number>
</Id>
</House>
<Car>
<Id>
<Number>2</Number>
</Id>
</Car>
</DeviceDescription>
My problem is the xmlns= field which xmlstarlet is picky about. Without this field I can use
xmlstarlet sel -t -v '/Description/House/Id/Number' /tmp/x.xml
I found that I can use a default namespace like this, but that returns both Id's
xmlstarlet sel -t -m "//_:Id" -v '_:Number' /tmp/x.xml
How do I specify a full path?
To only match the House id, add it to the -m argument:
xml sel -t -m '//_:House/_:Id' -v '_:Number'
If you want to use the namespace, specify it with -N, e.g.:
xml sel -N ns="http://www.3s-software.com/schemas/DeviceDescription-1.0.xsd" \
-t -v 'ns:DeviceDescription/ns:House/ns:Id/ns:Number'
So to update the value:
xml ed -N ns="http://www.3s-software.com/schemas/DeviceDescription-1.0.xsd" \
-u 'ns:DeviceDescription/ns:House/ns:Id/ns:Number' -v 3
Output:
<?xml version="1.0" encoding="UTF-8"?>
<DeviceDescription xmlns="http://www.3s-software.com/schemas/DeviceDescription-1.0.xsd">
<House>
<Id>
<Number>3</Number>
</Id>
</House>
<Car>
<Id>
<Number>2</Number>
</Id>
</Car>
</DeviceDescription>
I have thousands of records similar to the one below
<holding>
<holding_id>2225031160001858</holding_id>
<record>
<leader>00210cx a22200085 454500</leader>
<controlfield tag="001">h38165-01alliance_ohsu</controlfield>
<controlfield tag="004">b10145746-01alliance_ohsu</controlfield>
<controlfield tag="005">20200417125900.0</controlfield>
<controlfield tag="008">2004170u\\\\0\\\0001aaund0999999</controlfield>
<datafield ind1="2" ind2=" " tag="852">
<subfield code="b">OHSUMAIN</subfield>
<subfield code="c">oldstorjrl</subfield>
</datafield>
</record>
</holding>
I need to change datafield #ind1 to " " where #tag="852" AND no subfield with #code="h" exists. In this example, #code="b" and #code="c" exist, but #code="h" does not, so I'd want to modify this record.
I can think of ways to accomplish what I need using program logic, but can I use xmlstarlet directly to select the nodes I want based on the absence of a subnode?
Desired output from this record would be
<holding>
<holding_id>2225031160001858</holding_id>
<record>
<leader>00210cx a22200085 454500</leader>
<controlfield tag="001">h38165-01alliance_ohsu</controlfield>
<controlfield tag="004">b10145746-01alliance_ohsu</controlfield>
<controlfield tag="005">20200417125900.0</controlfield>
<controlfield tag="008">2004170u\\\\0\\\0001aaund0999999</controlfield>
<datafield ind1=" " ind2=" " tag="852">
<subfield code="b">OHSUMAIN</subfield>
<subfield code="c">oldstorjrl</subfield>
</datafield>
</record>
</holding>
Not sure how I missed this, but it turned out to be straightforward
xmlstarlet ed -u '/holding/record/datafield[#tag="852"][not(subfield[#code="h"])]/#ind1' -v ' '
This xpath expression should select the correct target node:
"//datafield[#ind1][not(subfield[#code="h"])]"
i am trying to decode a value from xml. please find a sample below. This will be multiple blocks. I need to find tag and decode the contents and generate the same output. i am just in the process of starting the script.
<SOAP-ENV:Body>
<log-entry serial="abcde" domain="abc">
<date>Tue Oct 17 2017</date>
<time utc="abcde">14:14:30</time>
<type>all</type>
<class>ccccc</class>
<object>Web_Token</object>
<level num="5">notice</level>
<transaction>xxxxx</transaction>
<global-transaction-id>xxxxx</global-transaction-id>
<client>X.X.X.X</client>
<message>
<base64>**encodeddata**</base64>
</message>
</log-entry>
</SOAP-ENV:Body>
i need output
<SOAP-ENV:Body>
<log-entry serial="abcde" domain="abc">
<date>Tue Oct 17 2017</date>
<time utc="abcde">14:14:30</time>
<type>all</type>
<class>ccccc</class>
<object>Web_Token</object>
<level num="5">notice</level>
<transaction>xxxxx</transaction>
<global-transaction-id>xxxxx</global-transaction-id>
<client>X.X.X.X</client>
<message>
<base64>**decodeddata**</base64>
</message>
</log-entry>
</SOAP-ENV:Body>
I am in the process of Iteration, started with decoding the value.
sed -n 's/<base64>\(.*\)<\/base64>/\1/p' log.txt | base64 --decode
thanks.
Try this :
xmllint --xpath '//message/base64/text()' file.xml 2>/dev/null |
base64 -d -
Hi I have a big log file for which I am trying to get xml data passed into It.
I have a big log file which ressembles this :
2016/01/01 bladh bqskjdqskldjqsdlqskdjqlskdj dazihzmkldjkdjqslkjd
2016/01/01: qsdhqsdlkqsmdjqsldjqslkdjqlskdjqslkdjqslkdjqskdjqsd
2016/01/01: qsjdqmlskdmlqskdmcxxxx [qskjd][qsdjqslkdj] Payload :[<LOG><a>a</a>
<b>b</b>
<c>c</c>
<id>XXXXX</id>
<d>d</d>
</LOG>]]
2016/01/01 bladh bqskjdqskldjqsdlqskdjqlskdj dazihzmkldjkdjqslkjd
2016/01/01: qsdhqsdlkqsmdjqsldjqslkdjqlskdjqslkdjqslkdjqskdjqsd
2016/01/01: qsjdqmlskdmlqskdmcxxxx [qskjd][qsdjqslkdj] Payload :[<LOG> <a>a</a>
<b>b</b>
<c>c</c>
<id>YYYYY</id>
<d>d</d>
</LOG>]]
qskdmqlskdqlsdqlskdqlsdk
qsdlkqsdlkqsdmlkqsdlk
For now I am using
sed -n '/<START/{:start /\/END/!{N;b start};/XXXXX/p}' logFile
and I am getting this
2016/01/01: qsjdqmlskdmlqskdmcxxxx [qskjd][qsdjqslkdj] Payload :[<LOG><a>a</a>
<b>b</b>
<c>c</c>
<id>XXXXX</id>
<d>d</d>
</LOG>]]
I would like to retrieve the whole XML and get :
<LOG>
<a>a</a>
<b>b</b>
<c>c</c>
<id>XXXX</id>
<d>d</d>
</LOG>
Thanks in advance
Solution in TXR:
#(repeat)
# (skip)Payload :[<#tag>#preamble
# (collect)
#middle
# (last)
</#tag>]]
# (end)
# (output)
<#tag>
#(trim-str preamble)
# (repeat)
#middle
# (end)
</#tag>
# (end)
#(end)
Run:
$ txr extract.txr data
<LOG>
<a>a</a>
<b>b</b>
<c>c</c>
<id>XXXXX</id>
<d>d</d>
</LOG>
<LOG>
<a>a</a>
<b>b</b>
<c>c</c>
<id>YYYYY</id>
<d>d</d>
</LOG>
Try this:
sed -n '/<LOG/{:a;/<\/LOG/!{N;ba};s/.*\(<LOG>\)\(.*XXXXX.*<\/LOG>\).*/\1\n\2/p}' logFile
It should do the job but keep in mind that sed is not the right tool for parsing xml. When you'll have to parse valid xml files, you should consider using xmlstarlet or xmllint.
This might work for you (GNU sed):
sed -nr '/<LOG>/,/<\/LOG>/{s/.*(<LOG>)\s*/\1\n/;s/(<\/LOG>).*/\1/;p}' file
Use seds grep-like option to inhibit printing unless explicitly required and utilise the range feature /.../,/.../, top and tailing the string produced.
I am passing a XML document form the Java to Flex using Remote Object.
My XML is as follows
"
<root
<dept ID="1" Name="RND"
<Emp ID="1" Name="Aj"/>
</dept>
<dept ID="2" Name="ENG">
<Emp ID="1" Name="Aj"/>
</dept>
<dept ID="3" Name="MECH">
<Emp ID="1" Name="Aj"/>
</dept>
</root>
"
In Flex i am trying to access using below code
treeData = event.result as XML;
deptTree.dataProvider = treeData;
When i am trying to access the result object and i am getting the below exception
"
[RPC Fault faultString="org.w3c.dom.DOMException : INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified. " faultCode="Server.Processing" faultDetail="null"]
at mx.rpc::AbstractInvoker/http://www.adobe.com/2006/flex/mx/internal::faultHandler()
at mx.rpc::Responder/fault()
at mx.rpc::AsyncRequest/fault()
at NetConnectionMessageResponder/statusHandler()
at mx.messaging::MessageResponder/status()
"
Please help me to resolve this issue.
Thanks in advance.
Aj
<root
<dept ID="1" Name="RND"
Close root tag:
<root>