extracting nodes values with xmlstarlet - xmlstarlet

i have this xml schema , what i want is how to extract the values of all the nodes one by one, using XMLStarlet , in shell script
<service>
<imageScroll>
<imageName>Photo_Gallerie_1.jpg</imageName>
</imageScroll>
<imageScroll>
<imageName>Photo_Gallerie_2.jpg</imageName>
</imageScroll>
<imageScroll>
<imageName>Photo_Gallerie_3.jpg</imageName>
</imageScroll>
</service>

xmlstarlet sel -t -m "//imageName" -v . -n your.xml
output:
Photo_Gallerie_1.jpg
Photo_Gallerie_2.jpg
Photo_Gallerie_3.jpg
Is that what you needed?
sel (select mode)
-t (output template(this is pretty much required)
-m (for each match of the following value)
"// (the double slash means it could be anywhere in the tree)
imageName (name of node you want)"
-v (requests the value of an element in the current path) and the . represents current element in iteration (you could put the name of the node there but it's generally easier this way)
and then the
-n is to add a line for every value you match.

that was the solution that i found and it did perfectly the job.
imagescroller=`xmlstarlet sel -t -m "//root/services/service/imageScroll[rank_of_the_desired_item]" -v imageName -n myfile.xml
sorry for late.

Related

Is any way serah match value from root node by xmlstarlet and return XPATH

As title
I am resarch xmlstartle, but seems no way to serach value from root node then return XPATH
Is any idea?
ThanksPeter
Perhaps you can make use of this example which handles elements and
attributes but not other node types. Assuming a POSIX shell,
indentation and line continuation characters added for readability.
xmlstarlet select --text -t \
-m '//xsl:*/#test[contains(.,"position")]' \
-m 'ancestor-or-self::*' \
--var pos='1+count(preceding-sibling::*[name() = name(current())])' \
-v 'concat("/",name(),"[",$pos,"]")' \
-b \
--if 'count(. | ../#*) = count(../#*)' \
-v 'concat("/#",name())' \
-b \
-n \
/usr/share/*/xslt/docbook/common/db-common.xsl
Where:
the outer -m (--match) option specifies the target, with optional
search conditions given as
predicates
the inner -m (--match) builds the XPath of elements from root to target,
calculating position by counting siblings
for an attribute node target - which doesn't match ancestor-or-self::* -
the -i (--if) clause adjusts the XPath
a -C (--comp) option before -t lists the generated XSLT code
Output:
/xsl:stylesheet[1]/xsl:template[2]/xsl:for-each[1]/xsl:if[1]/#test
/xsl:stylesheet[1]/xsl:template[3]/xsl:for-each[1]/xsl:if[1]/#test
/xsl:stylesheet[1]/xsl:template[7]/xsl:for-each[1]/xsl:choose[1]/xsl:when[1]/#test
/xsl:stylesheet[1]/xsl:template[7]/xsl:for-each[1]/xsl:choose[1]/xsl:when[3]/#test
Output from outer -m '//xsl:param[string(#select)]':
/xsl:stylesheet[1]/xsl:template[1]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[4]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[5]/xsl:param[1]
/xsl:stylesheet[1]/xsl:template[5]/xsl:param[2]
/xsl:stylesheet[1]/xsl:template[6]/xsl:param[1]

Using grep to find a binary pattern in a file

Previously, I was able to find binary patterns in files using grep with
grep -a -b -o -P '\x01\x02\x03' <file>
By find I mean I was able to get the byte position of the pattern in the file. But when I tried doing this with the latest version of grep (v2.16) it no longer worked.
Specifically, I can manually verify that the pattern is present in the file but grep does not find it. Strangely, some patterns are found correctly but not others. For example, in a test file
000102030405060708090a0b0c0e0f
'\x01\x02' is found but not '\x07\x08'.
Any help in clarifying this behavior is highly appreciated.
Update: The above example does not show the described behavior. Here are the commands that exhibit the problem
printf `for (( x=0; x<256; x++ )); do printf "\x5cx%02x" $x; done` > test
for (( x=$((0x70)); x<$((0x8f)); x++ )); do
p=`printf "\'\x5cx%02x\x5cx%02x\'" $x $((x+1))`
echo -n $p
echo $p test | xargs grep -c -a -o -b -P | cut -d: -f1
done
The first line creates a file with all possible bytes from 0x00 to 0xff in a sequence. The second line counts the number of occurrences of pairs of consecutive byte values in the range 0x70 to 0x8f. The output I get is
'\x70\x71'1
'\x71\x72'1
'\x72\x73'1
'\x73\x74'1
'\x74\x75'1
'\x75\x76'1
'\x76\x77'1
'\x77\x78'1
'\x78\x79'1
'\x79\x7a'1
'\x7a\x7b'1
'\x7b\x7c'1
'\x7c\x7d'1
'\x7d\x7e'1
'\x7e\x7f'1
'\x7f\x80'0
'\x80\x81'0
'\x81\x82'0
'\x82\x83'0
'\x83\x84'0
'\x84\x85'0
'\x85\x86'0
'\x86\x87'0
'\x87\x88'0
'\x88\x89'0
'\x89\x8a'0
'\x8a\x8b'0
'\x8b\x8c'0
'\x8c\x8d'0
'\x8d\x8e'0
'\x8e\x8f'0
Update: The same pattern occurs for single-byte patterns -- no bytes with value greater than 0x7f are found.
The results may depend on you current locale. To avoid this, use:
env LANG=LC_ALL grep -P "<binary pattern>" <file>
where env LANG=LC_ALL overrides your current locale to allow byte matching. Otherwise, patterns with non-ASCII "characters" such as \xff will not match.
For example, this fails to match because (at least in my case) the environment has LANG=en_US.UTF-8:
$ printf '\x41\xfe\n' | grep -P '\xfe'
when this succeeds:
$ printf '\x41\xfe\n' | env LANG=LC_ALL grep -P '\xfe'
A?

Check if a string in one file exists in another in unix

I have a file that contains the version name and version number. The contents of the first file looks as-
File1-
<Line contains the name of product1>
package_name0_9_8 >= 1.2.3x-4.5.6
package_name0_9_8-32bit >= 3.6.1g-3.5.1
package_name0_9_8-xx >= 6.3.2v-3.0.4
<Line contains the name of product2>
anotherpackage_name0_9_8 >= 3.5.6u-3.6.5
And,
File2.xml-
<package name="package_name0_9_8" version="1.2.3x-4.4.4"/>
<package name="package_name0_9_8-32bit" version="3.6.1g-3.4.0"/>
.
.
Is there a way to check the existance of package_name that is present in File1 with the package_name of File2 and check if the corresponding version of package_name in File1 with that of corresponding version of package_name of File2?
I am frank that I am pretty much weak in concatenating the 'grep' and 'awk' commands along with options to be used here. Please help out.
for a in $(sed -n '/>=/p' File1.txt | grep -o '^[^ ]*'); do for b in $(sed -n "/^$a /{s/.*>=\(.*\)$/\1/p}" File1.txt); do ((! $(grep -c "$a.*$b" File2.txt))) && (echo "$a $b" >> missing_pkgs.txt); done; done;
this is a quick one liner - you could print it out a bit prettier
the way this works is nested for loop that grabs both pieces separate into variables (you could do that with read and put them in on one loop if you want) and then just counts the occurences in the second file with grep and whenever there is a count of zero it will reverse the value making the test (()) turn true and echo the missing packages to the file missing_pkgs.txt
here is another quick one liner that does the same thing except more efficient with one loop and variables loaded via read
while read each; do read a b < <(echo $each) && ((! $(grep -c "$a.*$b" File2.txt))) && (echo "$a $b" >> missing_pkgs.txt); done < <(awk '/>=/{ print $1" "$3 }' File1.txt)
more simplified:
while read a b; do ((! $(grep -c "$a.*$b" File2.txt))) && (echo "$a $b" >> missing_pkgs.txt); done < <(awk '/>=/{ print $1" "$3 }' File1.txt)
sed -n 's².*²s#<package name="\\(&"/>#\\1 Present#p²;s/ *>= */\\)" *version="/p' File1 > /tmp/File1.sed
sed -n -f /tmp/File1.sed File2
rm /tmp/File1.sed
not in on instruction like awk could do, but do the job (posix version so --posix on GNU sed
you could change the output message that is the \\1 Present text where \\1 will the be the package name (with few modification, version could also be used)
It looks like you already got a much shorter solution in a format closer to what you desired. However, since I asked if a Python solution would work, and you said yes, check out the code here:
http://pastebin.com/F5LYrmea
(I haven't debugged it more than a little, but it seems to work on at least a little more than your example files. I released the code to the public domain. CC-BY-SA isn't a software license, according to the makers of CC; so, that's why I didn't post it here, as posting it here would give it that license. Plus, you get syntax highlighting specific to Python at the link provided.)
Basically, it's a lot of complicated text parsing. Not much of an algorithm to explain. It gets the contents of both files, strips out the packages, their versions and the operands (puts all those in a dictionary for use later), and loops through lines of the other file and compares versions; then it tells you which ones match and which ones don't.

Unix command parameter and option order

I am new to Unix.
I am wondering if the order of the options and parameters passed to a specific command matter?
For instance:
$grep -i -P 'wonderful' filename
$grep filename -i 'wonderful' -P
Do they mean exactly the same thing?
And if they don't mean the same thing, in unix pipe, since the result of the first command is going to be passed to the second command as input, and in which position the output is going to be placed in the second command's input parameters?
for example:
$echo "This is a wonderful day"|grep -P -i 'Wonderful'
is this equivalent to:
$grep -P -i 'Wonderful' $(echo "This is a wonderful day")
or in some other order?
In general it doesn't matter, but there is some command for which order is important. No general rule. Read the manual carefully...

unix sort descending order

I want to sort a tab limited file in descending order according to the 5th field of the records.
I tried
sort -r -k5n filename
But it didn't work.
The presence of the n option attached to the -k5 causes the global -r option to be ignored for that field. You have to specify both n and r at the same level (globally or locally).
sort -t $'\t' -k5,5rn
or
sort -rn -t $'\t' -k5,5
If you only want to sort only on the 5th field then use -k5,5.
Also, use the -t command line switch to specify the delimiter to tab. Try this:
sort -k5,5 -r -n -t \t filename
or if the above doesn't work (with the tab) this:
sort -k5,5 -r -n -t $'\t' filename
The man page for sort states:
-t, --field-separator=SEP
use SEP instead of non-blank to blank transition
Finally, this SO question Unix Sort with Tab Delimiter might be helpful.
To list files based on size in asending order.
find ./ -size +1000M -exec ls -tlrh {} \; |awk -F" " '{print $5,$9}' | sort -n\

Resources