I'm trying to add a node with a namespace and an attribute to an xml, but it fails if I try to do it as multiple commands in one execution of xmlstarlet:
<?xml version="1.0"?>
<levela xmlns:xi="http://www.w3.org/2001/XInclude">
<levelb>
</levelb>
</levela>
xmlstarlet ed -L -s /levela/levelb -t elem -n xi:input -i //xi:input -t attr -n "href" -v "aHref" file.xml
I'm trying to get:
<?xml version="1.0"?>
<levela xmlns:xi="http://www.w3.org/2001/XInclude">
<levelb>
<xi:input href="aHref"/>
</levelb>
</levela>
But the attribute isn't added. So I get:
<?xml version="1.0"?>
<levela xmlns:xi="http://www.w3.org/2001/XInclude">
<levelb>
<xi:input/>
</levelb>
</levela>
It works if I run it as two executions like this:
xmlstarlet ed -L -s /levela/levelb -t elem -n xi:input file.xml
xmlstarlet ed -L -i //xi:input -t attr -n "href" -v "aHref" file.xml
It also works if I add a tag without a namespace e.g:
xmlstarlet ed -L -s /levela/levelb -t elem -n levelc -i //levelc -t attr -n "href" -v "aHref" file.xml
<?xml version="1.0"?>
<levela xmlns:xi="http://www.w3.org/2001/XInclude">
<levelb>
<levelc href="aHref"/>
</levelb>
</levela>
What am I doing wrong? Why doesn't it work with the namespace?
This will do it:
xmlstarlet edit \
-s '/levela/levelb' -t elem -n 'xi:input' \
-s '$prev' -t attr -n 'href' -v 'aHref' \
file.xml
xmlstarlet edit code can use the convenience $prev (aka
$xstar:prev) variable to refer to the node created by the most
recent -i (--insert), -a (--append), or -s (--subnode) option.
Examples of $prev are given in
doc/xmlstarlet.txt and
the source code's
examples/ed-backref*.
Attributes can be added using -i, -a, or -s.
What am I doing wrong? Why doesn't it work with the namespace?
Update 2022-04-15
The -i '//xi:input' … syntax you use is perfectly logical. As your
own 2 alternative commands suggest it's the namespace xi that
triggers the omission and there's a hint in the edInsert function in
the source code's
src/xml_edit.c
where it says NULL /* TODO: NS */.
When you've worked with xmlstarlet for some
time you come to accept its limitations (or not); in this case the
$prev back reference is useful. I wouldn't expect that TODO to
go away anytime soon.
(end update)
Well, I think xmlstarlet edit looks upon node naming as a user
responsibility, as the following example suggests,
printf '<v/>' |
xmlstarlet edit --omit-decl \
-s '*' -t elem -n 'undeclared:qname' -v 'x' \
-s '*' -t elem -n '!--' -v ' wotsinaname ' \
-s '$prev' -t attr -n ' "" ' -v '' \
-s '*' -t elem -n ' <&> ' -v 'harrumph!'
the output of which is clearly not XML:
<v>
<undeclared:qname>x</undeclared:qname>
<!-- "" =""> wotsinaname </!-->
< <&> >harrumph!</ <&> >
</v>
If you want to indent the new element, for example:
xmlstarlet edit \
-s '/levela/levelb' -t elem -n 'xi:input' \
--var newnd '$prev' \
-s '$prev' -t attr -n 'href' -v 'aHref' \
-a '$newnd' -t text -n ignored -v '' \
-u '$prev' -x '(//text())[1][normalize-space()=""]' \
file.xml
The -x XPath expression grabs the first text node provided it
contains nothing but whitespace, i.e. the first child node of levela.
The --var name xpath option to define an xmlstarlet edit
variable is mentioned in
doc/xmlstarlet.txt
but not in the user's guide.
I used xmlstarlet version 1.6.1.
It seems you can't insert an attribute and attribute value into a namespaced node... Maybe someone smarter can figure out something else, but the only way I could get around that, at least in this case, is this:
xmlstarlet ed -N xi="http://www.w3.org/2001/XInclude" --subnode "//levela/levelb" \
--type elem -n "xi:input" --insert "//levela/levelb/*" --type attr --name "href"\
--value "aHref" file.xml
Related
When I am having input.xml, and want to update it to get output.xml (see below) XMLSTARLET fails.
First I tried to find the correct XSLT function to get the needed values, which led to this:
xmlstarlet sel -t -m //field -v . -o "=" -v 'substring("ABCDEFGHIJK",position(),1)' -n input.xml
output:
5 =A
3 =B
2 =C
4 =D
55 =E
42 =F
This made me believe that I should be able to update this XML with the following command:
xmlstarlet ed -u //field -x 'substring("ABCDEFGHIJK",position(),1)' input.xmlxmlstarlet ed -u //field -x 'substring("ABCDEFGHIJK",position(),1)' input.xml
But I did get:
Invalid context position
Segmentation fault
I tried using XmlStarlet on Windows 11, and on Ubuntu 20.04, both did a core dump.
I am interested in another solution using XmlStarlet.
FILES
input.xml
<root>
<field> 5 </field>
<field> 3 </field>
<field> 2 </field>
<field> 4 </field>
<field> 55 </field>
<field> 42 </field>
</root>
(desired) output.xml
<root>
<field>A</field>
<field>B</field>
<field>C</field>
<field>D</field>
<field>E</field>
<field>F</field>
</root>
position() works with xmlstarlet select's -m (--match) option (i.e. xsl:for-each) which determines the context position.
xmlstarlet select --indent -t \
-e '{name(*)}' \
-m '//field' -e '{name()}' -v 'substring("ABCDEFGHIJK",position(),1)' \
file.xml
With xmlstarlet edit's -u (--update) you can use a sibling node count, e.g.
xmlstarlet edit -O \
-u '//field' -x 'substring("ABCDEFGHIJK",1+count(preceding-sibling::field),1)' \
file.xml
or
xmlstarlet edit -O \
-u '//field' -x 'substring("ABCDEFGHIJK",count(preceding-sibling::* | self::*),1)' \
file.xml
Each of these commands produces the desired output. Line continuation chars added for readability.
I wrote a script in R that has several arguments. I want to iterate over 20 directories and execute my script on each while passing in a substring from the file path as my -n argument using sed. I ran the following:
find . -name 'xray_data' -exec sh -c 'Rscript /Users/Caitlin/Desktop/DeMMO_Pubs/DeMMO_NativeRock/DeMMO_NativeRock/R/scipts/dataStitchR.R -f {} -b "{}/SEM_images" -c "{}/../coordinates.txt" -z ".tif" -m ".tif" -a "Unknown|SEM|Os" -d "overview" -y "overview" --overview "overview.*tif" -p FALSE -n "`sed -e 's/.*DeMMO.*[/]\(.*\)_.*[/]xray_data/\1/' "{}"`"' sh {} \;
which results in this error:
ubs/DeMMO_NativeRock/DeMMO_NativeRock/R/scipts/dataStitchR.R -f {} -b "{}/SEM_images" -c "{}/../coordinates.txt" -z ".tif" -m ".tif" -a "Unknown|SEM|Os" -d "overview" -y "overview" --overview "overview.*tif" -p FALSE -n "`sed -e 's/.*DeMMO.*[/]\(.*\)_.*[/]xray_data/\1/' "{}"`"' sh {} \;
sh: command substitution: line 0: syntax error near unexpected token `('
sh: command substitution: line 0: `sed -e s/.*DeMMO.*[/](.*)_.*[/]xray_data/1/ "./DeMMO1/D1T3rep_Dec2019_Ellison/xray_data"'
When I try to use sed with my pattern on an example file path, it works:
echo "./DeMMO1/D1T1exp_Dec2019_Poorman/xray_data" | sed -e 's/.*DeMMO.*[/]\(.*\)_.*[/]xray_data/\1/'
which produces the correct substring:
D1T1exp_Dec2019
I think there's an issue with trying to use single quotes inside the interpreted string but I don't know how to deal with this. I have tried replacing the single quotes around the sed pattern with double quotes as well as removing the single quotes, both result in this error:
sed: RE error: illegal byte sequence
How should I extract the substring from the file path dynamically in this case?
To loop through the output of find.
while IFS= read -ru "$fd" -d '' files; do
echo "$files" ##: do whatever you want to do with the files here.
done {fd}< <(find . -type f -name 'xray_data' -print0)
No embedded commands in quotes.
It uses a random fd just in case something inside the loop is eating/slurping stdin
Also -print0 delimits the files with null bytes, so it should be safe enough to handle spaces tabs and newlines on the path and file names.
A good start is always put an echo in front of every commands you want to do with the files, so you have an idea what's going to be executed/happen just in case...
This is the solution that ultimately worked for me due to issues with quotes in sed:
for dir in `find . -name 'xray_data'`;
do sampleID="`basename $(dirname $dir) | cut -f1 -d'_'`";
Rscript /Users/Caitlin/Desktop/DeMMO_Pubs/DeMMO_NativeRock/DeMMO_NativeRock/R/scipts/dataStitchR.R -f "$dir" -b "$dir/SEM_images" -c "$dir/../coordinates.txt" -z ".tif" -m ".tif" -a "Unknown|SEM|Os" -d "overview" -y "overview" --overview "overview.*tif" -p FALSE -n "$sampleID";
done
I'm writing a bash script to edit Tomcat's server.xml file. I have it successfully adding a Connector node. To run this example, download and unpack Apache Tomcat 9, go into the conf directory where there is a server.xml file, and run:
xmlstarlet edit -P --inplace \
--subnode "/Server/Service" \
--type elem -n ConnectorNew -v "" \
--insert //ConnectorNew --type attr -n "port" -v "443" \
--insert //ConnectorNew --type attr -n "protocol" -v "org.apache.coyote.http11.Http11NioProtocol" \
--insert //ConnectorNew --type attr -n "keystoreFile" -v "example-key.pem" \
--insert //ConnectorNew --type attr -n "sslProtocol" -v "TLS" \
--insert //ConnectorNew --type attr -n "SSLEnabled" -v "true" \
--subnode "/Server/Service/ConnectorNew" \
--type elem -n "UpgradeProtocolNew" -v "" \
--insert //UpgradeProtocolNew --type attr -n "className" -v "org.apache.coyote.http2.Http2Protocol" \
--rename //ConnectorNew -v Connector \
--rename //UpgradeProtocolNew -v UpgradeProtocol server.xml
which is pretty cool! Upon running that there will now be a TLS Connector on port 443 with the given example key. That would run as usual assuming the key file exists and it's running as root (real server deployments shouldn't run as root but should use jsvc instead).
However that shows up at the very end of the Service element. I would like ideally to put it in the file after the last existing Connector element so the file looks normal. I don't think order of Connector elements has any effect on Tomcat, although I would like it to look like a normal config file that other people would expect, when they go looking for connector elements.
I assume there's some way to do this with xmlstarlet but I couldn't figure it out.
I hope I can avoid using xslt features to do this because I don't want to have to learn and manage another technology to get this script done.
Thank you!
If you have already a Connector defined in you server.xml you can replace --subnode "/Server/Service" by --append /Server/Service/Connector and this will insert your new Connector element right after the first existent Connector.
xmlstarlet edit -P --inplace \
--append /Server/Service/Connector \
--type elem -n ConnectorNew -v "" \
--insert //ConnectorNew --type attr -n "port" -v "443" \
...
If this is the first Connector to insert you would want to do --insert /Server/Service/Engine and your Connector element will be inserted before the Engine element where Connectors usually reside in the default server.xml
xmlstarlet edit -P --inplace \
--insert /Server/Service/Engine \
--type elem -n ConnectorNew -v "" \
--insert //ConnectorNew --type attr -n "port" -v "443" \
...
You may also want to delete all commented xml elements before you start editing the server.xml so that you have a clean and readable file:
xmlstarlet ed -L -d '//comment()' server.xml
and if you do so, you would need to insert a space before the closing "/>", otherwise tomcat will complain that server.xml is corrupt:
sed -i "s/\"\/>/\" \/>/g" server.xml
I want to write zsh completions for a program with the following calling convention:
program [generaloptions] operation [operationoptions]
where operation is one of --install, --upgrade...
What I have so far, are the general options and the operation options. My code looks something like this:
local generaloptions; generaloptions=(...)
local installoptions; installoptions=(...)
local upgradeoptions; upgradeoptions=(...)
case "$state" in
(install)
_arguments -s \
"$installoptions[#]" \
&& ret=0
(upgrade)
_arguments -s \
"$upgradeoption[#]" \
&& ret=0
*)
_arguments -s \
"$generaloptions[#]" \
'--install[...]: :->install' \
'--upgrade[...]: :->upgrade' \
&& ret=0
The problem is, after I type the operation and the first operation option, the state gets reset to the *) case.
Example
$ program --install --installoption --<tab>
list of general options
How can I set the next state to be the same as the old? Which command has similar calling conventions, so I can look at the code of the completion for this command?
The main problem is that the operations start with a --, so it is harder to find them in the arguments. In git for example all subcommands are only a word without dashes. So git solves this problem something like this:
Find the first argument without dashes because this must be the subcommand
Dispatch based on the subcommand to the commandline arguments for that subcommand.
So git dispatches in every call to the completion function (this was what I meant with "holding the state").
The way I solved this problem was by looking through many completion functions and finding a command that had a similar calling convention. The command that I found the most useful is pacman. Here is what I extracted from that:
# This somehow disassembles the commandline options
args=( ${${${(M)words:#-*}#-}:#-*}
case $args in
*i)
_arguments -s \
${installoptions} \
'(-i[...]' \
&& ret=0
;;
*u)
_arguments -s \
${upgradeoption} \
'-u[...]' \
&& ret=0
;;
*)
case ${(M)words:#--*} in
*--install*)
_arguments -s \
${installoptions} \
'--install[...]' \
&& ret=0
;;
*--upgrade*)
_arguments -s \
${upgradeoption} \
'--upgrade[...]' \
&& ret=0
;;
*)
_arguments -s \
{generaloptions} \
&& ret=0
;;
esac
esac
I know, there is a lot of dublication, but I think you get the point. Also notice, I moved the --install and --upgrade options from the general case to the operation case. If you don't do that, you loose the argument if you want complete after --install or --upgrade
When downloading a file using curl, how would I follow a link location and use that for the output filename (without knowing the remote filename in advance)?
For example, if one clicks on the link below, you would download a filenamed "pythoncomplete.vim." However using curl's -O and -L options, the filename is simply the original remote-name, a clumsy "download_script.php?src_id=10872."
curl -O -L http://www.vim.org/scripts/download_script.php?src_id=10872
In order to download the file with the correct filename you would have to know the name of the file in advance:
curl -o pythoncomplete.vim -L http://www.vim.org/scripts/download_script.php?src_id=10872
It would be excellent if you could download the file without knowing the name in advance, and if not, is there another way to quickly pull down a redirected file via command line?
The remote side sends the filename using the Content-Disposition header.
curl 7.21.2 or newer does this automatically if you specify --remote-header-name / -J.
curl -O -J -L $url
The expanded version of the arguments would be:
curl --remote-name --remote-header-name --location $url
If you have a recent version of curl (7.21.2 or later), see #jmanning2k's answer.
I you have an older version of curl (like 7.19.7 which came with Snow Leopard), do two requests: a HEAD to get the file name from response header, then a GET:
url="http://www.vim.org/scripts/download_script.php?src_id=10872"
filename=$(curl -sI $url | grep -o -E 'filename=.*$' | sed -e 's/filename=//')
curl -o $filename -L $url
If you can use wget instead of curl:
wget --content-disposition $url
I wanted to comment to jmanning2k's answer but as a new user I can't, so I tried to edit his post which is allowed but the edit was rejected saying it was supposed to be a comment. sigh
Anyway, see this as a comment to his answer thanks.
This seems to only work if the header looks like filename=pythoncomplete.vim as in the example, but some sites send a header that looks like filename*=UTF-8' 'filename.zip' that one isn't recognized by curl 7.28.0
I wanted a solution that worked on both older and newer Macs, and the legacy code David provided for Snow Leopard did not behave well under Mavericks. Here's a function I created based on David's code:
function getUriFilename() {
header="$(curl -sI "$1" | tr -d '\r')"
filename="$(echo "$header" | grep -o -E 'filename=.*$')"
if [[ -n "$filename" ]]; then
echo "${filename#filename=}"
return
fi
filename="$(echo "$header" | grep -o -E 'Location:.*$')"
if [[ -n "$filename" ]]; then
basename "${filename#Location\:}"
return
fi
return 1
}
With this defined, you can run:
url="http://www.vim.org/scripts/download_script.php?src_id=10872"
filename="$(getUriFilename $url)"
curl -L $url -o "$filename"
Please note that certain malconfigured webservers will serve the name using "Filename" as key, where RFC2183 specifies it should be "filename". curl only handles the latter case.
I had the same Problem like John Cooper. I got no filename but a Location File name back. His answer also worked but are 2 commands.
This oneliner worked for me....
url="https://download.mozilla.org/?product=firefox-latest-ssl&os=linux64&lang=de";url=$(curl -L --head -w '%{url_effective}' $url 2>/dev/null | tail -n1) ; curl -O $url
Stolen and added some stuff from
https://unix.stackexchange.com/questions/126252/resolve-filename-from-a-remote-url-without-downloading-a-file
An example using the answer above for Apache Archiva artifact repository to pull latest version. The curl returns the Location line and the filename is at the end of the line. Need to remove the CR at end of file name.
url="http://archiva:8080/restServices/archivaServices/searchService/artifact?g=com.imgur.backup&a=snapshot-s3-util&v=LATEST"
filename=$(curl --silent -sI -u user:password $url | grep Location | awk -F\/ '{print $NF}' | sed 's/\r$//')
curl --silent -o $filename -L -u user:password $url
instead of applying grep and other Unix-Fu operations, curl ships with a builtin "Write Out" option variable[1] specifically for such a case, e.g.
$ curl -OJsL "http://www.vim.org/scripts/download_script.php?src_id=10872" -w "%{filename_effective}"
pythoncomplete.vim
[1] https://everything.curl.dev/usingcurl/verbose/writeout#available-write-out-variables
Using the solution proposed above, I wrote this helper function curl2file.
[UPDATED]
function curl2file() {
url=$1
url=$(curl -o /dev/null -L --head -w '%{url_effective}' $url 2>/dev/null | tail -n1) ; curl -O $url
}
Usage:
curl2file https://cloud.tsinghua.edu.cn/f/4666d28af98a4e63afb5/?dl=1