replace characters in Bash variable - zsh

I am trying to replace all the - chars with _ chars in a specific variable. Tried to use the tr function. What am I missing?
Thanks!
user#mbp-user ~ % echo $APP_ID
app1_someinfo_info-text_text-indfo_text
user#mbp-user ~ % APP_ID= $APP_ID tr - _
zsh: command not found: app1_someinfo_info-text_text-indfo_text
user#mbp-user ~ % APP_ID= $APP_ID tr "-" "_"
zsh: command not found: app1_someinfo_info-text_text-indfo_text
user#mbp-user ~ %

You can do this in pure bash, without invoking any other processes.
$ APP_ID=app1_someinfo_info-text_text-indfo_text
$ echo $APP_ID
app1_someinfo_info-text_text-indfo_text
$ echo ${APP_ID//-/_}
app1_someinfo_info_text_text_indfo_text
Or reassign to the same variable
$ APP_ID=${APP_ID//-/_}
Specifically we are using the pattern
name//pattern/string
which replaces all occurrences of pattern with string in the variable name.
For more details see section 5.18 of the Bash Cookbook by Carl Albing.

Try the following:
[user#host ~]$ APP_ID="app1_someinfo_info-text_text-indfo_text"
[user#host ~]$ APP_ID=$(echo $APP_ID | tr "-" "_")
[user#host ~]$ echo $APP_ID
app1_someinfo_info_text_text_indfo_text

Using sed
$ sed 's/-/_/g' <<< "$APP_ID"
app1_someinfo_info_text_text_indfo_text
Pure bash
$ APP_ID=${APP_ID//-/_}
$ echo $APP_ID
app1_someinfo_info_text_text_indfo_text
Using awk
$ awk '{gsub(/-/,"_")}1' <<< "$APP_ID"
app1_someinfo_info_text_text_indfo_text

Try as below
echo $APP_ID | tr -- - _
To prevent arguments beginning with - from being interpreted as optional arguments, use -- to indicate the end of optional arguments.
Postscript
But that didn't matter. tr - _ works as expected. The problem is that you didn't give a string for the standard input of tr.

Related

command in shell to get second numeric value after "-"

Example
prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000
I need value 8080. So basically we need digit value after second occurrence of '-'.
We tried following options:
echo "prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000" | sed -r 's/([^-][:digit:]+[^-][:digit:]).*/\1/'
There is no need to resort to sed, BASH supports regular expressions:
$ A=prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000
$ [[ $A =~ ([^-]*-){2}[^[:digit:]]+([[:digit:]]+) ]] && echo "${BASH_REMATCH[2]}"
8080
Try this Perl solution
$ data="prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000"
$ perl -ne ' /.+?\-(\d+).+?\-(\d+).*/g and print $2 ' <<< "$data"
8080
or
$ echo "$data" | perl -ne ' /.+?\-(\d+).+?\-(\d+).*/g and print $2 '
8080
You could do this in a POSIX shell using IFS to identify the parts, and a loop to step to the pattern you're looking for:
s="prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000"
# Set a field separator
IFS=-
# Expand your variable into positional parameters
set - $s
# Drop the first two fields
shift 2
# Drop additional fields until one that starts with a digit
while ! expr "$1" : '[0-9]' >/dev/null; do shift; done
# Capture the part of the string that is not digits
y="$1"; while expr "$y" : '[0-9]' >/dev/null; do y="${y##[[:digit:]]}"; done
# Strip off the non-digit part from the original field
x="${1%$y}"
Note that this may fail for a string that looks like aa-bb-123cc45-foo. If you might have additional strings of digits in the "interesting" field, you'll need more code.
If you have a bash shell available, you could do this with a series of bash parameter expansions...
# Strip off the first two "fields"
x="${s#*-}"; x="${x#*-}"
shopt -s extglob
x="${x##+([^[:digit:]])}"
# Identify the part on the right that needs to be stripped
y="${x##+([[:digit:]])}"
# And strip it...
x="${x%$y}"
This is not POSIX compatible because if the requirement for extglob.
Of course, bash offers you many options. Consider this function:
whatdigits() {
local IFS=- x i
local -a a
a=( $1 )
for ((i=3; i<${#a[#]}; i++)) {
[[ ${a[$i]} =~ ^([0-9]+) ]] && echo "${BASH_REMATCH[1]}" && return 0
}
return 1
}
You can then run commands like:
$ whatdigits "12-ab-cd-45ef-gh"
45
$ whatdigits "$s"
8080

using sed or awk to double quote comma separate and concatenate a list

I have the following list in a text file:
10.1.2.200
10.1.2.201
10.1.2.202
10.1.2.203
I want to encase in "double quotes", comma separate and join the values as one string.
Can this be done in sed or awk?
Expected output:
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203","10.1.2.204"
The easiest is something like this (in pseudo code):
Read a line;
Put the line in quotes;
Keep that quoted line in a stack or string;
At the end (or while constructing the string), join the lines together with a comma.
Depending on the language, that is fairly straightforward to do:
With awk:
$ awk 'BEGIN{OFS=","}{s=s ? s OFS "\"" $1 "\"" : "\"" $1 "\""} END{print s}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Or, less 'wall of quotes' to define a quote character:
$ awk 'BEGIN{OFS=",";q="\""}{s=s ? s OFS q$1q : q$1q} END{print s}' file
With sed:
$ sed -E 's/^(.*)$/"\1"/' file | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g'
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
(With Perl and Ruby, with a join function, it is easiest to push the elements onto a stack and then join that.)
Perl:
$ perl -lne 'push #a, "\"$_\""; END{print join(",", #a)}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
Ruby:
$ ruby -ne 'BEGIN{#arr=[]}; #arr.push "\"#{$_.chomp}\""; END{puts #arr.join(",")}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
here is another alternative
sed 's/.*/"&"/' file | paste -sd,
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
awk -F'\n' -v RS="\0" -v OFS='","' -v q='"' '{NF--}$0=q$0q' file
should work for given example.
Tested with gawk:
kent$ cat f
10.1.2.200
10.1.2.201
10.1.2.202
10.1.2.203
kent$ awk -F'\n' -v RS="\0" -v OFS='","' -v q='"' '{NF--}$0=q$0q' f
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"
$ awk '{o=o (NR>1?",":"") "\""$0"\""} END{print o}' file
"10.1.2.200","10.1.2.201","10.1.2.202","10.1.2.203"

Sed - remove last two character quotes and comma

i wanted to keep only the 10.100.52.11 and delete everything else, 10.100.52.11 keeps changing so i don't want to hard code it
The original output was as below
"PrivateIpAddress": "10.100.52.111",
I tried the below command and removed "PrivateIpAddress": "
sudo aws ec2 describe-instances --filter Name=tag:Name,Values=bip-spark-es-worker1 |grep PrivateIpAddress |head -1|sed 's/^[ ^t]*\"PrivateIpAddress\"[:]* \"//g'
so the output for the above command now is
10.100.52.111",
I want to delete even the ending quotes and comma.
I tried with ["].$ and also \{2\}.$ did not work.
Please help.
Let sed do all the work. You don't need grep or head:
sed -n '/"PrivateIpAddress": /{s///; s/[",]//g; p; q}'
If content within " do not have " themselves,
grep PrivateIpAddress |head -1|sed 's/^[ ^t]*\"PrivateIpAddress\"[:]* \"//g'
can be replaced with
awk -F\" '/PrivateIpAddress/{print $4; exit}'
-F\" use " as field separator
/PrivateIpAddress/ if line matches this string
print $4 print 4th field which is 10.100.52.111 for given sample
exit will quit as only first match is required
some awk proposals
echo '"PrivateIpAddress": "10.100.52.111",'| awk -F: '{print substr($2,3,13)}'
10.100.52.111
echo '"PrivateIpAddress": "10.100.52.111",'| awk -F\" '{print $4}'
10.100.52.111
Alternative :
$ echo "\"PrivateIpAddress\": \"10.100.52.111\", "
"PrivateIpAddress": "10.100.52.111",
$ echo "\"PrivateIpAddress\": \"10.100.52.111\", " |grep -Po '(\d+[.]){3}\d+'
10.100.52.111
$ echo "\"PrivateIpAddress\": \"10.100.52.111\", " |grep -Eo '([[:digit:]]+[.]){3}[[:digit:]]+'
10.100.52.111

substring before and substring after in shell script

I have a string:
//host:/dir1/dir2/dir3/file_name
I want to fetch value of host & directories in different variables in unix script.
Example :
host_name = host
dir_path = /dir1/dir2/dir3
Note - String length & no of directories is not fixed.
Could you please help me to fetch these values from string in unix shell script.
Using bash string operations:
str='//host:/dir1/dir2/dir3/file_name'
host_name=${str%%:*}
host_name=${host_name##*/}
dir_path=${str#*:}
dir_path=${dir_path%/*}
I would do it using regular expressions:
if [[ $path =~ ^//(.*):(.*)/(.*)$ ]]; then
host="${BASH_REMATCH[1]}"
dir_path="${BASH_REMATCH[2]}"
filename="${BASH_REMATCH[3]}"
else
echo "Invalid format" >&2
exit 1
fi
If you are sure that the format will match, you can do simply
[[ $path =~ ^//(.*):(.*)/(.*)$ ]]
host="${BASH_REMATCH[1]}"
dir_path="${BASH_REMATCH[2]}"
filename="${BASH_REMATCH[3]}"
Edit: Since you seem to be using ksh rather than bash (though bash was indicated in the question), the syntax is a bit different:
match=(${path/~(E)^\/\/(.*):(.*)\/(.*)$/\1 \2 \3})
host="${match[0]}"
dir_path="${match[1]}"
filename="${match[2]}"
This will break if there are spaces in the file name, though. In that case, you can use the more cumbersome
host="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\1}"
dir_path="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\2}"
filename="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\3}"
Perhaps there are more elegant ways of doing it in ksh, but I'm not familiar with it.
The shortest way I can think of is to assign two variables in one statement:
$ read host_name dir_path <<< $(echo $string | sed -e 's,^//,,;s,:, ,')
Complete script:
string="//host:/dir1/dir2/dir3/file_name"
read host_name dir_path <<< $(echo $string | sed -e 's,^//,,;s,:, ,')
echo "host_name = " $host_name
echo "dir_path = " $dir_path
Output:
host_name: host
dir_path: /dir1/dir2/dir3/file_name

Get last field using awk substr

I am trying to use awk to get the name of a file given the absolute path to the file.
For example, when given the input path /home/parent/child/filename I would like to get filename
I have tried:
awk -F "/" '{print $5}' input
which works perfectly.
However, I am hard coding $5 which would be incorrect if my input has the following structure:
/home/parent/child1/child2/filename
So a generic solution requires always taking the last field (which will be the filename).
Is there a simple way to do this with the awk substr function?
Use the fact that awk splits the lines in fields based on a field separator, that you can define. Hence, defining the field separator to / you can say:
awk -F "/" '{print $NF}' input
as NF refers to the number of fields of the current record, printing $NF means printing the last one.
So given a file like this:
/home/parent/child1/child2/child3/filename
/home/parent/child1/child2/filename
/home/parent/child1/filename
This would be the output:
$ awk -F"/" '{print $NF}' file
filename
filename
filename
In this case it is better to use basename instead of awk:
$ basename /home/parent/child1/child2/filename
filename
If you're open to a Perl solution, here one similar to fedorqui's awk solution:
perl -F/ -lane 'print $F[-1]' input
-F/ specifies / as the field separator
$F[-1] is the last element in the #F autosplit array
Another option is to use bash parameter substitution.
$ foo="/home/parent/child/filename"
$ echo ${foo##*/}
filename
$ foo="/home/parent/child/child2/filename"
$ echo ${foo##*/}
filename
Like 5 years late, I know, thanks for all the proposals, I used to do this the following way:
$ echo /home/parent/child1/child2/filename | rev | cut -d '/' -f1 | rev
filename
Glad to notice there are better manners
It should be a comment to the basename answer but I haven't enough point.
If you do not use double quotes, basename will not work with path where there is space character:
$ basename /home/foo/bar foo/bar.png
bar
ok with quotes " "
$ basename "/home/foo/bar foo/bar.png"
bar.png
file example
$ cat a
/home/parent/child 1/child 2/child 3/filename1
/home/parent/child 1/child2/filename2
/home/parent/child1/filename3
$ while read b ; do basename "$b" ; done < a
filename1
filename2
filename3
I know I'm like 3 years late on this but....
you should consider parameter expansion, it's built-in and faster.
if your input is in a var, let's say, $var1, just do ${var1##*/}. Look below
$ var1='/home/parent/child1/filename'
$ echo ${var1##*/}
filename
$ var1='/home/parent/child1/child2/filename'
$ echo ${var1##*/}
filename
$ var1='/home/parent/child1/child2/child3/filename'
$ echo ${var1##*/}
filename
you can skip all of that complex regex :
echo '/home/parent/child1/child2/filename' |
mawk '$!_=$-_=$NF' FS='[/]'
filename
2nd to last :
mawk '$!--NF=$NF' FS='/'
child2
3rd last field :
echo '/home/parent/child1/child2/filename' |
mawk '$!--NF=$--NF' FS='[/]'
child1
4th-last :
mawk '$!--NF=$(--NF-!-FS)' FS='/'
echo '/home/parent/child000/child00/child0/child1/child2/filename' |
child0
echo '/home/parent/child1/child2/filename'
parent
major caveat :
- `gawk/nawk` has a slight discrepancy with `mawk` regarding
- how it tracks multiple,
- and potentially conflicting, decrements to `NF`,
- so other than the 1st solution regarding last field,
- the rest for now, are only applicable to `mawk-1/2`
just realized it's much much cleaner this way in mawk/gawk/nawk :
echo '/home/parent/child1/child2/filename' | …
'
awk ++NF FS='.+/' OFS= # updated such that
# root "/" still gets printed
'
filename
You can also use:
sed -n 's/.*\/\([^\/]\{1,\}\)$/\1/p'
or
sed -n 's/.*\/\([^\/]*\)$/\1/p'

Resources