substring before and substring after in shell script - unix

I have a string:
//host:/dir1/dir2/dir3/file_name
I want to fetch value of host & directories in different variables in unix script.
Example :
host_name = host
dir_path = /dir1/dir2/dir3
Note - String length & no of directories is not fixed.
Could you please help me to fetch these values from string in unix shell script.

Using bash string operations:
str='//host:/dir1/dir2/dir3/file_name'
host_name=${str%%:*}
host_name=${host_name##*/}
dir_path=${str#*:}
dir_path=${dir_path%/*}

I would do it using regular expressions:
if [[ $path =~ ^//(.*):(.*)/(.*)$ ]]; then
host="${BASH_REMATCH[1]}"
dir_path="${BASH_REMATCH[2]}"
filename="${BASH_REMATCH[3]}"
else
echo "Invalid format" >&2
exit 1
fi
If you are sure that the format will match, you can do simply
[[ $path =~ ^//(.*):(.*)/(.*)$ ]]
host="${BASH_REMATCH[1]}"
dir_path="${BASH_REMATCH[2]}"
filename="${BASH_REMATCH[3]}"
Edit: Since you seem to be using ksh rather than bash (though bash was indicated in the question), the syntax is a bit different:
match=(${path/~(E)^\/\/(.*):(.*)\/(.*)$/\1 \2 \3})
host="${match[0]}"
dir_path="${match[1]}"
filename="${match[2]}"
This will break if there are spaces in the file name, though. In that case, you can use the more cumbersome
host="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\1}"
dir_path="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\2}"
filename="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\3}"
Perhaps there are more elegant ways of doing it in ksh, but I'm not familiar with it.

The shortest way I can think of is to assign two variables in one statement:
$ read host_name dir_path <<< $(echo $string | sed -e 's,^//,,;s,:, ,')
Complete script:
string="//host:/dir1/dir2/dir3/file_name"
read host_name dir_path <<< $(echo $string | sed -e 's,^//,,;s,:, ,')
echo "host_name = " $host_name
echo "dir_path = " $dir_path
Output:
host_name: host
dir_path: /dir1/dir2/dir3/file_name

Related

command in shell to get second numeric value after "-"

Example
prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000
I need value 8080. So basically we need digit value after second occurrence of '-'.
We tried following options:
echo "prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000" | sed -r 's/([^-][:digit:]+[^-][:digit:]).*/\1/'
There is no need to resort to sed, BASH supports regular expressions:
$ A=prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000
$ [[ $A =~ ([^-]*-){2}[^[:digit:]]+([[:digit:]]+) ]] && echo "${BASH_REMATCH[2]}"
8080
Try this Perl solution
$ data="prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000"
$ perl -ne ' /.+?\-(\d+).+?\-(\d+).*/g and print $2 ' <<< "$data"
8080
or
$ echo "$data" | perl -ne ' /.+?\-(\d+).+?\-(\d+).*/g and print $2 '
8080
You could do this in a POSIX shell using IFS to identify the parts, and a loop to step to the pattern you're looking for:
s="prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000"
# Set a field separator
IFS=-
# Expand your variable into positional parameters
set - $s
# Drop the first two fields
shift 2
# Drop additional fields until one that starts with a digit
while ! expr "$1" : '[0-9]' >/dev/null; do shift; done
# Capture the part of the string that is not digits
y="$1"; while expr "$y" : '[0-9]' >/dev/null; do y="${y##[[:digit:]]}"; done
# Strip off the non-digit part from the original field
x="${1%$y}"
Note that this may fail for a string that looks like aa-bb-123cc45-foo. If you might have additional strings of digits in the "interesting" field, you'll need more code.
If you have a bash shell available, you could do this with a series of bash parameter expansions...
# Strip off the first two "fields"
x="${s#*-}"; x="${x#*-}"
shopt -s extglob
x="${x##+([^[:digit:]])}"
# Identify the part on the right that needs to be stripped
y="${x##+([[:digit:]])}"
# And strip it...
x="${x%$y}"
This is not POSIX compatible because if the requirement for extglob.
Of course, bash offers you many options. Consider this function:
whatdigits() {
local IFS=- x i
local -a a
a=( $1 )
for ((i=3; i<${#a[#]}; i++)) {
[[ ${a[$i]} =~ ^([0-9]+) ]] && echo "${BASH_REMATCH[1]}" && return 0
}
return 1
}
You can then run commands like:
$ whatdigits "12-ab-cd-45ef-gh"
45
$ whatdigits "$s"
8080

Replace a string which is present on first line in UNIX file

I would like to replace a string which is present on the first line though it is there on rest of the lines in the file as well. How can i do that through a shell script? Can someone help me regarding this. My code is as below. I am extracting the first line from the file and after that I am not sure how to do a replace. Any help would be appreciated. Thanks.
Guys -I would like to replace a string present in $line and write the new line into the same file at same place.
Code:
while read line
do
if [[ $v_counter == 0 ]] then
echo "$line"
v_counter=$(($v_counter + 1));
fi
done < "$v_Full_File_Nm"
Sample data:
Input
BUXT_CMPID|MEDICAL_RECORD_NUM|FACILITY_ID|PATIENT_LAST_NAME|PATIENT_FIRST_NAME|HOME_ADDRESS_LINE_1|HOME_ADDRESS_LINE_2|HOME_CITY|HOME_STATE|HOME_ZIP|MOSAIC_CODE|MOSAIC_DESC|DRIVE_TIME| buxt_pt_apnd_20140624_head_5records.txt
100106086|5000120878|7141|HARRIS|NEDRA|6246 PARALLEL PKWY||KANSAS CITY|KS|66102|S71|Tough Times|2|buxt_pt_apnd_20140624_head_5records.txt
Output
BUXT_CMPID|MEDICAL_RECORD_NUM|FACILITY_ID|PATIENT_LAST_NAME|PATIENT_FIRST_NAME|HOME_ADDRESS_LINE_1|HOME_ADDRESS_LINE_2|HOME_CITY|HOME_STATE|HOME_ZIP|MOSAIC_CODE|MOSAIC_DESC|DRIVE_TIME| SRC_FILE_NM
100106086|5000120878|7141|HARRIS|NEDRA|6246 PARALLEL PKWY||KANSAS CITY|KS|66102|S71|Tough Times|2|buxt_pt_apnd_20140624_head_5records.txt
From the above sample data I need to replace buxt_pt_apnd_20140624_head_5records.txt with SRC_FILE_NAME string.
Why not use sed?
sed -e '1s/fred/frog/' yourfile
will replace fred with frog on line 1.
If your 'string' is a variable, you can do this to get the variable expanded:
sed -e "1s/$varA/$varB/" yourfile
If you want to do it in place and change your file, add -i before -e.
awk -v old="string1" -v new="string2" '
NR==1 && (idx=index($0,old)) {
$0 = substr($0,1,idx-1) new substr($0,idx+length(old))
}
1' file > /usr/tmp/tmp$$ && mv /usr/tmp/tmp$$ file
The above will replace string1 with string2 only when it appears in the first line of file.
Any solution posted that uses awk but does not use index will not work in general. Same for any solution posted that uses sed. The reason is that those would work on REs, not strings and so behave undesirably for string replacement depending what characters are present in string1.
Looks like the OPs going with a sed RE-replacement solution so this is just for anyone else looking to replace a string: Here's what a string replacement function would look like if youd rather not have it inline:
awk -v old="string1" -v new="string2" '
function strsub(old,new,tgt, idx) {
if ( idx = index(tgt,old) ) {
tgt = substr(tgt,1,idx-1) new substr(tgt,idx+length(old))
}
return tgt
}
NR==1 { $0 = strsub(old,new,$0) }
1' file
A bash solution:
file="afile.txt"
str="hello"
repl="goodbye"
IFS= read -r line < "$file"
line=${line/$str/$repl}
tmpfile="/usr/tmp/$file.$$.tmp"
{
echo "$line"
tail -n+2 "$file"
} > "$tmpfile" && mv "$tmpfile" "$file"
Note that $str above will be interpreted as a "pattern" (a simple kind of regex) where * matches any number of any characters, ? matches any single character, [abc] matches any one of the characters in the brackets, and [^abc] (or [!abc]) matches any one character not in the brackets. See Pattern-Matching

How can I set a default value when incorrect/invalid input is entered in Unix?

i want to set the value of inputLineNumber to 20. I tried checking if no value is given by user by [[-z "$inputLineNumber"]] and then setting the value by inputLineNumber=20. The code gives this message ./t.sh: [-z: not found as message on the console. How to resolve this? Here's my full script as well.
#!/bin/sh
cat /dev/null>copy.txt
echo "Please enter the sentence you want to search:"
read "inputVar"
echo "Please enter the name of the file in which you want to search:"
read "inputFileName"
echo "Please enter the number of lines you want to copy:"
read "inputLineNumber"
[[-z "$inputLineNumber"]] || inputLineNumber=20
for N in `grep -n $inputVar $inputFileName | cut -d ":" -f1`
do
LIMIT=`expr $N + $inputLineNumber`
sed -n $N,${LIMIT}p $inputFileName >> copy.txt
echo "-----------------------" >> copy.txt
done
cat copy.txt
Changed the script after suggestion from #Kevin. Now the error message ./t.sh: syntax error at line 11: `$' unexpected
#!/bin/sh
truncate copy.txt
echo "Please enter the sentence you want to search:"
read inputVar
echo "Please enter the name of the file in which you want to search:"
read inputFileName
echo Please enter the number of lines you want to copy:
read inputLineNumber
[ -z "$inputLineNumber" ] || inputLineNumber=20
for N in $(grep -n $inputVar $inputFileName | cut -d ":" -f1)
do
LIMIT=$((N+inputLineNumber))
sed -n $N,${LIMIT}p $inputFileName >> copy.txt
echo "-----------------------" >> copy.txt
done
cat copy.txt
Try changing this line from:
[[-z "$inputLineNumber"]] || inputLineNumber=20
To this:
if [[ -z "$inputLineNumber" ]]; then
inputLineNumber=20
fi
Hope this helps.
Where to start...
You are running as /bin/sh but trying to use [[. [[ is a bash command that sh does not recognize. Either change the shebang to /bin/bash (preferred) or use [ instead.
You do not have a space between [[-z. That causes bash to read it as a command named [[-z, which clearly doesn't exist. You need [[ -z $inputLineNumber ]] (note the space at the end too). Quoting within [[ doesn't matter, but if you change to [ (see above), you will need to keep the quotes.
Your code says [[-z but your error says [-z. Pick one.
Use $(...) instead of `...`. The backticks are deprecated, and $() handles quoting appropriately.
You don't need to cat /dev/null >copy.txt, certainly not twice without writing to it in-between. Use truncate copy.txt or just plain >copy.txt.
You seem to have inconsistent quoting. Quote or escape (\x) anything with special characters (~, `, !, #, $, &, *, ^, (), [], \, <, >, ?, ', ", ;) or whitespace and any variable that could have whitespace. You don't need to quote string literals with no special characters (e.g. ":").
Instead of LIMIT=`expr...`, use limit=$((N+inputLineNumber)).

How to quote strings in file names in zsh (passing back to other scripts)

I have a script that has a string in a file name like so:
filename_with_spaces="a file with spaces"
echo test > "$filename_with_spaces"
test_expect_success "test1: filename with spaces" "
run cat \"$filename_with_spaces\"
run grep test \"$filename_with_spaces\"
"
test_expect_success is defined as:
test_expect_success () {
echo "expecting success: $1"
eval "$2"
}
and run is defined as:
#!/bin/zsh
# make nice filename removing special characters, replace space with _
filename=`echo $# | tr ' ' _ | tr -cd 'a-zA-Z0-9_.'`.run
echo "#!/bin/zsh" > $filename
print "$#" >> $filename
chmod +x $filename
./$filename
But when I run the toplevel script test_expect_success... I get cat_a_file_with_spaces.run with:
#!/bin/zsh
cat a file with spaces
The problem is the quotes around a file with spaces in cat_a_file_with_spaces.run is missing. How do you get Z shell to keep the correct quoting?
Thanks
Try
run cat ${(q)filename_with_spaces}
. It is what (q) modifier was written for. Same for run script:
echo -E ${(q)#} >> $filename
. And it is not bash, you don't need to put quotes around variables: unless you specify some option (don't remember which exactly)
command $var
always passes exactly one argument to command no matter what is in $var. To ensure that some zsh option will not alter the behavior, put
emulate -L zsh
at the top of every script.
Note that initial variant (run cat \"$filename_with_spaces\") is not a correct quoting: filename may contain any character except NULL and / used for separating directories. ${(q)} takes care about it.
Update: I would have written test_expect_success function in the following fashion:
function test_expect_success()
{
emulate -L zsh
echo "Expecting success: $1" ; shift
$#
}
Usage:
test_expect_success "Message" run cat $filename_with_spaces

Quoting command-line arguments in shell scripts

The following shell script takes a list of arguments, turns Unix paths into WINE/Windows paths and invokes the given executable under WINE.
#! /bin/sh
if [ "${1+set}" != "set" ]
then
echo "Usage; winewrap EXEC [ARGS...]"
exit 1
fi
EXEC="$1"
shift
ARGS=""
for p in "$#";
do
if [ -e "$p" ]
then
p=$(winepath -w $p)
fi
ARGS="$ARGS '$p'"
done
CMD="wine '$EXEC' $ARGS"
echo $CMD
$CMD
However, there's something wrong with the quotation of command-line arguments.
$ winewrap '/home/chris/.wine/drive_c/Program Files/Microsoft Research/Z3-1.3.6/bin/z3.exe' -smt /tmp/smtlib3cee8b.smt
Executing: wine '/home/chris/.wine/drive_c/Program Files/Microsoft Research/Z3-1.3.6/bin/z3.exe' '-smt' 'Z: mp\smtlib3cee8b.smt'
wine: cannot find ''/home/chris/.wine/drive_c/Program'
Note that:
The path to the executable is being chopped off at the first space, even though it is single-quoted.
The literal "\t" in the last path is being transformed into a tab character.
Obviously, the quotations aren't being parsed the way I intended by the shell. How can I avoid these errors?
EDIT: The "\t" is being expanded through two levels of indirection: first, "$p" (and/or "$ARGS") is being expanded into Z:\tmp\smtlib3cee8b.smt; then, \t is being expanded into the tab character. This is (seemingly) equivalent to
Y='y\ty'
Z="z${Y}z"
echo $Z
which yields
zy\tyz
and not
zy yz
UPDATE: eval "$CMD" does the trick. The "\t" problem seems to be echo's fault: "If the first operand is -n, or if any of the operands contain a backslash ( '\' ) character, the results are implementation-defined." (POSIX specification of echo)
bash’s arrays are unportable but the only sane way to handle argument lists in shell
The number of arguments is in ${#}
Bad stuff will happen with your script if there are filenames starting with a dash in the current directory
If the last line of your script just runs a program, and there are no traps on exit, you should exec it
With that in mind
#! /bin/bash
# push ARRAY arg1 arg2 ...
# adds arg1, arg2, ... to the end of ARRAY
function push() {
local ARRAY_NAME="${1}"
shift
for ARG in "${#}"; do
eval "${ARRAY_NAME}[\${#${ARRAY_NAME}[#]}]=\${ARG}"
done
}
PROG="$(basename -- "${0}")"
if (( ${#} < 1 )); then
# Error messages should state the program name and go to stderr
echo "${PROG}: Usage: winewrap EXEC [ARGS...]" 1>&2
exit 1
fi
EXEC=("${1}")
shift
for p in "${#}"; do
if [ -e "${p}" ]; then
p="$(winepath -w -- "${p}")"
fi
push EXEC "${p}"
done
exec "${EXEC[#]}"
I you do want to have the assignment to CMD you should use
eval $CMD
instead of just $CMD in the last line of your script. This should solve your problem with spaces in the paths, I don't know what to do about the "\t" problem.
replace the last line from $CMD to just
wine '$EXEC' $ARGS
You'll note that the error is ''/home/chris/.wine/drive_c/Program' and not '/home/chris/.wine/drive_c/Program'
The single quotes are not being interpolated properly, and the string is being split by spaces.
You can try preceeding the spaces with \ like so:
/home/chris/.wine/drive_c/Program Files/Microsoft\ Research/Z3-1.3.6/bin/z3.exe
You can also do the same with your \t problem - replace it with \\t.

Resources