command in shell to get second numeric value after "-" - unix

Example
prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000
I need value 8080. So basically we need digit value after second occurrence of '-'.
We tried following options:
echo "prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000" | sed -r 's/([^-][:digit:]+[^-][:digit:]).*/\1/'

There is no need to resort to sed, BASH supports regular expressions:
$ A=prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000
$ [[ $A =~ ([^-]*-){2}[^[:digit:]]+([[:digit:]]+) ]] && echo "${BASH_REMATCH[2]}"
8080

Try this Perl solution
$ data="prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000"
$ perl -ne ' /.+?\-(\d+).+?\-(\d+).*/g and print $2 ' <<< "$data"
8080
or
$ echo "$data" | perl -ne ' /.+?\-(\d+).+?\-(\d+).*/g and print $2 '
8080

You could do this in a POSIX shell using IFS to identify the parts, and a loop to step to the pattern you're looking for:
s="prod2-03_dl-httpd-prod-8080_access_referer_log.20181111-050000"
# Set a field separator
IFS=-
# Expand your variable into positional parameters
set - $s
# Drop the first two fields
shift 2
# Drop additional fields until one that starts with a digit
while ! expr "$1" : '[0-9]' >/dev/null; do shift; done
# Capture the part of the string that is not digits
y="$1"; while expr "$y" : '[0-9]' >/dev/null; do y="${y##[[:digit:]]}"; done
# Strip off the non-digit part from the original field
x="${1%$y}"
Note that this may fail for a string that looks like aa-bb-123cc45-foo. If you might have additional strings of digits in the "interesting" field, you'll need more code.
If you have a bash shell available, you could do this with a series of bash parameter expansions...
# Strip off the first two "fields"
x="${s#*-}"; x="${x#*-}"
shopt -s extglob
x="${x##+([^[:digit:]])}"
# Identify the part on the right that needs to be stripped
y="${x##+([[:digit:]])}"
# And strip it...
x="${x%$y}"
This is not POSIX compatible because if the requirement for extglob.
Of course, bash offers you many options. Consider this function:
whatdigits() {
local IFS=- x i
local -a a
a=( $1 )
for ((i=3; i<${#a[#]}; i++)) {
[[ ${a[$i]} =~ ^([0-9]+) ]] && echo "${BASH_REMATCH[1]}" && return 0
}
return 1
}
You can then run commands like:
$ whatdigits "12-ab-cd-45ef-gh"
45
$ whatdigits "$s"
8080

Related

Unix partial string comparison

I need to compare a string partially to check for a given condition.
Like my $1 will be checked if it has a part of a string BLR
while my file input has $1 entries as BLR21 BLR64 IND23
I only need a true condition when $1 is equal to BLR**
where these stars can be anything.
I used a simple if condition
if($1=="BLR21")
{print $2}
Now this only works when whole BLR21 is there in row.
I need to ckeck not for BLR21 but only BLR.
Please Help
Your question is not great, I hope I understood.
Quick and easy solution
grep BLR input.txt
This will output all the lines in which "BLR" is found, in file input.txt. It will match "BLR" with any prefix and suffix, whatever they might be (spaces, alphanumerical, tabs, ...).
"Complicated" solution
A bit more complicated. It does the same thing, but makes sure input.txt exists, and is in the form of a script.
Input file, input.txt:
BLR21 BLR64 IND23
Your script could be:
#!/bin/bash
#
# Arguments
inputfile="input.txt"
if [[ $# -ne 1 ]]
then
echo "Usage: myscript.bash <STRING>"
exit 1
else
string="$1"
fi
# Validation, and processing...
if [[ ! -f "$inputfile" ]]
then
echo "ERROR: file >>$inputfile<< does not exist."
exit 2
else
grep "$string" "$inputfile"
fi
And to call the script, you do:
./myscript.bash BLR
But really, a simple grep does the job here.
Taking it even further...
#!/bin/bash
#
# Arguments
inputfile="input.txt"
if [[ $# -ne 1 ]]
then
echo "Usage: check.bash <STRING>"
exit 1
else
string="$1"
fi
# Validation, and processing...
if [[ ! -f "$inputfile" ]]
then
echo "ERROR: file >>$inputfile<< does not exist."
exit 2
else
while read -r line
do
if [[ "$line" =~ $string ]]
then
echo "$line"
fi
done <"$inputfile"
fi
Now this one is like going to the moon via mars...
It reads each line of the file, one by one. Then it checks if that line contains the string, using the =~ operator inside the if.
But this is crazy, when a simple grep would do.

substring before and substring after in shell script

I have a string:
//host:/dir1/dir2/dir3/file_name
I want to fetch value of host & directories in different variables in unix script.
Example :
host_name = host
dir_path = /dir1/dir2/dir3
Note - String length & no of directories is not fixed.
Could you please help me to fetch these values from string in unix shell script.
Using bash string operations:
str='//host:/dir1/dir2/dir3/file_name'
host_name=${str%%:*}
host_name=${host_name##*/}
dir_path=${str#*:}
dir_path=${dir_path%/*}
I would do it using regular expressions:
if [[ $path =~ ^//(.*):(.*)/(.*)$ ]]; then
host="${BASH_REMATCH[1]}"
dir_path="${BASH_REMATCH[2]}"
filename="${BASH_REMATCH[3]}"
else
echo "Invalid format" >&2
exit 1
fi
If you are sure that the format will match, you can do simply
[[ $path =~ ^//(.*):(.*)/(.*)$ ]]
host="${BASH_REMATCH[1]}"
dir_path="${BASH_REMATCH[2]}"
filename="${BASH_REMATCH[3]}"
Edit: Since you seem to be using ksh rather than bash (though bash was indicated in the question), the syntax is a bit different:
match=(${path/~(E)^\/\/(.*):(.*)\/(.*)$/\1 \2 \3})
host="${match[0]}"
dir_path="${match[1]}"
filename="${match[2]}"
This will break if there are spaces in the file name, though. In that case, you can use the more cumbersome
host="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\1}"
dir_path="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\2}"
filename="${path/~(E)^\/\/(.*):(.*)\/(.*)$/\3}"
Perhaps there are more elegant ways of doing it in ksh, but I'm not familiar with it.
The shortest way I can think of is to assign two variables in one statement:
$ read host_name dir_path <<< $(echo $string | sed -e 's,^//,,;s,:, ,')
Complete script:
string="//host:/dir1/dir2/dir3/file_name"
read host_name dir_path <<< $(echo $string | sed -e 's,^//,,;s,:, ,')
echo "host_name = " $host_name
echo "dir_path = " $dir_path
Output:
host_name: host
dir_path: /dir1/dir2/dir3/file_name

How can I set a default value when incorrect/invalid input is entered in Unix?

i want to set the value of inputLineNumber to 20. I tried checking if no value is given by user by [[-z "$inputLineNumber"]] and then setting the value by inputLineNumber=20. The code gives this message ./t.sh: [-z: not found as message on the console. How to resolve this? Here's my full script as well.
#!/bin/sh
cat /dev/null>copy.txt
echo "Please enter the sentence you want to search:"
read "inputVar"
echo "Please enter the name of the file in which you want to search:"
read "inputFileName"
echo "Please enter the number of lines you want to copy:"
read "inputLineNumber"
[[-z "$inputLineNumber"]] || inputLineNumber=20
for N in `grep -n $inputVar $inputFileName | cut -d ":" -f1`
do
LIMIT=`expr $N + $inputLineNumber`
sed -n $N,${LIMIT}p $inputFileName >> copy.txt
echo "-----------------------" >> copy.txt
done
cat copy.txt
Changed the script after suggestion from #Kevin. Now the error message ./t.sh: syntax error at line 11: `$' unexpected
#!/bin/sh
truncate copy.txt
echo "Please enter the sentence you want to search:"
read inputVar
echo "Please enter the name of the file in which you want to search:"
read inputFileName
echo Please enter the number of lines you want to copy:
read inputLineNumber
[ -z "$inputLineNumber" ] || inputLineNumber=20
for N in $(grep -n $inputVar $inputFileName | cut -d ":" -f1)
do
LIMIT=$((N+inputLineNumber))
sed -n $N,${LIMIT}p $inputFileName >> copy.txt
echo "-----------------------" >> copy.txt
done
cat copy.txt
Try changing this line from:
[[-z "$inputLineNumber"]] || inputLineNumber=20
To this:
if [[ -z "$inputLineNumber" ]]; then
inputLineNumber=20
fi
Hope this helps.
Where to start...
You are running as /bin/sh but trying to use [[. [[ is a bash command that sh does not recognize. Either change the shebang to /bin/bash (preferred) or use [ instead.
You do not have a space between [[-z. That causes bash to read it as a command named [[-z, which clearly doesn't exist. You need [[ -z $inputLineNumber ]] (note the space at the end too). Quoting within [[ doesn't matter, but if you change to [ (see above), you will need to keep the quotes.
Your code says [[-z but your error says [-z. Pick one.
Use $(...) instead of `...`. The backticks are deprecated, and $() handles quoting appropriately.
You don't need to cat /dev/null >copy.txt, certainly not twice without writing to it in-between. Use truncate copy.txt or just plain >copy.txt.
You seem to have inconsistent quoting. Quote or escape (\x) anything with special characters (~, `, !, #, $, &, *, ^, (), [], \, <, >, ?, ', ", ;) or whitespace and any variable that could have whitespace. You don't need to quote string literals with no special characters (e.g. ":").
Instead of LIMIT=`expr...`, use limit=$((N+inputLineNumber)).

Is there a way to ignore header lines in a UNIX sort?

I have a fixed-width-field file which I'm trying to sort using the UNIX (Cygwin, in my case) sort utility.
The problem is there is a two-line header at the top of the file which is being sorted to the bottom of the file (as each header line begins with a colon).
Is there a way to tell sort either "pass the first two lines across unsorted" or to specify an ordering which sorts the colon lines to the top - the remaining lines are always start with a 6-digit numeric (which is actually the key I'm sorting on) if that helps.
Example:
:0:12345
:1:6:2:3:8:4:2
010005TSTDOG_FOOD01
500123TSTMY_RADAR00
222334NOTALINEOUT01
477821USASHUTTLES21
325611LVEANOTHERS00
should sort to:
:0:12345
:1:6:2:3:8:4:2
010005TSTDOG_FOOD01
222334NOTALINEOUT01
325611LVEANOTHERS00
477821USASHUTTLES21
500123TSTMY_RADAR00
(head -n 2 <file> && tail -n +3 <file> | sort) > newfile
The parentheses create a subshell, wrapping up the stdout so you can pipe it or redirect it as if it had come from a single command.
If you don't mind using awk, you can take advantage of awk's built-in pipe abilities
eg.
extract_data | awk 'NR<3{print $0;next}{print $0| "sort -r"}'
This prints the first two lines verbatim and pipes the rest through sort.
Note that this has the very specific advantage of being able to selectively sort parts
of a piped input. all the other methods suggested will only sort plain files which can be read multiple times. This works on anything.
In simple cases, sed can do the job elegantly:
your_script | (sed -u 1q; sort)
or equivalently,
cat your_data | (sed -u 1q; sort)
The key is in the 1q -- print first line (header) and quit (leaving the rest of the input to sort).
For the example given, 2q will do the trick.
The -u switch (unbuffered) is required for those seds (notably, GNU's) that would otherwise read the input in chunks, thereby consuming data that you want to go through sort instead.
Here is a version that works on piped data:
(read -r; printf "%s\n" "$REPLY"; sort)
If your header has multiple lines:
(for i in $(seq $HEADER_ROWS); do read -r; printf "%s\n" "$REPLY"; done; sort)
This solution is from here
You can use tail -n +3 <file> | sort ... (tail will output the file contents from the 3rd line).
head -2 <your_file> && nawk 'NR>2' <your_file> | sort
example:
> cat temp
10
8
1
2
3
4
5
> head -2 temp && nawk 'NR>2' temp | sort -r
10
8
5
4
3
2
1
It only takes 2 lines of code...
head -1 test.txt > a.tmp;
tail -n+2 test.txt | sort -n >> a.tmp;
For a numeric data, -n is required. For alpha sort, the -n is not required.
Example file:
$ cat test.txt
header
8
5
100
1
-1
Result:
$ cat a.tmp
header
-1
1
5
8
100
So here's a bash function where arguments are exactly like sort. Supporting files and pipes.
function skip_header_sort() {
if [[ $# -gt 0 ]] && [[ -f ${#: -1} ]]; then
local file=${#: -1}
set -- "${#:1:$(($#-1))}"
fi
awk -vsargs="$*" 'NR<2{print; next}{print | "sort "sargs}' $file
}
How it works. This line checks if there is at least one argument and if the last argument is a file.
if [[ $# -gt 0 ]] && [[ -f ${#: -1} ]]; then
This saves the file to separate argument. Since we're about to erase the last argument.
local file=${#: -1}
Here we remove the last argument. Since we don't want to pass it as a sort argument.
set -- "${#:1:$(($#-1))}"
Finally, we do the awk part, passing the arguments (minus the last argument if it was the file) to sort in awk. This was orignally suggested by Dave, and modified to take sort arguments. We rely on the fact that $file will be empty if we're piping, thus ignored.
awk -vsargs="$*" 'NR<2{print; next}{print | "sort "sargs}' $file
Example usage with a comma separated file.
$ cat /tmp/test
A,B,C
0,1,2
1,2,0
2,0,1
# SORT NUMERICALLY SECOND COLUMN
$ skip_header_sort -t, -nk2 /tmp/test
A,B,C
2,0,1
0,1,2
1,2,0
# SORT REVERSE NUMERICALLY THIRD COLUMN
$ cat /tmp/test | skip_header_sort -t, -nrk3
A,B,C
0,1,2
2,0,1
1,2,0
Here's a bash shell function derived from the other answers. It handles both files and pipes. First argument is the file name or '-' for stdin. Remaining arguments are passed to sort. A couple examples:
$ hsort myfile.txt
$ head -n 100 myfile.txt | hsort -
$ hsort myfile.txt -k 2,2 | head -n 20 | hsort - -r
The shell function:
hsort ()
{
if [ "$1" == "-h" ]; then
echo "Sort a file or standard input, treating the first line as a header.";
echo "The first argument is the file or '-' for standard input. Additional";
echo "arguments to sort follow the first argument, including other files.";
echo "File syntax : $ hsort file [sort-options] [file...]";
echo "STDIN syntax: $ hsort - [sort-options] [file...]";
return 0;
elif [ -f "$1" ]; then
local file=$1;
shift;
(head -n 1 $file && tail -n +2 $file | sort $*);
elif [ "$1" == "-" ]; then
shift;
(read -r; printf "%s\n" "$REPLY"; sort $*);
else
>&2 echo "Error. File not found: $1";
>&2 echo "Use either 'hsort <file> [sort-options]' or 'hsort - [sort-options]'";
return 1 ;
fi
}
This is the same as Ian Sherbin answer but my implementation is :-
cut -d'|' -f3,4,7 $arg1 | uniq > filetmp.tc
head -1 filetmp.tc > file.tc;
tail -n+2 filetmp.tc | sort -t"|" -k2,2 >> file.tc;
Another simple variation on all the others, reading a file once
HEADER_LINES=2
(head -n $HEADER_LINES; sort) < data-file.dat
With Python:
import sys
HEADER_ROWS=2
for _ in range(HEADER_ROWS):
sys.stdout.write(next(sys.stdin))
for row in sorted(sys.stdin):
sys.stdout.write(row)
cat file_name.txt | sed 1d | sort
This will do what you want.

How do I manipulate $PATH elements in shell scripts?

Is there a idiomatic way of removing elements from PATH-like shell variables?
That is I want to take
PATH=/home/joe/bin:/usr/local/bin:/usr/bin:/bin:/path/to/app/bin:.
and remove or replace the /path/to/app/bin without clobbering the rest of the variable. Extra points for allowing me put new elements in arbitrary positions. The target will be recognizable by a well defined string, and may occur at any point in the list.
I know I've seen this done, and can probably cobble something together on my own, but I'm looking for a nice approach. Portability and standardization a plus.
I use bash, but example are welcome in your favorite shell as well.
The context here is one of needing to switch conveniently between multiple versions (one for doing analysis, another for working on the framework) of a large scientific analysis package which produces a couple dozen executables, has data stashed around the filesystem, and uses environment variable to help find all this stuff. I would like to write a script that selects a version, and need to be able to remove the $PATH elements relating to the currently active version and replace them with the same elements relating to the new version.
This is related to the problem of preventing repeated $PATH elements when re-running login scripts and the like.
Previous similar question: How to keep from duplicating path variable in csh
Subsequent similar question: What is the most elegant way to remove a path from the $PATH variable in Bash?
Addressing the proposed solution from dmckee:
While some versions of Bash may allow hyphens in function names, others (MacOS X) do not.
I don't see a need to use return immediately before the end of the function.
I don't see the need for all the semi-colons.
I don't see why you have path-element-by-pattern export a value. Think of export as equivalent to setting (or even creating) a global variable - something to be avoided whenever possible.
I'm not sure what you expect 'replace-path PATH $PATH /usr' to do, but it does not do what I would expect.
Consider a PATH value that starts off containing:
.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin
The result I got (from 'replace-path PATH $PATH /usr') is:
.
/Users/jleffler/bin
/local/postgresql/bin
/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/local/bin
/bin
/bin
/sw/bin
/sbin
/sbin
I would have expected to get my original path back since /usr does not appear as a (complete) path element, only as part of a path element.
This can be fixed in replace-path by modifying one of the sed commands:
export $path=$(echo -n $list | tr ":" "\n" | sed "s:^$removestr\$:$replacestr:" |
tr "\n" ":" | sed "s|::|:|g")
I used ':' instead of '|' to separate parts of the substitute since '|' could (in theory) appear in a path component, whereas by definition of PATH, a colon cannot. I observe that the second sed could eliminate the current directory from the middle of a PATH. That is, a legitimate (though perverse) value of PATH could be:
PATH=/bin::/usr/local/bin
After processing, the current directory would no longer be on the PATH.
A similar change to anchor the match is appropriate in path-element-by-pattern:
export $target=$(echo -n $list | tr ":" "\n" | grep -m 1 "^$pat\$")
I note in passing that grep -m 1 is not standard (it is a GNU extension, also available on MacOS X). And, indeed, the-n option for echo is also non-standard; you would be better off simply deleting the trailing colon that is added by virtue of converting the newline from echo into a colon. Since path-element-by-pattern is used just once, has undesirable side-effects (it clobbers any pre-existing exported variable called $removestr), it can be replaced sensibly by its body. This, along with more liberal use of quotes to avoid problems with spaces or unwanted file name expansion, leads to:
# path_tools.bash
#
# A set of tools for manipulating ":" separated lists like the
# canonical $PATH variable.
#
# /bin/sh compatibility can probably be regained by replacing $( )
# style command expansion with ` ` style
###############################################################################
# Usage:
#
# To remove a path:
# replace_path PATH $PATH /exact/path/to/remove
# replace_path_pattern PATH $PATH <grep pattern for target path>
#
# To replace a path:
# replace_path PATH $PATH /exact/path/to/remove /replacement/path
# replace_path_pattern PATH $PATH <target pattern> /replacement/path
#
###############################################################################
# Remove or replace an element of $1
#
# $1 name of the shell variable to set (e.g. PATH)
# $2 a ":" delimited list to work from (e.g. $PATH)
# $3 the precise string to be removed/replaced
# $4 the replacement string (use "" for removal)
function replace_path () {
path=$1
list=$2
remove=$3
replace=$4 # Allowed to be empty or unset
export $path=$(echo "$list" | tr ":" "\n" | sed "s:^$remove\$:$replace:" |
tr "\n" ":" | sed 's|:$||')
}
# Remove or replace an element of $1
#
# $1 name of the shell variable to set (e.g. PATH)
# $2 a ":" delimited list to work from (e.g. $PATH)
# $3 a grep pattern identifying the element to be removed/replaced
# $4 the replacement string (use "" for removal)
function replace_path_pattern () {
path=$1
list=$2
removepat=$3
replacestr=$4 # Allowed to be empty or unset
removestr=$(echo "$list" | tr ":" "\n" | grep -m 1 "^$removepat\$")
replace_path "$path" "$list" "$removestr" "$replacestr"
}
I have a Perl script called echopath which I find useful when debugging problems with PATH-like variables:
#!/usr/bin/perl -w
#
# "#(#)$Id: echopath.pl,v 1.7 1998/09/15 03:16:36 jleffler Exp $"
#
# Print the components of a PATH variable one per line.
# If there are no colons in the arguments, assume that they are
# the names of environment variables.
#ARGV = $ENV{PATH} unless #ARGV;
foreach $arg (#ARGV)
{
$var = $arg;
$var = $ENV{$arg} if $arg =~ /^[A-Za-z_][A-Za-z_0-9]*$/;
$var = $arg unless $var;
#lst = split /:/, $var;
foreach $val (#lst)
{
print "$val\n";
}
}
When I run the modified solution on the test code below:
echo
xpath=$PATH
replace_path xpath $xpath /usr
echopath $xpath
echo
xpath=$PATH
replace_path_pattern xpath $xpath /usr/bin /work/bin
echopath xpath
echo
xpath=$PATH
replace_path_pattern xpath $xpath "/usr/.*/bin" /work/bin
echopath xpath
The output is:
.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin
.
/Users/jleffler/bin
/usr/local/postgresql/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/work/bin
/bin
/sw/bin
/usr/sbin
/sbin
.
/Users/jleffler/bin
/work/bin
/usr/local/mysql/bin
/Users/jleffler/perl/v5.10.0/bin
/usr/local/bin
/usr/bin
/bin
/sw/bin
/usr/sbin
/sbin
This looks correct to me - at least, for my definition of what the problem is.
I note that echopath LD_LIBRARY_PATH evaluates $LD_LIBRARY_PATH. It would be nice if your functions were able to do that, so the user could type:
replace_path PATH /usr/bin /work/bin
That can be done by using:
list=$(eval echo '$'$path)
This leads to this revision of the code:
# path_tools.bash
#
# A set of tools for manipulating ":" separated lists like the
# canonical $PATH variable.
#
# /bin/sh compatibility can probably be regained by replacing $( )
# style command expansion with ` ` style
###############################################################################
# Usage:
#
# To remove a path:
# replace_path PATH /exact/path/to/remove
# replace_path_pattern PATH <grep pattern for target path>
#
# To replace a path:
# replace_path PATH /exact/path/to/remove /replacement/path
# replace_path_pattern PATH <target pattern> /replacement/path
#
###############################################################################
# Remove or replace an element of $1
#
# $1 name of the shell variable to set (e.g. PATH)
# $2 the precise string to be removed/replaced
# $3 the replacement string (use "" for removal)
function replace_path () {
path=$1
list=$(eval echo '$'$path)
remove=$2
replace=$3 # Allowed to be empty or unset
export $path=$(echo "$list" | tr ":" "\n" | sed "s:^$remove\$:$replace:" |
tr "\n" ":" | sed 's|:$||')
}
# Remove or replace an element of $1
#
# $1 name of the shell variable to set (e.g. PATH)
# $2 a grep pattern identifying the element to be removed/replaced
# $3 the replacement string (use "" for removal)
function replace_path_pattern () {
path=$1
list=$(eval echo '$'$path)
removepat=$2
replacestr=$3 # Allowed to be empty or unset
removestr=$(echo "$list" | tr ":" "\n" | grep -m 1 "^$removepat\$")
replace_path "$path" "$removestr" "$replacestr"
}
The following revised test now works too:
echo
xpath=$PATH
replace_path xpath /usr
echopath xpath
echo
xpath=$PATH
replace_path_pattern xpath /usr/bin /work/bin
echopath xpath
echo
xpath=$PATH
replace_path_pattern xpath "/usr/.*/bin" /work/bin
echopath xpath
It produces the same output as before.
Reposting my answer to What is the most elegant way to remove a path from the $PATH variable in Bash? :
#!/bin/bash
IFS=:
# convert it to an array
t=($PATH)
unset IFS
# perform any array operations to remove elements from the array
t=(${t[#]%%*usr*})
IFS=:
# output the new array
echo "${t[*]}"
or the one-liner:
PATH=$(IFS=':';t=($PATH);unset IFS;t=(${t[#]%%*usr*});IFS=':';echo "${t[*]}");
For deleting an element you can use sed:
#!/bin/bash
NEW_PATH=$(echo -n $PATH | tr ":" "\n" | sed "/foo/d" | tr "\n" ":")
export PATH=$NEW_PATH
will delete the paths that contain "foo" from the path.
You could also use sed to insert a new line before or after a given line.
Edit: you can remove duplicates by piping through sort and uniq:
echo -n $PATH | tr ":" "\n" | sort | uniq -c | sed -n "/ 1 / s/.*1 \(.*\)/\1/p" | sed "/foo/d" | tr "\n" ":"
There are a couple of relevant programs in the answers to "How to keep from duplicating path variable in csh". They concentrate more on ensuring that there are no repeated elements, but the script I provide can be used as:
export PATH=$(clnpath $head_dirs:$PATH:$tail_dirs $remove_dirs)
Assuming you have one or more directories in $head_dirs and one or more directories in $tail_dirs and one or more directories in $remove_dirs, then it uses the shell to concatenate the head, current and tail parts into a massive value, and then removes each of the directories listed in $remove_dirs from the result (not an error if they don't exist), as well as eliminating second and subsequent occurrences of any directory in the path.
This does not address putting path components into a specific position (other than at the beginning or end, and those only indirectly). Notationally, specifying where you want to add the new element, or which element you want to replace, is messy.
Just a note that bash itself can do search and replace. It can do all the normal "once or all", cases [in]sensitive options you would expect.
From the man page:
${parameter/pattern/string}
The pattern is expanded to produce a pattern just as in pathname expansion. Parameter is expanded and the longest match of pattern against its value is replaced with string. If Ipattern begins with /, all matches of pattern are replaced with string. Normally only the first match is replaced. If pattern begins with #, it must match at the beginning of the expanded value of parameter. If pattern begins with %, it must match at the end of the expanded value of parameter. If string is null, matches of pattern are deleted and the / following pattern may be omitted. If parameter is # or *, the substitution operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with # or
*, the substitution operation is applied to each member of the array in turn, and the expansion is the resultant list.
You can also do field splitting by setting $IFS (input field separator) to the desired delimiter.
OK, thanks to all responders. I've prepared an encapsulated version of florin's answer. The first pass looks like this:
# path_tools.bash
#
# A set of tools for manipulating ":" separated lists like the
# canonical $PATH variable.
#
# /bin/sh compatibility can probably be regained by replacing $( )
# style command expansion with ` ` style
###############################################################################
# Usage:
#
# To remove a path:
# replace-path PATH $PATH /exact/path/to/remove
# replace-path-pattern PATH $PATH <grep pattern for target path>
#
# To replace a path:
# replace-path PATH $PATH /exact/path/to/remove /replacement/path
# replace-path-pattern PATH $PATH <target pattern> /replacement/path
#
###############################################################################
# Finds the _first_ list element matching $2
#
# $1 name of a shell variable to be set
# $2 name of a variable with a path-like structure
# $3 a grep pattern to match the desired element of $1
function path-element-by-pattern (){
target=$1;
list=$2;
pat=$3;
export $target=$(echo -n $list | tr ":" "\n" | grep -m 1 $pat);
return
}
# Removes or replaces an element of $1
#
# $1 name of the shell variable to set (i.e. PATH)
# $2 a ":" delimited list to work from (i.e. $PATH)
# $2 the precise string to be removed/replaced
# $3 the replacement string (use "" for removal)
function replace-path () {
path=$1;
list=$2;
removestr=$3;
replacestr=$4; # Allowed to be ""
export $path=$(echo -n $list | tr ":" "\n" | sed "s|$removestr|$replacestr|" | tr "\n" ":" | sed "s|::|:|g");
unset removestr
return
}
# Removes or replaces an element of $1
#
# $1 name of the shell variable to set (i.e. PATH)
# $2 a ":" delimited list to work from (i.e. $PATH)
# $2 a grep pattern identifying the element to be removed/replaced
# $3 the replacement string (use "" for removal)
function replace-path-pattern () {
path=$1;
list=$2;
removepat=$3;
replacestr=$4; # Allowed to be ""
path-element-by-pattern removestr $list $removepat;
replace-path $path $list $removestr $replacestr;
}
Still needs error trapping in all the functions, and I should probably stick in a repeated path solution while I'm at it.
You use it by doing a . /include/path/path_tools.bash in the working script and calling on of the the replace-path* functions.
I am still open to new and/or better answers.
This is easy using awk.
Replace
{
for(i=1;i<=NF;i++)
if($i == REM)
if(REP)
print REP;
else
continue;
else
print $i;
}
Start it using
function path_repl {
echo $PATH | awk -F: -f rem.awk REM="$1" REP="$2" | paste -sd:
}
$ echo $PATH
/bin:/usr/bin:/home/js/usr/bin
$ path_repl /bin /baz
/baz:/usr/bin:/home/js/usr/bin
$ path_repl /bin
/usr/bin:/home/js/usr/bin
Append
Inserts at the given position. By default, it appends at the end.
{
if(IDX < 1) IDX = NF + IDX + 1
for(i = 1; i <= NF; i++) {
if(IDX == i)
print REP
print $i
}
if(IDX == NF + 1)
print REP
}
Start it using
function path_app {
echo $PATH | awk -F: -f app.awk REP="$1" IDX="$2" | paste -sd:
}
$ echo $PATH
/bin:/usr/bin:/home/js/usr/bin
$ path_app /baz 0
/bin:/usr/bin:/home/js/usr/bin:/baz
$ path_app /baz -1
/bin:/usr/bin:/baz:/home/js/usr/bin
$ path_app /baz 1
/baz:/bin:/usr/bin:/home/js/usr/bin
Remove duplicates
This one keeps the first occurences.
{
for(i = 1; i <= NF; i++) {
if(!used[$i]) {
print $i
used[$i] = 1
}
}
}
Start it like this:
echo $PATH | awk -F: -f rem_dup.awk | paste -sd:
Validate whether all elements exist
The following will print an error message for all entries that are not existing in the filesystem, and return a nonzero value.
echo -n $PATH | xargs -d: stat -c %n
To simply check whether all elements are paths and get a return code, you can also use test:
echo -n $PATH | xargs -d: -n1 test -d
suppose
echo $PATH
/usr/lib/jvm/java-1.6.0/bin:lib/jvm/java-1.6.0/bin/:/lib/jvm/java-1.6.0/bin/:/usr/lib/qt-3.3/bin:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/tvnadeesh/bin
If you want to remove /lib/jvm/java-1.6.0/bin/ do like as below
export PATH=$(echo $PATH | sed 's/\/lib\/jvm\/java-1.6.0\/bin\/://g')
sed will take input from echo $PATH and replace /lib/jvm/java-1.6.0/bin/: with empty
in this way you can remove
Order of PATH is not distrubed
Handles corner cases like empty path, space in path gracefully
Partial match of dir does not give false positives
Treats path at head and tail of PATH in proper ways. No : garbage and such.
Say you have
/foo:/some/path:/some/path/dir1:/some/path/dir2:/bar
and you want to replace
/some/path
Then it correctly replaces "/some/path" but
leaves "/some/path/dir1" or "/some/path/dir2", as what you would expect.
function __path_add(){
if [ -d "$1" ] ; then
local D=":${PATH}:";
[ "${D/:$1:/:}" == "$D" ] && PATH="$PATH:$1";
PATH="${PATH/#:/}";
export PATH="${PATH/%:/}";
fi
}
function __path_remove(){
local D=":${PATH}:";
[ "${D/:$1:/:}" != "$D" ] && PATH="${D/:$1:/:}";
PATH="${PATH/#:/}";
export PATH="${PATH/%:/}";
}
# Just for the shake of completeness
function __path_replace(){
if [ -d "$2" ] ; then
local D=":${PATH}:";
if [ "${D/:$1:/:}" != "$D" ] ; then
PATH="${D/:$1:/:$2:}";
PATH="${PATH/#:/}";
export PATH="${PATH/%:/}";
fi
fi
}
Related post
What is the most elegant way to remove a path from the $PATH variable in Bash?
I prefer using ruby to the likes of awk/sed/foo these days, so here's my approach to deal with dupes,
# add it to the path
PATH=~/bin/:$PATH:~/bin
export PATH=$(ruby -e 'puts ENV["PATH"].split(/:/).uniq.join(":")')
create a function for reuse,
mungepath() {
export PATH=$(ruby -e 'puts ENV["PATH"].split(/:/).uniq.join(":")')
}
Hash, arrays and strings in a ruby one liner :)
The first thing to pop into my head to change just part of a string is a sed substitution.
example:
if echo $PATH => "/usr/pkg/bin:/usr/bin:/bin:/usr/pkg/games:/usr/pkg/X11R6/bin"
then to change "/usr/bin" to "/usr/local/bin" could be done like this:
## produces standard output file
## the "=" character is used instead of slash ("/") since that would be messy,
# alternative quoting character should be unlikely in PATH
## the path separater character ":" is both removed and re-added here,
# might want an extra colon after the last path
echo $PATH | sed '=/usr/bin:=/usr/local/bin:='
This solution replaces an entire path-element so might be redundant if new-element is similar.
If the new PATH'-s aren't dynamic but always within some constant set you could save those in a variable and assign as needed:
PATH=$TEMP_PATH_1;
# commands ... ; \n
PATH=$TEMP_PATH_2;
# commands etc... ;
Might not be what you were thinking. some of the relevant commands on bash/unix would be:
pushd
popd
cd
ls # maybe l -1A for single column;
find
grep
which # could confirm that file is where you think it came from;
env
type
..and all that and more have some bearing on PATH or directories in general. The text altering part could be done any number of ways!
Whatever solution chosen would have 4 parts:
1) fetch the path as it is
2) decode the path to find the part needing changes
3) determing what changes are needed/integrating those changes
4) validation/final integration/setting the variable
In line with dj_segfault's answer, I do this in scripts that append/prepend environment variables that might be executed multiple times:
ld_library_path=${ORACLE_HOME}/lib
LD_LIBRARY_PATH=${LD_LIBRARY_PATH//${ld_library_path}?(:)/}
export LD_LIBRARY_PATH=${ld_library_path}${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Using this same technique to remove, replace or manipulate entries in PATH is trivial given the filename-expansion-like pattern matching and pattern-list support of shell parameter expansion.

Resources