How to compare two files in shell script? - unix

Here is my scenario.
I have two files which are having records with each record's 3-25 characters is an identifier. Based on this I need to compare both of them and update the old file with the new file data if their identifiers match. Identifiers start with 01.
Please look at the script below.
This is giving some error as "argument expected at line 12 which I am not able to understand.
#!/bin/ksh
while read line
do
c=`echo $line|grep '^01' `
if [ $c -ne NULL ];
then
var=`echo $line|cut -c 3-25`
fi
while read i
do
d=`echo $i|grep '^01' `
if [ $d -ne NULL ];
then
var1=`echo $i|cut -c 3-25`
if [ $var -eq $var1 ];
then
$line=$i
fi
fi
done < test_monday
done < test_sunday
Please help me out thanks in advance

I think what you need is :
if [ "$d" != NULL ];
Try.

I think you could use the DIFF command
diff file1 file2 > whats_the_diff.txt

Unless you are writing a script for portability to the original Bourne shell or others that do not support the feature, in Bash and ksh you should use the [[ form of test for strings and files.
There is a reduced need for quoting and escaping, additional conditions such as pattern and regular expression matching and the ability to use && and || instead of -a and -o.
if [[ $var == $var1 ]]
Also, "NULL" is not a special value in Bash and ksh and so your test will always succeed since $d is tested against the literal string "NULL".
if [[ $d != "" ]]
or
if [[ $d ]]
For numeric values (not including leading zeros unless you're using octal), you can use numeric expressions. You can omit the dollar sign for variables in this context.
numval=41
if ((++numval >= 42)) # increment then test
then
echo "don't panic"
fi
It's not necessary to use echo and cut for substrings. In Bash and ksh you can do:
var=${line:3:23}
Note: cut uses character positions for the beginning and end of a range, while this shell construct uses starting position and character count so you have to adjust the numbers accordingly.
And it's a good idea to get away from using backticks. Use $() instead. This can be nested and quoting and escaping is reduced or easier.

Related

Zsh returning `<function>:<linenumber> = not found`

I used the have the following tmux shortcut function defined in a separate script and aliased, which worked fine but was messy. I decided to move it to my .zshrc where it naturally belongs, and encountered a problem I wasn't able to figure out.
function t () {re='^[0-9]+$'
if [ "$1" == "kill" ]
then
tmux kill-session -t $2
elif [[ "$1" =~ "$re" ]]
then
tmux attach-session -d -t $1
fi}
I source my .zshrc, call the function, and get:
t:1: = not found
I know the function is defined:
╭─bennett#Io [~] using
╰─○ which t
t () {
re='^[0-9]+$'
if [ "$1" == "kill" ]
then
tmux kill-session -t $2
elif [[ "$1" =~ "$re" ]]
then
tmux attach-session -d -t $1
fi
}
I'm assuming this is complaining about the first line of the function. I've tried shifting the first line of the function down several lines, which doesn't change anything except which line the error message refers to. Any clue what's going on? I haven't found anything relating to this specific issue on SO.
The command [ (or test) only supports a single = to check for equality of two strings. Using == will result in a "= not found" error message. (See man 1 test)
zsh has the [ builtin mainly for compatibility reasons. It tries to implement POSIX where possible, with all the quirks this may bring (See the Zsh Manual).
Unless you need a script to be POSIX compliant (e.g. for compatibility with other shells), I would strongly suggest to use conditional expressions, that is [[ ... ]], instead of [ ... ]. It has more features, does not require quotes or other workarounds for possibly empty values and even allows to use arithmetic expressions.
Wrapping the first conditional in a second set of square-brackets seemed to resolve the issue.
More information on single vs double brackets here:
Is [[ ]] preferable over [ ] in bash scripts?

Getting highest extensions value in unix script

I need to create new files with extensions like: file.1, file.2, file.3 and then check if files with certain numbers exist and create file.(n+1) where n is number of highest, existing file. I was trying to get extensions using basename but it doesn't want to get couple of files
file=`basename $file.*`
ext=${file##*.}
It only works when I input whole file name like $file.3
If the filenames are guaranteed not to have newline characters in them, you can, for example, use standard unix text processing tools:
printf '%s\n' file.* | #full list
sed 's/.*\.//' | #extensions
grep '^[0-9][0-9]*$' | #numerical extensions
awk '{ if($0>m) m=$0} END{ print m }' #get maximum
Here's my take on this.
You can do this entirely in standard awk.
$ awk '{ext=FILENAME;sub(/.*\./,"",ext)} ext>n&&ext~/^[0-9]+$/{n=ext}{nextfile} END {print n}' *.*
Broken out for easier reading:
$ awk '
{
# Capture the extension...
ext=FILENAME
sub(/.*\./,"",ext)
}
# Then, if we have a numeric extension that is bigger than "n"...
ext > n && ext ~ /^[0-9]+$/ {
# let "n" be that extension.
n=ext
}
{
# We aren't actually interested in the contents of this file, so move on.
nextfile
}
# No more files? Print our result.
END {print n}
' *.*
The idea here is that we'll step through the list of filenames and let awk do ALL the processing to capture and "sort" the extensions. (We're not really sorting, we're just recording the highest number as we pass through the files.)
There are a few provisos with this solution:
This only works if all the files have a non-zero length. Technically awk conditions are being compared on "lines of the file", so if there are no lines, awk will pass right by that file.
You don't really need to use the ext variable, you can modify FILENAME directly. I included it for improved readability.
The nextfile command is fairly standard, but not universal. If you have a very old machine, or are running an esoteric variety of unix, nextfile may not be included. (I don't expect this to be a problem.)
Another alternative, which might be easier for you, would be to implement the same logic directly in POSIX shell:
$ n=0; for f in *.*; do ext=${f##*.}; if expr "$ext" : '[0-9][0-9]*$' >/dev/null && [ "$ext" -gt "$n" ]; then n="$ext"; fi; done; echo "$n"
Or, again broken out for easier reading (or scripting):
n=0
for f in *.*; do
ext=${f##*.}
if expr "$ext" : '[0-9][0-9]*$' >/dev/null && [ "$ext" -gt "$n" ]; then
n="$ext"
fi
done
echo "$n"
This steps through all files using a for loop, captures the extension, makes sure it's numeric, determines whether it's greater than "n" and records if it it is, then prints its result.
It requires no pipes and no external tools except expr, which is a POSIX.1 tool available on every system.
One proviso for this solution is that if you have NO files with extensions (i.e. *.* returns no files), this script will erroneously report that the highest numbered extension is 0. You can of course handle that easily enough, but I thought I should mention it.
Thanks for all answers, I've came up with quite similar and a bit simpler idea which I'd like to present it:
for i in file.*; do
#reading the extensions
ext=${i##*.}
if [ "$ext" -gt "$n" ];
then
#increasing n
n=$((n+1))
fi
done
then if we want to get number exceeding n by one
until [[ $a -gt "$n" ]]; do
a=$((a+1))
done
and finally a is one number bigger then number of file extensions. So if there are three files: file.1 file.2 file.3 the returned value will be 4.

Combining file tests in Zsh

What is the most elegant way in zsh to test, whether a file is either a readable regular file?
I understand that I can do something like
if [[ -r "$name" && -f "$name" ]]
...
But it requires repeating "$name" twice. I know that we can't combine conditions (-rf $name), but maybe some other feature in zsh could be used?
By the way, I considered also something like
if ls ${name}(R.) >/dev/null 2>&1
...
But in this case, the shell would complain "no matches found", when $name does not fulfil the criterium. Setting NULL_GLOB wouldn't help here either, because it would just replace the pattern with an empty string, and the expression would always be true.
In very new versions of zsh (works for 5.0.7, but not 5.0.5) you could do this
setopt EXTENDED_GLOB
if [[ -n $name(#qNR.) ]]
...
$name(#qNR.) matches files with name $name that are readable (R) and regular (.). N enables NULL_GLOB for this match. That is, if no files match the pattern it does not produce an error but is removed from the argument list. -n checks if the match is in fact non-empty. EXTENDED_GLOB is needed to enable the (#q...) type of extended globbing which in turn is needed because parenthesis usually have a different meaning inside conditional expressions ([[ ... ]]).
Still, while it is indeed possible to write something up that uses $name only once, I would advice against it. It is rather more convoluted than the original solution and thus harder to understand (i.e. needs thinking) for the next guy that reads it (your future self counts as "next guy" after at most half a year). And at least this solution will work only on zsh and there only on new versions, while the original would run unaltered on bash.
How about make small(?) shell functions as you mentioned?
tests-raw () {
setopt localoptions no_ksharrays
local then="$1"; shift
local f="${#[-1]}" t=
local -i ret=0
set -- "${#[1,-2]}"
for t in ${#[#]}; do
if test "$t" "$f"; then
ret=$?
"$then"
else
return $?
fi
done
return ret
}
and () tests-raw continue "${#[#]}";
or () tests-raw break "${#[#]}";
# examples
name=/dev/null
if and -r -c "$name"; then
echo 'Ok, it is a readable+character special file.'
fi
#>> Ok, it is...
and -r -f ~/.zshrc ; echo $? #>> 0
or -r -d ~/.zshrc ; echo $? #>> 0
and -r -d ~/.zshrc ; echo $? #>> 1
# It could be `and -rd ~/.zshrc` possible.
I feel this is somewhat overkill though.

ZSH subString extraction

Goal
In ZSH script, for a given args, I want to obtain the first string and the rest.
For instance, when the script is named test
sh test hello
supposed to extract h and ello.
ZSH manual
http://zsh.sourceforge.net/Doc/zsh_a4.pdf
says:
Subscripting may also be performed on non-array values, in which case the subscripts specify a
substring to be extracted. For example, if FOO is set to ‘foobar’, then ‘echo $FOO[2,5]’ prints
‘ooba’.
Q1
So, I wrote a shell script in a file named test
echo $1
echo $1[1,1]
terminal:
$ sh test hello
hello
hello[1,1]
the result fails. What's wrong with the code?
Q2
Also I don't know how to extract subString from n to the last. Perhaps do I have to use Array split by regex?
EDIT: Q3
This may be another question, so if it's proper to start new Thread, I will do so.
Thanks to #skishore Here is the further code
#! /bin/zsh
echo $1
ARG_FIRST=`echo $1 | cut -c1`
ARG_REST=`echo $1 | cut -c2-`
echo ARG_FIRST=$ARG_FIRST
echo ARG_REST=$ARG_REST
if $ARG_FIRST = ""; then
echo nullArgs
else
if $ARG_FIRST = "#"; then
echo #Args
else
echo regularArgs
fi
fi
I'm not sure how to compare string valuables to string, but for a given args hello
result:
command not found: h
What's wrong with the code?
EDIT2:
What I've found right
#! /bin/zsh
echo $1
ARG_FIRST=`echo $1 | cut -c1`
ARG_REST=`echo $1 | cut -c2-`
echo ARG_FIRST=$ARG_FIRST
echo ARG_REST=$ARG_REST
if [ $ARG_FIRST ]; then
if [ $ARG_FIRST = "#" ]; then
echo #Args
else
echo regularArgs
fi
else
echo nullArgs
fi
EDIT3:
As the result of whole, this is what I've done with this question.
https://github.com/kenokabe/GitSnapShot
GitSnapShot is a ZSH thin wrapper for Git commands for easier and simpler usage
A1
As others have said, you need to wrap it in curly braces. Also, use a command interpreter (#!...), mark the file as executable, and call it directly.
#!/bin/zsh
echo $1
echo ${1[1,1]}
A2
The easiest way to extract a substring from a parameter (zsh calls variables parameters) is to use parameter expansion. Using the square brackets tells zsh to treat the scalar (i.e. string) parameter as an array. For a single character, this makes sense. For the rest of the string, you can use the simpler ${parameter:start:length} notation instead. If you omit the :length part (as we will here), then it will give you the rest of the scalar.
File test:
#!/bin/zsh
echo ${1[1]}
echo ${1:1}
Terminal:
$ ./test Hello
H
ello
A3
As others have said, you need (preferably double) square brackets to test. Also, to test if a string is NULL use -z, and to test if it is not NULL use -n. You can just put a string in double brackets ([[ ... ]]), but it is preferable to make your intentions clear with -n.
if [[ -z "${ARG_FIRST}" ]]; then
...
fi
Also remove the space between #! and /bin/zsh.
And if you are checking for equality, use ==; if you are assigning a value, use =.
RE:EDIT2:
Declare all parameters to set the scope. If you do not, you may clobber or use a parameter inherited from the shell, which may cause unexpected behavior. Google's shell style guide is a good resource for stuff like this.
Use builtins over external commands.
Avoid backticks. Use $(...) instead.
Use single quotes when quoting a literal string. This prevents pattern matching.
Make use of elif or case to avoid nested ifs. case will be easier to read in your example here, but elif will probably be better for your actual code.
Using case:
#!/bin/zsh
typeset ARG_FIRST="${1[1]}"
typeset ARG_REST="${1:1}"
echo $1
echo 'ARG_FIRST='"${ARG_FIRST}"
echo 'ARG_REST='"${ARG_REST}"
case "${ARG_FIRST}" in
('') echo 'nullArgs' ;;
('#') echo '#Args' ;;
(*)
# Recommended formatting example with more than 1 sloc
echo 'regularArgs'
;;
esac
using elif:
#!/bin/zsh
typeset ARG_FIRST="${1[1]}"
typeset ARG_REST="${1:1}"
echo $1
echo 'ARG_FIRST='"${ARG_FIRST}"
echo 'ARG_REST='"${ARG_REST}"
if [[ -z "${ARG_FIRST}" ]]; then
echo nullArgs
elif [[ '#' == "${ARG_FIRST}" ]]; then
echo #Args
else
echo regularArgs
fi
RE:EDIT3
Use "$#" unless you really know what you are doing. Explanation.
You can use the cut command:
echo $1 | cut -c1
echo $1 | cut -c2-
Use $() to assign these values to variables:
ARG_FIRST=$(echo $1 | cut -c1)
ARG_REST=$(echo $1 | cut -c2-)
echo ARG_FIRST=$ARG_FIRST
echo ARG_REST=$ARG_REST
You can also replace $() with backticks, but the former is recommended and the latter is somewhat deprecated due to nesting issues.
So, I wrote a shell script in a file named test
$ sh test hello
This isn't a zsh script: you're calling it with sh, which is (almost certainly) bash. If you've got the shebang (#!/bin/zsh), you can make it executable (chmod +x <script>) and run it: ./script. Alternatively, you can run it with zsh <script>.
the result fails. What's wrong with the code?
You can wrap in braces:
echo ${1} # This'll work with or without the braces.
echo ${1[3,5]} # This works in the braces.
echo $1[3,5] # This doesn't work.
Running this: ./test-script hello gives:
./test-script.zsh hello
hello
llo
./test-script.zsh:5: no matches found: hello[3,5]
Also I don't know how to extract subString from n to the last. Perhaps do I have to use Array split by regex?
Use the [n,last] notation, but wrap in braces. We can determine how long our variable is with, then use the length:
# Store the length of $1 in LENGTH.
LENGTH=${#1}
echo ${1[2,${LENGTH}]} # Display from `2` to `LENGTH`.
This'll produce ello (prints from the 2nd to the last character of hello).
Script to play with:
#!/usr/local/bin/zsh
echo ${1} # Print the input
echo ${1[3,5]} # Print from 3rd->5th characters of input
LENGTH=${#1}
echo ${1[2,${LENGTH}]} # Print from 2nd -> last characters of input.
You can use the cut command:
But that would be using extra baggage - zsh is quite capable of doing all this on it's own without spawning multiple sub-shells for simplistic operations.

UNIX command line argument referencing issues

I'm trying to tell unix to print out the command line arguments passed to a Bourne Shell script, but it's not working. I get the value of x at the echo statement, and not the command line argument at the desired location.
This is what I want:
./run a b c d
a
b
c
d
this is what I get:
1
2
3
4
What's going on? I know that UNIX is confused as per what I'm referencing in the shell script (the variable x or the command line argument at the x'th position". How can I clarify what I mean?
#!/bin/sh
x=1
until [ $x -gt $# ]
do
echo $x
x=`expr $x + 1`
done
EDIT: Thank you all for the responses, but now I have another question; what if you wanted to start counting not at the first argument, but at the second, or third? So, what would I do to tell UNIX to process elements starting at the second position, and ignore the first?
echo $*
$x is not the xth argument. It's the variable x, and expr $x+1 is like x++ in other languages.
The simplest change to your script to make it do what you asked is this:
#!/bin/sh
x=1
until [ $x -gt $# ]
do
eval "echo \${$x}"
x=`expr $x + 1`
done
HOWEVER (and this is a big however), using eval (especially on user input) is a huge security problem. A better way is to use shift and the first positional argument variable like this:
#!/bin/sh
while [ $# -gt 0 ]; do
x=$1
shift
echo ${x}
done
If you want to start counting a the 2nd argument
for i in ${#:2}
do
echo $i
done
A solution not using shift:
#!/bin/sh
for arg in "$#"; do
printf "%s " "$arg"
done
echo

Resources