Trying to write a simple script to find the latest modified file from a directory and then count the lines of that modified file. Below is part of my script.
Note: the $stg variable is created for another directory
echo "LATEST LOG = $(ls -ltr $stg/abc/foo***.txt | awk '{print $5, $6, $7, $8, $9}' | tail -n1)"
echo "COUNT = $(wc -l $stg/abc/foo***.txt | tail -n1)"
What happens on the "COUNT" part is that it does not match the count of the LATEST LOG because it seems to be counting a different log file.
Any suggestions? Thank you!
Suggestion: store the result of the latest log in a variable, and reuse it in the count. Like this:
#!/bin/bash
latestlogline=$(ls -ltr foo*.txt | awk '{print $5, $6, $7, $8, $9}' | tail -n1)
latestlogfilename=$(echo $latestlogline | awk 'NF>1{print $NF}')
echo "LATEST LOG = $(echo $latestlogline)"
echo "COUNT = $(wc -l $latestlogfilename)"
Details:
latestlogline: your code exactly, to extract the complete line of information
latestlogfilename: just the filename. wc -l expects a filename, so extract it from your first command.
Then just echo the variable values.
As commented before, *** is exactly the same thing as *.
Related
New to Unix and I'm trying to fetch files from a directory having current date.
Tried below command, but it fetches some other file instead
cd /path/; ls -lrt abc833* | grep `date '+%d'`
Also I want to try something like below but it doesn't work
for file in /path/abc833*
if [ `$file | awk '{print $7}'` =`date '+%d'`];then
echo $file
fi
done
What's the mistake?
Why not use find?
find ./ -ctime 1
returns all files created in last 24 hours. You also forgot to wrap date:
cd /path/; ls -lrt abc833* | grep $(date '+%d')
%d only gives number of day in month today would be "28". that would also match "20:28" or 28th of last month.
EDIT:
Syntax errors were in your first post. You wrapped the date command correctly.
Your second approach is full of syntax errors. And you are trying to execute each file to pass its output to awk => You forgot a ls -l
But same thought error for date there. stat -c %Y <file> gives you the modification time of a file in seconds since epoch, which is maybe easier to calculate.
cd /path/; ls -lrt abc833* | sed 1d | tr -s ' '|cut -d' ' -f9|grep $(date '+%d')
You can do all the logic in awk:
ls -ltr | awk '{date=strftime("%d"); if($7==date){f="";for(i=9;i<=NF;i++){f=f" "$i} print f}}'
If your file name does not contain spaces it can be simplified:
ls -ltr | awk '{date=strftime("%d"); if($7==date){print $9}}'
And if instead of the file name you want the whole line from ls -ltr
ls -ltr | awk '{date=strftime("%d"); if($7==date){print $0}}'
I am trying to unzip files in folder and print first line LASTMODIFIEDDATE
But the below will print First line with '-'
for file in /export/home/xxxxxx/New_folder/*.gz;
do
gzip -dc "$file" | awk 'NR=1 {print $0, FILENAME}' | awk '/LASTMODIFIEDDATE/'
done
1.How can i modify the above code to print filename that is unzipped.
2.I am a beginner and suggestion to improve the above code are welcome
A few issues:
Your first awk should have double equals signs if you mean to address the first line:
awk 'NR==1{...}'
Your second awk will only ever see the output of the first awk, which only shows the first line, so you will not see any lines with LASTMODIFIED in them unless they are the first. So this will show you the first line and any lines containing LASTMODIFIED.
for ...
do
echo $file
gzip -dc "$file" | awk 'NR==1 || /LASTMODIFIED/'
done
Or you may mean this:
for ...
do
gzip -dc "$file" | awk -v file="$file" 'NR==1{print $0 " " file} /LASTMODIFIED/'
done
which will print the first line followed by the filename and also any lines containing LASTMODIFIED.
Do this with an echo. Also you might want to use grep instead of awk in this case.
for file in /export/home/xxxxxx/New_folder/*.gz;
do
echo $file
gzip -dc "$file" | grep LASTMODIFIEDDATE
done
I've always used Stack Overflow to get help with issues but this is my first post. I am new to UNIX scripting and I was given a task to get values of column two and then run a command on them. The command I am suppose to run is 'echo -n "$2" | openssl dgst -sha1;' which is a function to hash a value. My problem is not hashing one value, but hashing them all and then printing them. Can someone maybe help me figure this out? This is how I am starting but I think the path I am going is wrong.
NOTE: this is a CSV text file and I know I need to use AWK command for this.
awk 'BEGIN { FS = "," } ; { print $2 }'
while [ "$2" != 0 ];
do
echo -n "$2" | openssl dgst -sha1
done
This prints the second column in it's entirety and also print some type of hashed value.
Sorry for the long first post, just trying to be as specific as possible. Thanks!
You don't really need awk just for extracting the second column. You can do by using bash read built in and setting the IFS to the delimiter.
while IFS=, read -ra line; do
[[ ${line[1]} != 0 ]] && echo "${line[1]}" | openssl dgst -sha1
done < inputFile
You should probably post some sample input data and the error you are getting so that someone can debug your existing code better.
This will do the trick:
$ awk '{print $2}' file | xargs -n1 openssl dgst -sha1
Use awk to print the second field in the file and xargs with the -n1 to pass each record separately to openssl.
If by CSV you mean each record is seperated by a comma then you need to add -F, to awk.
$ awk -F, '{print $2}' file | xargs -n1 openssl dgst -sha1
Is there a program that would result in output like "wc -l" but that would update counter on more data.
Here is what I want it for:
tail -f log/production.log | grep POST | wc -l
But wc -l should be changed for something.
tail -f log/production.log | grep --line-buffered POST | awk '{printf "\r%d", ++i} END {print ""}'
This prints the line count after every line of input. The carriage return \r makes each line number overwrite the last, so you only see the most recent one.
Use grep --line-buffered to make grep flush its output after each line rather than every 4KB. Or you can combine the grep and awk into one:
tail -f log/production.log | awk '/POST/ {printf "\r%d", ++i} END {print ""}'
In order to use the uniq command, you have to sort your file first.
But in the file I have, the order of the information is important, thus how can I keep the original format of the file but still get rid of duplicate content?
Another awk version:
awk '!_[$0]++' infile
This awk keeps the first occurrence. Same algorithm as other answers use:
awk '!($0 in lines) { print $0; lines[$0]; }'
Here's one that only needs to store duplicated lines (as opposed to all lines) using awk:
sort file | uniq -d | awk '
FNR == NR { dups[$0] }
FNR != NR && (!($0 in dups) || !lines[$0]++)
' - file
There's also the "line-number, double-sort" method.
nl -n ln | sort -u -k 2| sort -k 1n | cut -f 2-
You can run uniq -d on the sorted version of the file to find the duplicate lines, then run some script that says:
if this_line is in duplicate_lines {
if not i_have_seen[this_line] {
output this_line
i_have_seen[this_line] = true
}
} else {
output this_line
}
Using only uniq and grep:
Create d.sh:
#!/bin/sh
sort $1 | uniq > $1_uniq
for line in $(cat $1); do
cat $1_uniq | grep -m1 $line >> $1_out
cat $1_uniq | grep -v $line > $1_uniq2
mv $1_uniq2 $1_uniq
done;
rm $1_uniq
Example:
./d.sh infile
You could use some horrible O(n^2) thing, like this (Pseudo-code):
file2 = EMPTY_FILE
for each line in file1:
if not line in file2:
file2.append(line)
This is potentially rather slow, especially if implemented at the Bash level. But if your files are reasonably short, it will probably work just fine, and would be quick to implement (not line in file2 is then just grep -v, and so on).
Otherwise you could of course code up a dedicated program, using some more advanced data structure in memory to speed it up.
for line in $(sort file1 | uniq ); do
grep -n -m1 line file >>out
done;
sort -n out
first do the sort,
for each uniqe value grep for the first match (-m1)
and preserve the line numbers
sort the output numerically (-n) by line number.
you could then remove the line #'s with sed or awk