I am new to UNIX commands. I am trying to list the number of different years of all files in /etc based upon each files modification date(year).
I am playing around with variations of:
ls -lt /etc | sort | uniq -c
I realise that this only counts each unique file. I want to list the different years.
Can anyone help guide me in the right direction? Thanks.
Try this
ls -lT * | awk ' { print $9 }' | sort | uniq -c
Mac
ls -lT * | awk ' { print $9 }' | sort | uniq -c
Linux
ls -l --time-style +"%Y" * | awk ' { print $6 }' | sort | uniq -c
Related
I am currenlty trying to extract all the sender domains from maillog. I am able to do some of that with the below command but the output is not quite what I desired. What would be the best approach to retrieve a unique list of sender domain from maillog?
grep from= /var/log/maillog | awk '{print $7}' | sort | uniq -c | sort -n
output
1 from=<user#test.com>,
1 from=<apache#app1.com>,
2 from=<bounceld_5BFa-bx0p-P3tQ-67Nn#example.com>,
2 from=<bounceld_19iI-HqaS-usVU-fqe5#example.com>,
12 reject:
666 from=<>,
desired output
test.com
app1.com
example.com
See useless use of grep; if you are using Awk anyway, you don't really need grep at all.
awk '$7 ~ /from=.*#/{split($7, a, /#/); ++count[a[2]] }
END { for(dom in count) print count[dom], dom }' /var/log/maillog
Collecting the counts in an associative array does away with the need to call sort and uniq, too. Obviously, if you don't care about the count, don't print count[dom] at the end.
This should give you the answer:
grep from= /var/log/maillog | awk '{print $7}' | grep -Po '(?=#).{1}\K.*(?=>)' | sort -n | uniq -c
... change last items to "| sort | uniq" to remove the counts.
References:
https://www.baeldung.com/linux/bash-remove-first-characters {1}\K use
Extract email addresses from log with grep or sed -Po grep function
I think I'm close on this, and saw similar questions but couldn't get it to work as I want. So, I have several log files and I would like to count the occurrences of several different service calls by date.
First I tried the below, the cut is just to get the first element (date) and 11th element (name of service call), which is specific to my log file:
grep -E "invoking webservice" *.log* | cut -d ' ' -f1 -f11 | sort | uniq -c
But this returned something that looks like:
5 log_1.log:2017-12-05 getLegs()
10 log_1.log:2017-12-05 getArms()
7 log_2.log:2017-12-05 getLegs()
13 log_2.log:2017-12-04 getLegs()
What I really want is:
12 2017-12-05 getLegs()
10 2017-12-05 getArms()
13 2017-12-04 getLegs()
I've seen examples where they cat * first, but looks like the same problem.
cat * | grep -E "invoking webservice" *.log* | cut -d ' ' -f1 -f11 | sort | uniq -c
What am I doing wrong? As always, thanks a lot!
Your issue seems to be that grep prefixes the matched lines with the filenames. (grep has this behavior when multiple filenames are specified, to disambiguate the results.) You can pass the -h to grep to not print the filenames:
grep -h "invoking webservice" *.log | cut -d ' ' -f1 -f11 | sort | uniq -c
Note that I dropped the -E flag, because it is used to enable extended regex support, and your example doesn't need it.
Alternatively, you could use cat to dump the content of files to standard output, and pipe that to grep. That would work, because it removes the need for filename parameters for grep:
cat *.log | grep "invoking webservice" | cut -d ' ' -f1 -f11 | sort | uniq -c
I want to sort the owners in alphabetical order from a call to ls -l and cannot figure out a way to do it. I know something like ls-l | sort would sort the file name but how do i sort the owners in order?
The owner is the third field, so use -k 3:
ls -l | sort -k 3
You can extend this idea to sorting based on other fields, and you can have multiple -k options. For instance, maybe you want to sort by owner, and then size in descending order:
ls -l | sort -k 3,3 -k 5rn
I am not sure if you want only the owners or the whole information sorted by owner. In the former case superfo's solution is almost correct.
Additionally you need to remove repeating white spaces from ls's output with tr because otherwise cut that uses them as a delimiter won't work in all directories.*
So in the end you get this:
ls -l | tr -s ' ' | cut -d ' ' -f 3 | sort | uniq
*Some directories have a two digit value in the second field and all other lines with a single digit get an additional whitespace to preserve the layout.
How about ...
ls -l | cut -d ' ' -f 3 | sort | uniq
Try this:
ls -l | awk '{print $3, $4, $8}' | sort
It will print the user name, the group name and the file name. (File name cannot contain spaces)
ls -l | awk '{print $3, $4, $0}' | sort
This will print the user name, group name and the full ls -l output, sorted by the user name first, then the group name, then what ls -l prints first
Updated :
Thanks for your answers. As I said, my q was just a translation of my usecase.
Let me get into more details of what I want to achieve.
In my dev env, we use "ade" as our version control system what I want to do is :
ade describetrans | awk '/myapps/{ print $2 }' | sort -fr | xargs -iF ade unbranch F
Now, every single time I run the unbranch command, a new file/dir gets checked out. so I need to run ade checkin -all after all my unbranch commands. So I needed something like
"pre part till sort" | xargs -iF (ade unbranch + ade checkin -all)
Any way to run 2 commands on the op of pipe ?
Thanks
Original question asked :
I can translate the usecase I have into the following :
I need to get the 1st line of a file. I do
cat file | head -1
Now I want to do this on a list of files. How do I do the following in one unix command ??
Eg :
find . -name "*.log" | ( xargs -iF cat F | head -1 )
Obviously the brackets in the above command do not work.
is there a way to pipe the output of the find command and do 2 commands on it ( cat and head ) ? Tried using ; and && but dint help.
I can create a script - but wanted to do this in one command.
Again - this is just a translation of the case I have.
thanks
Rohan
First of all, head accepts more than one file name, so you can simply write:
head -q -n 1 *.log
or:
find . -name '*.log' -exec head -n 1 '{}' ';'
However, if you really need to duplicate the stream, then use tee and do something along the lines of:
wget -O - http://example.com/dvd.iso | tee >(sha1sum > dvd.sha1) > dvd.iso
This example is taken from info coreutils 'tee invocation'.
UPDATE (Following the ade comment) In your case tee will not work. You need to perform a task after another task finishes, and tee will trigger the tasks more or less simultaneously (modulo buffering).
Another approach will work, provided that there are no spaces in the input lines (a serious issue, I know, but I'm not sure how to overcome it right now). I'll start with a generic solution:
echo -e 'Foo\nBar\nBaz' | ( for i in `cat` ; do echo 1$i ; echo 2$i ; done )
Here, echo -e 'Foo\nBar\nBaz' creates a sample multi-line input, echo 1$i is the first command to run, echo 2$i is the second command to run.
I'm not familiar with ade and I don't have it installed on my system, but I'm guessing that in your case something like this might work:
ade describetrans | awk '/myapps/{ print $2 }' | sort -fr | ( for i in `cat` ; do ade unbranch $i ; ade checkin -all $i ; done )
Useless use of cat. Simply pass the file names to head:
find . -name '*.log' -exec head -n 1 '{}' '+'
or
find . -name '*.log' -print0 | xargs -0 head -n 1
EDIT: This will print headers for each file. On Linux, this can be suppressed with -q, but on other systems you must make sure that head is called with a single argument:
find . -name '*.log' -exec head -n 1 '{}' ';'
You're making this much more complicated than it needs to be, by piping input into head rather than simply giving it a filename(s):
head -n1 `find -X . -name '*.log' | xargs`
If you don't need to traverse subdirectories, you can make it even simpler:
head -n1 *.log
You can screen out the filename headers by piping through grep: | grep -v '^(==> .* <==)?$' | grep -v '^$'
Try
ade describetrans | awk '/myapps/{ print $2 }' | sort -fr | xargs sh -c 'ade unbranch "$#"; ade checkin -all "$#"' arg0
This is assuming that ade accepts multiple files at once and ade checkin -all needs to be called for every file.
The string arg0 supplies the value of $0 in the -c string.
I do not know ade, so I will have to guess what the 2 commands want
to run really are. But if you have GNU Parallel
http://www.gnu.org/software/parallel/ installed one of these should
work:
ade describetrans | awk '/myapps/{ print $2 }' | sort -fr |
parallel -j1 ade unbranch {} ";" ade checkin -all {}
ade describetrans | awk '/myapps/{ print $2 }' | sort -fr |
parallel -j1 ade unbranch {} ";" ade checkin -all
ade describetrans | awk '/myapps/{ print $2 }' | sort -fr |
parallel -j1 ade unbranch {} ; ade checkin -all
If ade can be run in parallel, you can remove -j1.
Watch the intro video for GNU Parallel to learn more:
http://www.youtube.com/watch?v=OpaiGYxkSuQ
In order to use the uniq command, you have to sort your file first.
But in the file I have, the order of the information is important, thus how can I keep the original format of the file but still get rid of duplicate content?
Another awk version:
awk '!_[$0]++' infile
This awk keeps the first occurrence. Same algorithm as other answers use:
awk '!($0 in lines) { print $0; lines[$0]; }'
Here's one that only needs to store duplicated lines (as opposed to all lines) using awk:
sort file | uniq -d | awk '
FNR == NR { dups[$0] }
FNR != NR && (!($0 in dups) || !lines[$0]++)
' - file
There's also the "line-number, double-sort" method.
nl -n ln | sort -u -k 2| sort -k 1n | cut -f 2-
You can run uniq -d on the sorted version of the file to find the duplicate lines, then run some script that says:
if this_line is in duplicate_lines {
if not i_have_seen[this_line] {
output this_line
i_have_seen[this_line] = true
}
} else {
output this_line
}
Using only uniq and grep:
Create d.sh:
#!/bin/sh
sort $1 | uniq > $1_uniq
for line in $(cat $1); do
cat $1_uniq | grep -m1 $line >> $1_out
cat $1_uniq | grep -v $line > $1_uniq2
mv $1_uniq2 $1_uniq
done;
rm $1_uniq
Example:
./d.sh infile
You could use some horrible O(n^2) thing, like this (Pseudo-code):
file2 = EMPTY_FILE
for each line in file1:
if not line in file2:
file2.append(line)
This is potentially rather slow, especially if implemented at the Bash level. But if your files are reasonably short, it will probably work just fine, and would be quick to implement (not line in file2 is then just grep -v, and so on).
Otherwise you could of course code up a dedicated program, using some more advanced data structure in memory to speed it up.
for line in $(sort file1 | uniq ); do
grep -n -m1 line file >>out
done;
sort -n out
first do the sort,
for each uniqe value grep for the first match (-m1)
and preserve the line numbers
sort the output numerically (-n) by line number.
you could then remove the line #'s with sed or awk