We are having a discussion at work, what is the best UNIX command tool that to view log files. One side says use LESS, the other says use MORE. Is one better than the other?
A common problem is that logs have too many processes writing to them, I prefer to filter my log files and control the output using:
tail -f /var/log/<some logfile> | grep <some identifier> | more
This combination of commands allows you to watch an active log file without getting overwhelmed by the output.
I opt for less. A reason for this is that (with aid of lessopen) it can read gzipped log (as archived by logrotate).
As an example with this single command I can read in time ordered mode dpkg log, without treating differently gzipped ones:
less $(ls -rt /var/log/dpkg.log*) | less
Multitail is the best option, because you can view multiple logs at the same time. It also colors stuff, and you can set up regex to highlight entries you're looking for.
You can use any program: less, nano, vi, tail, cat etc, they differ in functionality.
There are also many log viewers: gnome-system-log, kiwi etc (they can sort log by date / type etc)
Less is more. Although since when I'm looking at my logs I'm typically searching for something specific or just interested in the last few events I find myself using cat, pipes and grep or tail rather than more or less.
less is the best, imo. It is light weight compared to an editor, it allows forward and backward navigation, it has powerful search capabilities, and many more things. Hit 'h' for help. It's well worth the time getting familiar with it.
On my Mac, using the standard terminal windows, there's one difference between less and more, namely, after exiting:
less leaves less mess on my screen
more leaves more useful information on my screen
Consequently, if I think I might want to do something with the material I'm viewing after the viewer finishes (for example, copy'n'paste operations), I use more; if I don't want to use the material after I've finished, then I use less.
The primary advantage of less is the ability to scroll backwards; therefore, I tend to use less rather than more, but both have uses for me. YMMV (YMWV; W = Will in this case!).
As your question was generically about 'Unix systems', keep into account that
in some cases you have no choice, for old systems you have only MORE available,
but not LESS.
LESS is part of the GNU tools, MORE comes from the UCB times.
Turn on grep's line buffering mode.
Using tail (Live monitoring)
tail -f fileName
Using less (Live monitoring)
less +F fileName
Using tail & grep
tail -f fileName | grep --line-buffered my_pattern
Using less & grep
less +F fileName | grep --line-buffered my_pattern
Using watch & tail to highlight new lines
watch -d tail fileName
Note: For linux systems.
Related
I am really amazed by the functionality of GREP in shell, earlier I used to use substring method in java but now I use GREP for it and it executes in a matter of seconds, it is blazingly faster than java code that I used to write.(according to my experience I might be wrong though)
That being said I have not been able to figure out how it is happening? there is also not much available on the web.
Can anyone help me with this?
Assuming your question regards GNU grep specifically. Here's a note from the author, Mike Haertel:
GNU grep is fast because it AVOIDS LOOKING AT EVERY INPUT BYTE.
GNU grep is fast because it EXECUTES VERY FEW INSTRUCTIONS FOR EACH
BYTE that it
does look at.
GNU grep uses the well-known Boyer-Moore algorithm, which looks first
for the final letter of the target string, and uses a lookup table to
tell it how far ahead it can skip in the input whenever it finds a
non-matching character.
GNU grep also unrolls the inner loop of Boyer-Moore, and sets up the
Boyer-Moore delta table entries in such a way that it doesn't need to
do the loop exit test at every unrolled step. The result of this is
that, in the limit, GNU grep averages fewer than 3 x86 instructions
executed for each input byte it actually looks at (and it skips many
bytes entirely).
GNU grep uses raw Unix input system calls and avoids copying data
after reading it. Moreover, GNU grep AVOIDS BREAKING THE INPUT INTO
LINES. Looking for newlines would slow grep down by a factor of
several times, because to find the newlines it would have to look at
every byte!
So instead of using line-oriented input, GNU grep reads raw data into
a large buffer, searches the buffer using Boyer-Moore, and only when
it finds a match does it go and look for the bounding newlines
(Certain command line options like
-n disable this optimization.)
This answer is a subset of the information taken from here.
To add to Steve's excellent answer.
It may not be widely known but grep is almost always faster when grepping for a longer pattern-string than a short one, because in a longer pattern, Boyer-Moore can skip forward in longer strides to achieve even better sublinear speeds:
Example:
# after running these twice to ensure apples-to-apples comparison
# (everything is in the buffer cache)
$ time grep -c 'tg=f_c' 20140910.log
28
0.168u 0.068s 0:00.26
$ time grep -c ' /cc/merchant.json tg=f_c' 20140910.log
28
0.100u 0.056s 0:00.17
The longer form is 35% faster!
How come? Boyer-Moore consructs a skip-forward table from the pattern-string, and whenever there's a mismatch, it picks the longest skip possible (from last char to first) before comparing a single char in the input to the char in the skip table.
Here's a video explaining Boyer Moore (Credit to kommradHomer)
Another common misconception (for GNU grep) is that fgrep is faster than grep. f in fgrep doesn't stand for 'fast', it stands for 'fixed' (see the man page), and since both are the same program, and both use Boyer-Moore, there's no difference in speed between them when searching for fixed-strings without regexp special chars. The only reason I use fgrep is when there's a regexp special char (like ., [], or *) I don't want it to be interpreted as such. And even then the more portable/standard form of grep -F is preferred over fgrep.
For my developer work I reside in the *nix shell environment pretty much all day, but still can't seem to memorize the name and argument specifics of programs I don't use daily. I wonder how other 'casual amnesiacs' handle this. Do you maintain an big cheat sheet? Do you rehearse the emacs shortcuts when you take your weekly shower? Or is your desk covered under sticky notes?
Using bash_completion is one way of not having to remember the precise syntax of program arguments.
> svn [tab][tab]
--help checkout delete lock pdel propget revert
--version ci diff log pedit proplist rm
-h cleanup export ls pget propset status
add co help merge plist pset switch
annotate commit import mkdir praise remove unlock
blame copy info move propdel rename update
cat cp list mv propedit resolved
If I don't use a command regularly enough to remember what I want, I tend to just use --help or the man pages when I need to.
Or, if I'm lucky, I use CTRL+R and let bash's history search find when I last used it.
Eventually you just remember them, well the set that you use anyway. I used to maintain a README in my home directory when I was starting out but that disappeared many years ago.
One useful command is man -k which you pass a word to and it will return a list of all commands whose man page summary contains that word.
'apropos' is also a very useful command. It will list all commands whose man pages contain the keyword.
I have a text file (more correctly, a “German style“ CSV file, i.e. semicolon-separated, decimal comma) which has a date and the value of a measurement on each line.
There are stretches of faulty values which I want to remove before further work. I'd like to store these cuts in some script so that my corrections are documented and I can replay those corrections if necessary.
The lines look like this:
28.01.2005 14:48:38;5,166
28.01.2005 14:50:38;2,916
28.01.2005 14:52:38;0,000
28.01.2005 14:54:38;0,000
(long stretch of values that should be removed; could also be something else beside 0)
01.02.2005 00:11:43;0,000
01.02.2005 00:13:43;1,333
01.02.2005 00:15:43;3,250
Now I'd like to store a list of begin and end patterns like 28.01.2005 14:52:38 + 01.02.2005 00:11:43, and the script would cut the lines matching these begin/end pairs and everything that's between them.
I'm thinking about hacking an awk script, but perhaps I'm missing an already existing tool.
Have a look at sed:
sed '/start_pat/,/end_pat/d'
will delete lines between start_pat and end_pat (inclusive).
To delete multiple such pairs, you can combine them with multiple -e options:
sed -e '/s1/,/e1/d' -e '/s2/,/e2/d' -e '/s3/,/e3/d' ...
Firstly, why do you need to keep a record of what you have done? Why not keep a backup of the original file, or take a diff between the old & new files, or put it under source control?
For the actual changes I suggest using Vim.
The Vim :global command (abbreviated to :g) can be used to run :ex commands on lines that match a regex. This is in many ways more powerful than awk since the commands can then refer to ranges relative to the matching line, plus you have the full text processing power of Vim at your disposal.
For example, this will do something close to what you want (untested, so caveat emptor):
:g!/^\d\d\.\d\d\.\d\d\d\d/ -1 write tmp.txt >> | delete
This matches lines that do NOT start with a date (the ! negates the match), appends the previous line to the file tmp.txt, then deletes the current line.
You will probably end up with duplicate lines in tmp.txt, but they can be removed by running the file through uniq.
you are also use awk
awk '/start/,/end/' file
I would seriously suggest learning the basics of perl (i.e. not the OO stuff). It will repay you in bucket-loads.
It is fast and simple to write a bit of perl to do this (and many other such tasks) once you have grasped the fundamentals, which if you are used to using awk, sed, grep etc are pretty simple.
You won't have to remember how to use lots of different tools and where you would previously have used multiple tools piped together to solve a problem, you can just use a single perl script (usually much faster to execute).
And, perl is installed on virtually every unix/linux distro now.
(that sed is neat though :-)
use grep -L (print none matching lines)
Sorry - thought you just wanted lines without 0,000 at the end
I'm trying to use the Unix command paste, which is like a column-appending form of cat, and came across a puzzle I've never known how to solve in Unix.
How can you use the outputs of two different programs as the input for another program (without using temporary files)?
Ideally, I'd do this (without using temporary files):
./progA > tmpA;
./progB > tmpB; paste tmpA tmpB
This seems to come up relatively frequently for me, but I can't figure out how to use the output from two different programs (progA and progB) as input to another without using temporary files (tmpA and tmpB).
For commands like paste, simply using paste $(./progA) $(./progB) (in bash notation) won't do the trick, because it can read from files or stdin.
The reason I'm wary of the temporary files is that I don't want to have jobs running in parallel to cause problems by using the same file; ensuring a unique file name is sometimes difficult.
I'm currently using bash, but would be curious to see solutions for any Unix shell.
And most importantly, am I even approaching the problem in the correct way?
Cheers!
You do not need temp files under bash, try this:
paste <(./progA) <(./progB)
See "Process Substitution" in the Bash manual.
Use named pipes (FIFOs) like this:
mkfifo fA
mkfifo fB
progA > fA &
progB > fB &
paste fA fB
rm fA fB
The process substitution for Bash does a similar thing transparently, so use this only if you have a different shell.
Holy moly, I recent found out that in some instances, you can get your process substitution to work if you set the following inside of a bash script (should you need to):
set +o posix
http://www.linuxjournal.com/content/shell-process-redirection
From link:
"Process substitution is not a POSIX compliant feature and so it may have to be enabled via: set +o posix"
I was stuck for many hours, until I had done this. Here's hoping that this additional tidbit will help.
Works in all shells.
{
progA
progB
} | paste
It seems like the only way to do this is to pass the -i parameter in when you initially run less. Does anyone know of some secret hack to make something like this work
/something to search for/i
You can also type command -I while less is running. It toggles case sensitivity for searches.
You can also set the environment variable LESS
I use LESS=-Ri, so that I can pump colorized output from grep into it, and maintain the ANSI colour sequences.
Another little used feature of less that I found is starting it with +F as an argument (or hitting SHIFT+F while in less). This causes it to follow the file you've opened, in the same way that tail -f <file> will. Very handy if you're watching log files from an application, and are likely to want to page back up (if it's generating 100's of lines of logging every second, for instance).
Add-on to what #Juha said: Actually -i turns on Case-insensitive with SmartCasing, i.e if your search contains an uppercase letter, then the search will be case-sensitive, otherwise, it will be case-insensitive. Think of it as :set smartcase in Vim.
E.g.: with -i, a search for 'log' in 'Log,..' will match, whereas 'Log' in 'log,..' will not match.
It appears that you can summon this feature on a per search basis like so:
less prompt> /search string/-i
This option is in less's interactive help which you access via h:
less prompt> h
...
-i ........ --ignore-case
Ignore case in searches that do not contain uppercase.
-I ........ --IGNORE-CASE
Ignore case in all searches.
...
I've not extensively checked but the help in less version 487 on MacOS as well as other Linux distros lists this option as being available.
On MacOS you can also install a newer version of less via brew:
$ brew install less
$ less --version
less 530 (POSIX regular expressions)
Copyright (C) 1984-2017 Mark Nudelman
References
less is always case-insensitive
When using -i flag, be sure to enter the search string completely in lower case, because if any letter is upper case, then its an exact match.
See also: the -I (capital i) flag of less(1) to change this behavior.