Piping program output to less does not display beginning of the output - unix

I am trying to make a bunch of files in my directory, but the files are generating ~200 lines of errors, so they fly past my terminal screen too quickly and I have to scroll up to read them.
I'd like to pipe the output that displays on the screen to a pager that will let me read the errors starting at the beginning. But when I try
make | less
less does not display the beginning of the output - it displays the end of the output that's usually piped to the screen, and then tells me the output is 1 line long. When I try typing Gg, the only line on the screen is the line of the makefile that executed, and the regular screen output disappears.
Am I using less incorrectly? I haven't really ever used it before, and I'm having similar problems with something like, sh myscript.sh | less where it doesn't immediately display the beginning of the output file.

The errors from make appear on the standard error stream (stderr in C), which is not redirected by normal pipes. If you want to have it redirected to less as well, you need either make |& less (csh, etc.) or make 2>&1 | less (sh, bash, etc.).

Error output is sent to a slightly different place which isn't caught by normal pipelines, since you often want to see errors but not have them intermixed with data you're going to process further. For things like this you use a redirection:
$ make 2>&1 | less
In bash and zsh (and csh/tcsh, which is where they borrowed it from) this can be shortened to
$ make |& less
With things like make which are prone to produce lots of errors I will want to inspect later, I generally capture the output to a file and then less that file later:
$ make |& tee make.log
$ less make.log

Related

The best way in Unix to add a header to multiple files in a directory?

Before anyone else checks, I am confident this is not a duplicate of the existing question of how to add a header in Unix to multiple files (the question is here: Adding header into multiple text files). This is more about optimisation of a solution I am currently using for this current issue.
I have numerous directories in which I have over 20000 files and for each file I want to add the same header.
What I have been doing is:
sed -i '1ichr\tpos\tref\talt\treffrq\tinfo\trs\tpval\teffalt\tgene' *.txt
Now, this does work exactly as I want it to, but there have been a couple of issues.
First is that this seems to be an extremely slow method of doing this and it can take a pretty long time to get through all 20K+ files.
Second, and more frustratingly, occasionally my connection to the server I am using has timed out during this long process meaning that the command won't finish running, so I end up with half the files having the header and half not. And if I started from the top again this would mean a number of the files would have the header twice so I actually have to go through a process of creating them again so I can add the header all at once.
So, what I am wondering is if there is a better/quicker solution to this problem. The question I linked above seems like it would actually be slower (given that it seems like there is more the command line needs to do at each file as it is going through a loop) and so doesn't seem like it would fix this.
Don't use -i. It confuses things when you get interrupted. Instead, use
mkdir -p ../output-dir
for file in *.txt; do
sed '1ichr\tpos\tref\talt\treffrq\tinfo\trs\tpval\teffalt\tgene' "$file" > ../output-dir/"$file"
done
When you're done, you can rename the directories if you wish. This doesn't address the connection issue (ThoriumBR's suggestion of nohup is good for that), but when it happens you can recover state more easily.
First, adding a header is slow. You have to move the entire file contents to add something at the start. Adding a trailer would be very fast.
Second, use nohup:
nohup - run a command immune to hangups, with output to a non-tty
Using nohup sed -i '1ichr\tpos\tref\talt\treffrq\tinfo\trs\tpval\teffalt\tgene' *.txt will keep the command running on the background even if the server times you out.

How can I tail -f but only in whole lines?

I have a constantly updating huge log file (MainLog).
I want to create another file which is only the last n lines of the log file BUT also updating.
If I use:
tail -f MainLog > RecentLog
I get ALMOST what I want except RecentLog is written as MainLog is available and might at any point only have part of the last MainLog line.
How can I specify to tail that I only want it to write when a WHOLE line is available?
By default, tail outputs whole lines unless you use the -c switch to count characters. Something like
tail -n 20 -f MainLog > RecentLog
(substituting the number of lines you want prepended to the second file for "20") should work as you want.
But if if doesn't, it is possible that using grep to line-buffer your output will fix this condition. See this question.
After many attempts, the only solution for multiple files that worked (fantastically well) for me is the fdlinecombine command. It's a small binary that reads multiple file descriptors and prints data to stdout linewise.
My use case is spawning multiple long-running ssh commands in the background and following their output, without having the lines garbled or interrupted in between.

Vim execute a command and send out buffer over stdout [duplicate]

This question already has answers here:
Redirect ex command to STDOUT in vim
(3 answers)
Closed 9 years ago.
Here's how you can automate vim in an interesting way:
vim -c '0,$d | r source.txt | 1d | w | q' dest.txt
This uses vim ex commands to erase dest.txt, read source.txt into the buffer, erase the first line (which ends up as a blank line due to the way r works), write to the file (dest.txt), and then quit.
This (as far as I can tell) skips the entire vim terminal UI from loading and is conceptually a little like having a vimscript interpreter.
Now I'd love to be able to take this just one little step further to abuse the capabilities of vim: I want a script to peer at the currently edited changes of an opened file (as part of an interactive automation shell script) which exist in the vim *.swp swapfiles, apply the changes through vim's recover command, and then obtain the output.
Of course it would be perfectly serviceable to use an actual file, e.g. orig_file.txt is being edited in vim in another terminal; my script could do this at each point that the swapfile is detected to change:
cp orig_file.txt orig_file_ephemeral.txt
cp .orig_file.txt.swp .orig_file.txt_ephemeral.txt.swp
vim -c 'recover | w | q'
At this point orig_file_ephemeral.txt shall contain the content of the vim buffer from the other process in which editing is taking place, and we obtained this data without requiring any direct interaction with said process. Which is neat.
Of course for practical purposes it would probably make more sense to do exactly that, and just have the primary vim participate in the process. It would be splitting the functionality for the script out into the configuration of vim, which is a downside, but it would be more straightforward conceptually and computationally as it already has the buffer contents readily available for writing, and it should be straightforward to do so as I believe there exists an autocommand we can use (though whether that autocommand is run prior to saving the swapfile or not remains to be seen).
Either way, for the sake of completeness I'm curious to know if there exists an ex command to write stuff to the STDOUT of vim. Or if this even makes any sense.
I think it perhaps makes no sense as STDOUT is bound to be the actual terminal, e.g. it is where vim sends out its "view" of its UI and the buffer, and everything, to the terminal. So that for example if any of the vim -c 'vimscript commands' commands produce vim errors, I'll be seeing vim's terminal output to display these errors over STDOUT.
Therefore it may only be practical to use a file. But maybe there's some kind of craziness like !tee /dev/fd/3 I could do?
In addition, there is a wrinkle with this roundabout approach, which is that vim presents a Warning: Original file may have been changed error in bright red background text for about a second, and this is surely due to renaming the file. I can likely work around that by doing this work inside of a sub-directory while keeping the filenames identical.
That's the p command (and where the p in grep comes from):
ex -sc '%p|q' file
Would be a bit like cat file.

Viewing Unix Log Files

We are having a discussion at work, what is the best UNIX command tool that to view log files. One side says use LESS, the other says use MORE. Is one better than the other?
A common problem is that logs have too many processes writing to them, I prefer to filter my log files and control the output using:
tail -f /var/log/<some logfile> | grep <some identifier> | more
This combination of commands allows you to watch an active log file without getting overwhelmed by the output.
I opt for less. A reason for this is that (with aid of lessopen) it can read gzipped log (as archived by logrotate).
As an example with this single command I can read in time ordered mode dpkg log, without treating differently gzipped ones:
less $(ls -rt /var/log/dpkg.log*) | less
Multitail is the best option, because you can view multiple logs at the same time. It also colors stuff, and you can set up regex to highlight entries you're looking for.
You can use any program: less, nano, vi, tail, cat etc, they differ in functionality.
There are also many log viewers: gnome-system-log, kiwi etc (they can sort log by date / type etc)
Less is more. Although since when I'm looking at my logs I'm typically searching for something specific or just interested in the last few events I find myself using cat, pipes and grep or tail rather than more or less.
less is the best, imo. It is light weight compared to an editor, it allows forward and backward navigation, it has powerful search capabilities, and many more things. Hit 'h' for help. It's well worth the time getting familiar with it.
On my Mac, using the standard terminal windows, there's one difference between less and more, namely, after exiting:
less leaves less mess on my screen
more leaves more useful information on my screen
Consequently, if I think I might want to do something with the material I'm viewing after the viewer finishes (for example, copy'n'paste operations), I use more; if I don't want to use the material after I've finished, then I use less.
The primary advantage of less is the ability to scroll backwards; therefore, I tend to use less rather than more, but both have uses for me. YMMV (YMWV; W = Will in this case!).
As your question was generically about 'Unix systems', keep into account that
in some cases you have no choice, for old systems you have only MORE available,
but not LESS.
LESS is part of the GNU tools, MORE comes from the UCB times.
Turn on grep's line buffering mode.
Using tail (Live monitoring)
tail -f fileName
Using less (Live monitoring)
less +F fileName
Using tail & grep
tail -f fileName | grep --line-buffered my_pattern
Using less & grep
less +F fileName | grep --line-buffered my_pattern
Using watch & tail to highlight new lines
watch -d tail fileName
Note: For linux systems.

cat file | ... vs ... <file

Is there a case of ... or context where cat file | ... behaves differently than ... <file?
When reading from a regular file, cat is in charge of reading the data, performs it as it pleases, and might constrain it in the way it writes it to the pipeline. Obviously, the contents themselves are preserved, but anything else could be tainted. For example: block size and data arrival timing. Additionally, the pipe in itself isn't always neutral: it serves as an additional buffer between the input and ....
Quick and easy way to make the block size issue apparent:
$ cat large-file | pv >/dev/null
5,44GB 0:00:14 [ 393MB/s] [ <=> ]
$ pv <large-file >/dev/null
5,44GB 0:00:03 [1,72GB/s] [=================================>] 100%
Besides the thing posted by other users, when using input redirection from a file, standard input is the file but when piping the output of cat to the input, standard input is a stream with the contents of the file. When standard input is the file will be able to seek within the file but the pipe will not allow it. You can see this by finding a zip file and running the following commands:
zipinfo /dev/stdin < thezipfile.zip
and
cat thezipfile.zip | zipinfo /dev/stdin
The first command will show the contents of the zipfile while the second will show an error, though it is a misleading error because zipinfo does not check the result of the seek call and errors later on.
A useless use of cat is always to be avoided. It's like driving with the handbrake on. It wastes CPU cycles for nothing, the OS constantly context switching between the cat process and the next in the pipe. If all the world's useless cats were gone and stopped being invented, reinvented, passed on from father to son, we wouldn't have global warming because we could easily live with 1.21 Gigawatts of power saved.
Thanks. I feel better now. Please join me in my crusade to stamp out useless use of cat on stackoverflow. This site is, as far as I perceive it, a major contribution to the proliferation of useless cats. I don't blame the newbies, but I do want to teach them. Workers and newbies of the world, loosen the handbrakes and save the planet!!!1!
cat will allow you to pipe multiple files in sequentially. Otherwise, < redirection and cat file | produce the same side effects.
Pipes cause a subshell to be invoked for the command on the right. This interferes with environment variables.
cat foo | while read line
do
...
done
echo "$line"
versus
while read line
do
...
done < foo
echo "$line"
One further difference is behavior on a blocking open() of the input file.
For example, assuming input is a FIFO with no writers, one invocation will not spawn any child programs until the input file is opened, while the other will spawn two processes:
prog ... < a_fifo # 'prog' not launched until shell can open file
cat a_fifo | prog ... # 'prog' and 'cat' are running (latter may block on open)
In practice this rarely matters except in contrived circumstances. prog might periodically log or do some cleanup work while waiting for input, for example, which you might want to happen even if no input is available. (Why wouldn't prog be sophisticated enough to open its own input fifo nonblocking?)
cat file | starts up another program (cat) that doesn't have to start in the second case. It also makes it more confusing if you want to use "here documents". But it should behave the same.

Resources