Ring buffer log file on unix - unix

I'm trying to come up with a unix pipeline of commands that will allow me to log only the most recent n lines of a program's output to a text file.
The text file should never be more than n lines long. (it may be less when it is first filling the file)
It will be run on a device with limited memory/resources, so keeping the filesize small is a priority.
I've tried stuff like this (n=500):
program_spitting_out_text > output.txt
cat output.txt | tail -500 > recent_output.txt
rm output.txt
or
program_spitting_out_text | tee output.txt | tail -500 > recent_output.txt
Obviously neither works for my purposes...
Anyone have a good way to do this in a one-liner? Or will I have to write a script/utility?
Note: I don't want anything to do with dmesg and must use standard BSD unix commands. The "program_spitting_out_text" prints out about 60 lines/second, every second.
Thanks in advance!

If program_spitting_out_text runs continuously and keeps it's file open, there's not a lot you can do.
Even deleting the file won't help since it will still continue to write to the now "hidden" file (data still exists but there is no directory entry for it) until it closes it, at which point it will be really removed.
If it closes and reopens the log file periodically (every line or every ten seconds or whatever), then you have a relatively easy option.
Simply monitor the file until it reaches a certain size, then roll the file over, something like:
while true; do
sleep 5
lines=$(wc -l <file.log)
if [[ $lines -ge 5000 ]]; then
rm -f file2.log
mv file.log file2.log
touch file.log
fi
done
This script will check the file every five seconds and, if it's 5000 lines or more, will move it to a backup file. The program writing to it will continue to write to that backup file (since it has the open handle to it) until it closes it, then it will re-open the new file.
This means you will always have (roughly) between five and ten thousand lines in the log file set, and you can search them with commands that combine the two:
grep ERROR file2.log file.log
Another possibility is if you can restart the program periodically without affecting its function. By way of example, a program which looks for the existence of a file once a second and reports on that, can probably be restarted without a problem. One calculating PI to a hundred billion significant digits will probably not be restartable without impact.
If it is restartable, then you can basically do the same trick as above. When the log file reaches a certain size, kill of the current program (which you will have started as a background task from your script), do whatever magic you need to in rolling over the log files, then restart the program.
For example, consider the following (restartable) program prog.sh which just continuously outputs the current date and time:
#!/usr/bin/bash
while true; do
date
done
Then, the following script will be responsible for starting and stopping the other script as needed, by checking the log file every five seconds to see if it has exceeded its limits:
#!/usr/bin/bash
exe=./prog.sh
log1=prog.log
maxsz=500
pid=-1
touch ${log1}
log2=${log1}-prev
while true; do
if [[ ${pid} -eq -1 ]]; then
lines=${maxsz}
else
lines=$(wc -l <${log1})
fi
if [[ ${lines} -ge ${maxsz} ]]; then
if [[ $pid -ge 0 ]]; then
kill $pid >/dev/null 2>&1
fi
sleep 1
rm -f ${log2}
mv ${log1} ${log2}
touch ${log1}
${exe} >> ${log1} &
pid=$!
fi
sleep 5
done
And this output (from an every-second wc -l on the two log files) shows what happens at the time of switchover, noting that it's approximate only, due to the delays involved in switching:
474 prog.log 0 prog.log-prev
496 prog.log 0 prog.log-prev
518 prog.log 0 prog.log-prev
539 prog.log 0 prog.log-prev
542 prog.log 0 prog.log-prev
21 prog.log 542 prog.log-prev
Now keep in mind that's a sample script. It's relatively intelligent but probably needs some error handling so that it doesn't leave the executable running if you shut down the monitor.
And, finally, if none of that suffices, there's nothing stopping you from writing your own filter program which takes standard input and continuously outputs that to a real ring buffer file.
Then you would simply do:
program_spitting_out_text | ringbuffer 4096 last4k.log
That program could be a true ring buffer in that it treats the 4k file as a circular character buffer but, of course, you'll need a special marker in the file to indicate the write-point, along with a program that can turn it back into a real stream.
Or, it could do much the same as the scripts above, rewriting the file so that it's always below the size desired.

Since apparently this basic feature (circular file) does not exist on GNU/Linux, and because I needed it to track logs on my Raspberry Pi with limited storage, I just wrote the code as suggest above!
Behold: circFS
Unlike other tools quoted on this post and other similar, the maximum size is arbitrary and only limited by the actual available storage.
It does not rotate with several files, all is kept in the single file, which is rewritten on "release".
You can have as many log files as needed in the virtual directory.
It is a single C file (~600 lines including comments), and it builds with a single compile line after having installed fuse development dependencies.
This first version is very basic (see the README), if you want to improve it with some of the TODOs (see the TODO) be welcome to submit pull requests.
As a joke, this is my first "write only" fuse driver! :-)

Related

Process not listed on "ps -ef" (AIX 7.1)

I have an unusual problem involving the output from the ps -ef command on AIX 7.1.
A shell script monitors processes by parsing this output. I've noticed on two occasions a process (a Perl program) was omitted from this list. Everything I've read on the subject says this is not possible. The program in question starts via crontab at 6am and runs until 11pm, when it self terminates. I checked the output of ps -ef immediately after being omitted by the monitor script, and it displays:
user 1249864 9569338 0 06:00:00 - 0:19 /usr/bin/perl -w /path/to/omittedProgram.pl
... which means it's the same process that was started at 6am. The program did not terminate, then restart.
What is causing it to be omitted from the ps -ef output?
Edit: This is the program that examines the output of ps -ef, which has been running successfully for about five years. I've only noticed this problem twice, but both have been in the last 2 months:
# set global variables
PROCESS_FILE=/tmp/processList.txt
TEMP_FILE=/tmp/greppedProcesses.tmp
BOX=`uname -n`
DATE=`date`
EMAIL_LIST="Support#email.address"
# Get list of running processes
ps -ef > $PROCESS_FILE
checkProcess() {
PROCESS_NAME=$1
PROCESS_ABBREVIATION=$2
PROCESS_COUNT=$3
UNIQUE_PROCESS_IDENTIFIER=$4
GREPPED_LINES=$TEMP_FILE-$PROCESS_ABBREVIATION
grep $UNIQUE_PROCESS_IDENTIFIER $PROCESS_FILE | grep -v grep > $GREPPED_LINES
NUM=`cat $GREPPED_LINES | wc -l`
if [[ $NUM -ne $PROCESS_COUNT ]]
# Incorrect number of processes running!
then MESSAGE=`printf "The \"$PROCESS_NAME\" process count is %1d, but it should be $PROCESS_COUNT!!!" $NUM`
echo "Monitor - starting on $DATE\n\n$MESSAGE\n\n`cat $GREPPED_LINES`" | mail -s "Problem with $PROCESS_NAME on $BOX" $EMAIL_LIST
fi
# Delete the temp file
rm $GREPPED_LINES
}
checkProcess "Full Name of Program" "Program Abbreviation" <expected number of processes running> "Unique string to identify program in ps output"
checkProcess ... (for other processes) ...
exit 0
This might be a long shot in your case but I had same experience with "ps -ef" in the past (don't remember the exact OS type where I seen it, but my script had to work on any Linux, AIX, Solaris and HP-UX).
The "ps -ef" output might be limited to a certain number of columns when used inside a script executed without a terminal. The user, pid, ppid, cputime columns are dynamic and breaking the format sometimes (when the data is larger then the reserved space).
For example if the PID of the process gets to large then the name of the process might be "cut" so that it doesn't appear in the already limited number of column displayed by "ps -ef" then your monitor script would fail.
You could try to keep the file containing the "ps -ef" output and check if it's this problem. No need to wait for when the issue happens, just check if you have the extra long process names in the file (anything longer then the process you're looking for).
My workaround for this problem is to specify a large enough number of columns to be used, like this: COLUMNS=8192 ps -ef > file.out the variable is set just for this 1 purpose.
I just heard from my server support team that the AIX 7.1 TL4 SP4 patch will fix this! We're installing it on our servers now and hopefully this won't happen again.

How can I tail -f but only in whole lines?

I have a constantly updating huge log file (MainLog).
I want to create another file which is only the last n lines of the log file BUT also updating.
If I use:
tail -f MainLog > RecentLog
I get ALMOST what I want except RecentLog is written as MainLog is available and might at any point only have part of the last MainLog line.
How can I specify to tail that I only want it to write when a WHOLE line is available?
By default, tail outputs whole lines unless you use the -c switch to count characters. Something like
tail -n 20 -f MainLog > RecentLog
(substituting the number of lines you want prepended to the second file for "20") should work as you want.
But if if doesn't, it is possible that using grep to line-buffer your output will fix this condition. See this question.
After many attempts, the only solution for multiple files that worked (fantastically well) for me is the fdlinecombine command. It's a small binary that reads multiple file descriptors and prints data to stdout linewise.
My use case is spawning multiple long-running ssh commands in the background and following their output, without having the lines garbled or interrupted in between.

Unix Shell Script: sleep command not working

i have a scenario in which i need to download files through curl command and want my script to pause for some time before downloading the second one. I used sleep command like
sleep 3m
but it is not working.
any idea ???
thanks in advance.
Make sure your text editor is not putting a /r /n and only a /n for every new line. This is typical if you are writing the script on windows.
Use notepad++ (windows) and go to edit|EOL convention|UNIX then save it. If you are stuck with it on the computer, i have read from here [talk.maemo.org/showthread.php?t=67836] that you can use [tr -d "\r" < oldname.sh > newname.sh] to remove the problem. if you want to see if it has the problem use [od -c yourscript.sh] and /r will occur before any /n.
Other problems I have seen it cause is cd /dir/dir and you get [cd: 1: can't cd to /dir/dir] or copy scriptfile.sh newfilename the resulting file will be called newfilenameX where X is an invisible character (ensure you can delete it before trying it), if the file is on a network share, a windows machine can see the character. Ensure it is not the last line for a successful test.
Until i figured it out (i knew i had to ask google for something that may manifest in various ways) i thought that there was an issue with this linux version i was using (sleep not working in a script???).
Are you sure you are using sleep the right way? Based on your description, you should be invoking it as:
sleep 180
Is this the way you are doing it?
You might also want to consider wget command as it has an explicit --wait flag, so you might avoid having the loop in the first place.
while read -r urlname
do
curl ..........${urlname}....
sleep 180 #180 seconds is 3 minutes
done < file_with_several_url_to_be_fetched
?

cat file | ... vs ... <file

Is there a case of ... or context where cat file | ... behaves differently than ... <file?
When reading from a regular file, cat is in charge of reading the data, performs it as it pleases, and might constrain it in the way it writes it to the pipeline. Obviously, the contents themselves are preserved, but anything else could be tainted. For example: block size and data arrival timing. Additionally, the pipe in itself isn't always neutral: it serves as an additional buffer between the input and ....
Quick and easy way to make the block size issue apparent:
$ cat large-file | pv >/dev/null
5,44GB 0:00:14 [ 393MB/s] [ <=> ]
$ pv <large-file >/dev/null
5,44GB 0:00:03 [1,72GB/s] [=================================>] 100%
Besides the thing posted by other users, when using input redirection from a file, standard input is the file but when piping the output of cat to the input, standard input is a stream with the contents of the file. When standard input is the file will be able to seek within the file but the pipe will not allow it. You can see this by finding a zip file and running the following commands:
zipinfo /dev/stdin < thezipfile.zip
and
cat thezipfile.zip | zipinfo /dev/stdin
The first command will show the contents of the zipfile while the second will show an error, though it is a misleading error because zipinfo does not check the result of the seek call and errors later on.
A useless use of cat is always to be avoided. It's like driving with the handbrake on. It wastes CPU cycles for nothing, the OS constantly context switching between the cat process and the next in the pipe. If all the world's useless cats were gone and stopped being invented, reinvented, passed on from father to son, we wouldn't have global warming because we could easily live with 1.21 Gigawatts of power saved.
Thanks. I feel better now. Please join me in my crusade to stamp out useless use of cat on stackoverflow. This site is, as far as I perceive it, a major contribution to the proliferation of useless cats. I don't blame the newbies, but I do want to teach them. Workers and newbies of the world, loosen the handbrakes and save the planet!!!1!
cat will allow you to pipe multiple files in sequentially. Otherwise, < redirection and cat file | produce the same side effects.
Pipes cause a subshell to be invoked for the command on the right. This interferes with environment variables.
cat foo | while read line
do
...
done
echo "$line"
versus
while read line
do
...
done < foo
echo "$line"
One further difference is behavior on a blocking open() of the input file.
For example, assuming input is a FIFO with no writers, one invocation will not spawn any child programs until the input file is opened, while the other will spawn two processes:
prog ... < a_fifo # 'prog' not launched until shell can open file
cat a_fifo | prog ... # 'prog' and 'cat' are running (latter may block on open)
In practice this rarely matters except in contrived circumstances. prog might periodically log or do some cleanup work while waiting for input, for example, which you might want to happen even if no input is available. (Why wouldn't prog be sophisticated enough to open its own input fifo nonblocking?)
cat file | starts up another program (cat) that doesn't have to start in the second case. It also makes it more confusing if you want to use "here documents". But it should behave the same.

How to tail -f a file (or similar) for a specified interval?

I am working on adding some nagios alerts to our system -- some of which will monitoring the rate of certain events hitting the nginx/apache logs (or parsing values from those logs.) The way I've approached the problem so far is with a simple shell script tail -f'ing the log for 25 seconds or so to a temporary file, killing the process, and then running awk, etc over the temp file. The goal here being to get a log "sample" over 25 seconds and then perform analysis.
This is less than ideal obviously because of the increase in disk IO due to these temp files -- what I really would like is an "enhanced" tail -f that would terminate the pipe cleanly after a certain number of seconds. Ie:
tail -f --interval '5 seconds' | grep "/serve"
Would tail the log for 5 seconds and show me all the lines that have "/serve".
I'd imagine I can whip up a ruby script to do this pretty quickly, but I wanted to make sure there wasn't a more unixy way to accomplish this. At a high level, is there a better way to be taking samples of a log from the last N seconds (and no, I'd rather not be parsing timestamps, etc.)
Found the solution. "apt-get install timeout" :)
Edit: Actually this kills tail, doesn't cause it to exit gracefully, so we lose the entire pipe. What I want to work is:
timeout -15 5 tail -f /mnt/log/nginx/nginx-access.log | grep '/javascripts' | wc -l
To tell me how many javascript files served in last 5 seconds, etc.
A slightly different approach:
(tail -f /var/log/messages & P=$! ; sleep 5; kill -9 $P) | grep /serve
I'm thinking that, as a Nagiios user myself, you do not want probe processes pausing for arbitrary amounts of time. That's going to, in the worst case, make Nagios check other things less often, or "clump" the checks.
What about a script that runs quickly (instantly) and parses the last few lines of the file, returning only interesting things with a timestamp later than a given time?
GNU's tail has a --pid flag that can be used for this (tail will exit once a process with that PID no longer exists). Just start up a sleep process in the background and tell tail to exit when it does. Like so:
sleep 5 & tail --pid=$! -f /var/log/system.log
tail will exit with a 0 exit code when time is out.

Resources