The command
ps -o time -p 21361
works; however what I need is the running time of the process including all
the children. For example, if 21361 is a bash script, which calls other scripts,
then I want the total running time, including the running time of all children.
Now the ps documentation lists the "OUTPUT MODIFIER":
S
Sum up some information, such as CPU usage, from dead child processes into their parent. This is useful for examining a system where a parent process repeatedly forks off short-lived children to do work.
Sounds just right. Unfortunately, there is no specification of the ps-syntax, so
I have no clue where to place the "S"! For hours now I tried many combinations, but
either I get syntax errors, or "S" makes nothing. And on the Internet you find only
very basic information about ps (and always the same), specifically the "S" modifier
I couldn't find mentioned anywhere, and also nobody ever explains the syntax of ps.
I am not sure, but it might be that ps is somewhat buggy in this respect. Try this here:
$ ps p 12104 k time
PID TTY STAT TIME COMMAND
12104 ? Ss 16:17 /usr/sbin/apache2 -k start
$ ps p 12104 k time S
PID TTY STAT TIME COMMAND
12104 ? Ss 143:16 /usr/sbin/apache2 -k start
This is using the BSD options for ps. It works on my machine, however you get an extra header row and extra columns. I would cut them away using tr and cut:
$ ps p 12104 k time S | tail -n 1 | tr -s '[:space:]' | cut -d ' ' -f 4
143:39
$ ps p 12104 k time | tail -n 1 | tr -s '[:space:]' | cut -d ' ' -f 4
16:17
On MacOS X (10.7, Lion) the manual page says:
-S Change the way the process time is calculated by summing all exited children to their parent process.
So, I was able to get output using:
$ ps -S -o time,etime,pid -p 305
TIME ELAPSED PID
0:00.12 01-18:31:07 305
$
However, that output was not really any different from when the '-S' option was omitted.
I tried:
$ ps -S -o time,etime,pid -p 305
TIME ELAPSED PID
0:00.14 01-18:43:59 305
$ time dd if=/dev/zero of=/dev/null bs=1m count=100k
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 15.374440 secs (6983941055 bytes/sec)
real 0m15.379s
user 0m0.056s
sys 0m15.034s
$ ps -S -o time,etime,pid -p 305
TIME ELAPSED PID
0:00.14 01-18:44:15 305
$
As you can see, the 15 seconds of system time spent copying /dev/zero to /dev/null did not get included in the summary.
At this stage, the only way of working out what the '-S' option does, if anything, is to look at the source. You could look for sumrusage in the FreeBSD version, for example, at FreeBSD.
Related
I am trying to run more than 1 MPI codes (eg. 2) in PBS queue system across multiple nodes as a single job.
E.g. For my cluster, 1 node = 12 procs
I need to run 2 codes (abc1.out & abc2.out) as a single job, each code using 24 procs. Hence, I need 4x12 cores for this job. And I need a software which can assign 2x12 to each of the code.
Someone suggested:
How to run several commands in one PBS job submission
which is:
(cd jobdir1; myexecutable argument1 argument2) &
(cd jobdir2; myexecutable argument1 argument2) &
wait
but it doesn't work. The codes are not distributed among all processes.
Can GNU parallel be used? Becos I read somewhere that it can't work across multiple nodes.
If so, what's the command line for the PBS queue system
If not, is there any software which can do this?
This is similar to my final objective which is similar but much more complicated.
Thanks for the help.
Looking at https://hpcc.umd.edu/hpcc/help/running.html#mpi it seems you need to use $PBS_NODEFILE.
Let us assume you have $PBS_NODEFILE containing the 4 reserved nodes. You then need a way to split these in 2x2. This will probably do:
run_one_set() {
cat > nodefile.$$
mpdboot -n 2 -f nodefile.$$
mpiexec -n 1 YOUR_PROGRAM
mpdallexit
rm nodefile.$$
}
export -f run_one_set
cat $PBS_NODEFILE | parallel --pipe -N2 run_one_set
(Completely untested).
thanks for the suggestions.
Btw, i tried using gnu parallel and so far, it only works for jobs within a single node. After some trial and error, I finally found the solution.
Suppose each node has 12procs. And you need to run 2 jobs, each req 24 procs.
So u can request:
#PBS -l select=4:ncpus=12:mpiprocs=12:mem=32gb:ompthreads=1
Then
sort -u $PBS_NODEFILE > unique-nodelist.txt
sed -n '1,2p' unique-nodelist.txt > host.txt
sed 's/.*/& slots=12/' host.txt > host1.txt
sed -n '3,4p' unique-nodelist.txt > host.txt
sed 's/.*/& slots=12/' host.txt > host2.txt
mv host1.txt 1/
mv host2.txt 2/
(cd 1; ./run_solver.sh) &
(cd 2; ./run_solver.sh) &
wait
What the above do is to get the nodes used, remove repetition
separate into 2 nodes each for each job
go to dir 1 and 2 and run the job using run_solver.sh
Inside run_solver.sh for job 1 in dir 1:
...
mpirun -n 24 --hostfile host1.txt abc
Inside run_solver.sh for job 2 in dir 2:
...
mpirun -n 24 --hostfile host2.txt def
Note the different host name.
I have an unusual problem involving the output from the ps -ef command on AIX 7.1.
A shell script monitors processes by parsing this output. I've noticed on two occasions a process (a Perl program) was omitted from this list. Everything I've read on the subject says this is not possible. The program in question starts via crontab at 6am and runs until 11pm, when it self terminates. I checked the output of ps -ef immediately after being omitted by the monitor script, and it displays:
user 1249864 9569338 0 06:00:00 - 0:19 /usr/bin/perl -w /path/to/omittedProgram.pl
... which means it's the same process that was started at 6am. The program did not terminate, then restart.
What is causing it to be omitted from the ps -ef output?
Edit: This is the program that examines the output of ps -ef, which has been running successfully for about five years. I've only noticed this problem twice, but both have been in the last 2 months:
# set global variables
PROCESS_FILE=/tmp/processList.txt
TEMP_FILE=/tmp/greppedProcesses.tmp
BOX=`uname -n`
DATE=`date`
EMAIL_LIST="Support#email.address"
# Get list of running processes
ps -ef > $PROCESS_FILE
checkProcess() {
PROCESS_NAME=$1
PROCESS_ABBREVIATION=$2
PROCESS_COUNT=$3
UNIQUE_PROCESS_IDENTIFIER=$4
GREPPED_LINES=$TEMP_FILE-$PROCESS_ABBREVIATION
grep $UNIQUE_PROCESS_IDENTIFIER $PROCESS_FILE | grep -v grep > $GREPPED_LINES
NUM=`cat $GREPPED_LINES | wc -l`
if [[ $NUM -ne $PROCESS_COUNT ]]
# Incorrect number of processes running!
then MESSAGE=`printf "The \"$PROCESS_NAME\" process count is %1d, but it should be $PROCESS_COUNT!!!" $NUM`
echo "Monitor - starting on $DATE\n\n$MESSAGE\n\n`cat $GREPPED_LINES`" | mail -s "Problem with $PROCESS_NAME on $BOX" $EMAIL_LIST
fi
# Delete the temp file
rm $GREPPED_LINES
}
checkProcess "Full Name of Program" "Program Abbreviation" <expected number of processes running> "Unique string to identify program in ps output"
checkProcess ... (for other processes) ...
exit 0
This might be a long shot in your case but I had same experience with "ps -ef" in the past (don't remember the exact OS type where I seen it, but my script had to work on any Linux, AIX, Solaris and HP-UX).
The "ps -ef" output might be limited to a certain number of columns when used inside a script executed without a terminal. The user, pid, ppid, cputime columns are dynamic and breaking the format sometimes (when the data is larger then the reserved space).
For example if the PID of the process gets to large then the name of the process might be "cut" so that it doesn't appear in the already limited number of column displayed by "ps -ef" then your monitor script would fail.
You could try to keep the file containing the "ps -ef" output and check if it's this problem. No need to wait for when the issue happens, just check if you have the extra long process names in the file (anything longer then the process you're looking for).
My workaround for this problem is to specify a large enough number of columns to be used, like this: COLUMNS=8192 ps -ef > file.out the variable is set just for this 1 purpose.
I just heard from my server support team that the AIX 7.1 TL4 SP4 patch will fix this! We're installing it on our servers now and hopefully this won't happen again.
I would like a dead-simple way to query my gps location from a usb dongle from the unix command line.
Right now, I know I've got a functioning software and hardware system, as evidenced by the success of the cgps command in showing me my position. I'd now like to be able to make short requests for my gps location (lat,long in decimals) from the command line. my usb serial's path is /dev/ttyUSB0 and I'm using a Global Sat dongle that outputs generic NMEA sentences
How might I accomplish this?
Thanks
telnet 127.0.0.1 2947
?WATCH={"enable":true}
?POLL;
gives you your answer, but you still need to separate the wheat from the chaff. It also assumes the gps is not coming in from a cold start.
A short script could be called, e.g.;
#!/bin/bash
exec 2>/dev/null
# get positions
gpstmp=/tmp/gps.data
gpspipe -w -n 40 >$gpstmp"1"&
ppid=$!
sleep 10
kill -9 $ppid
cat $gpstmp"1"|grep -om1 "[-]\?[[:digit:]]\{1,3\}\.[[:digit:]]\{9\}" >$gpstmp
size=$(stat -c%s $gpstmp)
if [ $size -gt 10 ]; then
cat $gpstmp|sed -n -e 1p >/tmp/gps.lat
cat $gpstmp|sed -n -e 2p >/tmp/gps.lon
fi
rm $gpstmp $gpstmp"1"
This will cause 40 sentences to be output and then grep lat/lon to temporary files and then clean up.
Or, from GPS3 github repository place the alpha gps3.py in the same directory as, and execute, the following Python2.7-3.4 script.
from time import sleep
import gps3
the_connection = gps3.GPSDSocket()
the_fix = gps3.DataStream()
try:
for new_data in the_connection:
if new_data:
the_fix.refresh(new_data)
if not isinstance(the_fix.TPV['lat'], str): # check for valid data
speed = the_fix.TPV['speed']
latitude = the_fix.TPV['lat']
longitude = the_fix.TPV['lon']
altitude = the_fix.TPV['alt']
print('Latitude:', latitude, 'Longitude:', longitude)
sleep(1)
except KeyboardInterrupt:
the_connection.close()
print("\nTerminated by user\nGood Bye.\n")
If you want it to close after one iteration also import sys and then replace sleep(1) with sys.exit()
much easier solution:
$ gpspipe -w -n 10 | grep -m 1 lon
{"class":"TPV","device":"tcp://localhost:4352","mode":2,"lat":11.1111110000,"lon":22.222222222}
source
You can use my script : gps.sh return "x,y"
#!/bin/bash
x=$(gpspipe -w -n 10 |grep lon|tail -n1|cut -d":" -f9|cut -d"," -f1)
y=$(gpspipe -w -n 10 |grep lon|tail -n1|cut -d":" -f10|cut -d"," -f1)
echo "$x,$y"
sh gps.sh
43.xx4092000,6.xx1269167
Putting a few of the bits of different answers together with a bit more jq work, I like this version:
$ gpspipe -w -n 10 | grep -m 1 TPV | jq -r '[.lat, .lon] | #csv'
40.xxxxxx054,-79.yyyyyy367
Explanation:
(1) use grep -m 1 after invoking gpspipe, as used by #eadmaster's answer, because the grep will exit as soon as the first match is found. This gets you results faster instead of having to wait for 10 lines (or using two invocations of gpspipe).
(2) use jq to extract both fields simultaneously; the #csv formatter is more readable. Note the use of jq -r (raw output), so that the output is not put in quotes. Otherwise the output would be "40.xxxx,-79.xxxx" - which might be fine or better for some applications.
(3) Search for the TPV field by name for clarity. This is the "time, position, velocity" record, which is the one we want for extracting the current lat & lon. Just searching for "lat" or "lon" risks getting confused by the GST object that some GPSes may supply, and in that object, 'lat' and 'lon' are the standard deviation of the position error, not the position itself.
Improving on eadmaster's answer here is a more elegant solution:
gpspipe -w -n 10 | jq -r '.lon' | grep "[[:digit:]]" | tail -1
Explanation:
Ask from gpsd 10 times the data
Parse the received JSONs using jq
We want only numeric values, so filter using grep
We want the last received value, so use tail for that
Example:
$ gpspipe -w -n 10 | jq -r '.lon' | grep "[[:digit:]]" | tail -1
28.853181286
The version of diff in my cygwin has a number of advanced options which allow me to print out one difference per line.
Given two files one.txt and two.txt.
one.txt:
one
two
three
four
five
six
two.txt
one
two2
three
four
five5
six
And running diff in cygwin with the following options/parameters:
diff -y --suppress-common-lines one.txt two.txt
Gives an output of:
two |two2
five |five5
This is the type of format I'm after whereby one difference is printed out per line.
On my dev solaris box, the "-y" option is not supported, so I'm stuck with an output which looks like this:
2c2
< two
---
> two2
5c5
< five
---
> five5
Does anyone know of a way I can get an output of one difference per line on this solaris box? Maybe using a sed/awk one liner to massage the output from this more primitive diff output? (Please note, I am not able to install a more up-to-date diff version on this solaris box).
Thanks!
Use GNU diff.
http://www.gnu.org/software/diffutils/
You can build and install it into your local directory, no? If you have a home directory and a compiler and a make, you can build your own GNU diff.
I don't have Solaris, but I can't imagine it would be much more than this:
./configure --prefix=/home/bob
make
make install
No root privileges required.
comm -3 almost does what you want, but requires sorted input. It also will put them in separate lines by alphabetical order. Your example (once sorted) would show up as
five
five5
two
two2
If solaris diff won't do what you want, then nothing on a standard solaris box is liable to do so either, which means introducing code from elsewhere, either your own or someone else's. As GNU diff does what you want, just use that.
Example output:
# ~/config_diff.sh postfix_DIST/master.cf postfix/master.cf
postfix_DIST/master.cf: -o smtpd_tls_security_level=encrypt -o smtpd_sasl_auth_enable=yes -o smtpd_client_restrictions=permit_sasl_authenticated,reject -o smtpd_tls_wrappermode=yes -o smtp_fallback_relay=
postfix/master.cf: -o cleanup_service_name=cleanup_sasl -o smtpd_tls_security_level=encrypt -o smtpd_sasl_auth_enable=yes -o smtpd_client_restrictions=permit_sasl_authenticated,reject -o cleanup_service_name=cleanup_sasl -o smtpd_tls_wrappermode=yes -o smtpd_sasl_auth_enable=yes -o smtpd_client_restrictions=permit_sasl_authenticated,reject -o smtp_fallback_relay=
postfix_DIST/master.cf:smtp inet n - - - - smtpd smtp unix - - - - - smtp
postfix/master.cf:smtp inet n - - - - smtpd smtp unix - - - - - smtp
Sadly, it cannot currently handle several same configurations variables ... it counts them and will think the the files differ.
All the answers given above and below are perfect but just typing a command and getting a result wont help you in solving similar problems in future.
Here a link which explains how diff works. once you go through the link, you can the problem yourself
Here is a link. https://www.youtube.com/watch?v=5_dyVrvbWjc
#! /bin/bash
FILES="$#"
COLUMN=1
for variable in $( awk -F: '{print $X}' X=${COLUMN} ${FILES} | sort -u ) ; do
NUM_CONFIGS=$( for file in ${FILES} ; do
grep "^${variable}" ${file}
done | sort -u | wc -l )
if [ "${NUM_CONFIGS}" -ne 1 ] ; then
for file in ${FILES} ; do
echo ${file}:$( grep "^${variable}" ${file} )
done
echo
fi
done
I need to extract a set number of lines from a file given the start line number and end line number.
How could I quickly do this under unix (it's actually Solaris so gnu flavour isn't available).
Thx
To print lines 6-10:
sed -n '6,10p' file
If the file is huge, and the end line number is small compared to the number of lines, you can make it more efficient by:
sed -n '10q;6,10p' file
From testing a file with a fairly large number of lines:
$ wc -l test.txt
368048 test.txt
$ du -k test.txt
24640 test.txt
$ time sed -n '10q;6,10p' test.txt >/dev/null
real 0m0.005s
user 0m0.001s
sys 0m0.003s
$ time sed -n '6,10p' test.txt >/dev/null
real 0m0.123s
user 0m0.092s
sys 0m0.030s
Or
head -n "$last" file | tail -n +"$first"
I wrote a Haskell program called splitter that does exactly this: have a read through my release blog post.
You can use the program as follows:
$ cat somefile | splitter 4,6-10,50-
That will get lines four, six to ten and lines fifty onwards. And that is all that there is to it. You will need Haskell to install it. Just:
$ cabal install splitter
And you are done. I hope that you find this program useful.
you can do it with nawk as well
#!/bin/sh
start=10
end=20
nawk -vs="$start" -ve="$end" 'NR>e{exit}NR>=s' file