How to understand the following quote from termios description - unix

I'm using Mac (10.14.6), following quote is from man termios:
When a controlling terminal becomes associated with a session, its foreground process group is set to the process group of the session leader.
To verify this, I opened a terminal window, ran sleep command in the foreground, then opened another terminal windows and ran ps:
$ ps -o pid,pgid,tpgid,sess,stat,command,tty
PID PGID TPGID SESS STAT COMMAND TTY
44606 44606 45006 0 S -bash ttys000
45006 45006 45006 0 S+ sleep 3000 ttys000
which shows foreground process group as 45006, to get the session leader I wrote some C code using getsid and getpgid, then get following info:
pid: 45006 pgid: 45006 sid: 44605
pid: 44605 pgid: 44605 sid: 44605
the session leader is 44605 which is the login proc:
$ ps -o pid,pgid,tpgid,sess,stat,command,tty -p 44605
PID PGID TPGID SESS STAT COMMAND TTY
44605 44605 45006 0 Ss login -pfl mz /b ttys000
obviously the foreground proc group 45006 differs from session leader proc group 44605, what am I missing? Thanks!

Related

Openstack Devstack installation error : g-api did not start in ubuntu 18.04

I'm installing openstack using All-In-One Single Machine setup, I run stack.sh script for devstack setup. On starting glance service I'm getting following error on my console:
++:: curl -g -k --noproxy '*' -s -o /dev/null -w '%{http_code}' http://10.10.20.10/image
+:: [[ 503 == 503 ]]
+:: sleep 1
+functions:wait_for_service:485 rval=124
+functions:wait_for_service:490 time_stop wait_for_service
+functions-common:time_stop:2310 local name
+functions-common:time_stop:2311 local end_time
+functions-common:time_stop:2312 local elapsed_time
+functions-common:time_stop:2313 local total
+functions-common:time_stop:2314 local start_time
+functions-common:time_stop:2316 name=wait_for_service
+functions-common:time_stop:2317 start_time=1602763779096
+functions-common:time_stop:2319 [[ -z 1602763779096 ]]
++functions-common:time_stop:2322 date +%s%3N
+functions-common:time_stop:2322 end_time=1602763839214
+functions-common:time_stop:2323 elapsed_time=60118
+functions-common:time_stop:2324 total=569
+functions-common:time_stop:2326 _TIME_START[$name]=
+functions-common:time_stop:2327 _TIME_TOTAL[$name]=60687
+functions:wait_for_service:491 return 124
+lib/glance:start_glance:480 die 480 'g-api did not start'
+functions-common:die:198 local exitcode=0
+functions-common:die:199 set +o xtrace
[Call Trace]
./stack.sh:1306:start_glance
/opt/stack/devstack/lib/glance:480:die
[ERROR] /opt/stack/devstack/lib/glance:480 g-api did not start
Error on exit
World dumping... see /opt/stack/logs/worlddump-2020-10-15-121040.txt for details
neutron-dhcp-agent: no process found
neutron-l3-agent: no process found
neutron-metadata-agent: no process found
neutron-openvswitch-agent: no process found
I also tried to increase timeout duration but then also it failed and also verifyied devstack#g-api.service is in active state. Can someone let me know what is the exect reason behind this issue and how to resolve it.
The only solution is to reload the entire system, including the os

Timeout in tests when running pintos

I am just getting started with the pintos projects, working from my home computer that is running ubuntu 14.04 x64 system.
I'm able to compile the project from the src/threads/ directory, and the initial test pintos run alarm-multiple seems to work okay (notice that it runs qemu by default):
zay#ubuntu:~/Documents/pintos/src/threads/build$ pintos run alarm-multiple
Prototype mismatch: sub main::SIGVTALRM () vs none at /home/zay/Documents/pintos/src/utils/pintos line 935.
Constant subroutine SIGVTALRM redefined at /home/zay/Documents/pintos/src/utils/pintos line 927.
qemu-system-x86_64 -drive cache=writeback,file=/tmp/YS3E7FICwo.dsk -m 4 -net none -serial stdio
PiLo hda1
Loading..........
Kernel command line: run alarm-multiple
Pintos booting with 4,088 kB RAM...
382 pages available in kernel pool.
382 pages available in user pool.
Calibrating timer... 286,310,400 loops/s.
Boot complete.
Executing 'alarm-multiple':
(alarm-multiple) begin
(alarm-multiple) Creating 5 threads to sleep 7 times each.
(alarm-multiple) Thread 0 sleeps 10 ticks each time,
(alarm-multiple) thread 1 sleeps 20 ticks each time, and so on.
(alarm-multiple) If successful, product of iteration count and
(alarm-multiple) sleep duration will appear in nondescending order.
(alarm-multiple) thread 0: duration=10, iteration=1, product=10
(alarm-multiple) thread 0: duration=10, iteration=2, product=20
However, when I run make check under src/threads/build, all tests get a timeout fault:
zay#ubuntu:~/Documents/pintos/src/threads/build$ make check
pintos -v -k -T 60 --qemu -- -q run alarm-multiple < /dev/null 2> tests/threads/alarm-multiple.errors > tests/threads/alarm-multiple.output
perl -I../.. ../../tests/threads/alarm-multiple.ck tests/threads/alarm-multiple tests/threads/alarm-multiple.result
FAIL tests/threads/alarm-multiple
run: TIMEOUT after 61 seconds of wall-clock time - load average: 0.20, 0.45, 0.26
pintos -v -k -T 60 --qemu -- -q run alarm-simultaneous < /dev/null 2> tests/threads/alarm-simultaneous.errors > tests/threads/alarm-simultaneous.output
perl -I../.. ../../tests/threads/alarm-simultaneous.ck tests/threads/alarm-simultaneous tests/threads/alarm-simultaneous.result
FAIL tests/threads/alarm-simultaneous
run: TIMEOUT after 61 seconds of wall-clock time - load average: 0.18, 0.40, 0.25
pintos -v -k -T 60 --qemu -- -q run alarm-priority < /dev/null 2> tests/threads/alarm-priority.errors > tests/threads/alarm-priority.output
perl -I../.. ../../tests/threads/alarm-priority.ck tests/threads/alarm-priority tests/threads/alarm-priority.result
FAIL tests/threads/alarm-priority
run: TIMEOUT after 61 seconds of wall-clock time - load average: 0.10, 0.34, 0.2
What changes should I make to solve this problem?
Apparently, QEMU no longer supports the power off sequence on the port 0x8900. Here is a fix that made it work for me (found in chaOs): in the file devices/shutdown.c patch shutdown_power_off as follows:
void
shutdown_power_off (void)
{
// ...
printf ("Powering off...\n");
serial_flush ();
outw (0xB004, 0x2000); // <-- Add this line
// ...
}
If you are using qemu for pintos.
You need to add one line of code in devices/shutdown.c.
Insert the line
outw( 0x604, 0x0 | 0x2000 ); after printf (“Powering off…\n”); serial_flush (); as shown below:
/* This is a special power-off sequence supported by Bochs and
QEMU, but not by physical hardware. */
for (p = s; *p != ' printf ("Powering off...\n");
serial_flush ();
//add the following line
outw( 0x604, 0x0 | 0x2000 );
Follow this guide to find out more

Using PS : cannot see my work from new terminal window

Hi I just started using ps. So, I am trying to run a R script in a cluster (after logging in using ssh) by writing
nohup R CMD BATCH a.R > a.out &
Then when I use ps -u myusername it shows :
PID TTY TIME CMD
4200 pts/14 00:00:00 ps
29985 ? 00:00:00 sshd
29986 pts/14 00:00:00 tcsh
32146 pts/14 00:00:00 sh
32150 pts/14 01:10:10 R
Now when I log in into ssh from new terminal window and do the same ps -u myusernameit shows
PID TTY TIME CMD
444 ? 00:00:00 sshd
445 pts/1 00:00:00 tcsh
2460 pts/1 00:00:00 ps
It doesn't show the R job I started in other terminal window. How can I see that? And even if I close the first terminal/log out from ssh, the R script should still be running, right? Because of the nohup. Sorry for so many basic question. I never used nohup and ps before, and all the details are making me feel lost.
Try
top -u myusername
If you still don't see your job, the job is not running.
I personally had the same issue but somehow with the command
nohup R --vanilla < mycode.R >& mycode.Rout&
worked.

Nginx pid keeps changing every few second after killing the master process

I'm using the following Upstart script to keep Nginx up and running on Ubuntu server:
start on (filesystem and net-device-up IFACE=lo)
stop on runlevel [!2345]
env DAEMON=/usr/sbin/nginx
env CONF=/etc/nginx/nginx.conf
respawn
respawn limit 10 5
pre-start script
$DAEMON -t
if [ $? -ne 0 ]; then
exit $?
fi
end script
exec $DAEMON -c $CONF -g "daemon off;" > /dev/null 2>&1
This script works fine except when I'm killing the Nginx master process using the kill command. After killing the master process /var/run/nginx.pid remains the same but Nginx pid keeps changing every few seconds (this means Nginx is restarting all the time?). Any idea how to fix this?

Submitting Open MPI jobs to SGE

I've installed openmpi , not in /usr/... but in a /commun/data/packages/openmpi/ , it was compiled with --with-sge.
I've added a new PE in SGE as descibed in http://docs.oracle.com/cd/E19080-01/n1.grid.eng6/817-5677/6ml49n2c0/index.html
# /commun/data/packages/openmpi/bin/ompi_info | grep gridengine
MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.3)
# qconf -sq all.q | grep pe_
pe_list make orte
Without SGE, the program runs without any problem, using several processors.
/commun/data/packages/openmpi/bin/orterun -np 20 ./a.out args
Now I want to submit my program to SGE
In the Open MPI FAQ, I read:
# Allocate a SGE interactive job with 4 slots
# from a parallel environment (PE) named 'orte'
shell$ qsh -pe orte 4
but my output is:
qsh -pe orte 4
Your job 84550 ("INTERACTIVE") has been submitted
waiting for interactive job to be scheduled ...
Could not start interactive job.
I've also tried the mpirun command embedded in a script:
$ cat ompi.sh
#!/bin/sh
/commun/data/packages/openmpi/bin/mpirun \
/path/to/a.out args
but it fails
$ cat ompi.sh.e84552
error: executing task of job 84552 failed: execution daemon on host "node02" didn't accept task
--------------------------------------------------------------------------
A daemon (pid 18327) died unexpectedly with status 1 while attempting
to launch so we are aborting.
There may be more information reported by the environment (see above).
This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
error: executing task of job 84552 failed: execution daemon on host "node01" didn't accept task
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
How can I fix this?
answer in the openmpi mailing list: http://www.open-mpi.org/community/lists/users/2013/02/21360.php
In my case setting "job_is_first_task FALSE" and "control_slaves TRUE" solved the problem.
# qconf -mp mpi1
pe_name mpi1
slots 9
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary FALSE

Resources