Supervisord - NGINX stop OSError - nginx

I ran into an error when trying to stop NGINX using supervisord.
To start NGINX without error from supervisord I had to prepend sudo to the nginx command in supervisord.conf:
[supervisord]
[program:nginx]
command=sudo nginx -c %(ENV_PWD)s/configs/nginx.conf
When I run this:
$ supervisord -n
2017-02-09 12:26:06,371 INFO RPC interface 'supervisor' initialized
2017-02-09 12:26:06,372 INFO RPC interface 'supervisor' initialized
2017-02-09 12:26:06,372 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2017-02-09 12:26:06,373 INFO supervisord started with pid 22152
2017-02-09 12:26:07,379 INFO spawned: 'nginx' with pid 22155
2017-02-09 12:26:08,384 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
^C # SIGINT: Should stop all processes
2017-02-09 13:59:08,550 WARN received SIGINT indicating exit request
2017-02-09 13:59:08,551 CRIT unknown problem killing nginx (22155):Traceback (most recent call last):
File "/Users/ocervell/.virtualenvs/ndc-v3.3/lib/python2.7/site-packages/supervisor/process.py", line 432, in kill
options.kill(pid, sig)
File "/Users/ocervell/.virtualenvs/ndc-v3.3/lib/python2.7/site-packages/supervisor/options.py", line 1239, in kill
os.kill(pid, signal)
OSError: [Errno 1] Operation not permitted
Same when using supervisorctl to stop the process:
$ supervisorctl stop nginx
FAILED: unknown problem killing nginx (22321):Traceback (most recent call last):
File "/Users/ocervell/.virtualenvs/ndc-v3.3/lib/python2.7/site-packages/supervisor/process.py", line 432, in kill
options.kill(pid, sig)
File "/Users/ocervell/.virtualenvs/ndc-v3.3/lib/python2.7/site-packages/supervisor/options.py", line 1239, in kill
os.kill(pid, signal)
OSError: [Errno 1] Operation not permitted
Is there a workaround for this ?

If a process created by supervisord creates its own child processes, supervisord cannot kill them.
...
The pidproxy program is put into your configuration’s $BINDIR when supervisor is installed (it is a “console script”).[1]
So what you have to do is changing your supervisord configuration like this:
[program:nginx]
command=/path/to/pidproxy /path/to/nginx-pidfile sudo nginx -c %(ENV_PWD)s/configs/nginx.conf
This may not work either, since the nginx process is create by sudo. But let's try it first.

Related

openstack-nova-api has conflicted with the httpd service among the port 8774

I can't use httpd and nova-api at the same time.
when I used httpd service.The nova-api is dead(or inactive).
#systemctl restart openstack-nova-api
OUTPUT:
Job for openstack-nova-api.service failed because the control process exited
with error code. See "systemctl status openstack-nova-api.service" and
"journalctl -xe" for details.
I checked out the log,I get the error as follows.
LOG:ERROR nova.wsgi [-] Could not bind to 0.0.0.0:8774: error: [Errno 98] Address already in use.
CRITICAL nova [-] Unhandled error: error: [Errno 98] Address already in use.
And then,I try to find which process have used the port8774.
#netstat -tunlp | grep 8774
OUTPUT:
tcp 0 0 0.0.0.0:8774 0.0.0.0:* LISTEN 61690/httpd
When I #systemctl stop httpd->#systemctl restart nova-api->#systemctl restart http. I get a similiar mistake(I use RDO to install openstack-train version on centos 7).
they can't exist together

Setting repmgr witness node on Debian

I am trying to set up repmgr version 5 on Debian with PostgtrSql 11.
Seems like the documentation is more oriented towards centos/RHEL.
When I am trying to setup the witnes node to start the repmgr daemon, I get an error without any idea where to look for for seeing what is the cause of the error.
This is my repmgr.conf file:
node_id=3
node_name='PG-Node-Witness'
conninfo='host=10.97.7.140 user=repmgr dbname=repmgr connect_timeout=2'
data_directory='/var/lib/postgresql/11/main'
failover='automatic'
promote_command='/usr/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'
priority=60
monitor_interval_secs=2
connection_check_type='ping'
reconnect_attempts=6
reconnect_interval=8
primary_visibility_consensus=true
standby_disconnect_on_failover=true
repmgrd_service_start_command='sudo /etc/init.d/repmgrd start' #??????
repmgrd_service_stop_command='sudo //etc/init.d/repmgrd stop'#??????
service_start_command='sudo /usr/bin/systemctl start postgresql#11-main.service'
service_stop_command='sudo /usr/bin/systemctl stop postgresql#11-main.service'
service_restart_command='sudo /usr/bin/systemctl restart postgresql#11-main.service'
service_reload_command='sudo /usr/bin/systemctl relaod postgresql#11-main.service'
monitoring_history=yes
log_status_interval=60
register is OK:
repmgr -f /etc/repmgr.conf witness register -h 10.97.7.97
INFO: connecting to witness node "PG-Node-Witness" (ID: 3)
INFO: connecting to primary node
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
INFO: witness registration complete
NOTICE: witness node "PG-Node-Witness" (ID: 3) successfully registered
repmgr daemon dry-run OK too:
$repmgr -f /etc/repmgr.conf daemon start --dry-run
INFO: prerequisites for starting repmgrd met
DETAIL: following command would be executed:
sudo /usr/bin/systemctl start postg...#11-main.service
I setup /etc/default/repmgrd with:
REPMGRD_ENABLED=yes
and
REPMGRD_CONF="/etc/repmgr.conf"
But still get error when trying to run the daemon start:
$ repmgr -f /etc/repmgr.conf daemon start
I get:
NOTICE: executing: "sudo /etc/init.d/repmgrd start"
ERROR: repmgrd does not appear to have started after 15 seconds
HINT: use "repmgr service status" to confirm that repmgrd was successfully started
It is recommended to run repmgrd as a systemd service,
According to the docs (for debian) you may first need to configure /etc/default/repmgrd,
My configuration looks like this:
# default settings for repmgrd. This file is source by /bin/sh from
# /etc/init.d/repmgrd
# disable repmgrd by default so it won't get started upon installation
# valid values: yes/no
REPMGRD_ENABLED=yes
# configuration file (required)
REPMGRD_CONF="/etc/repmgr/12/repmgr.conf"
# additional options
REPMGRD_OPTS="--daemonize=false"
# user to run repmgrd as
REPMGRD_USER=postgres
# repmgrd binary
REPMGRD_BIN=/bin/repmgrd
# pid file
REPMGRD_PIDFILE=/var/run/repmgrd.pid
Secondly, I would revisit sudoers (visudo) in order to check whether the non-root user can execute sudo /etc/init.d/repmgrd start.
Further, the user who runs repmgr commands should be able to write logs depending on your configuration.
Apparently the correct command to start the repmgr daemon is:
repmgrd -f /etc/prepmgr.conf

I can access the site over SSH but not via HTTP?

I have a wordpress instance on AWS Lightsail.
I can access the VPS via SSH but it won't load over HTTP??
Here is what my error_log is saying:
[Mon May 13 10:58:14.946209 2019] [proxy:error] [pid 2780:tid 139711779657472] (2)No such file or directory: AH02454: FCGI: attempt to connect to Unix domain socket /opt/bitnami/php/var/run/wordpress.sock (wordpress-fpm) failed
[Mon May 13 10:58:14.946221 2019] [proxy_fcgi:error] [pid 2780:tid 139711779657472] [client 78.46.85.236:11708] AH01079: failed to make connection to backend: httpd-UDS
I have checked all services are running. i.e apache2, mySQL & PHP
this isn't Apache complaining. Apache is running just fine and cannot reverse proxy to the unix socket wordpress-fpm. It is likely that there is an issue where the php-fpm service is either not started, or your app is erroring. there should be a separate php error log and apache error log (this looks like the apache one). of course try the below commands to make sure your app is running at all.
$ sudo service php-fpm start # <- start it
$ sudo service php-fpm stop # <- stop it
$ sudo service php-fpm restart # <- restart it
$ sudo service php-fpm reload # <- reload it

Unable to create MariaDB Galera Cluster

I have built an image based on mariadb:10.1 which basically adds a new cluster.conf but facing the following error on the second node after the first node started working successfully. Can somebody help me debug here?
Error log tail
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
at gcomm/src/pc.cpp:connect():162
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: -110 (Connection timed out)
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs/src/gcs.cpp:gcs_open():1380: Failed to open channel 'test_cluster' at 'gcomm://172.17.0.2,172.17.0.3,172.17.0.4': -110 (Connection timed out)
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: gcs connect failed: Connection timed out
2016-09-28 10:12:55 139799503415232 [ERROR] WSREP: wsrep::connect(gcomm://172.17.0.2,172.17.0.3,172.17.0.4) failed: 7
2016-09-28 10:12:55 139799503415232 [ERROR] Aborting
MySQL init process failed.
Debugging steps taken
NOTE: Container IP addresses were ensured to be the same as shown.
To ensure networking between containers is working, tried creating another container which could login to the first container's mysql instance.
This is definitely not related to MYSQL_HOST
To see if the container was running out of memory, I used docker stats and saw that the failed container was using only a meagre 142MB all through its lifecycle until it failed, which is way lesser than the total memory it was allowed (~4GB).
I am using Docker for Mac, but tried running the same on a CentOS VirtualBox and gives the same results. Doesn't look like Docker on Mac has a problem.
Config
[mysqld]
user=mysql
binlog_format=ROW
bind-address=0.0.0.0
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
innodb_flush_log_at_trx_commit=0
innodb_buffer_pool_size=122M
innodb_file_per_table=1
innodb_doublewrite=1
query_cache_size=0
query_cache_type=0
wsrep_on=ON
wsrep_provider=/usr/lib/libgalera_smm.so
wsrep_sst_method=rsync
Steps to start containers
# bootstrap node
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \
activatedgeek/mariadb:devel \
--wsrep-cluster-name=test_cluster \
--wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4 \
--wsrep-new-cluster
# add node into cluster
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \
activatedgeek/mariadb:devel \
--wsrep-cluster-name=test_cluster \
--wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4
# add node into cluster
docker run --rm -e MYSQL_ROOT_PASSWORD=123 \
activatedgeek/mariadb:devel \
--wsrep-cluster-name=test_cluster \
--wsrep-cluster-address=gcomm://172.17.0.2,172.17.0.3,172.17.0.4
This problem is caused due to the hanging init process. The configurations and CLI arguments above are correct. The only thing to be done before the init process starts is to create and empty mysql directory in the data directory (/var/lib/mysql by default). The must only be created on all nodes except the bootstrap node.
mkdir -p /var/lib/mysql/mysql
See sample MariaDB Cluster for usage which uses a custom MariaDB image and is a proof of concept for creating clusters.
I guess your containers should either expose the required ports:
-p 3306:3306 -p 4444:4444 -p 4567:4567 -p 4568:4568
or should be --link (ed) together.

Riak 1.3.1 will not start on lucid, Ec2 instance

I have installed riak (apt-get) on an EC2 instance, lucid, amd64 with libssl.
When running riak start I get:
Attempting to restart script through sudo -H -u riak
Riak failed to start within 15 seconds,
see the output of 'riak console' for more information.
If you want to wait longer, set the environment variable
WAIT_FOR_ERLANG to the number of seconds to wait.
Running riak console:
Exec: /usr/lib/riak/erts-5.9.1/bin/erlexec -boot /usr/lib/riak/releases/1.3.1/riak
-embedded -config /etc/riak/app.config
-pa /usr/lib/riak/lib/basho-patches
-args_file /etc/riak/vm.args -- console
Root: /usr/lib/riak
Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:2:2] [async-threads:64] [kernel-poll:true]
/usr/lib/riak/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has closed.
Erlang has closed
{"Kernel pid terminated",application_controller,"{application_start_failure,riak_core, {shutdown,{riak_core_app,start,[normal,[]]}}}"}
Crash dump was written to: /var/log/riak/erl_crash.dump
Kernel pid terminated (application_controller) ({application_start_failure,riak_core, {shutdown,{riak_core_app,start,[normal,[]]}}})
The error logs:
2013-04-24 11:36:20.897 [error] <0.146.0> CRASH REPORT Process riak_core_handoff_listener with 1 neighbours exited with reason: bad return value: {error,eaddrinuse} in gen_server:init_it/6 line 332
2013-04-24 11:36:20.899 [error] <0.145.0> Supervisor riak_core_handoff_listener_sup had child riak_core_handoff_listener started with riak_core_handoff_listener:start_link() at undefined exit with reason bad return value: {error,eaddrinuse} in context start_error
2013-04-24 11:36:20.902 [error] <0.142.0> Supervisor riak_core_handoff_sup had child riak_core_handoff_listener_sup started with riak_core_handoff_listener_sup:start_link() at undefined exit with reason shutdown in context start_error
2013-04-24 11:36:20.903 [error] <0.130.0> Supervisor riak_core_sup had child riak_core_handoff_sup started with riak_core_handoff_sup:start_link() at undefined exit with reason shutdown in context start_error
I'm new to Riak and basically tried to run through the "Fast Track" docs.
None of the default core IP settings in the configs have been changed. They are still set to {http, [ {"127.0.0.1", 8098 } ]}, {handoff_port, 8099 }
Any help would be greatly appreciated.
I know this is old but there is some solid documentation about the errors in the crash.dump file on the Riak site.

Resources