Gearmand not starting properly, getting "multiple occurences" returned - syslog

Gearmand ran for about a day before I tried to restart it and it wouldn't come back up. Getting the following and the syslog doesn't have anything in it that refers to gearman.
~$ /etc/init.d/gearman-job-server stop && /etc/init.d/gearman-job-server start
* Stopping Gearman Server gearmand No /usr/sbin/gearmand found running; none killed.
[ OK ]
* Starting Gearman Server gearmand multiple occurrences
[fail]
* Please take a look at the syslog

Just had this problem too. I think the 'multiple occurrences' is referring to a conflicting commandline switch
In my case I was passing --listen=127.0.0.1 aswell as --listen=0.0.0.0
Removing 127.0.0.1 fixed the problem

Related

How to control vhost_shared_traffic memory K8s nginx ingress?

Background
We run a kubernetes cluster that handles several php/lumen microservices. We started seeing the app php-fpm/nginx reporting 499 status code in it's logs, and it seems to correspond with the client getting a blank response (curl returns curl: (52) Empty reply from server) while the applications log 499.
10.10.x.x - - [09/Mar/2020:18:26:46 +0000] "POST /some/path/ HTTP/1.1" 499 0 "-" "curl/7.65.3"
My understanding is nginx will return the 499 code when the client socket is no longer open/available to return the content to. In this situation that appears to mean something before the nginx/application layer is terminating this connection. Our configuration currently is:
ELB -> k8s nginx ingress -> application
So my thoughts are either ELB or ingress since the application is the one who has no socket left to return to. So i started hitting ingress logs...
Potential core problem?
While looking the the ingress logs i'm seeing quite a few of these:
2020/03/06 17:40:01 [crit] 11006#11006: ngx_slab_alloc() failed: no memory in vhost_traffic_status_zone "vhost_traffic_status"
Potential Solution
I imagine if i gave vhost_traffic_status_zone some more memory at least that error would go away and on to finding the next error.. but I can't seem to find any configmap value or annotation that would allow me to control this. I've checked the docs:
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/
Thanks in advance for any insight / suggestions / documentation I might be missing!
here is the standard way to look up how to modify the nginx.conf in the ingress controller. After that, I'll link in some info on suggestions on how much memory you should give the zone.
First start by getting the ingress controller version by checking the image version on the deploy
kubectl -n <namespace> get deployment <deployment-name> | grep 'image:'
From there, you can retrieve the code for your version from the following URL. In the following, I will be using version 0.10.2.
https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.10.2
The nginx.conf template can be found at rootfs/etc/nginx/template/nginx.tmpl in the code or /etc/nginx/template/nginx.tmpl on a pod. This can be grepped for the line of interest. I the example case, we find the following line in the nginx.tmpl
vhost_traffic_status_zone shared:vhost_traffic_status:{{ $cfg.VtsStatusZoneSize }};
This gives us the config variable to look up in the code. Our next grep for VtsStatusZoneSize leads us to the lines in internal/ingress/controller/config/config.go
// Description: Sets parameters for a shared memory zone that will keep states for various keys. The cache is shared between all worker processe
// https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_zone
// Default value is 10m
VtsStatusZoneSize string `json:"vts-status-zone-size,omitempty"
This gives us the key "vts-status-zone-size" to be added to the configmap "ingress-nginx-ingress-controller". The current value can be found in the rendered nginx.conf template on a pod at /etc/nginx/nginx.conf.
When it comes to what size you may want to set the zone, there are the docs here that suggest setting it to 2*usedSize:
If the message("ngx_slab_alloc() failed: no memory in vhost_traffic_status_zone") printed in error_log, increase to more than (usedSize * 2).
https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_zone
"usedSize" can be found by hitting the stats page for nginx or through the JSON endpoint. Here is the request to get the JSON version of the stats and if you have jq the path to the value: curl http://localhost:18080/nginx_status/format/json 2> /dev/null | jq .sharedZones.usedSize
Hope this helps.

How do I get back to the running instance of riak-shell?

I was in riak-shell when ssh lost its connection to the server. After reconnecting, I do the following:
sudo riak-shell
and get:
An instance of riak-shell is already running
So, I restarted the riak node in question. This did not seem to solve the problem. I do not see anything using ps -aux to kill. According to the docs, only one instance can run at a time. That makes sense, but when I run riak-shell from another node and try to connect to any node, I now get the following:
Error: invalid function call : connection_EXT:connect ["riak#<<<ip_address_elided>>>"]
You can connect to a specific node (whether in your riak_shell.config
or not) by typing 'connect "dev1#127.0.0.1";' substituting your
node name for dev1.
You may need to change the Erlang cookie to do this.
See also the 'reconnect' command.
Unhandled message received is {#Ref<0.0.0.135>,disconnected}
riak-shell(3)>
I have not changed the cookies during this process, and the cookie appears to be the same (at least in /etc/riak/riak_shell.config). (I am running the Riak TS AMI on AWS.)
riak-shell runs in its own Erlang VM - entirely separate from the riak node
(You don't need to run riak-shell from the machine your node is on - it uses the normal riak-erlang-client to talk to riak)
If you you are on a Linux do ps aux | grep riak_shell_app it will give you the process number you need to kill that instance:
08:30:45:~ $ ps aux | grep riak_shell_app
vagrant 4671 0.0 0.3 493260 34884 pts/4 Sl+ Aug17 0:03 /home/vagrant/riak_ee/dev/dev1/erts-5.10.3/bin/beam.smp -- -root /home/vagrant/riak_ee/dev/dev1 -progname erl -- -home /home/vagrant -- -boot /home/vagrant/riak_ee/dev/dev1/releases/2.1.1/start_clean -run riak_shell_app boot debug_off /home/vagrant/riak_ee/dev/dev1/bin/../log/riak_shell/riak_shell -noshell -config /home/vagrant/riak_ee/dev/dev1/bin/../etc/riak
I wrote a good chunk of it so let me know how you got on:
https://github.com/basho/riak_shell/graphs/contributors

Ensure nginx master process stays running

I am currently trying to setup a docker container using ubuntu:14.04 as my base image, with nginx and gunicorn/django/celery running inside. I am using supervisor to start all of the processes, and have tested to make sure gunicorn is relaunched when it goes down. However, I can't figure out how to do it with nginx.
My supervisord.conf for nginx is as follows:
[program:nginx]
command=nginx
autorestart=false
I have autorestart set to false because, from what I can tell, the nginx command simply starts the master process and worker processes, and then exits with status code 0. If I have autorestart set to true, it simply keeps trying to restart that nginx command, which will fail for subsequent retries because the master/worker processes are already running and bound to the port.
On the surface, this seems okay, because if I try and kill a worker process, the master will start another worker to take it's place. But how do I ensure the master process stays running as well?
You need to append daemon off; to your nginx.conf configuration instructing nginx to run in the foreground.
Then modify your supervisor stanza to be:
[program:nginx]
command=nginx
autorestart=true
It will still spawn master/worker processes/subprocesses and can be used this way in production setups just fine. In this case it's supervisor that runs the process in the background and controls and supervises it.
See this FAQ entry

Cannot get MariaDB to start

I've been away and one of my databases became a massive size that MariaDB had crashed and I cannot get it to restart. I have tried moving the path /var/lib/ and restart to no avail. I am trying to find messages in /var/log/mysql/ and /var/log but there are no error messages to give me a clue.
Can anyone offer some solutions?
MariaDB: 10.0.12
Debian 7.6.
Thanks.
Check for error messages in the syslog. Mariadb 10.0.* send tehm there if you do not change the /etc/mariadb.conf.d/50-mysqld_safe.cnf and comment the 2 lines: skip_error_log and syslog

exec function blocks other requests during the execution

I can't get http response when a exec or system function is excuting.
A specific request launch the exec function sudo find / -name toto, this take a while
During the execution, the client send any other requests
I don't get responses
Did you have a similar problem ? I tried shell_exec, exec, system...
I tried also this case :
A request launch a infinite loop
A second request launch an other script who return 'toto'
It work fine. What's wrong with exec ?
Thank you for your ideas

Resources