openstack-Neutron-LBaaS 503 returned by Haproxy - openstack

Recently I am working on LBaaS service. When I set up a pool and it serves,
the haproxy process randomly returns 503:
503 Service Unavailabe
No server is available to handle this request
And I am pretty sure when this problrm happened, member servers are up.
Anyone can help me abt this problem?
PS:when i first build members in the loadbalancer,the status is active, however,it will turn to inactive in a few minutes. And I find a way to resolve this via executing
echo -e 'HTTP/1.0 200 OK\r\n\r\n<serverX>' | nc -l -p
then the status of members turns to active.

Ahaaaa...
Finally i got it, I am really careless..
i should use while true when i execute
echo -e 'HTTP/1.0 200 OK\r\n\r\n' | nc -l -p
otherwise, this command will be only executed by once.

Related

Linux Apline: restart nginx

I have Alpine Linux, 3.15.0 version on the server.
The installed nginx version is 1.21.6. I have performed apk update
nginx -t command successfully responds with
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
When I type nginx -s reload server responds with
2023/02/03 10:58:00 [notice] 54#54: signal process started
but nothing actually happens. It's like the process started and that's all.
What am I missing?
According to Nginx documentation, command nginx -s reload actually sends signal to nginx master process and
once the master process receives the signal to reload configuration,
it checks the syntax validity of the new configuration file and tries
to apply the configuration provided in it. If this is a success, the
master process starts new worker processes and sends messages to old
worker processes, requesting them to shut down.
Thus, we can consider that nginx is restarted (If we disregard the fact that the master process itself continued to work).
At the same time, if you want to totally restart nginx, you can stop it with nginx -s quit command and then start again. Or that's much better use your system service manager. If I'm not mistaken, there is an open-rc in Alpine, thus command will be rc-service nginx restart.

autossh tunnel getting killed after 10 minutes

I have an autossh tunnel set up over which I am sending something that needs an uninterrupted connection for a couple dozen minutes. However, I noticed that every 10 minutes the SSH tunnel managed by autossh is killed and recreated.
This is not due to an inactive connection, as there is active communication happening through that channel.
The command used to set up the tunnel was:
autossh -C -f -M 9910 -N -L 6969:127.0.0.1:12345 remoteuser#example.com
In my case the problem was a clash of the monitoring ports on the remote server. There are multiple servers all autossh-ing to the single central server and two of those "clients" used the same monitoring port (-M).
The default interval in which autossh tries to communicate over the monitoring channel is 600 seconds, 10 minutes. When autossh starts up, it does not verify that it could open the remote monitoring port. Everything will look fine until the time when autossh tries to check that the connection is open - and it fails. At that point the SSH tunnel will be forcibly killed and recreated.
A good way to check if this is your case as well is change the default timeout using the AUTOSSH_POLL environment variable:
AUTOSSH_POLL=10 autossh -C -f -M 9910 -N -L 6969:127.0.0.1:12345 remoteuser#example.com

How do I get back to the running instance of riak-shell?

I was in riak-shell when ssh lost its connection to the server. After reconnecting, I do the following:
sudo riak-shell
and get:
An instance of riak-shell is already running
So, I restarted the riak node in question. This did not seem to solve the problem. I do not see anything using ps -aux to kill. According to the docs, only one instance can run at a time. That makes sense, but when I run riak-shell from another node and try to connect to any node, I now get the following:
Error: invalid function call : connection_EXT:connect ["riak#<<<ip_address_elided>>>"]
You can connect to a specific node (whether in your riak_shell.config
or not) by typing 'connect "dev1#127.0.0.1";' substituting your
node name for dev1.
You may need to change the Erlang cookie to do this.
See also the 'reconnect' command.
Unhandled message received is {#Ref<0.0.0.135>,disconnected}
riak-shell(3)>
I have not changed the cookies during this process, and the cookie appears to be the same (at least in /etc/riak/riak_shell.config). (I am running the Riak TS AMI on AWS.)
riak-shell runs in its own Erlang VM - entirely separate from the riak node
(You don't need to run riak-shell from the machine your node is on - it uses the normal riak-erlang-client to talk to riak)
If you you are on a Linux do ps aux | grep riak_shell_app it will give you the process number you need to kill that instance:
08:30:45:~ $ ps aux | grep riak_shell_app
vagrant 4671 0.0 0.3 493260 34884 pts/4 Sl+ Aug17 0:03 /home/vagrant/riak_ee/dev/dev1/erts-5.10.3/bin/beam.smp -- -root /home/vagrant/riak_ee/dev/dev1 -progname erl -- -home /home/vagrant -- -boot /home/vagrant/riak_ee/dev/dev1/releases/2.1.1/start_clean -run riak_shell_app boot debug_off /home/vagrant/riak_ee/dev/dev1/bin/../log/riak_shell/riak_shell -noshell -config /home/vagrant/riak_ee/dev/dev1/bin/../etc/riak
I wrote a good chunk of it so let me know how you got on:
https://github.com/basho/riak_shell/graphs/contributors

Playframework always operation timeout on about 16k-th request from ApacheBench if keep-alive flag not set

An activator template project was created by
activator new rest-benchmark simple-rest-scala
cd rest-benchmark
activator clean stage
target/universal/stage/bin/rest-benchmark -Dplay.crypto.secret=1234
the template can be found here
Then I run ApacheBench to get a rough idea of playframework's throughput:
ab -n 20000 -c 5 http://127.0.0.1:9000/books
which always give similar result - timeout at around 16.3k-th request:
apr_socket_recv: Operation timed out (60)
Total of 16345 requests completed
However, if I run ab with -k KeepAlive:
ab -k -n 20000 -c 5 http://127.0.0.1:9000/books
The benchmark was able to complete.
I have several questions:
Why always timeout when Keep Alive is absent? Shouldn't play be able to handle requests regardless of this header? Or is it my OS keep the connection open, hence no further requests can be processed?
Why around the 16300-th request? Is it related to ulimit?
If missing Keep Alive will cause connection timeout, what can I do in production?
Edit: Switched to play 2.5.4
Edit2: Changed app launch command as marcospereira suggested, observing same result

Keepalived health check can't connect to 127.0.0.1

I've currently got a cluster of servers running Centos 7 and Docker, and I want to use Keepalived to allocate a floating IP between them. I've configured Keepalived to run a check command on each node which just does curl --silent --fail localhost:80 to ensure a HTTP server is listening.
The web app is running using a Docker container bound to port 80 and --net=host on Docker 1.10.3. Firewalld is also completely disabled.
The problem I'm having is that the curl never succeeds. If I change the check command to echo '' or anything else which exits 0 (without any network interaction) it works fine, but for some reason the curl doesn't work. When I run it from a normal bash terminal it is fine, and echo $? prints a 0.
I'm not even sure how to debug this as Keepalived doesn't provide any documentation on the matter and doesn't seem to log anything in relation to errors coming from the vrrp script.
Any help or suggestions would be greatly appreciated.
Turns out I was using an ancient version of Keepalived. Compiling the latest version from source fixed the issue (rather than using the binary from Centos repos)

Resources