Docker dynamic load balancing with Nginx - nginx

I'm doing an internship focused on Docker and I have to load-balance an application which have a client, a server and a database. I use Nginx as a load-balancer and my goal is to dynamically scale the number of server containers according their CPU usage. For instance if the CPU usage is over 60% I want to add a new container on the fly without restarting Nginx to divide the CPU usage.
I have to modify the nginx.conf file to add a new container but I have to restart the Nginx container to apply the changes, which is very slow.
So my question is : is there a (free) way to do it dynamically ?
Tell me if you want further information and forgive my poor english.
Thanks.
EDIT : I did as #Konstantin Azizov told me :
docker cp ./new.conf $(docker ps -f "name=dockerizedrubis_nginx" -q ):/etc/nginx/nginx.conf
docker exec $(docker ps -f "name=dockerizedrubis_nginx" -q) bash -c 'kill -HUP $(cat /run/nginx.pid)'
docker exec $(docker ps -f "name=dockerizedrubis_nginx" -q) bash -c '/etc/init.d/nginx reload'
The configuration file is well pasted in the container supporting Nginx, I send the HUP signal to reconfigure the Nginx process et then I reload to apply my changes. There are no errors and the reload on-the-fly works fine but my new nodes are not taken into account by Nginx, the requests are still only directed to the first node created ...
EDIT 2 : I found the origin of the problem. It seems like in order to update the /etc/hosts of a container after a 'docker-compose scale', this container needs to be stopped, removed and restarted. In my case, I really don't want to stop the container supporting Nginx.
Question : Anyone has an idea of how to update /etc/hosts of a container after a re-scale without having to restart the container (beside a dirty script) ?
Thanks.

I used the nginx-proxy image from Json Wilder for a while to load balance between containers, and it works for more than one scalable service. It monitors the docker daemon and if an event happens it rebuilds the nginx config file adding the new container instances when you are scaling out or removing it if you are scaling in.
Since Docker 1.10 (not sure if this is the correct version) there is a internal DNS embedded into Docker daemon, so since then I am using the round robin feature from it. Now I am using the oficial nginx image to proxy pass the requests to the a domain that I define as alias into network options. I do not know if I was clear due to my poor english but I believe my Github example may help.

Unfortunately there is no easy(free) way to change configuration without restarting, the only way to achieve zero-downtime scaling it's graceful restart, when you restarting Nginx gracefully it will spawn new instance with new configuration wait until it boots up and then kill old instance with the previous configuration.
See official guide.

Related

How can I increase gitlab CE lfs file size limitation as to not get 500 server errors?

I'm using the excellent sameersbn/gitlab to set up a custom gitlab server for my job.
So I have a ridiculous scenario, I'm using git lfs to store files which are in the 10-20 GB range, with gitlab ce v8.12.5, but I'm seeing 500 server errors all over the place, my uploads cannot finish.
Question: Does anyone know how I can increase the server side limitations?
Note: This is not a 413 nginx issue, I've set the client_max_body_size 500G so it should be forwarding to gitlab just fine.
If any more info is required(i.e. log files, etc) I will gladly provide it, just make a comment.
Update.1:
There seems to be a related gitlab issue on this same problem.
Update.2
Other resources which are relavant:
For now my hypothesis is that there is a timeout somewhere in the chain or proxy servers in the docker container.
git bash: error: RPC failed; result = 18, HTP code = 200B | 1KiB/s
https://github.com/gitlabhq/gitlabhq/issues/694
So here's something I just noticed the docker mapped device /dev/dm-7 becomes 100% full near the same time that gitlab errors out with a 500.
Now I'm starting to believe that this is not a gitlab problem, but a docker problem and that gitlab is just running out of space.
Thanks for your time, and cheers.
Problem 1
The first major issue was the following error in ~/log/gitlab-workhorse.log
error: handleStoreLfsObject: copy body to tempfile: unexpected EOF
This error had nothing to do with gitlab itself, but rather that docker in their latest version has decided to shrink the default container sizes from 100G / container to 10G / container, which consequently meant that whenever I tried to upload files larger than 10GB, gitlab would attempt to make a temporary file of the size of the uploaded file (in my case 30GB) and subsequently blow with the above error message for lack of space in the docker container.
I followed this excellent guide on how to increase the size of my container, but it basically boils down to:
sudo `which docker-compose` down
to stop the running container.
sudo vim /etc/systemd/system/docker.service
and appending
--sotrage-opt dm.basesize=100G
as the default size for new base images. Now since there seems to be a current issue with docker, you have to
sudo `which docker` rmi gitlab
assuming your image is called gitlab, and
sudo `which docker-compose` up
to re-pull the image and have it be created with the proper size.
If this still doesn't work try sudo systemctl restart docker.service as this seems to help when docker seems to not do what you asked it to do.
sudo docker exec -it gitlab df -h
Should produce something like:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/docker-253:1-927611-353ffe52e1182750efb624c81a3a040d5c054286c6c5b5f709bd587afc92b38f 100G 938M 100G 1% /
I'm not 100% certain that all of these settings were necessary, but in solving Problem 2, below, I ended up having to set these as well in the docker-compose.yml
- GITLAB_WORKHORSE_TIMEOUT=60m0s
- UNICORN_TIMEOUT=3600
- GITLAB_TIMEOUT=3600
Problem 2
In addition to the resizing the container from 10GB to 100GB I had to add the following to the running instance:
The reason for this was that the filesize was so large (30GB+) and the network speed so slow(10MB/s) that the upload was taking longer than the default nginx and timing out with a 504 Gateway Timeout.
sudo docker exec -it /bin/bash
of the gitlab container's vim.tiny /etc/nginx/nginx.conf:
http {
...
client_max_body_size 500G;
proxy_connect_timeout 3600;
proxy_send_timeout 3600;
proxy_read_timeout 3600;
send_timeout 3600;
...
}
then I restarted nginx. Sadly service restart nginx did not work and I had to:
service stop nginx
service start nginx
Note: I have a reverse proxy running on this server which catches all http requests, so I'm not certain, but I believe that all the settings that I added to the container's nginx config have to be duplicated on the proxy side
If I've left step out or you would like some clarification on how exactly to do a certain part of the procedure please leave a comment and ask. This was a royal pain and I hope to help someone with this solution.

Getting "connection refused" when trying to access etcd from within a Docker container

I am trying to access etcd from within a running Docker container. When I run
curl http://172.17.42.1:4001/v2/keys
I get
curl: (7) Failed to connect to 172.17.42.1 port 4001: Connection refused
I have four other hosts where this works fine, but every container on this machine has this problem. I'm really at a loss as to what's going on, and I don't know how to debug it.
My etcd environment variables are
ETCD_ADVERTISE_CLIENT_URLS=http://10.242.10.2:2379
ETCD_DISCOVERY=https://discovery.etcd.io/<token_removed>
ETCD_INITIAL_ADVERTISE_PEER_URLS=http://10.242.10.2:2380
ETCD_LISTEN_CLIENT_URLS=http://10.242.10.2:2379,http://127.0.0.1:2379,http://0.0.0.0:4001
ETCD_LISTEN_PEER_URLS=http://10.242.10.2:2380
I can also access etcd from the host with
curl http://localhost:4001/v2/keys
So there seems to be some error when routing from the container out to the host. But I can't figure out what it is. Can anyone point me in the right direction?
I observed I had to use the --advertise-client-urls and --listen-client-urls. Like so:
./etcd --advertise-client-urls 'http://0.0.0.0:2379,http://0.0.0.0:4001' --listen-client-urls 'http://0.0.0.0:2379,http://0.0.0.0:4001'
Then I was able to successfully do
curl -L http://hostname:2379/version
from any machine that could reach that server and it worked.
It turns out etcd was only listening on localhost:4001 on that machine, which is why I couldn't access it from within a container. This is despite me configuring one of the listen client urls to 0.0.0.0:4001.
It turns out that I had run sudo systemctl enable etcd2, which caused it to run before the cloud-config service ran. As such, etcd started with default configuration instead of the one that I had specified in my cloud config. Running sudo systemctl disable etcd2 fixed the issue.

Docker restart not showing the desired effect

I have a small nginx based test application that I want to run inside a docker container. So I followed the example given here docker installation
So I have a foder name restartTest and it contains an index.html file that has this single line in it that says Docker Test 1. I mount this up as my volume during runtime for docker container. So the commmand I use is
docker run -dP -v /Users/Sachin/restartTest/:/usr/share/nginx/html --name engine2 nginx
And it runs fine. I use curl to verify that the volume has mounted properly and the application is running as desired. Now what I do is that I change the content of the index.html file (from my localhost) to Docker test 2 and then I restart the container. I execute the following command to verify that the content has indeed changed inside the docker container
docker exec engine2 cat /usr/share/nginx/html/index.html
And as expected, the file reads Docker Test 2. However, when I use the curl command to see if the webpage also reflects the change I see that I still get Docker Test 1 as the response. The index.html reflects the change however when I run the curl command or if I access the app from the browser, I still get the same result. I have tried the following but to no avail.
Restart the service
Stop and start the container
Stop and start the boot2docker VM and docker daemon.
I have no clue as to why this is happening.
So I found this known bug with VirtualBox VM that is used for running Docker on Mac.
When we have shared content between our host machine and the VirtualBox, then only we face this bug. There is a optimisation as far as web servers like nginx, apache (and apparently vertx) are concerned. Whenever we request a static file from the server, it uses sendfile to provide us with the file. The bug is that in case of VirtualBox (in the scenario described above) we always get the first version of the file no matter what we try. The workaround for this in case of nginx and apache is to turn sendfile off . However, there is a hack that we use as far as vertx is concerned.
rename the file say login.html to login.html.moved (anything)
curl :/….../login.html (we won’t get anything)
rename the file back to its original name login.html.moved to login.html
Hard refresh the page (Command + Shift + R).
For further reading about this bug consult the following
Link1
Link2
Link3
Link4
I assume it is a caching problem. Did you try to set expires -1 in your index.html location configuration to disable server side caching for static files?

Ensure nginx master process stays running

I am currently trying to setup a docker container using ubuntu:14.04 as my base image, with nginx and gunicorn/django/celery running inside. I am using supervisor to start all of the processes, and have tested to make sure gunicorn is relaunched when it goes down. However, I can't figure out how to do it with nginx.
My supervisord.conf for nginx is as follows:
[program:nginx]
command=nginx
autorestart=false
I have autorestart set to false because, from what I can tell, the nginx command simply starts the master process and worker processes, and then exits with status code 0. If I have autorestart set to true, it simply keeps trying to restart that nginx command, which will fail for subsequent retries because the master/worker processes are already running and bound to the port.
On the surface, this seems okay, because if I try and kill a worker process, the master will start another worker to take it's place. But how do I ensure the master process stays running as well?
You need to append daemon off; to your nginx.conf configuration instructing nginx to run in the foreground.
Then modify your supervisor stanza to be:
[program:nginx]
command=nginx
autorestart=true
It will still spawn master/worker processes/subprocesses and can be used this way in production setups just fine. In this case it's supervisor that runs the process in the background and controls and supervises it.
See this FAQ entry

Nginx Tornado and Flask - What's a good start/stop script and keep-alive method

I've set up a Flask application to run on a tornado server backed by nginx. I've written a couple of bash scripts to reload server configuration when a new version is deployed, but I am unhappy with them. Basically what I have is:
to start the server (assuming in project root)
# this starts the tornado-flask wrapper
python myapp.py --port=8000 # .. some more misc settings
# this starts nginx
nginx
to stop it
pkill -f 'myapp.py'
nginx -s stop
to restart
cd $APP_ROOT
./script/stop && ./script/start
Many times these don't work smoothly and I need to manually run the commands. Also, I'm looking for a way to verify the service is alive, and start it up if it's down. Thoughts? Thanks.
Supervisor is what you are looking for.
It's what I use to manage my Tornado apps along with some other processing daemons.
It will daemonize, handle logging, pid files... Pretty much everything you need.

Resources