How to scale Docker containers in production - scale

This question's answers are a community effort. Edit existing answers to improve this post. It is not currently accepting new answers or interactions.
So I recently discovered this awesome tool, and it says
Docker is an open-source project to easily create lightweight,
portable, self-sufficient containers from any application. The same
container that a developer builds and tests on a laptop can run at
scale, in production, on VMs, bare metal, OpenStack clusters, public
clouds and more.
Let's say I have a docker image which runs Nginx and a website connects to external database. How do I scale the container in production?

Update: 2019-03-11
First of all thanks for those who have upvoted this answer over the years.
Please be aware that this question was asked in August 2013, when Docker was still a very new technology. Since then: Kubernetes was launched on June 2014, Docker swarm was integrated into the Docker engine in Feb 2015, Amazon launched it's container solution, ECS, in April 2015 and Google launched GKE in August 2015. It's fair to say the production container landscape has changed substantially.
The short answer is that you'd have to write your own logic to do this.
I would expect this kind of feature to emerge from the following projects, built on top of docker, and designed to support applications in production:
flynn
deis
coreos
Mesos
Update 1
Another related project I recently discovered:
maestro
Update 2
The latest release Openstack contains support for managing Docker containers:
Docker Openstack
Paas zone within OpenStack
Update 3
System for managing Docker instances
Shipyard
And a presentation on how to use tools like Packer, Docker and Serf to deliver an immutable server infrastructure pattern
FutureOps with Immutable Infrastructure
Slides
Update 4
A neat article on how to wire together docker containers using serf:
Decentralizing Docker: How to use serf with Docker
Update 5
Run Docker on Mesos using the Marathon framework
Mesosphere Docker Developer Tutorial
Update 6
Run Docker on Tsuru as it supports docker-cluster and segregated scheduler deploy
http://blog.tsuru.io/2014/04/04/running-tsuru-in-production-scaling-and-segregating-docker-containers/
Update 7
Docker-based environments orchestration
maestro-ng
Update 8
decking.io
Update 9
Google kubernetes
Update 10
Redhat have refactored their openshift PAAS to integrate Docker
Project Atomic
Geard
Update 11
A Docker NodeJS lib wrapping the Docker command line and managing it from a json file.
docker-cmd
Update 12
Amazon's new container service enables scaling in the cluster.
Update 13
Strictly speaking Flocker does not "scale" applications, but it is designed to fufil a related function of making stateful containers (running databases services?) portable across multiple docker hosts:
https://clusterhq.com/
Update 14
A project to create portable templates that describe Docker applications:
http://panamax.io/
Update 15
The Docker project is now addressing orchestration natively (See announcement)
Docker machine
Docker swarm
Docker compose
Update 16
Spotify Helios
See also:
https://blog.docker.com/tag/helios/
Update 17
The Openstack project now has a new "container as a service" project called Magnum:
https://wiki.openstack.org/wiki/Magnum
Shows a lot of promise, enables the easy setup of Docker orchestration frameworks like Kubernetes and Docker swarm.
Update 18
Rancher is a project that is maturing rapidly
http://rancher.com/
Nice UI and strong focus on hyrbrid Docker infrastructures
Update 19
The Lattice project is an offshoot of Cloud Foundry for managing container clusters.
Update 20
Docker recently bought Tutum:
https://www.docker.com/tutum
Update 21
Package manager for applications deployed on Kubernetes.
http://helm.sh/
Update 22
Vamp is an open source and self-hosted platform for managing (micro)service oriented architectures that rely on container technology.
http://vamp.io/
Update 23
A Distributed, Highly Available, Datacenter-Aware Scheduler
https://www.nomadproject.io/
From the guys that gave us Vagrant and other powerful tools.
Update 24
Container hosting solution for AWS, open source and based on Kubernetes
https://supergiant.io/
Update 25
Apache Mesos based container hosted located in Germany
https://sloppy.io/features/#features
And Docker Inc. also provide a container hosting service called Docker cloud
https://cloud.docker.com/
Update 26
Jelastic is a hosted PAAS service that scales containers automatically.

Deis automates scaling of Docker containers (among other things).
Deis (pronounced DAY-iss) is an open source PaaS that makes it easy to deploy and manage applications on your own servers. Deis builds upon Docker and CoreOS to provide a lightweight PaaS with a Heroku-inspired workflow.
Here is the developer workflow:
deis create myapp # create a new deis app called "myapp"
git push deis master # built with a buildpack or dockerfile
deis scale web=16 worker=4 # scale up docker containers
Deis automatically deploys your Docker containers across a CoreOS cluster and configures the Nginx routers to route requests to healthy Docker containers. If a host dies, containers are automatically restarted on another host in seconds. Just browse to the proxy URL or use deis open to hit your app.
Some other useful commands:
deis config:set DATABASE_URL= # attach to a database w/ an envvar
deis run make test # run ephemeral containers for one-off tasks
deis logs # get aggregated logs for troubleshooting
deis rollback v23 # rollback to a prior release
To see this in action, check out the terminal video at http://deis.io/overview/. You can also learn about Deis concepts or jump right into deploying your own private PaaS.

You can try Tsuru. Tsuru is a opensource PaaS inspired in Heroku, and it is already with some products in production at Globo.com(internet arm of the biggest Broadcast Television Company in Brazil)
It manages the entire flow of an application, since the container creation, deploy, routing(with hipache) with many nice features as docker cluster, scaling of units, segregated deploy, etc.
Take a look in our documentation bellow:
http://docs.tsuru.io/
Here our post covering our environment:
http://blog.tsuru.io/2014/04/04/running-tsuru-in-production-scaling-and-segregating-docker-containers/

Have a look at Rancher.com - it can manage multiple Docker hosts and much more.

A sensible approach to scaling Docker could be:
Each service will be a docker container
Intra container service discovery managed through links (new feature from docker 0.6.5)
Containers will be deployed through Dokku
Applications will be managed through Shipyard which in its turn is using hipache
Another docker open sourced project from Yandex:
cocaine

Openshift guys also created a project. You can find more information here, try test container and detailed info here .
The only problem is the solution is Redhat centric for now :)

While we're big fans of Deis (deis.io) and are actively deploying to it, there are other Heroku like PaaS style deployment solutions out there, including:
Longshoreman from the Wayfinder folks:
https://github.com/longshoreman/longshoreman
Decker from the CloudCredo folks, using CloudFoundry:
http://www.cloudcredo.com/decker-docker-cloud-foundry/
As for straight up orchestration, NewRelic's opensource Centurion project seems quite promising:
https://github.com/newrelic/centurion

Take a look also at etcd and Consul.

Panamax: Docker Management for Humans. panamax.io
Fig: Fast, isolated development environments using Docker. fig.sh

One option not mentioned in other posts is Helios. It is built by spotify and does not try to do too much.
https://github.com/spotify/helios

Related

How to run DockerOperator when Airflow is already a docker container?

I currently have an Airflow instance running with docker-compose. In the future, I will be moving to a kubernetes cluster. So, Airflow will always be running in a docker container.
That being said, how do I run a DockerOperator when Airflow itself is inside a docker container?
It's a docker-in-docker inception problem that I don't fully understand how to mitigate.
Thanks!
You could map docker socket inside the container (/var/run/docker.sock) configure docker engine URL to connect to external Docker engine (Docker architecture is done in the way that you actually run client locally and the engine running your docker container runs elsewhere.
Just look for Docker-in-Docker term.
There is currently (as of 2.0.0 Docker Provider) a bug preventing doing that, but it is being addressed in https://github.com/apache/airflow/pull/16932 and it will be out in < week - or you can downgrade to previous version of the provider.

Arflow in ECS cluster

I have previously used Airflow running Ubuntu VM. I was able to login into VM via WinSCP and putty to run commands and edit Airflow related files as required.
But i first time came across Airflow running AWS ECS cluster. I am new to ECS So, i am trying to see what is the best possible way to :
run commands like "airflow dbinit", stop/start web-server and scheduler etc...
Install new python packages like "pip install "
View and edit Airflow files in ECS cluster
I was reading about AWS CLI and ECS cli could they be helpful ? or is there is any other best possible way that lets me do above mentioned actions.
Thanks in Advance.
Kind Regards
P.
There are many articles that describe how to run Airflow on ECS:
https://tech.gc.com/apache-airflow-on-aws-ecs/
https://towardsdatascience.com/building-an-etl-pipeline-with-airflow-and-ecs-4f68b7aa3b5b
https://aws.amazon.com/blogs/containers/running-airflow-on-aws-fargate/
many more
[ Note: Fargate allows you to run containers via ECS without a need to have EC2 instances. More here if you want additional background re what Fargate is]
Also, AWS CLI is a generic CLI that maps all AWS APIs (mostly 1:1). You may consider it for your deployment (but it should not be your starting point).
The ECS CLI is a more abstracted CLI that exposes higher level constructs and workflows that are specific to ECS. Note that the ECS CLI has been superseded by AWS Copilot which is an opinionated CLI to deploy workloads patterns on ECS. It's perhaps too opinionated to be able to deploy Airflow.
My suggestion is to go through the blogs above and get inspired.

How to keep persistent volumes in sync between clusters?

I'm trying to get an installation of Wordpress running in Kubernetes, as well as have an option of running the same configuration locally in minikube. I want to use the standard Docker image of Wordpress: https://hub.docker.com/_/wordpress/.
I'm having trouble with making sure that the plugins and templates are in sync though. The Docker container exposes a Volume at /var/www/html. Wordpress installation, as well as my plugins will live there.
Assuming I do the development on Minikube, along with the installation of plugins etc. How do handle the move between Persistent Volumes between my local cluster and the target cluster? Should I just reinstall Wordpress every time when the Pod is scaled?
You can follow Writing Portable Configuration (https://kubernetes.io/docs/concepts/storage/persistent-volumes/#writing-portable-configuration) guide for persistent volume if you are planning to migrate it to different cluster.
In a real production scenario you would want to use a standard tool to backup and migrate persistent volumes between clusters. Valero is such a tool which enables you to achieve that.

Cloudify architecture

I am trying to setup cloudify in an OpenStack installation using this offline guide.
This guide does not specify much about cloud platform so I have assumed it can be used on OpenStack environment. I am using simple manager blueprint YAML file for bootstrapping .
I have the following questions:
Can I use fabric 1.4.2 with cloudify 3.4.1 ?
If not, I am unable to find wagon-file for fabric 1.4.1.wgn file
Architecture: Can I use CLI inside a network to bootstrap a manager within that network? And this network lies inside OpenStack environment. Can cloudify CLI machine, cloudify Manager and application reside within one network inside openstack? If so, how? Because we would like to test it inside one single network.
(Full disclosure: I wrote the document you linked to.)
Yes you can.
You can find all Wagon files for all versions of the Fabric plugin here: https://github.com/cloudify-cosmo/cloudify-fabric-plugin/releases
Yes.

Dart lang app with open stack / docker / vagrant

I'm newbie for these techs (open stack / docker / vagrant), not sure if I understood them correctly (most likely did not), for me I understood it is something like having a portable application to run it with same development configuration to ensure all the development team have same setup, but did not understand, what after development, and how to get benefit from them with dart app.
my question is:
1. Correct my understanding
2. Do I need the end user to have these things installed in his system, and run my application through them, same as in the development stage?
3. How can I build/develop/distribute dart lang app through them, may be as hese as well as dart are new, I could not find enough info while googling.
thanks
Docker is similar to a virtual machine like VM-Ware or Virtualbox as it creates an abstraction layer between the host operating system and the operating system running within a Docker container. The difference is that Docker doesn't emulate the entire hardware. The disadvantage is that Docker only runs on Linux and only Linux can be run inside Docker. If your host is an Intel system you can't run an ARM Linux inside the container. (theoretically you can run Virtualbox inside Docker and run Windows. or other OSes in it)
With Docker you can test your application locally in the same environment as the application will run when deployed.
When you for example create an application you want to run in Google Compute Engine you install and test it locally inside a Docker container and then deploy the Docker container to Google Compute Engine as a whole unit. When there is a bug in the deployed application you should be able to reproduce it locally as well because it's just a 1:1 copy. No bug could have been introduce because the operating system or other dependencies were installed differently on the deployment environment than in the develeopment/test environment.
The Dockerfile is a set of instructions to set up a Docker container. If you want to create a new Docker container (for example for a new developer) you just let Docker process the Dockerfile and a new Docker container is created from it. This allows to easily create new Containers.
If you want to update one dependency to a newer version or want to add remove components to/from the environment you change the Dockerfile and create a new container from it. This way you avoid that manual addition/removal form/to an existing container manually lets containers of different developers/testers/deployment diverge from each other.
I haven't used OpenStack myself but from the web page it seems to provide components and tools to build and manage your own cloud infrastructure.
I also haven't used Vagrant myself but it seems to help to automate a lot of tasks related to creating and managing virtual machines like VM-Ware, Virtualbox, Docker and probably others.
When you have for example a server application it probably consist of a number of components you don't want all to run in one container but split up into several containers. One container for the Database, one for the web server, one for the backend application (created in Dart for example), and others. It can become cumbersome to manage all those containers. Vagrant helps to automate related tasks.

Resources