Autoscaling Google Cloud Composer - airflow

I have read this Medium article which is one of the top hits when searching for autoscaling and Cloud Composer. It shows some 'hacks' that you can use to autoscale Composer while it remains configured to use the CeleryExecutor.
I have also read the GCP docs on using KubernetesPodOperator (KPO) with Cloud Composer, and have implemented that before.
However, using KPO means that you don't get to utilize all the other Airflow Operators - you have to write your own container and code every time.
KubernetesExecutor seems to be the best way forward - you get to use the Airflow Operators, and autoscaling can be enabled since it will create a new Kubernetes Pod for every task instance.
Google Cloud Composer currently runs on CeleryExecutor, in Blocked Airflow Configurations it currently states
Cloud Composer configures Airflow to use the Celery executor
for the core-executor setting.
Will KubernetesExecutor ever be an option for Composer?

Unfortunately, your question can't be answered yet as there is no official plans of doing so. That said, I would be surprised if this wasn't at least under consideration by the Cloud Composer product team.
But as soon as there is news about it, it should be published in this Feature Request.

I would recommend this airflow-executors-explained
overview for features comparison of CeleryExecutor and KubernetesExecutor. As you have already researched in the links you have provided, the CeleryExecutor does provide the scalability for the Composer environment. Having KubernetesExecutor as an option would be nice, but is not necessary as the additional benefits don't outweigh the downsides.

Related

Google Cloud Composer: Save on costs

I am trying to figure out how to save on costs via Google Cloud Composer. Is there anyway to spin down the server when none of your dags are running? Then spin it up again when a dag needs to run?
It's costing way too much since I believe even though my dags are not running the server remains up and we're getting charged.
Thanks,
For now, there is no possibility to enable/disable a Composer environment. In order to save money on a server that is not in use, there is a need for a feature similar to autoscaling, for which a request has already been filed.
On Medium site, you can find a lot of useful information, regarding saving costs.
One way to control your costs in Cloud Composer is to use autoscaling. The amount of nodes can be set to autoscale in GKE cluster, follow this guide. Smaller size of Cloud Composer environment and shorter running time would be best practice.
Cloud Composer charges for compute resources allocated to an environment, components continue to run even when none DAGs deployed. There's not much you can reduce/turn off, you may consider another platform services, such as Dataflow, which is serverless.
I hope you find the above pieces of information useful.
You can now take a snapshot from GUI or v1beta API then delete the environment. When you want to work on it, simply create a new environment and load the snapshot from GCS via GUI or API. Creation and snapshot operations may take 20-30 minutes.

Google stackdriver database agent for OracleDB?

We know that Google's stackdriver supports monitroing for third-party applications like postgresql, mysql, couchdb and others mentioned here. They have also defined the service configuration files for the monitoring agent here.
As per my understanding, I think they somehow use collectd's third-party plugins somewhere in this. Also, since there exists a plugin for Oracle, stackdriver should support that too. But I can't see Oracle in the list of supported third-party applications. So, does stackdriver support it or not?
The Stackdriver monitoring agent package does not bundle the oracle plugin, so it's not supported. You may be able to write a shell script (invoked via the exec plugin) or a Python script (invoked via the python plugin) to query your database, and the custom metrics mechanism to ingest metrics.
You could also try BindPlane from our partner, Blue Medora.
Disclaimer: I'm an engineer on the Stackdriver team.

Deploying wordpress as AWS lambda functions?

I am wondering if it is feasible to deploy wordpress as a series of lambda functions on AWS API gateway. Any pointers on the feasibility/gotchas would be greatly appreciated!
Thanks in advance,
PKK
You'll have a lot of things to consider with persistence and even before that, Lambda doesn't support PHP. I'd probably look at Microsoft Azure Functions instead that do support PHP and do have persistent storage.
While other languages (such as Go, Rust, Swift etc.) can be "wrapped" to run in AWS Lambda with relative ease, compiling PHP targeting the same platform and running it is a bit different (and certainly more painstaking). Think about all the various PHP modules you'd need for starters. Moreover, I can't imagine performance will be as good as something like a Go binary.
If you can do something clever with the Phalcon framework and come up with an easy build and deploy process, then maayyyybee.
Though, you'd probably need to really overhaul something like WordPress which was not designed for this at all. It still uses some pretty old conventions due to the age of the project and while that is all well and good for your typical PHP server, it's a different ball game in the sense of this "portable" PHP installation.
Keep in mind that PHP sessions are relied upon as well and so you're going to need to move those elsewhere due to the lack of persistence with AWS Lambda. You can probably find some sort of plugin for WordPress that works with Redis?? I have to imagine something like that has been built by now... But there will be many complications.
I would seriously consider using Azure Functions to begin with OR using Docker and forgoing the pricing model that cloud functions offers. You can still find some pretty cheap and scalable hosting out there.
What I've done previously was use AWS ECS (Docker) with EFS (network storage) for persistence and RDS for the database. While this doesn't carry the same pricing model as Lambda, it is still cost efficient. You can set up your ECS Service to autoscale up and down. So that way you're running the bare minimum until you need more.
I've written a more in depth article about it here: https://serifandsemaphore.io/how-to-host-wordpress-like-a-boss-b5993fcfbd8e#.n6fbnf8ii ... but it's basically just the idea of running WordPress in Docker and using EFS to offload the persistent storage issues. You can swap many of the pieces of the puzzle out if you like. Use a database hosted in some other Docker service or Compose or where ever. That part need not be RDS for example. Even your storage could be handled in a different way, though EFS worked pretty well! The only major thing to note about EFS is the write speed. Most WordPress sites are read heavy though. Your mileage will vary depending on your needs.
Is it possible? Yes, anything is possible with enough time and effort. Is it worth it? That is a question best to ask yourself.
PHP can be run on Lambda as per the documentation located here: https://aws.amazon.com/blogs/compute/scripting-languages-for-aws-lambda-running-php-ruby-and-go/ .
The bigger initial problem as stated in other comments is a persistent file system. S3 for media storage is doable via Wordpress plugin (again from the comments) but any other persistent storage for the request / script execution is the initial biggest hurdle. Tackle one problem at a time till you get to the end!

Is there any solution for make my own Meteor cloud server?

meteor deploy myapp.meteor.com
When I run this command line, my meteor app upload to meteor cloud server.
Is there any solution or repository for make my own meteor cloud server?
meteor deploy mycloud.server.com myapp.mydomain.com
I know I can use my own domain use this command.
meteor deploy myapp.mydomain.com
But I want to make my own cloud service like meteor do.
I know https://github.com/arunoda/meteor-up. But this is single service solution.
This is not for one or more server (clustered server) with many services.
If there are no solution for this, I'll make this solutions.
For now galaxy is still not released, this one should do exactly what you are looking for i.e. using deploy on your own server.
An alternative might be modulus.io but it is still not the easy deployment we would like.
The simplest I found yet is still using meteor-up. You can use it for deploying on several server too. The point is meteor-up expect to have a running ubuntu (or debian), and you deploy to those machines. You still need to setup an oplog for mongodb and a high availability proxy (with sticky session) to forward on the right virtual machines….
If only the performances matter, you can build micro services and integrated them through a service discovery as provided through meteorhacks:cluster, as this will help load balance your app it does not (yet?) provide a way to route the client according to the domain name (meaning you still need a reverse proxy for accessing the right service discovery from a domain) Also this packages does not provide any way to deploy you app, this is just a convenient way to help manage and scale your service.
If you need a reliable solution right now, docking meteor, deploying it on clusters and managing them, I would strongly advise looking at: https://bulletproofmeteor.com It is a very good source for building reliable meteor app with high availability. Note that all the chapters are not free, but there is a whole chapter covering "Deploying Meteor Apps into a Kubernetes Cluster" which goes step by step on the process of setting up your server(s) for running your meteor app in a PaaS way.

How to configure Meteor Oplog Tailing on a Sharded Mongo DB

As we're developing a greedy real-time monitoring application on Meteor, we reached the limit of our single MongoDB instance.
We migrated the DB to a sharded cluster with 2 shards for now, but we might expand up to 6 shards. (I have 2 BladeE chassis with 28 servers)
How do we configure Meteor Oplog Tailing on a mongo db cluster with sharding enabled?
Thanks,
Now there is a good news :) You can use sharded MongoDB database with Meteor easily with a bit of tweaking
Although Meteor core development team hasn't yet add Oplog Tailing support for sharding to their RoadMap, the workaround is so simple. You just add these 2 Meteor packages: cultofcoders:redis-oplog and disable-oplog, add another Redis server, tweak your code a bit and you are good to go.
The reason why Oplog Tailing doesn't work with sharded MongoDB database is just because core development team hasn't yet planed to support it. In fact, it's now possible to add support for sharded database. If you add a bunch of new records and read the Oplogs with tailable cursors from all shards, you will notice that MognoDB balancer will move data, lets say, from shard01 to shard02 where record id 0001 got removed from shard01 and added to shard02. This situation seems to be confusing for Meteor as it doesn't know whether records are actually removed/added by users or by MongoDB balancer. However there is a way to know whether users or MongoDB balancer removed/added the data because we can distinguish by fromMigrate flag — read more about this at MongoDB official site blog — so what we can do for now is to wait for update from core development team or work around with few tricks.
And the most promising workaround I've found so far is Meteor package called cultofcoders:redis-oplog. It's opensource and available on Github (Please check up their repository for full documentation. It's very-easy-to-use). The idea behind this package is to use another Redis server as pub and sub system — it doesn't store any data. it's just for pub and sub — instead of Meteor's which heavily rely on Oplog. This way, we don't have to worry about Oplog for sharded database that Meteor hasn't yet supported. Redis is mature and has been being used in production by many big companies. It's fully compatible with Meteor and you don't have to change the way you use Meteor. However you have to tweak your code a bit when updating collection's data in order to publish changes to Redis, and then the package cultofcoders:redis-oplog will handle the rest.
Seems Meteor does not yet support sharded mongoDB setups with > 1 shard: https://forums.meteor.com/t/mongodb-sharding-and-livequery/3712

Resources