How to set up a database on startup in AWS amplify? - aws-amplify

I have an amplify and I get how to add an api function, use a lambda layer etc. What I don't see is how to create the database on startup -- it appears from the documentation that this is done from a CloudFormation stack, but I still can't see how to ensure that the database is set up on startup of the app (or build the tables if not) using something like SQLAlchemy.
What's the intended flow here?

Related

Airflow Metadata DB = airflow_db?

I have a project requirement to back-up Airflow Metadata DB to some data warehouse (but not using an Airflow DAG). At the same time, the requirement mentions some connection called airflow_db.
I am quite new to Airflow, so I googled a bit on the topic. I am a bit confused about this part. Our Airflow Metadata DB is PostgreSQL (this is built from docker-compose, so I am tinkering on a local install), but when I look at Connections in Airflow Web UI, it says airflow_db is MySQL.
I initially assumed that they are the same, but by the looks of it, they aren't? Can someone explain the difference and what they are for?
Airflow creates airflow_db Conn Id with MySQL by default (see source code)
Default connections are not really useful in production system. It's just a long list of stuff that you are probably not going to use.
Airflow 1.1.10 introduced the ability not to create the default list by setting:
load_default_connections = False in airflow.cfg (See PR)
To give more background the connection list is where hooks find the information needed in order to connect to a service. It's not related to the backend database. Though the backend is db like any db and if you wish to allow hooks to interact with it you can define it in the list like any other connection (which is probably why you have this as option in the default).

Web.Config transforms for Multi-Tenant deployment of WebForms app in docker over AWS ECS

Environment
ASP.NET WebForms app over IIS
Docker container host
AWS ECS hosting platform
Each client hosting its own copy of the app with private database connection string
Background
In the non-docker environment, each copy is a virtual directory under IIS, and thus have their own individual web.config pointing to dedicated databases. The underlying codebase is the same for each client, with no client-specific customization involved. The route becomes / here.
In the docker environment (one container per client), each copy goes over as a central root application.
Challange
Since the root image is going to be the same, how to have the web.config overridden for each client deployment.
We shouldn't create multiple images (one per client) as that will mean having extra deployment jobs and losing out on centralization. The connection strings should ideally be stored in some kind of dictionary storage applicable at ECS level which can provide client-specific values upon loading of corresponding containers.
Presenting the approach we used to solve this issue. Hope it may help others struck in similar cases.
With the problem statement tied to having a single root image and having any customization being applied at runtime, we knew that there needs to be a transformation of web.config at time of loading of the corresponding containers.
The solution was to use a PowerShell script that will read the web.config and get replace the specific values which were having a custom prefix embedded to the key. The values got passed from custom environmental variables within ECS and the web.config also got updated to have the keys with the prefix added.
Now since the docker container can have only a single entry point, a new base image was created which instantiated an IIS server and called a PowerShell script as startup. The called script called this transformation script and then set the ServiceMonitor on the w3cwp.
Thanks a lot for this article https://anthonychu.ca/post/overriding-web-config-settings-environment-variables-containerized-aspnet-apps/
I would use environment variables as the OP suggests for this with a start up transform, however I want to make the point that you do not want sensitive information in ENV variables, like DB passwords, in your ECS task definition.
For that protected information, you should use ECS secrets coupled with Parameter Store in Systems Manager. These values can be stored encrypted in the Parameter Store (using a KMS key) and the ECS Agent will 'inject' them as ENV variables on task startup.
For me, to simplify matters, I simply use secrets for everything although you can choose to only encrypt the sensitive information and leave the others clear.
I dynamically add the secrets for the given application into my task definitions at deploy time by looking up the 'secrets' for the given app by 'namespace' (something that Parameter Store supports). Then, if I need to add a new parameter, I can just add a new secret to the store in the given namespace and re-deploy the app. It will pick up and inject into the task definition any newly defined secrets automatically (or remove ones that have been retired).
Sample ruby code for creating task definition:
params = ssm_client.get_parameters_by_path(path: '/production/my_app/').parameters
secrets = params.map{ |p| { name: p.name.split("/")[-1], value_from: p.arn } }
task_def.container_definitions[0].secrets = secrets
This last transform injects the secrets such that the secret 'name' is the ENV variable name... which ends up looking like this:
"secrets": [
{
"valueFrom": "arn:aws:ssm:us-east-1:578610029524:parameter/production/my_app/DB_HOSTNAME",
"name": "DB_HOSTNAME"
},
{
"valueFrom": "arn:aws:ssm:us-east-1:578610029524:parameter/production/my_app/DB_PASSWORD",
"name": "DB_PASSWORD"
}
You can see there are no values now in the task definition. They are retrieved and injected when ECS starts up your task.
More information:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/specifying-sensitive-data.html

Cloudera Post deployment config updates

In cloudera is there a way to update list of configurations at a time using CM-API or CURL?
Currently I am updating one by one one using below CM API.
services_api_instance.update_service_config()
How can we update all configurations stored in json/config file at a time.
The CM API endpoint you're looking for is PUT /cm/deployment. From the CM API documentation:
Apply the supplied deployment description to the system. This will create the clusters, services, hosts and other objects specified in the argument. This call does not allow for any merge conflicts. If an entity already exists in the system, this call will fail. You can request, however, that all entities in the system are deleted before instantiating the new ones.
This basically allows you to configure all your services with one call rather than doing them one at a time.
If you are using services that require a database (Hive, Hue, Oozie ...) then make sure you set them up before you call the API. It expects all the parameters you pass in to work so external dependencies must be resolved first.

Firebase + Datastore = need_index

I'm working through the appengine+go tutorial, which connects in with Firebase: https://cloud.google.com/appengine/docs/standard/go/building-app/. The code is available at https://github.com/GoogleCloudPlatform/golang-samples/tree/master/appengine/gophers/gophers-6, which aside from my Firebase keys is identical.
I have it working locally just fine under dev_appserver.py, and it queries the Vision API and adds labels. However, after I deploy to appengine I get an index error on datastore. If I go to the Firebase console, I see the collection (Post) and the field (Posted) which is a timestamp.
If I change this line: https://github.com/GoogleCloudPlatform/golang-samples/blob/master/appengine/gophers/gophers-6/main.go#L193 to remove the Order("-Posted") then everything works (it's important to note that any Order call causes it to error, except the test records I've posted come in random order.
The error message when running in appengine is: "Getting posts: API error 4 (datastore_v3: NEED_INDEX): no matching index found."
I've attempted to create a composite index, or test locally with --require_indexes=true and it hasn't helped me debug the issue.
Edit: I've moved this over to use Firebase's Datastore libraries directly, instead of the GCP updates. I never solved this particular issue, but was able to move forward with my app actually working :)
By default the local development server automatically creates the composite indexes needed for the actual queries invoked in your app. From Creating indexes using the development server:
The development web server (dev_appserver.py) automatically adds
items to this file when the application tries to execute a query that
needs an index that does not have an appropriate entry in the
configuration file.
In the development server, if you exercise every query that your app
will make, the development server will generate a complete list of
entries in the index.yaml file.
When the development web server adds a generated index definition to
index.yaml, it does so below the following line, inserting it if
necessary:
# AUTOGENERATED
The development web server considers all index definitions below this
line to be automatic, and it might update existing definitions below
this line as the application makes queries.
But you also need to deploy the generated index configurations to the Datastore and let the Datastore update indexing information (i.e. the indexes to get into the Serving state) for the respective queries to not hit the NEED_INDEX error. From Updating indexes:
You upload your index.yaml configuration file to Cloud Datastore
with the gcloud command. If the index.yaml file defines any
indexes that don't exist in Cloud Datastore, those new indexes are
built.
It can take a while for Cloud Datastore to create all the indexes and
therefore, those indexes won't be immediately available to App Engine.
If your app is already configured to receive traffic, then exceptions
can occur for queries that require an index that is still in the
process of being built.
To avoid exceptions, you must allow time for all the indexes to build.
For more information and examples about creating indexes, see
Deploying a Go App.
To upload your index configuration to Cloud Datastore, run the
following command from the directory where your index.yaml is located:
gcloud datastore create-indexes index.yaml
For information, see the gcloud datastore reference.
You can use the GCP Console, to check the status of your indexes.

Meteor: Delay on direct db inserts (with external MongoDB)

I have a C application that inserts data directly to the database of my Meteor application. The app works fine (withoud delays) when I run it in development mode (with "meteor"). However, if I run the app as a node app (bundled) and with external MongoDB, there's an annoying delay in screen updates (5-10s).
I have noticed some previous discussions about this:
Meteor: server-side db insert delays
Using node ddp-client to insert into a meteor collection from Node
Questions:
Is there any other way than building a server-side API for doing the db inserts through Meteor?
Why the delay is only when using external MongoDB?
Is there a way in Meteor to shorten this database polling interval?
You need to enable oplog tailing. Without oplog tailing, when your C program makes a database write, the Meteor server doesn't realise anything has changed until it polls MongoDB again. With oplog tailing, it can pick up the changes much more quickly and efficiently. In development mode, oplog tailing is enabled automatically, but for production it needs some additional setup.
Your MongoDB must be set up as a replica set (a replica set of one node does work).
You have to pass in a mongo URL for the replica set's local database with the environment variable MONGO_OPLOG_URL.
For more information, see this article.

Resources