We are using airflow to schedule our data pipelines, as part of it we also have added few connections and variables in airflow admin.
Everything worked fine in DEV, now we want to setup PROD environment. How do we migrate these values into PROD environment.
You can list or export variables and connection through the command line: https://airflow.apache.org/cli.html
Relevant commands:
airflow variables -e variables.json
airflow connections --list
Variables, I generally have JSON files in our code repo to store non sensitive variables for different environments, which can then be imported via the command line easily and changes are tracked through git.
For connections the other option which is possible is to use environment variables instead of setting up in the UI, you can set connection properties using AIRFLOW_CONN_{CONNECTION_NAME} for example AIRFLOW_CONN_AWS_DEFAULT for connection aws_default
The value stored in the variable must be in a URI format i.e. postgres://user:password#localhost:5432/master or s3://accesskey:secretkey#S3
Related
I have an Airflow instance in a local Ubuntu machine. This instance doesn't work very well, so I would like to install it again. The problem is that I can't delete the current instance, because it is used by other people, so I would like to create a new Airflow instance in the same machine to put various dags there.
How could I do it? I created a different virtual environment, but I don't know how to install a second airflow server in that environment, which works in parallel with the current airflow.
Thank you!
use different port for webserver
use different AIRFLOW_HOME variable
use different sql_alchemy_conn (to point to a different database)
copy the deployment you have to start/stop your airflow components.
Depending on your deployment you might somehew record process id of your running airflow (so called pid-files) or have some other way to determine which processes are running. But that is nothing airflow-specific, this is something that is specific for your deployment.
Documentation for Airflow https://airflow.readthedocs.io/en/1.9.0/configuration.html
talks about setting an environment variable named $AIRFLOW_HOME which is where airflow will be installed. The configuration file airflow.cfg is created by this process has an attribute called airflow_home in the [core] section at the top of the file. This makes sense.
But, the way you override airflow variables in the airflow.cfg with environment variables is with the pattern AIRFLOW__[SECTION]__VARIABLENAME. Based on that pattern, the airflow home environment variables should technically be managed by the environment variable AIRFLOW__CORE__AIRFLOW_HOME and not AIRFLOW_HOME.
Why the difference?
Are both needed?
is one of them not needed?
do they do different things?
They do different things insofar that $AIRFLOW_HOME works as intended: the value you set will be what you get, and $AIRFLOW__CORE__AIRFLOW_HOME is likely to screw things up.
The $AIRFLOW_HOME value is special in that it's a prerequisite for a handful of actions and is read without support for the $AIRFLOW__[SECTION]__VARIABLENAME interpolation.
Can a single installation of Apache Airflow be used to handle multiple environments? eg. Dev, QA1, QA2, and Production (if so please guide) or do I need to have a separate install for each? What would be the best design considering maintenance of all environments.
You can do whatever you want. If you want to keep a single Airflow installation to handle different environments, you could switch connections or Airflow variables according to the environment.
At the end of the day DAGs are written in Python so you're really flexible.
I would like to know the best practice for setting environment variables in local machine to reflect the production environment.
I want to set the private API keys in the ENV variable, rather than directly committing them in Git. In Rails, I would use plugins like figaro to put every ENV variables in a single YML file, and they will be available.
What is the common practice in Meteor?
I think I could
run SECRET_KEY=some_key OTHER_SECRET_KEY=some_other_key meteor every time I run the local server. But that's too much to remember.
set environment variables locally but I don't want them to live in the global namespace in my machine.
Any alternatives?
Found this old post while having the same problem.
Looks like meteor is offering now to start with a config file.
meteor run --settings config.json
You would exclude that (or rather gitignore it) to keep it local. More here in the docs.
Setting an environment variable on the localhost is done using export.
e.g. export PORT=80
My Question is how to set an environment var for the remote meteor server.
I am using Meteor's free hosting service and deploy using meteor deploy appname, and therefore have no ssh access to the remote command line.
I'd like to set DISABLE_WEBSOCKETS to true.
I've looked at the list of possible meteor commands and haven't found one which relates to setting env vars.
You do it the same way when you run your server e.g, you don't have to use export you can just put the environment variables in the line you use to start meteor.
PORT=80 node main.js
or if you use forever
PORT=80 forever start main.js
or even with meteor
DISABLE_WEBSOCKETS=TRUE meteor
I'm a bit confused about your setup, by remote meteor server you mean a production environment? You shouldn't use the meteor command in production as it is not optimized this way and performance would be very significantly affected.
Meteor gets the environment variables using process so whatever you use to start the process you can pass the environment variables to it using the typical terminal/bash/shell/ssh that you used to start the process up.