How to mount Airflow DAGs via Git? - airflow

Found that in airflow.cfg there is possibility to 'Mount DAGs via Git':
# Git credentials and repository for DAGs mounted via Git (mutually exclusive with volume claim)
git_repo = https://username#bitbucket.org/repo.git
git_branch = master
git_subpath =
# Use git_user and git_password for user authentication or git_ssh_key_secret_name and git_ssh_key_secret_key
# for SSH authentication
git_user = username
git_password = pass
git_sync_root = /git
git_sync_dest = repo
# Mount point of the volume if git-sync is being used.
# i.e. /home/user/airflow/dags
git_dags_folder_mount_point = /home/user/airflow/dags
Was unable to find any documentation on this.
Do I understand it correctly, that I am able to deploy my DAGs to Airflow production server using Git repo by just pushing to the branch specified?
Couldn't get that operating, would appreciate any thoughts.

Related

How to collect jolokia data via telegraf but just if the jolokia connection is active?

When my application is up, telegraf works fine and collects data related to jolokia since my application opens the port 11722 that telegraf uses to get the metrics. But then, when my application is down, telegraf starts to get errors since it can't connect to Jolokia. My telegraf version is 1.5.3 and this is a Production environment, so I don't have much flexibility to change the version. Is there a way to collect the jolokia metrics just when my application is up and running?
I've tried to create a script to check if jolokia was running and use with a tag that then I could use with my agent, but this didn't work:
[[inputs.exec]]
commands = ["sh /local/1/home/svcegctp/telegraf/inputs/scripts/check_jolokia.sh"]
timeout = "1s"
data_format = "influx"
name_override = "jvm_status"
[inputs.exec.tags]
running = "true"
(...)
[[inputs.jolokia2_agent]]
# Add agents URLs to query
urls = ["http://localhost:11722/jolokia"]
[inputs.jolokia2_agent.tags]
running = "true"
This is my script:
check_jolokia.sh
#!/bin/bash
if curl -s -u <username>:<password> http://localhost:11722/jolokia/version >/dev/null 2>&1; then
echo "jvm_status running=true"
else
echo "jvm_status running=false"
fi

How to make Airflow to use default aws profile

I would like airflow to use the default local ~./aws credentials when running locally and when ran in EMR it must take those credentials.
Currently, I am exporting them as AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID . I have two environments. therefore I am extracting the account number and based on that it will take the appropriate credentials
os.environ["AWS_ACCESS_KEY_ID"] = "A"
os.environ["AWS_SECRET_ACCESS_KEY"] = "s"
account_id = boto3.client('sts').get_caller_identity().get('Account')
if account_id=="2222222":
path="prd-datahub"
elif account_id=="111111111":
path="datahub"
tz = pytz.timezone('US/Central')
DAG_ID = os.path.basename(__file__).replace(".py", "")
os.environ['AWS_DEFAULT_REGION'] = 'us-west-1'
When I execute locally it says "botocore. credentials not found". How do I give this local AWS credentials path?

How do I deploy Apache-Airflow via uWSGI and nginx?

I'm trying to deploy airflow in a production environment on a server running nginx and uWSGI.
I've searched the web and found instructions on installing airflow behind a reverse proxy, but those instructions only have nginx config examples. However, due to the permissions, I can't change the nginx.conf itself and have to solve it via uswsgi.
My folder structure is:
project_folder
|_airflow
|_airflow.cfg
|_webserver_config.py
|_wsgi.py
|_env
|_start
|_stop
|_uwsgi.ini
My path/to/myproject/uwsgi.ini file is configured as follows:
[uwsgi]
master = True
http-socket = 127.0.0.1:9999
virtualenv = /path/to/myproject/env/
daemonize = /path/to/myproject/uwsgi.log
pidfile = /path/to/myproject/tmp/myapp.pid
workers = 2
threads = 2
# adjust the following to point to your project
wsgi-file = /path/to/myproject/airflow/wsgi.py
touch-reload = /path/to/myproject/airflow/wsgi.py
and currently the /path/to/myproject/airflow/wsgi.py looks as follows:
def application(env, start_response):
start_response('200 OK', [('Content-Type','text/html')])
return [b'Hello World!']
I'm assuming I have to somehow call the airflow flask app from the wsgi.py file (perhaps by also changing some reverse proxy fix configs, since I'm behind SSL), but I'm stuck; what do I have to configure?
Will this procedure then be identical for the workers and scheduler?

How to use airflow configuration file (airflow.cfg) when airflow run in container?

I'm using Airflow that run in container as described here. It seems that the configuration file airflow.cfg on the host have no impact on Airflow. I tried the solution here but it didn't help.
The configuration fields I changed are:
default_timezone = system #(from utc)
load_examples = False #(from True)
base_url = http://localhost:8081 #(from 8080)
default_ui_timezone = system #(from UTC)
I didn't see any impact on airflow eventhough I did docker-compose down and docker-compose up

Openstack-Keystone failing to start

I've tried almost everything in the past couple of days to get keystone running to no avail.
The setup is all on the same host, the virtualization and openstack and keystone are all on the same host, so I've tried setting up keystone with 127.0.0.1 and localhost and the IP of the host with no luck
[DEFAULT] log_file = /var/log/keystone/keystone.log
admin_token = ***
bind_host = 192.168.33.11
public_port = 5000
admin_port = 35357
compute_port = 8774
# === Logging Options ===
# Print debugging output verbose = True
# Print more verbose output
# (includes plaintext request logging, potentially including passwords)
# debug = False
# Name of log file to output to. If not set, logging will go to stdout. log_file = keystone.log
# The directory to keep log files in (will be prepended to --logfile) log_dir = /var/log/keystone
# Use syslog for logging.
# use_syslog = False
# syslog facility to receive log lines
# syslog_log_facility = LOG_USER
# If this option is specified, the logging configuration file specified is
# used and overrides any other logging options specified. Please see the
# Python logging module documentation for details on logging configuration
# files. log_config = logging.conf
# A logging.Formatter log message format string which may use any of the
# available logging.LogRecord attributes.
# log_format = %(asctime)s %(levelname)8s [%(name)s] %(message)s
# Format string for %(asctime)s in log records.
# log_date_format = %Y-%m-%d %H:%M:%S
# onready allows you to send a notification when the process is ready to serve
# For example, to have it notify using systemd, one could set shell command:
# onready = systemd-notify --ready
# or a module with notify() method:
# onready = keystone.common.systemd
[sql] connection = mysql://keystone:***#localhost/keystone
# idle_timeout = 200
[identity] driver = keystone.identity.backends.sql.Identity
[catalog] template_file = /etc/keystone/default_catalog.templates driver = keystone.catalog.backends.sql.Catalog
# dynamic, sql-based backend (supports API/CLI-based management commands)
# driver = keystone.catalog.backends.sql.Catalog
# static, file-based backend (does *NOT* support any management commands)
# driver = keystone.catalog.backends.templated.TemplatedCatalog
# template_file = default_catalog.templates
[token] driver = keystone.token.backends.sql.Token
# driver = keystone.token.backends.kvs.Token
# Amount of time a token should remain valid (in seconds)
# expiration = 86400
I've enabled logging in the logging.conf file and set the level to DEBUG and INFO, however nothing in log files.
[root#* keystone]# service openstack-keystone restart
Stopping keystone: [FAILED]
Starting keystone: [ OK ]
[root#* keystone]# service openstack-keystone restart
Stopping keystone: [FAILED]
Starting keystone: [ OK ]
[root#* keystone]# ps aux | grep keystone
root 25580 0.0 0.0 103236 880 pts/1 S+ 09:41 0:00 grep keystone
[root#* keystone]#
Any ideas will be greatly appreciated.Thank you
As I mentioned in the comment, I've never seen a config file with the section headings on the same line as config option:
[DEFAULT] log_file = /var/log/keystone/keystone.log
I've also seen it like this instead:
[DEFAULT]
log_file = /var/log/keystone/keystone.log
However, I have no idea if this is related to your issue.
To enable debug-level logging, make sure you set the following in /etc/keystone/logging.conf:
[logger_root]
level=DEBUG
Then try running keystone manually instead of as a service:
$ sudo -u keystone bash
$ HOME=/var/lib/keystone keystone-all --debug
Hopefully you'll see a relevant error message on standard out.
(I believe it will still send the logging to /var/log/keystone/keystone.log, not sure how to actually get it to log to standard out when running manually like this).
Add a valid token for admin_token. It should not be "*".
Check the below line:
[sql] connection = mysql://keystone:*#localhost/keystone
It should be something like:
connection = mysql://keystone:keystone#localhost/keystone
Refer to this url for an example keystone.conf file
http://docs.openstack.org/trunk/openstack-compute/install/yum/content/keystone-conf-file.html
I ran into this issue as well. I am running on Ubuntu 12.04LTS. What i found was the the service start command in /etc/init/keystone.conf is using start-stop-daemon to run the service. It was written for a newer version than the one on my box. The --chdir variable is not accepted as an input. once i removed that line keystone started right up.
Try running:
start-stop-daemon --start --chuid keystone --name keystone --exec /usr/bin/keystone-all
/etc/init/keystone.conf after
description "Keystone API server"
author "Soren Hansen <soren#linux2go.dk>"
start on runlevel [2345]
stop on runlevel [!2345]
respawn
exec start-stop-daemon --start --chuid keystone \
--name keystone \
--exec /usr/bin/keystone-all
Check if your IP-adress is equal to HOST_IP=... in localrc
This might be due to keystone not getting started properly and therefore port 35357 is not in listening mode.
This seems to be anomalous behavior of service keystone.
I am mentioning steps which have worked on my system for havana installtion on Ubuntu 12.04 Kernel version 3.2.0-67-generic. After a day of headache around this issue. Try these steps, preferably in the same order.
1) Remove keystone package:-
apt-get remove keystone
2) Reboot your system
reboot
3) After reboot again INSTALL KEYSTONE.
apt-get install keystone
4) Check status of keystone service
service keystone status
It will show start/running
5) Now do the necessary changes you want to do in /etc/keystone/keystone.conf
after making changes in conf file DO NOT RESTART KEYSTONE SERVICE
Use stop and start command to make an effect of restart but don't restart.
service keystone stop
service keystone start
For further help, pasting a dump of my CLI :-
http://pastebin.com/sduuFCL7
There are multiple problems with the icehouse documentations and install. packstack is broken so the only way to get started is to manually follow the upstream docs for your distro. keystone is very important to set up first correctly before moving on, because other services rely on it.
the paste-file /usr/share/keystone/keystone-dist-paste.ini should be copied to /etc/ to be accessible to the config scripts like this:
cp /usr/share/keystone/keystone-dist-paste.ini /etc/keystone/
chown keystone:keystone /etc/keystone/*
make sure to update keystone.conf with the new config_file value
documentation is wrong about the mysql connection, it should go to [sql] and not [database] so:
openstack-config --set /etc/keystone/keystone.conf sql connection mysql://keystone:PASSWD#controller/keystone
the name controller should be resolved to whatever mysql is bound to, I will add it to /etc/hosts like this if [mysqld]/bind-address in /etc/my.cnf is 10.1.1.100:
10.1.1.100 controller
make sure to uncomment log_file in keystone.conf to get what is happening.
I was facing similar issue.I followed below mentioned steps and openstack-keystone service got started.
Edit the /etc/keystone/keystone.conf file and complete the following actions:
In the [DEFAULT] section
[DEFAULT]
admin_token = ADMIN_TOKEN
In the [database] section
[database]
connection = mysql://keystone:KEYSTONE_DBPASS#controller/keystone
In the [token] section, configure the UUID token provider and SQL driver
[token]
provider = keystone.token.providers.uuid.Provider
driver = keystone.token.persistence.backends.sql.Token
In the [revoke] section
[revoke]
driver = keystone.contrib.revoke.backends.sql.Revoke
After making above changes populate the Identity service database using command
su -s /bin/sh -c "keystone-manage db_sync" keystone
Start the openstack-keystone service using below command
systemctl start openstack-keystone

Resources