How to change default region in Apache Airflow - airflow

I am using SQS for apache airflow.The setup running quiet well for now.
But the default region is us-east-1 , how can we change the default region?
I have added the config parameter region=us-west-2 to the [celery_broker_transport_options] and [celery] sections ,but it doesnt work, the default region is still us-east-1 .
Any suggestions would be of great help.

You are specifying the wrong key name:
[celery_broker_transport_options]
region = us-west-2
See the Celery docs for the key name (region) [1] and the Airflow configuration code for the section name (celery_broker_transport_options) [2].
[1] http://docs.celeryproject.org/en/latest/getting-started/brokers/sqs.html
[2] https://github.com/apache/incubator-airflow/blob/master/airflow/config_templates/default_celery.py#L50

Related

Why some openstack instances use the commands 'openstack server list|grep [ip]' or 'nova list|grep [ip]' can not be found

Why are some openstack instances found in the "dashboard" and viewed using the command 'nova list --ip [ip]', but not found using the command 'openstack server list|grep [ip]', 'nova list|grep [ip]'?
It is hard to say for sure what the cause of your problem is from the information you provided.
However, nova list and openstack server list both default to listing instances for the current project only. To list instances for all projects, you need to include the --all-tenants or --all-projects option respectively.
Refer to the Nova list command command to select parameter parameters:
--limit
Maximum number of servers to display. If limit == -1, all servers will be displayed. If limit is bigger than 'CONF.api.max_limit' option of Nova API, limit 'CONF.api.max_limit' will be used instead.

AWS credentials not found for celery-k8s deployment

I'm trying to run dagster using celery-k8s and using the examples/celery-k8s as a start. upon running the pipeline from playground I get
Initialization of resources [s3, io_manager] failed.
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I have configured aws credentials in env variables as mentioned in the document
deployments:
- name: "user-code-deployment-test"
image:
repository: "somasays/dagster-usercode-example"
tag: "0.5"
pullPolicy: Always
dagsterApiGrpcArgs:
- "-f"
- "/workspace/repo.py"
port: 3030
env:
AWS_ACCESS_KEY_ID: AAAAAAAAAAAAAAAAAAAAAAAAA
AWS_SECRET_ACCESS_KEY: qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
AWS_DEFAULT_REGION: eu-central-1
and I can also see these values are set in the env variables of the pod and can also access the s3 location after pip install awscli and aws s3 ls see the screenshot below the job pod however throws Unable to locate credentials
Please help
The deployment configuration applies to the user code servers. Meanwhile the celery executor runs your pipeline code in separate kubernetes jobs. To provide your secrets there, you will want to configure the env_secrets field of the celery-k8s executor in your pipeline run config.
See https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagster-k8s/dagster_k8s/job.py#L321-L327 for details on the config.

Euca 5.0 Ansible Console Task Failing

Background:
I am only able to get past the ansible console install/config tasks by adding --region localhost to anywhere in: /usr/share/eucalyptus-ansible/roles/cloud-post/tasks/console.yml wherever it calls tools that take that argument.
Otherwise each sub task fails like this: ["euca-describe-images: error: connection error (('Connection aborted.', gaierror(-2, 'Name or service not known')))"]
Running the commands from that playbook directly on the euca server being configured gives the same result unless I specify --region localhost
Problem:
I'm stuck here: [cloud-post : update console route53 system domain for eucalyptus-cloud authentication]
Error: "euform-update-stack: error (ValidationError): No updates are to be performed.", "stderr_lines": ["euform-update-stack: error (ValidationError): No updates are to be performed."]
All services are running except the ImagingBackend is Not Ready
No instances are running according to euca-describe-instances
Images are available:
IMAGE ami-5be483c81cf8bd65c eucalyptus-console-image-5-0-823/eucalyptus-console-image-5-0-823.raw.manifest.xml 000216594841 available private x86_64 machine instance-store hvm
TAG image ami-5be483c81cf8bd65c type eucalyptus-console-image
TAG image ami-5be483c81cf8bd65c version 5.0.823
IMAGE ami-f31092ddb73e29af9 eucalyptus-service-image-v5.0.100/eucalyptus-service-image.raw.manifest.xml 000216594841 available privatx86_64 machine instance-store hvm
TAG image ami-f31092ddb73e29af9 provides imaging,loadbalancing
TAG image ami-f31092ddb73e29af9 type eucalyptus-service-image
TAG image ami-f31092ddb73e29af9 version 5.0.100
---
all:
hosts:
exp-euca.lan.com:
exp-enc-[01:02].lan.com:
vars:
vpcmido_public_ip_range: "192.168.100.5-192.168.100.254"
vpcmido_public_ip_cidr: "192.168.100.1/24"
cloud_system_dns_dnsdomain: "cloud.lan.com"
cloud_public_port: 443
eucalyptus_console_cloud_deploy: yes
cloud_service_image_rpm: no
cloud_properties:
services.imaging.worker.ntp_server: "x.x.x.x"
services.loadbalancing.worker.ntp_server: "x.x.x.x"
children:
cloud:
hosts:
exp-euca.lan.com:
console:
hosts:
exp-euca.lan.com:
node:
hosts:
exp-enc-[01:02].lan.com:
EDIT:
Solved. Details are in the comments of the marked answer.
The name error most likely means that DNS for the domain cloud.lan.com is not being correctly delegated to your deployment. To test this, check if the nameserver is found:
dig +short NS cloud.lan.com
you should see "ns1.cloud.lan.com" and then should be able to use that nameserver to resolve services, e.g.
dig +short ec2.cloud.lan.com #ns1.cloud.lan.com
which should be the IP of the host for the compute service.
The second item is a bug in the ansible playbook that occurs when the stack is already present and up to date. To work around it, you can either update your playbook or delete the stack before running the playbook. Depending on how far the playbook progressed you may have a script to do this:
/usr/local/bin/console-manage-stack -a delete
the related playbook change is https://github.com/AppScale/ats-deploy/pull/36

How to write airflow logs to Elasticsearch?

I am using Airflow 1.10.5. Can't seem to find complete documentation or sample on how to setup remote logging using Elasticsearch. I saw airflow documentation about logging, but it wasn't helpful. I am trying to write the airflow (not task) logs to ES.
As far as I understand the docs, the ES log handler can only read from ES. You would have to setup your logging to print into a file, then use something like filebeat to post the file content to ES and Airflow can then read them back...
https://airflow.readthedocs.io/en/stable/howto/write-logs.html#writing-logs-to-elasticsearch
Writing Logs to Elasticsearch
Airflow can be configured to read task
logs from Elasticsearch and optionally write logs to stdout in
standard or json format. These logs can later be collected and
forwarded to the Elasticsearch cluster using tools like fluentd,
logstash or others.
I was able to achieve using [filebeat][1] shipper.
Input config section in filebeat.yml
</snip>
# ============================== Filebeat inputs ===============================
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /path/to/logs/*.log
</snip>
Output config section in filebeat.yml
<snip>
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]
# Protocol - either `http` (default) or `https`.
#protocol: "https"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
username: "elastic"
password: "changeme"
</snip>
Good doc to read especially about airflow --> ES.

Where is the Compose Scylladb SSL certificate?

I'm trying to connect to my scylladb 1.7.4 instance using the connection string provided for me in the compose overview section of the management UI:
$ cqlsh --ssl portal-xxxx.ibm-343.composedb.com 19228 -u scylla -p XXXX --cqlversion=3.3.1
However, the response is:
Validation is enabled; SSL transport factory requires a valid certfile to be specified. Please provide path to the certfile in [ssl] section as 'certfile' option in /Users/snowch/.cassandra/cqlshrc (or use [certfiles] section) or set SSL_CERTFILE environment variable
Where can I get access to the Compose SSL certificate so that I can connect with:
$ SSL_CERTFILE=/path/to/scylla_certfile cqlsh --ssl portal-xxxx-0.csnow-scylla-45.ibm-343.composedb.com 19228 -u scylla -p XXXX --cqlversion=3.3.1
I have seen the option SSL_VALIDATE=false in the documentation however, I don't want to disable SSL validation.
The information is further down in the documentation in the section https://help.compose.com/docs/scylla-and-certificates.
My confusion was because I was drawn to the information on ssl (#2) because of the issue I had encountered and as such I jumped over the section on full configuration for cqlsh (#1):
Cqlsh Command Line
The Cqlsh Command Line panel contains three cqlsh commands, each of which connect to the three Compose portals. Full details on obtaining cqlsh and configuring it are available in Scylla and cqlsh. (#1)
The displayed command include required flags (--ssl and --cqlversion). If the command is preceded by setting the environment variable SSL_VALIDATE=false, then no further configuration is needed. (#2)
I think this section would be a bit clearer if it was re-ordered:
Cqlsh Command Line
The Cqlsh Command Line panel contains three cqlsh commands, each of which connect to the three Compose portals.
The displayed command include required flags (--ssl and --cqlversion). If the command is preceded by setting the environment variable SSL_VALIDATE=false, then no further configuration is needed.
Full details on obtaining cqlsh and configuring it are available in Scylla and cqlsh. This section includes information on configuring cqlsh to use ssl.

Resources