MWAA Airflow 2.0 in AWS Snowflake connection not showing - airflow

Snowflake is not showing in the connections dropdown.
I am using MWAA 2.0 and the providers are already in the requirements.txt
MWAA uses python 3.7 dont know if this can be a thing
Requirements.txt:
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.0.2/constraints-3.7.txt"
asn1crypto
azure-common
azure-core
azure-storage-blob
boto3
botocore
certifi
cffi
chardet
cryptography
greenlet
idna
isodate
jmespath
msrest
numpy
oauthlib
oscrypto
pandas
pyarrow
pycparser
pycryptodomex
PyJWT
pyOpenSSL
python-dateutil
pytz
requests
requests-oauthlib
s3transfer
six
urllib3
apache-airflow-providers-http
apache-airflow-providers-snowflake
#apache-airflow-providers-snowflake[slack]
#apache-airflow-providers-slack
snowflake-connector-python >=2.4.1
snowflake-sqlalchemy >=1.1.0
\

If anyone is in this trouble, instead of choosing Snowflake in the dropdown, you can choose AWS as the connection and will work fine.

It took me a while to finally figure this one out after trying many different parameter combinations.
My full Snowflake URL is:
https://xx12345.us-east-2.aws.snowflakecomputing.com
The correct format for the Host field is:
xx12345.us-east-2.snowflakecomputing.com
For the Extra field, this is what worked for me:
{
"account": "xx12345.us-east-2.aws",
"warehouse": "my_warehouse_name",
"database": "my_database_name"
}
Make sure you put Amazon Web Services for the Conn Type, like #AXI said.
Also, I have these modules defined in my requirements.txt file:
apache-airflow-providers-snowflake==1.3.0
snowflake-connector-python==2.4.5
snowflake-sqlalchemy==1.2.4
My Airflow version is 2.0.2.

According to MWAA docs, it should be enough to add apache-airflow-providers-snowflake==1.3.0 to the requirements file. When I added it to the existing MWAA env, where I had already tried many different combinations of packages, it helped partially. It was possible to create a connection using CLI, but not with UI.
But, when I created a new clean MWAA env with the requirements file as stated in mentioned AWS doc, it worked well. The connection was available in UI.

Related

Airflow 2.0 support for DataprocClusterCreateOperator

In our project we are using DataprocClusterCreateOperator which was under contrib from airflow.contrib.operators import dataproc_operator. It is working fine with airflow version 1.10.14.
We are in a process of upgrading to Airflow 2.1.2 wherein while testing or dags which requires spinning of DataProc Cluster we found error as airflow.exceptions.AirflowException: Invalid arguments were passed to DataprocClusterCreateOperator (task_id: <task_id>). Invalid arguments were: **kwargs: {'config_bucket': None, 'autoscale_policy': None}
I am not able to see any links for this operator support in Airflow 2 so that I can identify the new params or the changes which happened.
Please share the relevant link.
We are using google-cloud-composer version 1.17.2 having Airflow version 2.1.2.
Since Airflow 2.0, 3rd party provider (like Google in this case) operators/hooks has been moved away from Airflow core to separate providers packages. You can read more here.
Since you are using Cloud Composer, the Google providers package is already installed.
Regarding the DataprocClusterCreateOperator, it has been renamed to DataprocCreateClusterOperator and moved to airflow.providers.google.cloud.operators.dataproc so you can import it with:
from airflow.providers.google.cloud.operators.dataproc import DataprocCreateClusterOperator
The accepted parameters differ from the one included in Airflow 1.x. You can find an example of usage here.
The supported parameters for the DataprocCreateClusterOperator in Airflow 2 can be found here, in the source code. The cluster configuration parameters that can be passed to the operator can be found here.
The DataprocClusterCreateOperator has been renamed as DataprocCreateClusterOperator since January 13, 2020 as per this Github commit and has been ported from airflow.contrib.operators to airflow.providers.google.cloud.operators.dataproc import path.
As given in #itroulli's answer, an example implementation of the operator can be found here.

Airflow connection password decryption

I want to decrypt the passwords(getting the value from connection table) for airflow connections. Is there any way I can decrypt the password value.
You can do:
from airflow.hooks.base_hook import BaseHook
connection = BaseHook.get_connection("conn_name")
conn_password = connection.password
conn_login = connection.login
Export your connections
airflow connections export connections.json
Install ejson to encrypt your file
brew tap shopify/shopify && brew install ejson or download the .deb package from Github Releases.
Add the public key at the top of your file, as shown in the image
ejson keygen -w
Encrypt your connections
ejson encrypt connections.json
Version, the file in Git, decrypt the connections, and import them into the DB within your CI/CD pipeline
credits to Marc Lamberti from Astronomer
Recently encountered a similar issue. You can now export connections in json or yaml format in Airflow 2.3.2. This will provide all the key values that an Airflow connection is represented by.
Command to run:
airflow connections export connections.yml --file-format yaml
See the Airflow documentation for more details:
https://airflow.apache.org/docs/apache-airflow/2.0.2/howto/connection.html#exporting-connections-from-the-cli

Airflow 'GoogleCloudStorageDownloadOperator' is not defined

Importing the operator in the following way:
from airflow.contrib.operators.gcs_download_operator import GoogleCloudStorageDownloadOperator
Then trying to use it in a DAG:
download_file = GoogleCloudStorageDownloadOperator(bucket='us-central1-scale-training-d7d12089-bucket',
google_cloud_storage_conn_id='google_cloud_default',
object='params.json',
filename='params.json')
Receiving this error:
'GoogleCloudStorageDownloadOperator' is not defined
Edit: I am using Google Cloud Composer so I assume the relevant dependecies are installed.
If you haven't already, you also need to add the GCP dependency to Airflow:
pip install apache-airflow[gcp_api]
There's more information about installation in the docs: https://airflow.apache.org/installation.html

configuring Airflow to work with CeleryExecutor

I try to configure Airbnb AirFlow to use the CeleryExecutor like this:
I changed the executer in the airflow.cfg from SequentialExecutor to CeleryExecutor:
# The executor class that airflow should use. Choices include
# SequentialExecutor, LocalExecutor, CeleryExecutor
executor = CeleryExecutor
But I get the following error:
airflow.configuration.AirflowConfigException: error: cannot use sqlite with the CeleryExecutor
Note that the sql_alchemy_conn is configured like this:
sql_alchemy_conn = sqlite:////root/airflow/airflow.db
I looked at Airflow's GIT (https://github.com/airbnb/airflow/blob/master/airflow/configuration.py)
and found that the following code throws this exception:
def _validate(self):
if (
self.get("core", "executor") != 'SequentialExecutor' and
"sqlite" in self.get('core', 'sql_alchemy_conn')):
raise AirflowConfigException("error: cannot use sqlite with the {}".
format(self.get('core', 'executor')))
It seems from this validate method that the sql_alchemy_conn cannot contain sqlite.
Do you have any idea how to configure the CeleryExecutor without sqllite? please note that I downloaded rabitMQ for working with the CeleryExecuter as required.
It is said by AirFlow that the CeleryExecutor requires other backend than default database SQLite. You have to use MySQL or PostgreSQL, for example.
The sql_alchemy_conn in airflow.cfg must be changed to follow the SqlAlchemy connection string structure (see SqlAlchemy document)
For example,
sql_alchemy_conn = postgresql+psycopg2://airflow:airflow#127.0.0.1:5432/airflow
To configure Airflow for mysql
firstly install mysql this might help or just google it
goto airflow installation director usually /home//airflow
edit airflow.cfg
locate
sql_alchemy_conn = sqlite:////home/vipul/airflow/airflow.db
and add # in front of it so it looks like
#sql_alchemy_conn = sqlite:////home/vipul/airflow/airflow.db
if you have default sqlite
add this line below
sql_alchemy_conn = mysql://:#localhost:3306/
save the file
run command
airflow initdb
and done !
As other answers have stated you need to use a different database besides SQLite. Additionally you need to install rabbitmq, configure it appropriately, and change each of your airflow.cfg's to have the correct rabbitmq information. For an excellent tutorial on this see A Guide On How To Build An Airflow Server/Cluster.
If you run it on a kubernetes cluster. Use the following config:
airflow:
config:
AIRFLOW__CORE__SQL_ALCHEMY_CONN: postgresql+psycopg2://postgres:airflow#airflow-postgresql:5432/airflow

Symfony ICU Issue, routes using locale different than EN will fail

After installing Yosemite and a new version of MAMP
and when I'm trying to execute
domain/app_dev.php/es/venues/3/show
This route is rendering a form containing a language type field, so it's requiring ICU.
being 'es' the locale i get errors. If I changed it to 'en' there's no problem.
The errors are:
[1/2] ResourceBundleNotFoundException: The resource bundle
"/Users/a77/Documents/DEV/UVox
Com/vendor/symfony/icu/Symfony/Component/Icu/Resources/data/lang/root.php"
does not exist.
[2/2] Couldn't read the indices [Languages] from
"/Users/a77/Documents/DEV/UVox
Com/vendor/symfony/icu/Symfony/Component/Icu/Resources/data/lang/es.res".
The indices also couldn't be found in the fallback locale(s)
"root.res".
My symfony version is 2.5, I'm running the MAMP PHP 5.5.10.
I updated dependencies via composer, including "symfony/intl": "*",
I have followed several webs in order to install icu and intl via pecl. But still get the error. I don't know how to check if the installations or the configs are ok. Maybe you can let me know how to test both via terminal and let you know what is the result...
This is because you are trying to get resources only for language es. But now (from the moment of importing to Symfony icu data) you need to get language resources via language and country codes es_ES.
You may not be able to just simply activate intl.so after the Yosemite update. I solved the issue installing intl.so following an excellent article by Danilo Braband http://dab.io/posts/getting-started-with-symfony-on-yosemite.html
Solved updgrading to Symfony 2.5.6

Resources