MS SQL Hook and Operators not importing into Airflow - airflow

Trying to import the mssql hook and operator into my dag but I keep getting this error from Airflow.
I'm currently importing with the newest syntax:
from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
from airflow.providers.microsoft.mssql.operators.mssql import MsSqlOperator
and I'm getting this import error:
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/opt/airflow/dags/hevo_dag.py", line 5, in <module>
from airflow.providers.microsoft.mssql.hooks.mssql import MsSqlHook
ModuleNotFoundError: No module named 'airflow.providers.microsoft.mssql'

To import the operator you need to do:
For Airflow>=2.0:
you need to use Mssql provider package:
pip install apache-airflow-providers-microsoft-mssql
For Airflow<2.0:
you need to use Mssql backport provider package:
pip install apache-airflow-backport-providers-microsoft-mssql
In your code it's the same import path (regardless of the package). Airflow backported the provider to ease migration from Airflow 1 to Airflow 2 so upon upgrading you will not need to change the import paths.

Related

error while importing DAG file in Airflow 2.5.0

when I start my airflow schedular and webserver my bigdata.py file not getting imported, below is the error which am getting.
​Broken DAG: [/home/adminn/airflow/dags/bigdata.py] Traceback (most recent call last):
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'airflow.providers.apache'
this is the DAG I have written, am I missing something?
here I am trying to pull MySQL table using sqoop and load in HDFS, and scheduling this operation using Airflow.
from airflow.models import DAG
from airflow.contrib.operators.sqoop_operator import SqoopOperator
from airflow.utils.dates import days_ago
Dag_Sqoop_Import = DAG(dag_id="SqoopImport",
schedule_interval="* * * * *",
start_date=days_ago(2))
sqoop_mysql_import = SqoopOperator(conn_id="sqoop_local",
table="shipmethod",
cmd_type="import",
target_dir="/airflow_sqoopImport",
num_mappers=1,
task_id="SQOOP_Import",
dag=Dag_Sqoop_Import)
sqoop_mysql_import
replies appreciated,thanks.
From airflow 2.x the providers are no longer included by default, but you have to install and then import them and the import path is changed
In your case you have to:
install the sqoop provider by running: pip install 'apache-airflow-providers-apache-sqoop'
change the import string for the operator into: from airflow.providers.apache.sqoop.operators.sqoop import SqoopOperator
Here is the full list of the providers available:
https://airflow.apache.org/docs/apache-airflow-providers/packages-ref.html
And this is the provider you are looking for:
https://airflow.apache.org/docs/apache-airflow-providers/packages-ref.html#apache-airflow-providers-apache-sqoop

Broken DAG : ModuleNotFoundError: No module named 'airflow.providers.snowflake'

Hi I have setup my environment for Airflow run. I want to run the DAG whcih connects to Snowflake. I have installed below necessary packages from cloud shell.
pip3 install snowflake-connector-python==2.4.5
pip3 install snowflake-sqlalchemy==1.2.4
pip3 install apache-airflow-providers-snowflake==2.3.0
pip3 install apache-airflow-providers-common-sql
I have established snowflake connection in Airflow.
Now while executing the DAG i am getting this error since long time:
Broken DAG: [/home/airflow/gcs/dags/snowflake_connect_mine.py] Traceback (most recent call last):
File "/home/airflow/gcs/dags/snowflake_connect_mine.py", line 6, in <module>
from airflow.contrib.hooks.snowflake_hook import SnowflakeHook
File "/opt/python3.8/lib/python3.8/site-packages/airflow/contrib/hooks/snowflake_hook.py", line 23, in <module>
from airflow.providers.snowflake.hooks.snowflake import SnowflakeHook # noqa
ModuleNotFoundError: No module named 'airflow.providers.snowflake'
Please help me to resolve this issue.
Regards
Sachin Mittal
9560315720

MWAA Apache Airflow DAG error importing EcsOperator

I am trying to deploy an Airfow DAG to MWAA.
My requirements.txt:
apache-airflow[amazon] == 3.2.0
I import EcsOperator like this:
from airflow.contrib.operators.ecs_operator import EcsOperator
However, I get this error:
Broken DAG: [/usr/local/airflow/dags/mydag.py] Traceback (most recent call last):
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/usr/local/airflow/dags/mydag.py", line 4, in <module>
from airflow.contrib.operators.ecs_operator import EcsOperator
ImportError: cannot import name 'EcsOperator' from 'airflow.contrib.operators.ecs_operator' (/usr/local/lib/python3.7/site-packages/airflow/contrib/operators/ecs_operator.py)
What am I doing wrong here?
What am I doing wrong here?
You might be referencing a different version (1.10.12?) of the Airflow documentation.
airflow.contrib.operators.ecs_operator (1.10.12)
The documentation for 3.2.0 is here. You can import the EcsOperator like this:
from airflow.providers.amazon.aws.operators.ecs import EcsOperator
airflow.providers.amazon.aws.operators.ecs (3.2.0)
The correct requirements.txt:
(empty)
And the correct import:
from airflow.providers.amazon.aws.operators.ecs import ECSOperator
Note the casing!
There are several issues here so I'll compile a detailed answer since privious answers didn't cover all of them.
First, the updated import path (provider release 3.2.0) is:
from airflow.providers.amazon.aws.operators.ecs import EcsOperator
The reason this doesn't work for you is because you install the provider with extras as:
apache-airflow[amazon]
as explained in the provider extra docs when installing provider in that manner you get the provider version which was released at the time of the Airflow version that you are using. Thus you are not guaranteed to get the updated provider version. So in case you are using Airflow 2.2.4 (latest at the time of writing this answer) you will get Amazon provider version 3.0.0 which is not the most recent one.
To get updated provider you should install it as:
pip install apache-airflow-providers-amazon
if you like to pick a specific version then:
pip install apache-airflow-providers-amazon==3.2.0
Please note that you should always install from constraint files provided by Airflow. Example:
pip install "apache-airflow-providers-amazon" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
Note that the provider is referring to constraints-main which constantly updated and not to constraints-2.2.4 or any other specific Airflow version.
You can read more about it in the doc about Installation and upgrading of Airflow providers separately.

airflow initdb: cannot import name 'Pendulum' from 'pendulum'

I installed airflow within one of my Anaconda envs named engdados. When I execute the command airflow initdb I'm getting the following error: airflow initdb: cannot import name 'Pendulum' from 'pendulum'. The full trace back is shown below:
(engdados) guilherme#Athena-LNX:~$ airflow initdb
Traceback (most recent call last):
File "/home/guilherme/anaconda3/envs/engdados/bin/airflow", line 25, in <module>
from airflow.configuration import conf
File "/home/guilherme/anaconda3/envs/engdados/lib/python3.8/site-packages/airflow/__init__.py", line 47, in <module>
settings.initialize()
File "/home/guilherme/anaconda3/envs/engdados/lib/python3.8/site-packages/airflow/settings.py", line 403, in initialize
configure_adapters()
File "/home/guilherme/anaconda3/envs/engdados/lib/python3.8/site-packages/airflow/settings.py", line 319, in configure_adapters
from pendulum import Pendulum
ImportError: cannot import name 'Pendulum' from 'pendulum' (/home/guilherme/anaconda3/envs/engdados/lib/python3.8/site-packages/pendulum/__init__.py)
(engdados) guilherme#Athena-LNX:~$ service start mysql$
start: unrecognized service
(engdados) guilherme#Athena-LNX:~$ service mysql start$
Usage: /etc/init.d/mysql start|stop|restart|reload|force-reload|status|bootstrap
(engdados) guilherme#Athena-LNX:~$ airflow initdb
Traceback (most recent call last):
File "/home/guilherme/anaconda3/envs/engdados/bin/airflow", line 25, in <module>
from airflow.configuration import conf
File "/home/guilherme/anaconda3/envs/engdados/lib/python3.8/site-packages/airflow/__init__.py", line 47, in <module>
settings.initialize()
File "/home/guilherme/anaconda3/envs/engdados/lib/python3.8/site-packages/airflow/settings.py", line 403, in initialize
configure_adapters()
File "/home/guilherme/anaconda3/envs/engdados/lib/python3.8/site-packages/airflow/settings.py", line 319, in configure_adapters
from pendulum import Pendulum
ImportError: cannot import name 'Pendulum' from 'pendulum' (/home/guilherme/anaconda3/envs/engdados/lib/python3.8/site-packages/pendulum/__init__.py)
The problem is: the pendulum is installed! When I execute the conda list command I can see the Pendulum there as follows:
Name Version Build Channel
pendulum 2.1.2 pypi_0 pypi
What I've checked so far:
Is the engdados environment activated? Yes
Is the Pendulum installed on Anaconda environment? Yes
The version of Pendulum the Anaconda shows is different of the one showed in conda list (1.4.4). Why?
I have no idea what is going on. Thanks in advance.
In pendulum version 2, the class pendulum.Pendulum is replaced with pendulum.DateTime.
Your version of airflow is expecting pendulum 1.x but your environment has 2.x.
You may be able to fix this by making a new env and installing airflow 2.0 (which uses pendulum 2.x). If you must use airflow < 2.0, you will need to pin pendulum to < 2.0 (e.g. using pip constraints).
Also if you using Pendulum in your code, for example in custom Operators you can add
try:
from pendulum import DateTime as Pendulum
except ImportError:
from pendulum import Pendulum

Apache Airflow initdb command fails, due to syntax error

I have created virtualenv for python3 using:
virtualenv -p $(which python3) ENV
Then activate the source
source /Users/myusername/ENV/bin/activate
Install the apache-airflow:
pip install apache-airflow
then which airflow yields /Users/myusername/ENV/bin/airflow
But when I try to initdb using:
airflow initdb
I get below error:
{db.py:350} INFO - Creating tables
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
INFO [alembic.runtime.migration] Will assume non-transactional DDL.
WARNI [airflow.utils.log.logging_mixin.LoggingMixin] cryptography not found - values will not be stored encrypted.
ERROR [airflow.models.DagBag] Failed to import: /Library/Python/2.7/site-packages/airflow/example_dags/example_http_operator.py
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/airflow/models/__init__.py", line 413, in process_file
m = imp.load_source(mod_name, filepath)
File "/Library/Python/2.7/site-packages/airflow/example_dags/example_http_operator.py", line 27, in <module>
from airflow.operators.http_operator import SimpleHttpOperator
File "/Library/Python/2.7/site-packages/airflow/operators/http_operator.py", line 21, in <module>
from airflow.hooks.http_hook import HttpHook
File "/Library/Python/2.7/site-packages/airflow/hooks/http_hook.py", line 23, in <module>
import tenacity
File "/Library/Python/2.7/site-packages/tenacity/__init__.py", line 375, in <module>
from tenacity.tornadoweb import TornadoRetrying
File "/Library/Python/2.7/site-packages/tenacity/tornadoweb.py", line 24, in <module>
from tornado import gen
File "/Library/Python/2.7/site-packages/tornado-6.0.3-py2.7-macosx-10.14-intel.egg/tornado/gen.py", line 126
def _value_from_stopiteration(e: Union[StopIteration, "Return"]) -> Any:
^
SyntaxError: invalid syntax
Done.
(ENV) ---------------------------------------------------------
Seems like example scripts use python 2.7 and it can't recognize the function definition syntax.
Does apache-airflow package need to be fixed by the next release or I can do something to fix this?
I tried fixing this:
Use python2.7 instead of python3
then install airflow on default python 2.7 enabled on mac but this throws other errors like package "six" is not compatible.
You need to turn off the example DAGs to be loaded in config file to solve this problem.
Anyway, it seems weird that airflow uses 2.7 Python when you told that it is installed into Python 3 virtual environment.

Resources