Unable to find Airflow.cfg file - airflow

I have set up the Airflow environment on AWS EKS using the official doc
https://airflow.apache.org/docs/helm-chart/stable/index.html
The pods are up and running:
From pods description I can see that there is supposed to be a mount for airflow.cfg:
But when I try to check the path and mount on the node, I am unable to find the airflow.cfg file at /opt/airflow.
I am using 2.3.0 version currently, but the same was seen with 2.4.1 and 2.5.1 versions.
How is airflow running without airflow.cfg?
Let me know where can I find this file.

Related

How to upgrade airflow?

There seems to be no proper documentation about upgrading airflow. The Upgrading Airflow to a newer version page only talks about upgrading the database. So what is the proper way of upgrading airflow?
Is it just upgrading the python packages to the newest versions? Or should I use the same venv and install the newer airflow version completely from scratch? Or is it something else altogether?
I'm guessing doing the database upgrade would be the final step followed by one of these steps.
I was also struggling with upgrading airflow for minor versions and didn't feel like I found a good answer in the docs. I think I have the right approach after looking back at how I installed airflow in the first place.
If you followed the guide to run airflow locally you'll want to change the value for AIRFLOW_VERSION in the commands to your desired version.
If you followed the guide to run airflow on docker, then you'll want to fetch the latest docker-compose.yaml. The command on the site always has the latest version. Then re-run docker compose up.
You can confirm you have the right version by running airflow version. I run airflow via docker so the docker steps work for me, I imagine the local steps should be about the same.
Adding to Vivian's answer -
I had installed airflow from PyPi and was upgrading from 2.2.4 to 2.3.0.
To upgrade airflow,
I installed the new version of airflow in the same virtual environment as 2.2.4 (using this).
Upgraded the database using airflow db upgrade. More details here.
You might have to manually upgrade providers using pip install packagename -U
After this, when I started Airflow, I got an error related to some missing conf. Airflow wanted the newest version of airflow.cfg, but I had the older version. To fix this,
Renamed my airflow.cfg to airflowbackup.cfg. This is done so that airflow will make a new airflow.cfg on start up when it sees that there is no config file.
Compared airflowbackup.cfg with the 2.2.4 config to find out all the fields I had changed.
Manually made those same changes in the newly made airflow.cfg

helm chart - airflow version upgrade

I've installed airflow on my k8s using helm chart apache-airflow/airflow.
Currently installed airflow version is 2.2.4, what path should be followed from the options below to upgrade airflow version 2.3.0.
The official helm chart version is 1.5.0 and the default airflow version is 2.2.4. When the newer version of helm chart is released, the default airflow version will be set to 2.3.0. Will the helm repo update and helm upgrade command provision the upgrade of airflow? or are there any other similar process or official upgrade process guide?
If the upgrade process to accommodate the airflow default version has to be manual, what process/steps should be followed? N.B. changing defaultAirflowTag value from 2.2.3->2.3.0 in values.yaml is not an option as it causes an exception.
Thanks in advance.
there. I checked my airflow environment on my aws ec2.
It didn't work properly.
So I did lots of methods. It proved all failed.
But finally I found solution.
if you should run below these two commands, you should remove your airflow namespace first.
There is some errors in airflow namespace.
So I found that it worked properly after deleting airflow namespace.
1)kubectl delete namespace airflow
2)helm repo remove apache-airflow https://airflow.apache.org
helm repo add apache-airflow https://airflow.apache.org
3)You should see your brand-new airflow interface.
How I solved :
helm upgrade --install actually works. but I saw a message on airflow web saying some of data migration failed cause of schema changes in airflow 2.3.0. After I drop the mentioned table in postgresql, airflow worked fine.
and changing values.yaml also works.
images:
airflow:
repository: apache-airflow/airflow
tag: latest
pullPolicy: IfNotPresent

GCP Composer - ModuleNotFoundError: No module named 'airflow.providers.sftp'

I trying to getting data from FTP server's txt file by GCP Composer Tasks.
So I imported SFTPOperator package in code.
but error occurred:
ModuleNotFoundError: No module named 'airflow.providers.sftp'
then, I tried few ways:
Getting exception "No module named 'airflow.providers.sftp'"
Install apache-airflow-providers-sftp by composer pypi packages
but didn't work.😭
My GCP Composer Environment is as below:
Image Version : composer-1.17.7-airflow-2.1.4
python version : 3
Network VPC-native : Enable
How can I use SFTPOperator ?
For this you will have to install sftp package, pip install 'apache-airflow[sftp]' . You can check the built-in and extras packages that airflow components have when installed (varies from version).
Once you have it installed you should be able to use SFTPOperator by importing the operator inside your DAG.
from airflow.providers.sftp.operators.sftp import SFTPOperation,SFTPOperator
with DAG(...) as dag:
upload_op = SFTPOperator(
task_id="test_sftp",
ssh_conn_id="ssh_default",
local_filepath="/tmp/file.txt",
remote_filepath="/tmp/tmp1/tmp2/file.txt",
operation=SFTPOperation.GET,
dag=dag
)
...
You can also find a mock tests on the airflow git hub project that can provide you some guidance, check this link.
UPDATE 17/08/2022: As commented by Diana, Composer has a documented way to install its components as mention on this link. Be advised to pick up the composer version your project uses as there is version1 and version2 guides.

unable to find airflow.cfg file in windows 10

I have installed apache airflow using docker desktop for windows and able to create DAGs and running webserver without issue. Here my question is, am unable to find airflow.cfg file in my local machine.
I am new to Airflow and docker please help me to find this one.

Unable to see snowflake conn_type in Airflow

I can see the following in my pip list but when I try to add a Snowflake connection via the GUI, Snowflake is not an option from the dropdown.
apache-airflow-providers-snowflake 2.1.0
snowflake-connector-python 2.5.1
snowflake-sqlalchemy 1.2.3
Am I missing something?
I have had this issue with MWAA recently.
I find that if I select AWS in the drop down and provide the correct snowflake host name etc it works though.
I run into the same issue using the official helm chart 1.3.0.
But finally I was able to make Snowflake connection visible by doing the following steps:
I uninstalled the apache-airflow-providers-google. Not sure whether this is important, but I like to mention it here. I did this, because I got some warnings.
Because with SQLAlchemy 1.4 some breaking changes were introduced, I made sure that version 1.3.24 gets installed. Based on that I choosed the fitting version for snowflake stuff.
So this is my requirements.txt for my custom Airflow container:
apache-airflow-providers-snowflake==2.3.0
pyarrow==5.0.0
snowflake-connector-python==2.5.1
snowflake-sqlalchemy==1.2.5
SQLAlchemy==1.3.24
This is my Dockerfile:
FROM apache/airflow:2.2.1-python3.8
## adding missing python packages
USER airflow
COPY requirements.txt .
RUN pip uninstall apache-airflow-providers-google -y \
&& pip install -r requirements.txt
I had the same issue where my pip freeze showed the apache-airflow-providers-snowflake yet I did not have the provider in the UI. I had to add the line apache-airflow-providers-snowflake to my requirements.txt file and then restart. Then I was able to see the Snowflake provider and connector in the UI.

Resources