There seems to be no proper documentation about upgrading airflow. The Upgrading Airflow to a newer version page only talks about upgrading the database. So what is the proper way of upgrading airflow?
Is it just upgrading the python packages to the newest versions? Or should I use the same venv and install the newer airflow version completely from scratch? Or is it something else altogether?
I'm guessing doing the database upgrade would be the final step followed by one of these steps.
I was also struggling with upgrading airflow for minor versions and didn't feel like I found a good answer in the docs. I think I have the right approach after looking back at how I installed airflow in the first place.
If you followed the guide to run airflow locally you'll want to change the value for AIRFLOW_VERSION in the commands to your desired version.
If you followed the guide to run airflow on docker, then you'll want to fetch the latest docker-compose.yaml. The command on the site always has the latest version. Then re-run docker compose up.
You can confirm you have the right version by running airflow version. I run airflow via docker so the docker steps work for me, I imagine the local steps should be about the same.
Adding to Vivian's answer -
I had installed airflow from PyPi and was upgrading from 2.2.4 to 2.3.0.
To upgrade airflow,
I installed the new version of airflow in the same virtual environment as 2.2.4 (using this).
Upgraded the database using airflow db upgrade. More details here.
You might have to manually upgrade providers using pip install packagename -U
After this, when I started Airflow, I got an error related to some missing conf. Airflow wanted the newest version of airflow.cfg, but I had the older version. To fix this,
Renamed my airflow.cfg to airflowbackup.cfg. This is done so that airflow will make a new airflow.cfg on start up when it sees that there is no config file.
Compared airflowbackup.cfg with the 2.2.4 config to find out all the fields I had changed.
Manually made those same changes in the newly made airflow.cfg
Related
I can see the following in my pip list but when I try to add a Snowflake connection via the GUI, Snowflake is not an option from the dropdown.
apache-airflow-providers-snowflake 2.1.0
snowflake-connector-python 2.5.1
snowflake-sqlalchemy 1.2.3
Am I missing something?
I have had this issue with MWAA recently.
I find that if I select AWS in the drop down and provide the correct snowflake host name etc it works though.
I run into the same issue using the official helm chart 1.3.0.
But finally I was able to make Snowflake connection visible by doing the following steps:
I uninstalled the apache-airflow-providers-google. Not sure whether this is important, but I like to mention it here. I did this, because I got some warnings.
Because with SQLAlchemy 1.4 some breaking changes were introduced, I made sure that version 1.3.24 gets installed. Based on that I choosed the fitting version for snowflake stuff.
So this is my requirements.txt for my custom Airflow container:
apache-airflow-providers-snowflake==2.3.0
pyarrow==5.0.0
snowflake-connector-python==2.5.1
snowflake-sqlalchemy==1.2.5
SQLAlchemy==1.3.24
This is my Dockerfile:
FROM apache/airflow:2.2.1-python3.8
## adding missing python packages
USER airflow
COPY requirements.txt .
RUN pip uninstall apache-airflow-providers-google -y \
&& pip install -r requirements.txt
I had the same issue where my pip freeze showed the apache-airflow-providers-snowflake yet I did not have the provider in the UI. I had to add the line apache-airflow-providers-snowflake to my requirements.txt file and then restart. Then I was able to see the Snowflake provider and connector in the UI.
I upgraded docker image to use airflow 1.10.14. Airflow is deployed with helm and I have an init-container which execute script to initialize airflow. The init script contain commands
...
airflow upgradedb
alembic upgrade heads
...
The upgrade failed so I need to rollback to previous deployed release version which contains the 1.10.10 version of airflow but it is now getting the alembic error. I tried to delete the row/record in the alembic_version table based on my search.
The error in scheduler container is this:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.DuplicateColumn) column "operator" of relation "task_instance" already exists
All the other pods are running fine (webserver and workers).
Any resolution/workaround to this issue?
Unless you are ok with scrapping your entire metadata DB (connections, variables, task runs, etc) I might opt to just push to 1.10.15 and see if the bug you encountered is resolved there. From my best understanding it is not possible to downgrade the DB after the upgrade has been done.
Suggesting upgrade to 1.10.15 based on if you remember encountering an issue similar to this user here. The CLI fix can be found here. If you found another issue with your 1.10.14 upgrade besides the one I noted for the CLI, it might be worth investigating a path to resolving that instead.
Now that Airflow 2.0 is released, we're excited to try out some the new features.
What's the best way of upgrading from 1.10.11 to Airflow 2.0?
Will my existing code work or will I be required to change my DAGs?
We'll start upgrading in our DEV environment for testing later this week.
Airflow 1.10.11 and local executor and Python3
The documentation lacks the information, how to exactly upgrade to 1.10.14 while the newer version is already available.
According to the PIP documentation (https://pip.pypa.io/en/stable/user_guide/#installing-packages) this should work:
python -m pip install apache-airflow==1.10.14
This seemed to work for me, but I was not able to start the websever after.
First, I had to upgrade the DB:
airflow upgradedb
Second, starting the webserver showed the problem that now the "secret_key" has to contain a real secret key.
Execute
openssl rand -hex 30
and add the hex key to the airflow.cfg file.
Then follow the remaining steps (including executing the check script) from the upgrade documentation.
As it is not described, either, the actual upgrade to 2.0 should work by using
pip install -U apache-airflow
Note especially the change in the DB upgrade command (airflow db upgrade instead of airflow upgradedb).
Regards,
HerrB92
We have documented it at https://airflow.apache.org/docs/apache-airflow/stable/upgrading-to-2.html
Step 1: Upgrade to Python 3
Step 2: Upgrade to Airflow 1.10.14 (a.k.a Airflow "bridge" release)
Step 3: Install and run the Airflow Upgrade check scripts (https://pypi.org/project/apache-airflow-upgrade-check/)
Step 4: Import Operators from Backport Providers
Step 5: Upgrade Airflow DAGs
Step 6: Upgrade Configuration settings
Step 7: Upgrade to Airflow 2.0
The upgrade-check package should help you in upgrading.
Read https://airflow.apache.org/docs/apache-airflow/stable/upgrading-to-2.html#step-3-install-and-run-the-upgrade-check-scripts
I have deployed graphite with nginx some time ago, with chef, but didn't froze versions to be installed. Thus now, trying to install with the same recipe i get errors because of missing something related to versions.
I need to find what version of Graphite I have installed on my other CentOS machines, to be able to figure out how to repair recipe.
Thank you.
Gabriel
If you have web access to the Graphite installation, you can also see the currently running version under the /version/ path.
I had the same issue. I solved it running a pip list, which displays every package you installed with pip. You can also run pip show graphite-web or pip show whisper to get more specific information.
Airflow has an upgradedb command that needs to be run when upgrading Airflow versions. I wonder if it's safe to run even if the version is the same
The way it works is in db.py they use the alembic command module to check the checked in files in the migrations directory https://github.com/apache/incubator-airflow/tree/master/airflow/migrations/versions, and only make the changes if the commit version differs. But these files only get changed/added once we change the version so the upgrade db step does nothing when it's the same version/whl.
Adding it as a default step since I've verified it's safe to do so.