You have two airflow.cfg files - airflow

I have created a venv project and installed airflow with in this venv. I have also set the export AIRFLOW_HOME - to a directory ( airflow_home ) with in this venv project. First time, after I ran
$airflow version
this created airflow.cfg and logs directory under this 'airflow_home' folder. However, when I repeat the same on next day, now I have the error message that I have two airflow.cfg.
one airflow.cfg under my venv project
another one under /home/username/airflow/airflow.cfg
Why is that ? I haven't installed airflow anywhere outside this venv project.

Found the issue. If I don't set the environment variable AIRFLOW_HOME, by default it creates a new ariflow.cfg under /home/usernme/airflow. To avoid this, AIRFLOW_HOME should be set before calling airflow each time after terminal starts or add to bash profile.

Related

(Dagster) Schedule my_hourly_schedule was started from a location that can no longer be found

I'm getting the following Warning message when trying to start the dagster-daemon:
Schedule my_hourly_schedule was started from a location Scheduler that can no longer be found in the workspace, or has metadata that has changed since the schedule was started. You can turn off this schedule in the Dagit UI from the Status tab.
I'm trying to automate some pipelines with dagster and created a new project using dagster new-project Scheduler where "Scheduler" is my project.
This command, as expected, created a diretory with some hello_world files. Inside of it I put the dagster.yaml file with configuration for a PostgreDB to which I want to right the logs. The whole thing looks like this:
However, whenever I run dagster-daemon run from the directory where the workspace.yaml file is located, I get the message above. I tried runnning running the daemon from other folders, but it then complains that it can't find any workspace.yaml files.
I guess, I'm running into a "beginner mistake", but could anyone help me with this?
I appreciate any counsel.
One thing to note is that the dagster.yaml file will not do anything unless you've set your DAGSTER_HOME environment variable to point at the directory that this file lives.
That being said, I think what's going on here is that you don't have the Scheduler package installed into the python environment that you're running your dagster-daemon in.
To fix this, you can run pip install -e . in the Scheduler directory, although the README.md inside that directory has more specific instructions for working with virtualenvs.

composer doesn't use imported variable

I import a json file that define variables to be used by composer.
I used the gcloud beta composer environments storage data import command, I can see that the file is imported correctly to the <composer_bkt>/data/variables, however, when I accessed to airflow webUI, I find that there is no variable declared !
Moving the file to to <COMPOSER_BCKT>/data/variables is not enough by itself to import the variables to Airflow. Apart from that you need to run the Airflow CLI command:
airflow variables --i <JSON_FILE>
To do that in Composer you have to run the following command as described here:
gcloud composer environments run <ENVIRONMENT_NAME> --location=<LOCATION> variables -- --i /home/airflow/gcs/data/variables/variables.json
Thank you for the answer #itroulli. It appears that my composer version (v2.2.5) failed on that command, but instead a command of this form worked:
gcloud composer environments run <ENVIRONMENT_NAME> --location=<LOCATION> variables -- import /home/airflow/gcs/data/variables/variables.json
I'll leave this for anyone else that comes across this problem

DOTNET_ROOT Not Recognised After Raspbian Reboot

I have been following some basic tutorials for dotnet with Raspbian
They state:
export DOTNET_ROOT=$HOME/dotnet-arm32
export PATH=$PATH:$HOME/dotnet-arm32
However, when I reboot these are lost. After some reading I found that adding PATH=$PATH:$HOME/dotnet-arm32 to my ~/.profile solved the dotnet command issue, but the DOTNET_ROOT does not work. I have to run export DOTNET_ROOT=$HOME/dotnet-arm32 once I've rebooted to get a project to run.
This is what my ~/.profile looks like at the bottom of the file.
# set PATH to dotnet
PATH="$PATH:$HOME/dotnet-arm32"
# set ENV for runtime
DOTNET_ROOT="$HOME/dotnet-arm32"
You need to export the variables:
# set PATH to dotnet
export PATH="$PATH:$HOME/dotnet-arm32"
# set ENV for runtime
export DOTNET_ROOT="$HOME/dotnet-arm32"
PATH was already an exported variable, so not exporting it doesn't make a difference. But DOTNET_ROOT is treated as a local variable in .profile unless it's exported explicitly.

New airflow directory created when i run airflow webserver

I'm having a few problems with airflow. When I installed airflow and set the airflow home directory to be
my_home/Workspace/airflow_home
But when I start the webserver a new airflow directory is created
my_home/airflow
I thought maybe something in the airflow.cfg file needs to be changed but I'm not really sure. Has anyone had this problem before?
Try doing echo $AIRFLOW_HOME and see if it the correct path you set
you need to set AIRFLOW_HOME to the directory where you save airflow config file.
if the full path of airflow.cfg file is /home/test/bigdata/airflow/airflow.cfg
just run
export AIRFLOW_HOME=/home/test/bigdata/airflow
if AIRFLOW_HOME is not set, it will use ~/airflow as default.
you could also write a shell script to start airflow webserver
it might contain lines below
source ~/.virtualenvs/airflow/bin/activate # if your airflow is installed with virtualenv, this is not necessary
export AIRFLOW_HOME=/home/test/bigdata/airflow # path should be changed according to your environment
airflow webserver -D # start airflow webserver as daemon

Airflow user issues

We have installed airflow from service account say 'ABC' using sudo root in virtual environment, but we are facing few issues.
Calling python script using bash operator. Python script uses some
environmental variables from unix account 'ABC'.While running from
airflow, environmental variables are not picked. In order to find the
user of airflow, created dummy dag with bashoperator command
'whoami', it returns the ABC user. So airflow is using the same 'ABC'
user. Then why environmental variables are not picked?
Then tried sudo -u ABC python script. Environmental variables are not picked, due to sudo usage. Did the workaround without environmental variables and it ran well in development environment without issues. But while moving to different environment, got the below error and we don't have permission to edit sudoers file. Admin policy didn't comply.
sudo: sorry, you must have a tty to run sudo
Then used 'impersonation=ABC' option in .cfg file and ran the airflow without sudo. This time, bash command fails for environmental variables and it's asking all the packages used in script in virtual environment.
My Questions:
Airflow is installed through ABC after sudoing root. Why ABC was not
treated while running the script.
Why ABC environmental variables are not picked?
Even Impersonation option is not picking the environmental
variables?
Can airflow be installed without virtual environment?
Which is the best approach to install airflow? Using separate user
and sudoing root? We are using dedicated user for running python
script.Experts kindly clarify.
It's always a good idea to use virtualenv for installing any python packages. So, you should always prefer installing airflow in a virtaulenv.
You can use systemd or supervisord and create programs for airflow webserver and scheduler. Example configuration for supervisor:
[program:airflow-webserver]
command=sh /home/airflow/scripts/start-airflow-webserver.sh
directory=/home/airflow
autostart=true
autorestart=true
startretries=3
stderr_logfile=/home/airflow/supervisor/logs/airflow-webserver.err.log
stdout_logfile=/home/airflow/supervisor/logs/airflow-webserver.log
user=airflow
environment=AIRFLOW_HOME='/home/airflow/'
[program:airflow-scheduler]
command=sh /home/airflow/scripts/start-airflow-scheduler.sh
directory=/home/airflow
autostart=true
autorestart=true
startretries=3
stderr_logfile=/home/airflow/supervisor/logs/airflow-scheduler.err.log
stdout_logfile=/home/airflow/supervisor/logs/airflow-scheduler.log
user=airflow
environment=AIRFLOW_HOME='/home/airflow/'
We got the same issue as.
sudo: sorry, you must have a tty to run sudo
The solution we got is,
su ABC python script

Resources