What is export command doing in apache airflow setup - airflow

I am following this tutorial.
https://towardsdatascience.com/getting-started-with-apache-airflow-df1aa77d7b1b
when I run the export command as below
export AIRFLOW_HOME='pwd' airflow_home
what is this export command doing. it will create a environment variable AIRFLOW_HOME = pwd
is this the purpose?
when I run the next command airflow initdb it creates a folder called pwd inside my newly created project directory and puts the files in there.
Am I missing something here?
I am using macbook, python 3.7, airflow 1.10.9

You're missing the correct backtick ` instead of a single quote '.
On *nix systems `pwd` will be evaluated to the current directory. That's why it creates a folder called pwd instead of using the current directory as the airflow home

Related

Loading own dags in Airflow

When running Airflow server for the first time after installation, in the DAG list, there is already some dags, such as "example_dash_operator", "example_branch_labels", etc.
Airflow doc says that, to create our own DAGs, we should put in a dags folder, which should be in this location AIRFLOW_HOME/airflow/dags/ (AIRFLOW_HOME is the folder where I install Airflow). I put a sample dag1.py in this folder. But after re-logging in into localhost:8080, I still see only the standard list of DAGS after installation. I don't see dag1.py. I have both the server and the scheduler running with :
airflow webserver --port 8080
airflow scheduler
The full folder structure is as following:
\AIRFLOW_HOME\
airflow\
airflow-webserver.pid
airflow.db
logs\
airflow.cfg
dags\
dag1.py
webserver_config.py
This thread here advised to run airflow dags list first. dag1.py does not appear on the list when I run that command. And, after running that, restarting the server and scheduler, the web UI still does not list dag1.py
in airflow.cfg, I have this line defining the dags folder:
dags_folder = /xxxxxx/airflow/dags
where the xxxxxx is the absolute path of AIRFLOW_HOME.
The content of dag1.py is code copied from a tutorial in this Youtube. So I think it is a valid dag.
What am I missing?
The AIRFLOW_HOME is NOT where you install airflow. The AIRFLOW_HOME is where you aither:
have AIRFLOW_HONE environment point to
or if you have no AIRFLOW_HOME variable defined when you run airflow, it defaults to "${HOME}/airflow"
See:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html?highlight=airflow_home
The first time you run Airflow, it will create a file called airflow.cfg in your $AIRFLOW_HOME directory (~/airflow by default). This file contains Airflow's configuration and you can edit it to change any of the settings. You can also set options with environment variables by using this format: AIRFLOW__{SECTION}__{KEY} (note the double underscores).

Creating a pyinstaller executable that uses virtualenv imported modules

So, the title basically covers my question. I've created a project using virtualenv, e.g. I have to
source ./env/bin/activate
to run my script.
When I try creating an executable using:
pyinstaller --onefile <myscript.py>
None of the virtualenv packages are included; just the ones that are installed globally. I have a requirements.txt file that contains all of the modules I need. Is there a way to have pyinstaller point to that for the needed modules, or is there another way?
As Valentino pointed out by looking at How can I create the minimum size executable with pyinstaller?
You have to run PyIntaller from inside the virtual environment:
(venv_test) D:\testenv>pyinstaller
How to solve the not importing modules from the virtual environment
The virtual environment saves modules in a different directory than the global module directory. If you are using Windows like me, you can find the modules directory here:
C:\Users\<your username>\.virtualenvs\<your project name>\Lib\site-packages
When you find your virtualenv directory, run this command instead of this simple command(pyinstaller <script>.py):
pyinstaller --paths "C:\Users\<your username>\.virtualenvs\<your project name>\Lib\site-packages" --hidden-import <module name that should be import> <your script name>.py
To export just one file you can add this: -F or --onefile
As many modules as you can add to be imported by adding more --hidden-import flags and module name
Flag description
--paths: The pyinstaller will search for imports here
--hidden-import: Which modules should be imported by pyinstaller from the path

Passing environment variables through jar file which app uses

I am currently trying out on the docker link between my app and db containers. I've checked on my app container and environment variables are automatically set when I link the containers together.
What I want to do is for my config file, which is packaged into a jar file, to receive the environment variables and set the required values to it. Any advice or help?
And this is how I create a config file in my jar file to connect to MySQL
database { url="jdbc:mysql://${MYSQL_PORT_3306_TCP_ADDR}:${MYSQL_PORT_3306_TCP_PORT}/mydb" driver="com.mysql.jdbc.Driver"}
Updating the config file inside the jar could be quite overkill.
It think you have several choices
read the config environment variable directly in you program
use variable either directly or generate the config file there
create launch script (details of this depends of you guest os in docker how to do it; sh/bash for linux etc..)
that script can generate new config file from environment and put it on classpath before jar so you program sees it.
EDIT: added example
You can save this kind of launcher script on docker image which dynamically creates configuration before launching actual program.
#!/bin/bash
# some default values for testing even without links to other container
MYSQL_PORT_3306_TCP_ADDR=${MYSQL_PORT_3306_TCP_ADDR:-127.0.0.1}
MYSQL_PORT_3306_TCP_PORT=${MYSQL_PORT_3306_TCP_PORT:-3306}
cat << EOF > /opt/yourprogram/dbconfig.conf
database { url="jdbc:mysql://${MYSQL_PORT_3306_TCP_ADDR}:${MYSQL_PORT_3306_TCP_PORT}/mydb" driver="com.mysql.jdbc.Driver"
}
EOF
scala -classpath /opt/yourprogram YourProgram
What I did is that I wrote the sh file in my directory /tmp/restcore-1.0-SNAPSHOT/bin like this:
#!/bin/bash echo "database{url="jdbc:mysql://"${MYSQL_PORT_3306_TCP_ADDR}":"${MYSQL_PORT_3306_TCP_PORT}"/mydb" driver="com.mysql.jdbc.Driver" }" > myconf.conf
jar uf /tmp/restcore-SNAPSHOT/lib/com.organization.restcore-1.0-SNAPSHOT.jar /tmp/restcore-1.0-SNAPSHOT/bin/myconf.conf
After building the Dockerfile and running the sh file in CMD, I use cat myconf.conf to check the config file and I'll be able to see the environment set.

Implementing a selected unix path for a specific application (e.g. pytest)

I have two versions of pytest installed, one locally in a directory in my home directory, and one that is installed in /usr/local/bin.
The version of pytest installed in the /usr/local/bin is 2.2.4 and I don't have sudo rights to upgrade it to the newer version, 2.3.4, but need some tests to run with 2.3.4.
Is there a way to redirect the path so that it always uses the pytest in my home directory over the pytest in the /usr/local/bin directory when I invoke pytest?
Because there is a need to run many tests, it would be more convenient to have a shortcut!
You should add a directory to your $PATH that contains the copy of pytest you would like to use. For example, place pytest in ~/bin and add ~/bin (or $HOME/bin) to your path:
PATH="$HOME/bin:$PATH"
export PATH
As indicated, place the new directory at the front of the path so that your copy of pytest (and whatever else you put in ~/bin) will be found first.
Even better, put those two lines into ~/.profile so that your $PATH will be updated every time you log in.

Adding directory to PATH through Makefile

I'm having some trouble in exporting the PATH I've modified inside the Makefile into the current Terminal.
I'm trying to add to the PATH, the bin folder inside wherever the Makefile directory is.
Here's the relevant strip of the makefile:
PATH := $(shell pwd)/bin:$(PATH)
install:
mkdir -p ./bin
export PATH
echo $(PATH)
The echo prints it correctly but if I redo the echo in the terminal, the PATH remains the same.
Thanks in advance for the help.
If you're using GNU make, you need to explicitly export the PATH variable to the environment for subprocesses:
export PATH := $(shell pwd)/bin:$(PATH)
install:
mkdir -p ./bin
export PATH
echo $(PATH)
What you are trying to do is not possible. Make is running in another process than the shell in your terminal. Changes to the environment in the make process does not transfer to the shell.
Perhaps you are confusing the effect of the export statement. export does not export the values of the variables from the make process to the shell. Instead, export marks variables so they will be transfered any child processes of make. As far as I know there is no way to change the environment of the parent process (the shell where you started make is the parent process of the make process).
Perhaps this answers will make the concept of exporting variables to child processes a bit clearer.
Perhaps you can rely on the user to do it for you. Note the quoting
install_hint:
#echo "Execute this command at your shell prompt:"
#echo "export PATH=$(shell pwd)/bin:\$$PATH"

Resources