Change Airflow Services Logs Path - airflow

I am looking for resources to change the log paths for Airflow services such as Webserver and Scheduler. I am running out of space every now and then and so want to move the logs into a bigger mount space.
airflow-scheduler.log
airflow-webserver.log
airflow-scheduler.out
airflow-webserver.out
airflow-scheduler.err
airflow-webserver.err
I am starting the services using below given command:
airflow webserver -D
airflow scheduler -D
Thanking in advance!

From https://airflow.apache.org/howto/write-logs.html#writing-logs-locally
Users can specify a logs folder in airflow.cfg using the base_log_folder setting. By default, it is in the AIRFLOW_HOME directory.
You need to change the airflow.cfg for log related parameters as below:
[core]
...
# The folder where airflow should store its log files
# This path must be absolute
base_log_folder = /YOUR_MOUNTED_PATH/logs
...
[webserver]
...
# Log files for the gunicorn webserver. '-' means log to stderr.
access_logfile = /YOUR_MOUNTED_PATH/webserver-access.log"
error_logfile = /YOUR_MOUNTED_PATH/webserver-error.log"
...

Log location can be specified on airflow.cfg as follows. By default, it is under AIRFLOW_HOME
[core]
...
# The folder where airflow should store its log files
# This path must be absolute
base_log_folder = /airflow/logs
...
Please refer to this for additional information https://airflow.apache.org/howto/write-logs.html?highlight=logs

In both master (code) and the 1.10 branch (code), the locations of the following files are hardcoded unless you pass an argument to the cli:
airflow-webserver.err
airflow-webserver.out
airflow-webserver.log
airflow-scheduler.err
airflow-scheduler.out
airflow-scheduler.log
The rest of the log locations can be modified through one of the following variables:
In the [core] section:
base_log_folder
log_filename_template
log_processor_filename_template
dag_processor_manager_log_location
And in the [webserver] section:
access_logfile
error_logfile

You can supply flags to the airflow webserver -D and airflow scheduler -D commands to put all of the generated webserver and scheduler log files where you want them. Here's an example:
airflow webserver -D \
--port 8080 \
-A $AIRFLOW_HOME/logs/webserver/airflow-webserver.out \
-E $AIRFLOW_HOME/logs/webserver/airflow-webserver.err \
-l $AIRFLOW_HOME/logs/webserver/airflow-webserver.log \
--pid $AIRFLOW_HOME/logs/webserver/airflow-webserver.pid \
--stderr $AIRFLOW_HOME/logs/webserver/airflow-webserver.stderr \
--stdout $AIRFLOW_HOME/logs/webserver/airflow-webserver.stdout
and
airflow scheduler -D \
-l $AIRFLOW_HOME/logs/scheduler/airflow-scheduler.log \
--pid $AIRFLOW_HOME/logs/scheduler/airflow-scheduler.pid \
--stderr $AIRFLOW_HOME/logs/scheduler/airflow-scheduler.stderr \
--stdout $AIRFLOW_HOME/logs/scheduler/airflow-scheduler.stdout
Note: If you use these, you'll need to create the logs/webserver and logs/scheduler subfolders. This is only tested for airflow 2.1.2.

Related

Varnish 6.0.8 Secret file is not created

Please we're facing some issues when installing Varnish 6.0.8 on ubutnu 18.04.6 OS, it doesn't create the secret file inside the /etc/varnish dir as shown below:
enter image description here
we use the following script to for installation :
curl -s https://packagecloud.io/install/repositories/varnishcache/varnish60lts/script.deb.sh | sudo bash
can someone please help ?
PS: we tried to install later versions (6.6 and 7.0.0) and we got the same issue.
Form a security point of view, remote CLI access is not enabled by default. You can see this when looking at /lib/systemd/system/varnish.service:
[Unit]
Description=Varnish Cache, a high-performance HTTP accelerator
After=network-online.target nss-lookup.target
[Service]
Type=forking
KillMode=process
# Maximum number of open files (for ulimit -n)
LimitNOFILE=131072
# Locked shared memory - should suffice to lock the shared memory log
# (varnishd -l argument)
# Default log size is 80MB vsl + 1M vsm + header -> 82MB
# unit is bytes
LimitMEMLOCK=85983232
# Enable this to avoid "fork failed" on reload.
TasksMax=infinity
# Maximum size of the corefile.
LimitCORE=infinity
ExecStart=/usr/sbin/varnishd \
-a :6081 \
-a localhost:8443,PROXY \
-p feature=+http2 \
-f /etc/varnish/default.vcl \
-s malloc,256m
ExecReload=/usr/sbin/varnishreload
[Install]
WantedBy=multi-user.target
There are no -T and -S parameters in the standard systemd configuration. However, you can enable this by modifying the systemd configuration yourself.
Just run sudo systemctl edit --full varnish to edit the runtime configuration and add a -T parameter to enable remote CLI access.
Be careful with this and make sure you restrict access to this endpoint via firewalling rules.
Additionally you'll add -S /etc/varnish/secret as a varnishd runtime parameter in /lib/systemd/system/varnish.service.
You can use the following command to add a random unique value to the secret file:
uuidgen | sudo tee /etc/varnish/secret
This is what your runtime parameters would look like:
ExecStart=/usr/sbin/varnishd \
-a :6081 \
-a localhost:8443,PROXY \
-p feature=+http2 \
-f /etc/varnish/default.vcl \
-s malloc,2g \
-S /etc/varnish/secret \
-T :6082
When you're done just run the following command to restart Varnish:
sudo systemctl restart varnish

Airflow 2.0.2 - No user yet created

we're moving from airflow 1.x to 2.0.2, and I'm noticing the below error in my terminal after i run docker-compose run --rm webserver initdb:
{{manager.py:727}} WARNING - No user yet created, use flask fab
command to do it.
but in my entrypoint.sh I have the below to create users:
echo "Creating airflow user: ${AIRFLOW_CREATE_USER_USER_NAME}..."
su -c "airflow users create -r ${AIRFLOW_CREATE_USER_ROLE} -u ${AIRFLOW_CREATE_USER_USER_NAME} -e ${AIRFLOW_CREATE_USER_USER_NAME}#vice.com \
-p ${AIRFLOW_CREATE_USER_PASSWORD} -f ${AIRFLOW_CREATE_USER_FIRST_NAME} -l \
${AIRFLOW_CREATE_USER_LAST_NAME}" airflow
echo "Created airflow user: ${AIRFLOW_CREATE_USER_USER_NAME} done!"
;;
Because of this error whenever I try to run airflow locally I still have to run the below to create a user manually every time I start up airflow:
docker-compose run --rm webserver bash
airflow users create \
--username name \
--firstname fname \
--lastname lname \
--password pw \
--role Admin \
--email email#email.com
Looking at the airflow docker entrypoint script entrypoint_prod.sh file, looks like airflow will create the an admin for you when the container on boots.
By default the admin user is 'admin' without password.
If you want something diferent, set this variables: _AIRFLOW_WWW_USER_PASSWORD and _AIRFLOW_WWW_USER_USERNAME
(I'm on airflow 2.2.2)
Looks like they changed the admin creation command password from -p test to -p $DEFAULT_PASSWORD. I had to pass in this DEFAULT_PASSWORD env var to the docker-compose environment for the admin user to be created. It also looks like they now suggest using the .env.localrunner file for configuration.
Here is the commit where that change was made.
(I think you asked this question prior to that change being made, but maybe this will help someone in the future who had my same issue).

Dokku - persistent volumes?

I'm attempting to set up Mautic (https://github.com/mautic/docker-mautic) on Dokku. I have everything working well except for the mounted volume. Mautic stores config files in the volume, so every time the container restarts it needs to be reconfigured if the volume is not set up. The instructions on the above page are:
$ docker volume create mautic_data
$ docker run --name mautic -d \
--restart=always \
-e MAUTIC_DB_HOST=127.0.0.1 \
-e MAUTIC_DB_USER=root \
-e MAUTIC_DB_PASSWORD=mypassword \
-e MAUTIC_DB_NAME=mautic \
-e MAUTIC_RUN_CRON_JOBS=true \
-e MAUTIC_TRUSTED_PROXIES=0.0.0.0/0 \
-p 8080:80 \
-v mautic_data:/var/www/html \
mautic/mautic:latest
I have created a persistent volume in dokku with
dokku storage:mount mautic /var/lib/dokku/data/storage/mautic:/mautic_data
this is confirmed:
root#apps:/var/lib# dokku storage:report mautic
=====> mautic storage information
Storage build mounts:
Storage deploy mounts: -v /var/lib/dokku/data/storage/mautic:/mautic_data
Storage run mounts: -v /var/lib/dokku/data/storage/mautic:/mautic_data
However the config file is not saved. Can anyone point out where I am going wrong?
It looks like the directory that the config files are stored in is /var/www/html instead of /mautic_data. In the docker command referenced, mautic_data in -v mautic_data:/var/www/html is the name of the volume on the host created by docker volume create mautic_data, not the directory inside the container.
Try using:
dokku storage:mount mautic /var/lib/dokku/data/storage/mautic:/var/www/html
This will bind /var/lib/dokku/data/storage/mautic in the host computer to /var/www/html inside the container.

What is RENV_PATHS_CACHE_HOST? -- docker documentation

In the docker vignette/documentation, they give an example with a shiny app, but don't exactly specify what their parameters mean. Some of them are self explanatory, but others aren't. More specifically:
https://rstudio.github.io/renv/articles/docker.html
RENV_PATHS_CACHE_HOST=/opt/local/renv/cache
RENV_PATHS_CACHE_CONTAINER=/renv/cache
docker run --rm \
-e "RENV_PATHS_CACHE=${RENV_PATHS_CACHE_CONTAINER}" \
-v "${RENV_PATHS_CACHE_HOST}:${RENV_PATHS_CACHE_CONTAINER}" \
-p 14618:14618 \
R -s -e 'renv::restore(); shiny::runApp(host = "0.0.0.0", port = 14618)'
What is RENV_PATHS_CACHE_HOST?
And is RENV_PATHS_CACHE_CONTAINER the location of where my cache will be upon running the image instance/container?
I'm not entirely sure how to use this example, but feel I'll need it.
The example here tries to demonstrate how one might mount an renv cache from the host filesystem on to a Docker container.
In this case, RENV_PATHS_CACHE_HOST points to a (theoretical) cache directory on the host filesystem, at /opt/local/renv/cache, whereas RENV_PATHS_CACHE_CONTAINER points to the location in the container where the host cache will be visible.

Firestore authorization for Google Compute engine for app on a docker container

I have deployed a Node.js app on a Google compute instance via a Docker container. Is there a recommended way to pass the GOOGLE_APPLICATION_CREDENTIALS to the docker container?
I see the documentation states that GCE has Application Default Credentials (ADC), but these are not available in the docker container. (https://cloud.google.com/docs/authentication/production)
I am a bit new to docker & GCP, so any help would be appreciated.
Thank you!
So, I could find this documentation on where you can inject your GOOGLE_APPLICATION_CREDENTIALS into a docker in order to test cloud run locally, I know that this is not cloud run, but I believe that the same command could be used in order to inject your credentials to the container.
As I know that a lot of the times the community needs the steps and commands as the links could change and information also could change I will copy the steps needed in order to inject the credentials.
Refer to Getting Started with Authentication for instructions
on generating, retrieving, and configuring your Service Account
credentials.
The following Docker run flags inject the credentials and
configuration from your local system into the local container:
Use the --volume (-v) flag to inject the credential file into the
container (assumes you have already set your
GOOGLE_APPLICATION_CREDENTIALS environment variable on your machine):
-v $GOOGLE_APPLICATION_CREDENTIALS:/tmp/keys/FILE_NAME.json:ro
Use the --environment (-e) flag to set the
GOOGLE_APPLICATION_CREDENTIALS variable inside the container:
-e GOOGLE_APPLICATION_CREDENTIALS=/tmp/keys/FILE_NAME.json
Optionally, use this fully configured Docker run command:
PORT=8080 && docker run \
-p 9090:${PORT} \
-e PORT=${PORT} \
-e K_SERVICE=dev \
-e K_CONFIGURATION=dev \
-e K_REVISION=dev-00001 \
-e GOOGLE_APPLICATION_CREDENTIALS=/tmp/keys/FILE_NAME.json \
-v $GOOGLE_APPLICATION_CREDENTIALS:/tmp/keys/FILE_NAME.json:ro \ gcr.io/PROJECT_ID/IMAGE
Note that the path
/tmp/keys/FILE_NAME.json
shown in the example above is a reasonable location to place your
credentials inside the container. However, other directory locations
will also work. The crucial requirement is that the
GOOGLE_APPLICATION_CREDENTIALS environment variable must match the
bind mount location inside the container.
Hope this works for you.

Resources