How to set Airflow scheduler log file mode/permissions - airflow

I'm running airflow 1.10.3, on Red Hat Linux. I'm using a LocalExecutor, and the webserver and scheduler are both started via systemd.
The log files being generated by the scheduler are world-readable (i.e. mode "-rw-rw-rw-"). The log directories being created are "drwxrwxrwx".
This fails the security scans my organisation has in place. I need to be able to restrict the permissions on these files.
The umask in /etc/profile is 077. I've also added UMask=0007 to both the systemd unit files for the services. However, although this seems to be working for the logs in the dags/logs/scheduler/ directory, it is not affecting the DAG run logs.
[root#server logs]# ls -la s3_dag_test/
total 4
drwxrwxrwx. 4 airflow airflow 54 Aug 7 17:35 .
drwxrwx---. 46 airflow airflow 4096 Aug 7 20:00 ..
drwxrwxrwx. 5 airflow airflow 126 Aug 7 17:37 bash_test
drwxrwxrwx. 5 airflow airflow 126 Aug 7 17:29 check_s3_for_file_in_s3
[root#server logs]# ls -la s3_dag_test/bash_test/2019-08-07T17\:29\:27.988953+00\:00/
total 12
drwxrwxrwx. 2 airflow airflow 19 Aug 7 17:35 .
drwxrwxrwx. 5 airflow airflow 126 Aug 7 17:37 ..
-rw-rw-rw-. 1 airflow airflow 8241 Aug 7 17:35 1.log

This is probably too late to be a helpful answer for you, but I had the exact same issue. My organization raised the permissions of the Airflow log directories as a security finding. I likewise checked the umask, to no avail.
I did manage to find this:
https://anoopkm.wordpress.com/2020/03/26/world-readable-airflow-dag-logs-issue/
In a nutshell, it looks like Airflow hard-codes the permissions used for creating files and folders.
I edited this Python file: venv/lib/python3.8/site-packages/airflow/utils/log/file_task_handler.py and changed lines 242 and 247 to use the 0o770 and 0o660 instead of 0o777 and 0o666 for creating folders and files, respectively. Then I manually triggered a DAG and checked the folder permissions. The newest log folder no longer had global rwx permissions.

Can you let us know how airflow is installed as normal user or root user

Related

User Plugin in Artifactory not loading

I have Artifactory 6.20.1 running in a Docker container. I'm trying to install the artifactCleanup plugin (https://github.com/jfrog/artifactory-user-plugins/tree/master/cleanup/artifactCleanup)
I have put the artifactCleanup.groovy file in the corresponding folder:
$ ls -all /opt/jfrog/artifactory/var/etc/artifactory/plugins/
total 36
drwxr-xr-x 2 artifact artifact 4096 Feb 24 10:28 .
drwxr-xr-x 3 artifact artifact 4096 Feb 23 15:24 ..
-rwxr-xr-x 1 artifact artifact 5829 Feb 23 15:25 README.md
-rwxr-xr-x 1 artifact artifact 14043 Feb 23 15:26 artifactCleanup.groovy
-rwxr-xr-x 1 artifact artifact 325 Feb 24 10:28 artifactCleanup.json
However if I'm trying to see my loaded plugins I get an empty response
curl -X GET -u "admin:password" http://localhost:8081/artifactory/api/plugins
{}
The Server has been restarted before running that request. All commands have been running inside the Docker container. I have been looking at the documentation (https://www.jfrog.com/confluence/display/JFROG/User+Plugins) on how to install plugins. My User account which was used for the rest calls is an admin account.
Now I am out of clues, why that plugin is not loading?
You can use the below reload plugins using the Reload Plugins REST API endpoint.
https://www.jfrog.com/confluence/display/JFROG/Artifactory+REST+API#ArtifactoryRESTAPI-ReloadPlugins
Please comment here if you are running into any issues.
Turns out I created a wrong directory. Correct directory is
/var/opt/jfrog/artifactory/etc/plugins
which already existed.

how to handle task log permissions in Airflow

Issue: while jobs are running with airflow admin id logs in the Linux directory are getting created with readable permissions
but when we create a new user "other than admin role" and with "User/Op role" and when the user is trying to trigger the dag the logs are getting created with different permissions and the user is not able to monitor the logs from the server (but still he can access from UI)
I tried with
modifying "venv/lib/python3.8/site-packages/airflow/utils/log/file_task_handler.py"
[root#server logs]# ls -la
total 12
drwx------ 2 airflow airflow 23 Feb 15 03:39 2022-01-15T00:00:00+00:00
under this directory
-rw-rw-rw-. 1 airflow airflow 23 Feb 15 03:39 7 17:35 1.log
but the issue is still persists.. directory is getting created with " drwx------" permissions
Could any one please suggest what can be done

How to grant nginx permissions to phpMyAdmin on synology diskstation

I have a Synology Diskstation DS216se running DSM 6.2.3-25426. I've installed MariaDB 10, Web Station, PHP 7.2, and myPhpAdmin, but when I open it at http://diskstation/phpMyAdmin/ I get this error message
"Sorry, the page you are looking for is not found."
I'm using an nginx server in Web Station, and the error log at /var/log/nginx/error.log contains multiple entries like the following
*621 open() "/var/services/web/phpMyAdmin/js/vendor/jquery/jquery.debounce-1.0.5.js" failed (13: Permission denied)
The file, and all other files with permission denied entries in the logs, exist in the /var/services/web/phpMyAdmin/ directory - what permissions need to be granted to the directory for this to succeed?
I hit this as well. I managed to recover, but it effectively amounts to hard clearing any evidence of prior installs of Web Station, PHP 7.2, phpMyAdmin, and any other web related services. Then manually ripping out some bad directories with broken symlinks/permissions.
My hypothesis is that I tried to install adminer prior to this and - having not done any set up for Web Station et. al. - it put the filesystem in a bad state.
I am not willing to try installing adminer again to test this hypothesis.
What I did to fix this:
Backup what you need (e.g., any personal web site).
SSH into your diskstation. Please be aware of what you are doing and keep in mind the big picture. Don't go deleting random things.
Uninstall Web Station, PHP 7.2, Apache, phpMyAdmin, etc. Anything that Web Station would ultimately be inclined to read and serve up.
Verify that /var/services/web doesn't contain anything you care about, and delete it (sudo rm -rf /var/services/web).
Verify that /volume1/web doesn't contain anything you care about, and delete everything inside it (sudo rm -rf /var/services/web). You may need to chmod permissions for this - I ended up leaving the web directory itself intact, but nothing inside.
Reboot. Mount any encrypted disks, etc.
Check that /var/services/web now shows it is symlinked to /volume1/web, e.g. sudo readlink -e /var/services/web.
Also check permissions for /volume1/web, e.g. ls -al /volume1. It should be owned by root:root and have permissive (777) bits.
Install Web Station, PHP 7.2, and phpMyAdmin in that order.
After this, I could open phpMyAdmin and be served its log in screen.
Debugging notes:
For me, when I SSH in I see in the logs similar issues:
2020/12/17 10:36:35 [error] 32658#32658: *1028 "/var/services/web/phpMyAdmin/index.php" is forbidden (13: Permission denied),
ps says that the nginx workers run as the http user (uid=1023(http) gid=1023(http) groups=1023(http)).
The directory /var/services/web/ appears to be owned by root, both group and user:
# ls -al /var/services/web/
total 424
drwxr-xr-x 3 root root 4096 Dec 17 10:29 .
drwxr-xr-x 3 root root 4096 Dec 17 10:22 ..
-rw-r--r-- 1 root root 27959 Apr 13 2016 adminer.css
-rw-r--r-- 1 root root 82 Apr 13 2016 .htaccess
-rw-r--r-- 1 root root 387223 Apr 13 2016 index.php
drwxr-xr-x 10 root root 4096 Dec 17 10:29 phpMyAdmin
It's not clear to me how Web Station's nginx is intended to work at all given the mismatch - perhaps some set of actions I took prior caused it to decide to install with bad ownership.
I decided to leave everything owned by root, but changed group permissions so that http can access:
# chown -R root:http /var/services/web/
# chmod -R 775 /var/services/web/
This got past the initial error, but revealed a new one:
"/usr/syno/synoman/phpMyAdmin/index.cgi" is not found (2: No such file or directory)
Indeed, there was no trace of phpMyAdmin anywhere in that directory. Evidence of a bad install.
I decided to uninstall anything web related: phpMyAdmin, PHP 7, Apache (happened to be installed), nginx, and Web Station. Once I did, I still had two files in /var/services/web: adminer.css index.php.
I had tried adminer prior to this. In /var/services, there were symlinks to specific volume locations, e.g.:
# ls -al /var/services/
total 12
drwxr-xr-x 3 root root 4096 Dec 17 10:22 .
drwxr-xr-x 17 root root 4096 Dec 17 10:21 ..
lrwxrwxrwx 1 root root 18 Jan 20 2020 download -> /volume1/#download
lrwxrwxrwx+ 1 root root 14 Dec 17 10:22 homes -> /volume1/homes
lrwxrwxrwx 1 root root 24 Jan 20 2020 pgsql -> /volume1/#database/pgsql
lrwxrwxrwx 1 root root 13 Dec 17 10:22 tmp -> /volume1/#tmp
lrwxrwxrwx 1 root root 13 Dec 17 10:22 web
Interestingly, web was not symlinked. I fully deleted /var/services/web.
Looking over at /volume1, I do see a /volume1/web, again fully owned by root but with extremely constrained permission:
d---------+ 1 root root 52 Dec 17 10:14 web
There are only a few things in here, which look related to a blank install of Web Station. I fully deleted everything within /volume1/web, but left it as is. With everything maximally cleaned I rebooted.
Upon boot, /var/services/web was now symlinked to /volume1/web, which now also had useful permission bits (777), and owned by root:root. Maybe this was done by some boot recover process, who knows. (I still have nothing web related installed at this point.)
I installed Web Station, then PHP 7.2, then phpMyAdmin.
I had the same issue when accessing my server via
<name>.local/phpMyAdmin/
It worked when I accessed it via
<local ip>/phpMyAdmin/

Airflow task Intermittently Fails due to Failed to fetch log file and Could not read logs

I'm running a DAG that runs once per day. It starts with 9 concurrently running tasks that all do the same thing - each is basically polling S3 to see if that tasks's designated 1 file exists. Each task is the same code in Airflow and is put into the structure in the same way. I have 1 of these tasks, which, on random days, fails to "begin" - it won't enter the running stage. It just sits as queued . When it does this, here's what its log says
*** Log file isn't local.
*** Fetching here: http://:8793/log/my.dag.name./my_airflow_task/2020-03-14T07:00:00
*** Failed to fetch log file from worker.
*** Reading remote logs...
Could not read logs from s3://mybucket/airflow/logs/my.dag.name./my_airflow_task/2020-03-14T07:00:00
Why does this only happen on random days? All similar questions I've seen point to this error happening consistently, and once overcome, no longer continues. To "trick" this task into "running" I manually touch whatever the name of the log file is supposed to be, and then it changes to running.
So the issue appears that it had to do with the system's ownership rules regarding the folder the logs for that particular task wrote to. I used a CI tool to ship the new task_3 when I updated my Airflow's Python code to the production environment, so the task was created that way. When I peaked for log directory ownership, I noticed this for the tasks:
# inside/airflow/log/dir:
drwxrwxr-x 2 root root 4096 Mar 25 14:53 task_3 # is the offending task
drwxrwxr-x 2 airflow airflow 20480 Mar 25 00:00 task_2
drwxrwxr-x 2 airflow airflow 20480 Mar 25 15:54 task_1
So, I think what was going on, was that randomly, Airflow couldn't get the permission to write the log file, thus it wouldn't start the rest of the task. When I applied the appropriate chown command using something like sudo chown -R airflow:airflow task_3 . Ever since I changed this, the issue has disappeared.

Make Redis unixsocket owned by redis user

I have installed Redis 3.0.6 on Debian. There's a /etc/init.d/redis file which starts the Redis server when the system starts or I can invoke it manually to start/stop the server. Problem is that this script is run as root user.
I have a redis user and group that I want to make Redis run under. But I can't figure out how (I have not found an option to make Redis switch user ID after startup). In my config file I use
unixsocket /home/redis/redis.sock
unixsocketperm 770
But, of course, the redis.sock is owned by root.
drwxr-xr-x 2 redis redis 4096 Jan 18 03:34 bin
drwxr-xr-x 2 redis redis 4096 Jan 18 03:55 data
-rw-r--r-- 1 redis redis 41638 Jan 18 03:52 redis.conf
-rw-r--r-- 1 redis redis 16348 Jan 18 03:55 redis.log
-rw-r--r-- 1 root root 5 Jan 18 03:55 redis.pid
srwxrwx--- 1 root root 0 Jan 18 03:55 redis.sock
And the process is, too.
root 7913 0.1 0.1 38016 1976 ? Ssl 03:55 0:00 /home/redis/bin/redis-server *:6379
Ultimately, I have a git user that is also in the redis group and thus should in the end have access to redis.sock. (This is for a manual deployment of GitLab CE).
How I can I configure the Redis server that way?
Update your /etc/init.d to use sudo during start service (line 33):
sudo -u redis $EXEC $CONF
You may need to cleanup old files (in /var/lib) or reset their permission to redis.

Resources