Airflow 2.0 API response 403 Forbidden - airflow

I'm trying to trigger a new dag run via Airflow 2.0 REST API. If I am logged in to the Airflow webserver on the remote machine and I go to the swagger documentation page to test the API, the call is successful. If I log out or if the API call is sent through Postman or curl, then I get a 403 forbidden message. The same 403 error message is received in curl or postman whether I provide the web server username password or not.
curl -X POST --user "admin:blabla" "http://10.0.0.3:7863/api/v1/dags/tutorial_taskflow_api_etl/dagRuns" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"conf\":{},\"dag_run_id\":\"string5\"}"
{
"detail": null,
"status": 403,
"title": "Forbidden",
"type": "https://airflow.apache.org/docs/2.0.0/stable-rest-api-ref.html#section/Errors/PermissionDenied"
}
The security for API has been changed to default, instead of deny_all (auth_backend = airflow.api.auth.backend.default). The installation of airflow has been done using pip using ubuntu 18 bionic. Dags are running fine if triggered manually or scheduled. The database backend is postgres.
Also tried copying the cookie details from Chrome into postman to get past this issue, but it did not work.
Here is the log on the web server for the two calls mentioned above.
airflowWebserver_container | 10.0.0.4 - - [05/Jan/2021:06:35:33 +0000] "POST /api/v1/dags/tutorial_taskflow_api_etl/dagRuns HTTP/1.1" 403 170 "http://10.0.0.3:7863/api/v1/ui/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
airflowWebserver_container | 10.0.0.4 - - [05/Jan/2021:06:35:07 +0000] "POST /api/v1/dags/tutorial_taskflow_api_etl/dagRuns HTTP/1.1" 409 251 "http://10.0.0.3:7863/api/v1/ui/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"

I am using basic_auth for Airflow v2.0. The AIRFLOW__API__AUTH_BACKEND environment variable should be set to airflow.api.auth.backend.basic_auth. You will have to restart the webserver container. Then you should be able to access all stable APIs using the cURL commands with --user option.

In Airflow 2.0, There seems to be some bug.
If you set this auth configuration in airflow.cfg, it doesn't work.
auth_backend = airflow.api.auth.backend.basic_auth
But setting this as an environment variable works
AIRFLOW__API__AUTH_BACKEND: "airflow.api.auth.backend.basic_auth"

#AmitSingh was correct. Setting security to default only works with the experimental api. I changed the relevant configuration in airflow, restarted and added 'experimental' in the api path. Please see https://airflow.apache.org/docs/apache-airflow/stable/rest-api-ref.html

Maybe also good to know:
You can only disable authentication for experimental API, not the stable REST API.
See: https://airflow.apache.org/docs/apache-airflow/stable/security/api.html#disable-authentication

Related

Different API response for axios and curl requests

Instagram has an endpoint that gives you a JSON in the response.
When I try to call it using curl I would get a JSON.
curl -H "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36" https://www.instagram.com/microsoft/\?__a\=1
However, if I use other HTTP clients such as Axios for node.js I would get an html page instead. Also for some other HTTP clients I would get 302 redirect occasionally.
const axios = require('axios')
axios.request({
url: 'https://www.instagram.com/microsoft/\?__a',
headers: {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36'}
})
Is there a way to get around this with axios or other HTTP clients so that they follow the curl behaviour for HTTP requests?
What I usually do in these cases, is use Postman in order to convert any type of requests, like Curl based request, to any other type, for example axios, fetch etc...

nginx errors with very large headers

When the user selects the ‘All’ filter on our dashboards, most queries fail and we get this error: 502 - Bad Gateway in Grafana. If it refreshes the page, the errors disappear and the dashboards work. We use an nginx as a reverse proxy and imagine that the problem is linked to URI size or headers. We made an attempt to increase the buffers: large_client_header_buffers 32 1024k. A second attempt was to change the InfluxDB method from GET to POST. Errors have diminished, but they still happen constantly. Our configuration uses nginx + Grafana + InfluxDB.
When using All nodes as filter on our dashboards ( the maximum of possible information), most of the queries return an failure (502 - Bad Gateway) on grafana. We have Keycloak for authetication and an nginx, working as an reverse proxy in front of our grafana server and somehow the problem is linked to it, when acessing the grafana server directly, trhough an ssh-tunnel for example, we do not experience the failure.
nginx log error example:
<my_ip> - - [22/Dec/2021:14:35:27 -0300] "POST /grafana/api/datasources/proxy/1/query?db=telegraf&epoch=ms HTTP/1.1" 502 3701 "https://<my_domain>/grafana/d/gQzec6oZk/compute-nodes-administrative-dashboard?orgId=1&refresh=1m" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36" "-"
below prints of the error in grafana and the configuration variables
variables we use in them as a whole
error in grafana

Why wget fails to download a file but browser succeeds?

I am trying to download virus database for clamav from http://database.clamav.net/main.cvd location. I am able to download main.cvd from web browser(chrome or firefox) but unable to do same with wget and get the following error:
--2021-05-03 19:06:01-- http://database.clamav.net/main.cvd
Resolving database.clamav.net (database.clamav.net)... 104.16.219.84, 104.16.218.84, 2606:4700::6810:db54, ...
Connecting to database.clamav.net (database.clamav.net)|104.16.219.84|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2021-05-03 19:06:01 ERROR 403: Forbidden.
Any lead on this issue?
Edit 1:
This is how my chrome cookies look like when I try to download main.cvd
Any lead on this issue?
It might be possible that blocking is based on User-Agent header. You might use --user-agent= option to set same User-Agent as for browser. Example
wget --user-agent="Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0" https://www.example.com
will download example.com page and identify itself as Firefox to server. If you want know more about User-Agent's part meaning you might read Developer Mozilla docs for User-Agent header
Check for the session cookies or tokens from browser, as some websites place similar kind of security

Wordpress site gets infected with malware, random POST requests from hackers return 200 results, trying to understand how this happens

A word press site i maintain, gets infected with .ico extension PHP scripts and their invocation links. I periodically remove them. Now i have written a cron job to find and remove them every minute. I am trying to find the source of this hack. I have closed all the back doors as far as i know ( FTP, DB users etc..).
After reading similar questions and looking at https://perishablepress.com/protect-post-requests/, now i think this could be because of malware POST requests. Monitoring the access log i see plenty of POST requests that fail with 40X response. But i also see requests that succeed which should not. Example one below, first request fails, similar POST Requests succeeds with 200 response few hours later.
I tried duplicating a similar request from https://www.askapache.com/online-tools/http-headers-tool/, but that fails with 40X response. Help me understand this behavior. Thanks.
POST Fails as expected
146.185.253.165 - - [08/Dec/2019:04:49:13 -0700] "POST / HTTP/1.1" 403 134 "http://website.com/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/534.24 (KHTML, like Gecko) RockMelt/0.9.58.494 Chrome/11.0.696.71 Safari/534.24" website.com
Few hours later same post succeeds
146.185.253.165 - - [08/Dec/2019:08:55:39 -0700] "POST / HTTP/1.1" 200 33827 "http://website.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.861.0 Safari/535.2" website.com
146.185.253.167 - - [08/Dec/2019:08:55:42 -0700] "POST / HTTP/1.1" 200 33827 "http://website.com/" "Mozilla/5.0 (Windows NT 5.1)

REST API - works in chrome but curl did not work

I am using a web service API.
http://www.douban.com/j/app/radio/people?app_name=radio_desktop_win&version=100&user_id=&expire=&token=&sid=&h=&channel=1&type=n
Typing that address into the chrome, expected result (json file containing song information) could be returned but when using curl it failed. (in both case,response code is OK but the response body is not correct in the later case )
Here are the request info dumped using the Chrome developer tool:
Request URL:http://www.douban.com/j/app/radio/people?app_name=radio_desktop_win&version=100&user_id=&expire=&token=&sid=&h=&channel=7&type=n
Request Method:GET
Status Code:200 OK
Request Headersview source
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate,sdch
Accept-Language:zh-CN,zh;q=0.8
Connection:keep-alive
Cookie:bid="lwaJyClu5Zg"
Host:www.douban.com
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
Query String Parametersview sourceview URL encoded
app_name:radio_desktop_win
version:100
user_id:
expire:
token:
sid:
h:
channel:7
type:n
However, using that API with curl, i.e curl http://www.douban.com/j/app/radio/people?app_name=radio_desktop_win&version=100&user_id=&expire=&token=&sid=&h=&channel=7&type=n will not return expected result.
Even specifying the exactly header as what dumped from Chrome still failed.
curl -v -H "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" -H "Accept-Encoding:gzip,deflat,sdcn" -H "Accept-Language:zh-CN,zh;q=0.8" -H "Cache-Control:max-age=0" -H "Connection:keep-alive" -H "Host:www.douban.com" -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36" http://www.douban.com/j/app/radio/people?app_name=radio_desktop_win&version=100&user_id=&expire=&token=&sid=&h=&channel=7&type=n
Below is what print out with -v from curl. Seems everything was identical with the request made by Chrome but still the response body is not correct.
GET /j/app/radio/people?app_name=radio_desktop_win HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8
Accept-Encoding:gzip,deflat,sdcn
Accept-Language:zh-CN,zh;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Host:www.douban.com
Why this happened? Appreciate your help.
You need to put quotes around that url in the shell. Otherwise the &s are going to cause trouble.
Another common problem: you may be using an HTTP proxy with Chrome. If so, you need to tell curl about this proxy as well. You can do so by setting the environmental variable http_proxy.

Resources