The fastAPI that I am working on does not return traceback whenever the request failed, instead, it returns 500 Internal Server Error with error :
ERROR: ASGI callable returned without starting response.
2021-05-14 16:12:08 - uvicorn.error:409 - ERROR - ASGI callable returned without starting response.
Anyone experienced this problem before and know the fix to it ?
Most probably the __ call __ method of your middleware is not working well. I would advise checking a build in middleware and comparing with yours.
E.g:
async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None:
if scope["type"] != "http":
await self.app(scope, receive, send)
return
request = Request(scope, receive=receive)
response = await self.dispatch_func(request, self.call_next)
await response(scope, receive, send)
My issue was caused by a customize middleware. I've been able to update the middleware and fix this issue.
Related
I need to fetch data from REST API via HTTP get on apache airflow (e.g. to https://something.com/api/data).
The data is come in pages with following structure :
{
"meta" : {
"size" : 50,
"currentPage" : 3,
"totalPage" : 10
},
"data" : [
....
]
}
The problem is, the API provider is not reliable. Sometimes we get 504 gateway timeout. So I have to retry the API call, until current page = total page, and retry if we got 504 Gateway timeout. However, the overall retry process must not exceed 15 minutes.
Is there any way I can achieve this using apache airflow?
Thanks
You could use HTTP Operator from HTTP providers package. Check the examples and guide in those links.
If you don't have it already, start by installing the provider package:
pip install apache-airflow-providers-http
Then you could try it sending requests to https://httpbin.org .To do so, create a connection like:
You could create Tasks using the SimpleHttpOperator:
from datetime import datetime, timedelta
from airflow import DAG
from airflow.providers.http.operators.http import SimpleHttpOperator
with DAG(
'example_http_operator',
default_args={
'retries': 1,
'retry_delay': timedelta(minutes=5),
},
start_date=datetime(2021, 10, 9),
) as dag:
task_get_op = SimpleHttpOperator(
task_id='get_op',
method='GET',
endpoint='get',
data={"param1": "value1", "param2": "value2"},
headers={},
)
By default, under the hood, this operator performs a raise_for_status to the obtained response. So, if the response status_code is not in the range of 1xx or 2xx will raise an exception and the Task will be mark as failed. If you want to customize this behaviour you can provide your own response_check as an argument to the SimpleHttpOperator
:param response_check: A check against the 'requests' response object.
The callable takes the response object as the first positional argumentand optionally any number of keyword arguments available in the context dictionary.
It should return True for 'pass' and False otherwise.
:type response_check: A lambda or defined function.
Finally to handle retries on failures as needed, you could use the following parameters avaiblables in any Operator in Airflow (docs):
retries (int) -- the number of retries that should be performed before failing the task
retry_delay (datetime.timedelta) -- delay between retries
retry_exponential_backoff (bool) -- allow progressive longer waits between retries by using exponential backoff algorithm on retry delay (delay will be converted into seconds)
max_retry_delay (datetime.timedelta) -- maximum delay interval between retries
Finally, to try out how everything works together, perform a request to an endpoint which will answer with an specific error status code:
task_get_op = SimpleHttpOperator(
task_id='get_op',
method='GET',
endpoint='status/400', # response stus code will be 400
data={"param1": "value1", "param2": "value2"},
headers={},
)
Let me know if that works for you!
I've asked here but thought I'd post on SO as well:
given this code:
local redis = require('resty.redis')
local client = redis:new()
client:connect(host,port)
ngx.thread.spawn(function()
ngx.say(ngx.time(),' ',#client:keys('*'))
end)
ngx.timer.at(2,function()
ngx.say(ngx.time(),' ',#client:keys('*'))
end)
I get this error:
---urhcdhw2pqoz---
1611628086 5
2021/01/26 10:28:08 [error] 4902#24159: *4 lua entry thread aborted: runtime error: ...local/Cellar/openresty/1.19.3.1_1/lualib/resty/redis.lua:349: bad request
stack traceback:
coroutine 0:
[C]: in function 'send'
...local/Cellar/openresty/1.19.3.1_1/lualib/resty/redis.lua:349: in function 'keys'
./src/main.lua:20: in function <./src/main.lua:19>, context: ngx.timer
so it seems that threads work with redis but timers don't. Why is that?
There are two errors in your code.
It is not possible to pass the cosocket object between Lua handlers (emphasis added by me):
The cosocket object created by this API function has exactly the same lifetime as the Lua handler creating it. So never pass the cosocket object to any other Lua handler (including ngx.timer callback functions) and never share the cosocket object between different Nginx requests.
https://github.com/openresty/lua-nginx-module#ngxsockettcp
In your case, the reference to the cosocket object is stored in the client table (client._sock).
ngx.print/ngx.say are not available in the ngx.timer.* context.
https://github.com/openresty/lua-nginx-module#ngxsay (check the context: section).
You can use ngx.log instead (it writes to nginx log, set error_log stderr debug; in nginx.conf to print logs to stderr).
The following code works as expected:
ngx.timer.at(2, function()
local client = redis:new()
client:connect('127.0.0.1' ,6379)
ngx.log(ngx.DEBUG, #client:keys('*'))
end)
I am trying to achieve Airflow integration with Slack,
have received the webhook URL, and created the connection as below. Why is it showing google.com ??
Why is it using the default http_conn_id and connecting to google ??
But got an error as below
ERROR - Error in sending a message to Slack channel #airflow-alerts
by Airflow: 404:Not Found
{base_hook.py:83} INFO - Using connection to: id: http_default. Host: https://www.google.com/, Port: None, Schema: None, Login: None, Password: None, extra: {}
{logging_mixin.py:95} INFO - [2020-05-29 12:43:21,374] {http_hook.py:128} INFO - Sending 'POST' to url: https://www.google.com//T00A6ASFHD8S/G1FDF4K/a3zfKsadfsrScxgadfsdafOIgIvgW
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://www.google.com//T00A6ASFHD8S/G1FDF4K/a3zfKsadfsrScxgadfsdafOIgIvgW
But I got the below error, unable to figure out
Your connection is not setup correctly, you need to select HTTP as the Conn Type, leave the Extra field blank and put the webhook token (format is /STRING/STRING/STRING) in the Password field. Then you can use the SlackWebhookOperator operator which allows you to set the channel and username.
I finally figured out after long struggle ...
There was a bug in SlackWebhookOperator in Airflow≤1.10.3 (Bug Jira Issue). This was fixed in 1.10.4 with this PR (fix commit).
I wrote the following OCaml code in order to make POST requests to an https server.
open Lwt
open Cohttp
open Cohttp_lwt_unix
try
let headers = Header.init ()
|> fun h -> Header.add h "content-type" "application/json" in
let body = Client.post
~headers
~body:(body_of_credentials nickname secret)
(uri_of_project project)
>>= fun (response, body) ->
let code = response |> Response.status |> Code.code_of_status in
Printf.printf "Response code: %d\n" code;
Printf.printf "Headers: %s\n" (response |> Response.headers |> Header.to_string);
body |> Cohttp_lwt.Body.to_string >|= fun body ->
Printf.printf "Body of length: %d\n" (String.length body);
body
in
let body = Lwt_main.run body in
print_endline ("Received body\n" ^ body)
with
| Tls_lwt.Tls_alert e ->
print_endline (Tls.Packet.alert_type_to_string e);
exit 1
But when executing it with CONDUIT_TLS=native CONDUIT_DEBUG=true COHTTP_DEBUG=true I get the following response :
Selected TLS library: Native
Resolver system: https://my_server/auth/tokens/ ((name https) (port 443) (tls true)) -> (TCP ((V4 xx.xxx.xx.xxx) 443))
HANDSHAKE_FAILURE
I've read all the google results (documentation, ocaml/cohttp/ocaml-tls lists and stack overflow questions) I could find about it, but nothing helps, so I would like to start from scratch here.
How can I get more details about this failure ?
In case it helps, I'm using the following opam configuration:
"lwt" {>= "4.1.0"}
"cohttp" {>= "1.1.1"}
"cohttp-lwt"
"cohttp-lwt-unix" {>= "1.1.1"}
"tls" {>= "0.9.2"}
EDIT:
As suggested by #ivg, I tried with CONDUIT_TLS=openssl but then I get the following error:
Selected TLS library: OpenSSL
Resolver system: https://my_server/auth/tokens/ ((name https) (port 443) (tls true)) -> (TCP ((V4 xx.xxx.xx.xxx) 443))
...: internal error, uncaught exception:
SSL: Method error
EDIT²:
As suggested in the following discussion: github.com/savonet/ocaml-ssl/issues/40 I added an opam pin to ssl-0.5.5: opam pin add ssl 0.5.5 in order to fix this error. Now I am able to post requests to my https server, but not with the pure ocaml implementation of tls.
You're getting this alert because the handshaking process (authentication) had failed. This alert means, that the peer is not authenticated (either the server or the client) and thus a secure connection could not be established.
To debug the issue, I would suggest first to ensure that everything works fine with conventional tools, e.g., openssl, wget, or curl.
If you're sure that your configuration is fine and this is a problem on the ocaml-tls part I would suggest to use the low-level interface of the ocaml-tls library. Apparently, conduit doesn't expose or use any of the tracing capabilities of ocaml-tls, so there is no other choice.
The alert type is projected from two more rich fatal and error types that bear much more information about the nature of the problem, cf. the following code that creates an alert and consider possible input values that lead to the HANDSHAKE_FAILURE alert.
To access the particular error or alert I would suggest you to use the Tracing capabilities of ocaml-tls. There are examples in the source repository that already enable tracing and should provide sufficient information. I hope those examples will fit into your use case.
So I got status 503 when I crawl. It's retried, but then it gets ignored. I want it to be marked as an error, not ignored. How to do that?
I prefer to set it in settings.py so it would apply to all of my spiders. handle_httpstatus_list seems will only affect one spider.
There are two settings that you should look into:
RETRY_HTTP_CODES:
Default: [500, 502, 503, 504, 408]
Which HTTP response codes to retry. Other errors (DNS lookup issues, connections lost, etc) are always retried.
https://doc.scrapy.org/en/latest/topics/downloader-middleware.html#retry-http-codes
And HTTPERROR_ALLOWED_CODES:
Default: []
Pass all responses with non-200 status codes contained in this list.
https://doc.scrapy.org/en/latest/topics/spider-middleware.html#std:setting-HTTPERROR_ALLOWED_CODES
In the end, I overwrite the retry middleware just for a small change. I set so every time the scraper gave up retrying on something, doesn't matter what is the status code, it will be marked as an error.
It seems Scrapy somehow doesn't associate giving up retrying as an error. That's weird for me.
This is the middleware if anyone wants to use it. Don't forget to activate it on the settings.py
from scrapy.downloadermiddlewares.retry import *
class Retry500Middleware(RetryMiddleware):
def _retry(self, request, reason, spider):
retries = request.meta.get('retry_times', 0) + 1
if retries <= self.max_retry_times:
logger.debug("Retrying %(request)s (failed %(retries)d times): %(reason)s",
{'request': request, 'retries': retries, 'reason': reason},
extra={'spider': spider})
retryreq = request.copy()
retryreq.meta['retry_times'] = retries
retryreq.dont_filter = True
retryreq.priority = request.priority + self.priority_adjust
return retryreq
else:
# This is the point where I update it. It used to be `logger.debug` instead of `logger.error`
logger.error("Gave up retrying %(request)s (failed %(retries)d times): %(reason)s",
{'request': request, 'retries': retries, 'reason': reason},
extra={'spider': spider})