How to wrap asynchronous and gen functions together in Tornado?
My code looks like below, the error is 'Future' object has no attribute 'body'.
Did I place the decorators in a wrong way?
import tornado.httpclient
import tornado.web
import tornado.gen
import tornado.httpserver
import tornado.ioloop
class Class1(tornado.web.RequestHandler):
#tornado.web.asynchronous
def post(self, *args, **kwargs):
url = self.get_argument('url', None)
response = self.json_fetch('POST', url, self.request.body)
self.write(response.body)
self.finish()
#tornado.gen.engine
def json_fetch(self, method, url, body=None, *args, **kwargs):
client = tornado.httpclient.AsyncHTTPClient()
headers = tornado.httputil.HTTPHeaders({"content-type": "application/json charset=utf-8"})
request = tornado.httpclient.HTTPRequest(url, method, headers, body)
yield tornado.gen.Task(client.fetch, request)
You don't need "asynchronous" in this code example. "gen.engine" is obsolete, use "coroutine" instead. You don't generally need to use "gen.Task" much these days, either. Make four changes to your code:
Wrap "post" in "coroutine"
"yield" the result of self.json_fetch instead of using the result directly.
No need to call "finish" in a coroutine, Tornado finishes the response when a coroutine completes.
Wrap json_fetch in "coroutine", too.
The result:
class ClubCreateActivity(tornado.web.RequestHandler):
#tornado.gen.coroutine
def post(self, *args, **kwargs):
url = self.get_argument('url', None)
response = yield self.json_fetch('POST', url, self.request.body)
self.write(response.body)
#tornado.gen.coroutine
def json_fetch(self, method, url, body=None, *args, **kwargs):
client = tornado.httpclient.AsyncHTTPClient()
headers = tornado.httputil.HTTPHeaders({"content-type": "application/json charset=utf-8"})
request = tornado.httpclient.HTTPRequest(url, method, headers, body)
response = yield client.fetch(request)
raise gen.Return(response)
Further reading:
The section on coroutines in the Tornado User's Guide
Tornado async request handlers
My article on refactoring coroutines
The recommended method in tornado official documentation is using #tornado.gen.coroutine and yield together.
If you want to use both asynchronous and advantage of yield, you should nesting #tornado.web.asynchronous decorator followed by #tornado.gen.engine
Documentation about "asynchronous call own function" but without additional external callback-function — Asynchronous and non-Blocking I/O
You can make your json_fetch like this:
from tornado.concurrent import Future
def json_fetch(self, method, url, body=None, *args, **kwargs):
http_client = tornado.httpclient.AsyncHTTPClient()
my_future = Future()
fetch_future = http_client.fetch(url)
fetch_future.add_done_callback(
lambda f: my_future.set_result(f.result()))
return my_future
Or like this (from A. Jesse Jiryu Davis's answer):
from tornado import gen
#gen.coroutine
def json_fetch(self, method, url, body=None, *args, **kwargs):
http_client = tornado.httpclient.AsyncHTTPClient()
headers = tornado.httputil.HTTPHeaders({"content-type": "application/json charset=utf-8"})
request = tornado.httpclient.HTTPRequest(url, method, headers, body)
response = yield http_client.fetch(request)
raise gen.Return(response)
* wrap "post" in "gen.coroutine" and "yield" call of json_fetch.
** "raise gen.Return(response)" for Python2 only, in Python3.3 and later you should write "return response".
Thanks to A. Jesse Jiryu Davis for link "Tornado async request handlers", "Asynchronous and non-Blocking I/O" was found there.
Related
I want to be able to create a custom WebSocket object rather than using Starlette's so that I can add some more things in the constructor and add some more methods. In FastAPI, you're able to subclass the APIRoute and pass in your own Request object. How would I do the same for the WebSocket router?
As you say, there doesn't seem to be an easy way to set the websocket route class (short of a lot of subclassing and rewriting). I think the simplest way to do this would be to define your own wrapper class around the websocket, taking whatever extra data you want, and then define the methods you need. Then you can inject that as a dependency, either with a separate function, or use the class itself as a dependency, see the documentation for details, which is what I'm doing below.
I've put together a minimal example, where the URL parameter name is passed to the wrapper class:
# main.py
from fastapi import Depends, FastAPI, WebSocket
app = FastAPI()
class WsWrapper:
def __init__(self, websocket: WebSocket, name: str) -> None:
self.name = name
self.websocket = websocket
# You can define all your custom logic here, I'm just adding a print
async def receive_json(self, mode: str = "text"):
print(f"Hello from {self.name}", flush=True)
return await self.websocket.receive_json(mode)
#app.websocket("/{name}")
async def websocket(ws: WsWrapper = Depends()):
await ws.websocket.accept()
while True:
data = await ws.receive_json()
print(data, flush=True)
You can test it by running uvicorn main:app and connecting to ws://localhost:8000/test, and it should print "Hello from test" when receiving JSON.
Ended up just monkeypatching the modules. Track this PR for when monkeypatching isn't necessary: https://github.com/tiangolo/fastapi/pull/4968
from typing import Callable
from fastapi import routing as fastapi_routing
from starlette._utils import is_async_callable
from starlette.concurrency import run_in_threadpool
from starlette.requests import Request as StarletteRequest
from starlette.websockets import WebSocket as StarletteWebSocket
from starlette.types import ASGIApp, Receive, Scope, Send
class Request(StarletteRequest):
pass
class WebSocket(StarletteWebSocket):
pass
def request_response(func: Callable) -> ASGIApp:
"""
Takes a function or coroutine `func(request) -> response`,
and returns an ASGI application.
"""
is_coroutine = is_async_callable(func)
async def app(scope: Scope, receive: Receive, send: Send) -> None:
request = Request(scope, receive=receive, send=send)
# Force all views to be a coroutine
response = await func(request)
if is_coroutine:
response = await func(request)
else:
response = await run_in_threadpool(func, request)
await response(scope, receive, send)
return app
fastapi_routing.request_response = request_response
def websocket_session(func: Callable) -> ASGIApp:
"""
Takes a coroutine `func(session)`, and returns an ASGI application.
"""
# assert asyncio.iscoroutinefunction(func), "WebSocket endpoints must be async"
async def app(scope: Scope, receive: Receive, send: Send) -> None:
session = WebSocket(scope, receive=receive, send=send)
await func(session)
return app
fastapi_routing.websocket_session = websocket_session
Say I have a function that takes some *args (or **kwargs??) for a http request and I want to input different arguments each time the function is called - something like:
def make_some_request(self, *args)
response = requests.get(*args)
return response
where *args might be e.g. url, headers and parameters in one case and url and timeout in another case. How can this be formatted to get the request to look like
response = requests.get(url, headers=headers, params=parameters)
on the first function call and
response = requests.get(url, timeout=timeout)
on the second function call?
I wondered if this was possible using *args or **kwargs but the format doesn't seem quite right with either.
You can do this with **kwargs:
class Client:
def make_some_request(self, *args, **kwargs)
response = requests.get(*args, **kwargs)
return response
client = Client()
client.make_some_request("https://domain.tld/path?param=value", timeout=...)
client.make_some_request("https://domain.tld/path?param=value", headers=..., params=...)
*args will take care of passing the positional argument (the URL), whereas the named arguments (timeout, headers, params) will be passed to requests.get by **kwargs.
Let's say I am building a simple chat app using gRPC with the following .proto:
service Chat {
rpc SendChat(Message) returns (Void);
rpc SubscribeToChats(Void) returns (stream Message);
}
message Void {}
message Message {string text = 1;}
The way I often see the servicer implemented (in examples) in Python is like this:
class Servicer(ChatServicer):
def __init__(self):
self.messages = []
def SendChat(self, request, context):
self.messages.append(request.text)
return Void()
def SubscribeToChats(self, request, context):
while True:
if len(self.messages) > 0:
yield Message(text=self.messages.pop())
While this works, it seems very inefficient to spawn an infinite loop that continuously checks a condition for each connected client. It would be preferable to instead have something like this, where the send is triggered right as a message comes in and doesn't require any constant polling on a condition:
class Servicer(ChatServicer):
def __init__(self):
self.listeners = []
def SendChat(self, request, context):
for listener in self.listeners:
listener(Message(request.text))
return Void()
def SubscribeToChats(self, request, context, callback):
self.listeners.append(callback)
However, I can't seem to find a way to do something like this using gRPC.
I have the following questions:
Am I correct that an infinite loop is inefficient for a case like this? Or are there optimizations happening in the background that I'm not aware of?
Is there any efficient way to achieve something similar to my preferred solution above? It seems like a fairly common use case, so I'm sure there's something I'm missing.
Thanks in advance!
I figured out an efficient way to do it. The key is to use the AsyncIO API. Now my SubscribeToChats function can be an async generator, which makes things much easier.
Now I can use something like an asyncio Queue, which my function can await on in a while loop. Similar to this:
class Servicer(ChatServicer):
def __init__(self):
self.queue = asyncio.Queue()
async def SendChat(self, request, context):
await self.queue.put(request.text)
return Void()
async def SubscribeToChats(self, request, context):
while True:
yield Message(text=await self.queue.get())
I am using the scrapy to consume the message(url) from the RabbitMQ,But When I use the yield to call the parse method passing my url as parameters .The program does not comes inside the callback method.Below is the foloowing code of my spider
# -*- coding: utf-8 -*-
import scrapy
import pika
from scrapy import cmdline
import json
class MydeletespiderSpider(scrapy.Spider):
name = 'Mydeletespider'
allowed_domains = []
start_urls = []
def callback(self,ch, method, properties, body):
print(" [x] Received %r" % body)
body=json.loads(body)
url=body.get('url')
yield scrapy.Request(url=url,callback=self.parse)
def start_requests(self):
cre = pika.PlainCredentials('test', 'test')
connection = pika.BlockingConnection(
pika.ConnectionParameters(host='10.0.12.103', port=5672, credentials=cre, socket_timeout=60))
channel = connection.channel()
channel.basic_consume(self.callback,
queue='Deletespider_Batch_Test',
no_ack=True)
print(' [*] Waiting for messages. To exit press CTRL+C')
channel.start_consuming()
def parse(self, response):
print response.url
pass
cmdline.execute('scrapy crawl Mydeletespider'.split())
My goal is to pass the url response to parse method
To consume urls from rabbitmq you can take a look at scrapy-rabbitmq package:
Scrapy-rabbitmq is a tool that lets you feed and queue URLs from RabbitMQ via Scrapy spiders, using the Scrapy framework.
To enable it, set these values in your settings.py:
# Enables scheduling storing requests queue in rabbitmq.
SCHEDULER = "scrapy_rabbitmq.scheduler.Scheduler"
# Don't cleanup rabbitmq queues, allows to pause/resume crawls.
SCHEDULER_PERSIST = True
# Schedule requests using a priority queue. (default)
SCHEDULER_QUEUE_CLASS = 'scrapy_rabbitmq.queue.SpiderQueue'
# RabbitMQ Queue to use to store requests
RABBITMQ_QUEUE_NAME = 'scrapy_queue'
# Provide host and port to RabbitMQ daemon
RABBITMQ_CONNECTION_PARAMETERS = {'host': 'localhost', 'port': 6666}
# Bonus:
# Store scraped item in rabbitmq for post-processing.
# ITEM_PIPELINES = {
# 'scrapy_rabbitmq.pipelines.RabbitMQPipeline': 1
# }
And in your spider:
from scrapy import Spider
from scrapy_rabbitmq.spiders import RabbitMQMixin
class RabbitSpider(RabbitMQMixin, Spider):
name = 'rabbitspider'
def parse(self, response):
# mixin will take urls from rabbit queue by itself
pass
refer to this : http://30daydo.com/article/512
def start_requests(self) this function should return a generator, else scrapy wont work.
I am building a rest API and am wondering both about HTTP best practices I guess, and how that would apply to the DRF. When sending a PUT request, in the body, do requests have all of the parameters for objects that they would be manipulating? Even if not all of them are changing? Or do they only send the fields that are being updated? So for example if we had a House object with No. of rooms and floors, and I was changing No. of rooms should I only send that parameter, or both of them?
If requests should only contain the fields that are being updating, then how would that translate to the DjangoRestFramework? Any help would be greatly appreciated!
My Views are:
class HouseDetail(generics.RetrieveUpdateDestroyAPIView):
queryset = House.objects.all()
serializer_class = HouseSerializer
and serializer is:
class HouseSerializer(serializers.ModelSerializer):
quotes = serializers.PrimaryKeyRelatedField(many=True, read_only=True)
class Meta:
model = House
fields = (
'pk',
'address',
'quotes',
)
PUT is for full resource updates, PATCH is for partial resource updates because Fielding says so.
Therefore, in your example, if you wanted to only update a No. of rooms for your HouseDetail model, you would send the following payload in a PATCH request:
{ "no. of rooms": "42" }
It is still possible to partially update things with a PUT request in DRF (explained below) but you should simply use PATCH for that because it was created for this. You get PUT and PATCH functionality for your model because you have subclassed the generics.RetrieveUpdateDestroyAPIView class and have defined a serializer.
If a required field is omitted in your PUT request (see documentation for required), then you will receive a 400 status code with a response like {"detail": "x field is required"}. However, when you request via PATCH, there is a partial=True argument passed to the serializer which allows this partial only update. We can see this for the UpdateModelMixin:
def partial_update(self, request, *args, **kwargs):
kwargs['partial'] = True
return self.update(request, *args, **kwargs)
which calls update, which is:
def update(self, request, *args, **kwargs):
partial = kwargs.pop('partial', False)
instance = self.get_object()
serializer = self.get_serializer(instance, data=request.data, partial=partial)
serializer.is_valid(raise_exception=True)
self.perform_update(serializer)
return Response(serializer.data)