Asynchronous Pool of Connections in Tornado with multiple processes - asynchronous

I'm using both Tornado 4.2.1 and tornadoes 2.4.1 libraries to query my Elasticsearch database and I'm looking for a way to initialize a Pool of connections to shared between several RequestHandler instances in a multiple processes service.
Is it possible to do that? Are there specific libraries for Tornado to do that?
Thanks in advance

Since tornado-es is just a HTTP client, it uses AsyncHTTPClient in the ESConnection. The new TCP connection is made every request, unless Connection: keep-alive header is specified.
conn = ESConnection()
conn.httprequest_kwargs['headers'] = {'Connection': 'keep-alive'}
I've not tested, but it should work. I used similar setup in ruby (with patron http client), and it works well
Next thing
AsyncHTTPClient has limit of maximum number of simultaneous requests (fetch) per ioloop. Every request that hit the limit is just queued internally.
You may want to increase the global limit:
AsyncHTTPClient.configure(None, max_clients=50)
or separate the client with its own limit (force_instance):
from tornadoes import ESConnection
from tornado.httpclient import AsyncHTTPClient
class CustomESConnection(ESConnection):
def __init__(self, , host='localhost', port='9200', io_loop=None, protocol='http', max_clients=20):
super(CustomESConnection, self).__init__(host, port, io_loop, protocol)
self.client = AsyncHTTPClient(force_instance=True, max_clients=max_clients)
And finally
To reuse the same ESConnection you can create it in the Application, since the application is available with every request (RequestHandler)
from tornado.web import Application, RequestHandler
from tornadoes import ESConnection
class MainHandler(RequestHandler):
def get(self):
yield self.application.es.search('something')
class MyApp(Application):
def __init__(self, *args, **kwargs):
super(MyApp, self).__init__(*args, **kwargs)
self.es = ESconnection()
if __name__ == "__main__":
application = MyApp([
(r"/", MainHandler),
])
application.listen(8888)
tornado.ioloop.IOLoop.current().start()
Multiprocess
Actually there is no easy way. The common approach is a pooler, which is used mostly when persistent connection is needed, like databases (pgbouncer for postgres) or as a optimization on high-load service.
And you will have to write a pooler, a gateway application to es
subprocess1
\ (http, zmq, ...)
\
> pooler (some queue and tornadoes api) - http -> elastisearch
/
/
subprocess2
The subprocesses could communicate with pooler via HTTP, ØMQ (there are many examples even pooler) or some implementation of IPC (sockects, ...).

Related

HTTP Client in DoFn

I would like to make POST request through a DoFn for a Apache Beam Pipeline running on Dataflow.
For that, I have created a client which instanciate an HttpClosableClient configured on a PoolingHttpClientConnectionManager.
However, I instanciate a client for each element that I process.
How could I setup a persistent client used by all my elements?
And is there other class for parallel and high-speed HTTP requests that I should use?
You can put the client into a member variable, use the #Setup method to open it, and #Teardown to close it. Implementation of almost all IOs in Beam uses this pattern, e.g. see JdbcIO.

PySide - QTimer triggered socket.socket.recv() hangs

I've built a simple PySide Gui application that is controlled by a QObject derived class. The class contains methods to connect to a tcp port and receive data. My plan is to have a QTimer trigger a socket_read() method that reads any input from the tcp socket and processes the data without blocking the main application.
So far I have the main app connecting to a tcp port successfully and a QTimer triggering a socket_read(). When socket_read() contains simple print statements the application keeps responsive and everything is fine but when I try to use socket.socket.recv() inside the method the method keeps on receiving data in a loop and running the code in that method but the rest of the app freezes.
Does anybody know what I'm dong wrong or what I need to do to keep the app responsive in this situation?
from PySide import QtGui, QtCore
import socket
class AppController(QtCore.QObject):
def __init__(self, main_window):
QtCore.QObject.__init__(self)
# socket
self.socket_connection = None
self.socket_listen_timer = QtCore.QTimer(self)
self.socket_listen_timer.timeout.connect(self.socket_read)
def socket_connect(self, port):
print 'connecting'
host_name = 'localhost'
socket_connection = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket_connection.bind((host_name, port))
socket_connection.listen(1)
self.socket_connection, addr = socket_connection.accept()
self.main_window.console_info('Connected to {0}'.format(addr))
self.socket_listen_timer.start(1000)
def socket_disconnect(self):
self.main_window.console_info('Disconnecting')
self.socket_listen_timer.stop()
if self.socket_connection is not None:
self.socket_connection.close()
def socket_read(self):
print 'test'
data = self.socket_connection.recv(1024)
do_something_with_data(data)

Streaming Zope HTTP responses with proxy views

I am using the following PLone + urllib code to proxy responses from another server through a BrowserView
req = urllib2.Request(full_url)
try:
# Important or if the remote server is slow
# all our web server threads get stuck here
# But this is UGLY as Python does not provide per-thread
# or per-socket timeouts thru urllib
orignal_timeout = socket.getdefaulttimeout()
try:
socket.setdefaulttimeout(10)
response = urllib2.urlopen(req)
finally:
# restore orignal timeoout
socket.setdefaulttimeout(orignal_timeout)
# XXX: How to stream respone through Zope
# AFAIK - we cannot do it currently
return response.read()
My question is how could I make this function not to block and start streaming the proxied response through Zope instantly when the first bytes arrive? When interfaces, objects or patterns are used in making streamable Zope responses?
I think there are two ways you can do this. Firstly, the Zope response itself is file-like so you can use the response's write() method to write successive chunks of data to the response as they come in. Here's an example where I use a Zope response as a file-like object for a csv.writer.
Or you can use ZPublisher's IStreamIterators and wrap the response in a ZPublisher.Iterators.filestream_iterator wrapper and return the wrapper.
This should actually be a comment, but I don't have the reputation yet.
I am trying to do the same thing as you Mikko, and RESPONSE.write() does exactly that, as Ross said it would. Note however that the bytes won't actually leave the interface until there's 64K of them (or connection closes). Flushing stdout won't help so it seems that you will have to interfere further down with the socket to promptly send a few bytes right away.

How to write a simple HTTP server in Haskell using Network.HTTP.receiveHTTP

The module Network.HTTP
exposes the functions receiveHTTP and respondHTTP which I'd like to use for a very basic web server. I wrote a stub that just waits for clients:
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Network.HTTP
import Network
import Control.Monad
main = withSocketsDo $ do
socket <- listenOn $ PortNumber 8080
forever $ do
(handle, host, port) <- accept socket
print (host, port)
Here accpet gives me a Handle, and now I can't figure out how to use a Handle with receiveHTTP.
I found an example with Google, but it is from 2008 and does not work anymore. And I was not able to port it.
Any ideas?
You can do this, but I really think you shouldn't. HTTP can act as a server, but is designed to be used client side. I Googled a little and I can't find any examples of someone actually using respondHTTP. If you're doing client side HTTP in 2016 use http-conduit. On the server side, warp or something that depends upon it is probably what you want.
Nevertheless, here's the code.
#!/usr/bin/env stack
-- stack --resolver lts-6.3 --install-ghc runghc --package HTTP
{-# LANGUAGE RecordWildCards #-}
import Control.Monad
import qualified Data.ByteString as B
import Network.HTTP
import Network.Socket
import Network.URI
main = do
lsock <- socket AF_INET Stream defaultProtocol
bind lsock (SockAddrInet 8080 iNADDR_ANY)
listen lsock 1
forever $ do
(csock, _) <- accept lsock
hs <- socketConnection "" 8080 csock
req <- receiveHTTP hs
case req of
Left _ -> error "Receiving request failed"
Right (Request {..}) -> if uriPath rqURI == "/"
then do
respondHTTP hs $
Response (2,0,0) "OK" [] "Hello HTTP"
Network.HTTP.close hs
else do
respondHTTP hs $
Response (4,0,4) "Not found" [] "Nothing here"
Network.HTTP.close hs
The above uses Stack's support for shebang scripts. chmod +x it or run it with stack foo.hs.
The Network module is deprecated. Always use Network.Socket if you need a socket API. For something higher level, use connection.
You do the normal POSIX socket thing, then convert the connected socket to a HandleStream with socketConnection and run respondHTTP and receiveHTTP on it. socketConnection is a weird function. The first two parameters are a hostname and a port which AFAICT aren't used when running a server.
I used the RecordWildCards extension. It lets me write Right (Request {..}) in a pattern and have all the fields of the record in scope on the right hand side.
Perhaps it expects you to use the accept function from Network.Socket instead of Network? That gives you a Socket instead of a Handle, which you should be able to convert to a form that receiveHTTP can use.
Normally a Handle would be nicer to work with directly, but here the HTTP library is taking care of it for you so it expects the lower-level interface instead.
EDIT: After looking at it a bit further, it seems the socketConnection function in Network.TCP does what you need. The funny part is it's actually making the socket into a Handle internally before it reads from it, but doesn't seem to provide a way to read from an externally-provided Handle. The string parameter to the function is supposed to be the name of the remote host, but it looks like that's merely kept for reference; it's not actually initiating a connection to that host or anything.

Automated naming of AF_UNIX local datagram sockets?

I'm implementing a simple service using datagrams over unix local sockets (AF_UNIX address family, i.e. not UDP). The server is bound to a public address, and it receives requests just fine. Unfortunately, when it comes to answering back, sendto fails unless the client is bound too. (the common error is Transport endpoint is not connected).
Binding to some random name (filesystem-based or abstract) works. But I'd like to avoid that: who am I to guarantee the names I picked won't collide?
The unix sockets' stream mode documentation tell us that an abstract name will be assigned to them at connect time if they don't have one already. Is such a feature available for datagram oriented sockets?
The unix(7) man page I referenced had this information about autobind UNIX sockets:
If a bind(2) call specifies addrlen as sizeof(sa_family_t), or the SO_PASSCRED socket option was specified for a socket that was not explicitly bound to an address, then the socket is autobound to an abstract address.
This is why the Linux kernel checks the address length is equal to sizeof(short) because sa_family_t is a short. The other unix(7) man page referenced by Rob's great answer says that client sockets are always autobound on connect, but because SOCK_DGRAM sockets are connectionless (despite calling connect on them) I believe this only applies to SOCK_STREAM sockets.
Also note that when supplying your own abstract namespace socket names, the socket's address in this namespace is given by the additional bytes in sun_path that are covered by the specified length of the address structure.
struct sockaddr_un me;
const char name[] = "\0myabstractsocket";
me.sun_family = AF_UNIX;
// size-1 because abstract socket names are not null terminated
memcpy(me.sun_path, name, sizeof(name) - 1);
int result = bind(fd, (void*)&me, sizeof(me.sun_family) + sizeof(name) - 1);
sendto() should likewise limit the address length, and not pass sizeof(sockaddr_un).
I assume that you are running Linux; I don't know if this advice applies to SunOS or any UNIX.
First, the answer: after the socket() and before the connect() or first sendto(), try adding this code:
struct sockaddr_un me;
me.sun_family = AF_UNIX;
int result = bind(fd, (void*)&me, sizeof(short));
Now, the explanation: the the unix(7) man page says this:
When a socket is connected and it
doesn’t already have a local address a
unique address in the abstract
namespace will be generated
automatically.
Sadly, the man page lies.
Examining the Linux source code, we see that unix_dgram_connect() only calls unix_autobind() if SOCK_PASSCRED is set in the socket flags. Since I don't know what SOCK_PASSCRED is, and it is now 1:00AM, I need to look for another solution.
Examining unix_bind, I notice that unix_bind calls unix_autobind if the passed-in size is equal to "sizeof(short)". Thus, the solution above.
Good luck, and good morning.
Rob
A bit of a late response, but for whomever finds this using google as I did. Rob Adam's answer helped me get the 'real' answer to this: simply use set (level SO_SOCKET, see man 7 unix) to set SO_PASSCRED to 1. No need for a silly bind.
I used this in PHP, but it doesn't have SO_PASSCRED defined (stupid PHP). It does still work, though, if you define it yourself. On my computer it has the value of 16, and I reckon that it will work quite portably.
I'm not so sure I understand your question completely, but here is a datagram implementation of an echo server I just wrote. You can see the server is responding to the client on the same IP/PORT it was sent from.
Here's the code
First, the server (listener)
from socket import *
import time
class Listener:
def __init__(self, port):
self.port = port
self.buffer = 102400
def listen(self):
sock = socket(AF_INET, SOCK_DGRAM)
sock.bind(('', self.port))
while 1:
data, addr = sock.recvfrom(self.buffer)
print "Received: " + data
print "sending to %s" % addr[0]
print "sending data %s" % data
time.sleep(0.25)
#print addr # will tell you what IP address the request came from and port
sock.sendto(data, (addr[0], addr[1]))
print "sent"
sock.close()
if __name__ == "__main__":
l = Listener(1975)
l.listen()
And now, the Client (sender) which receives the response from the Listener
from socket import *
from time import sleep
class Sender:
def __init__(self, server):
self.port = 1975
self.server = server
self.buffer = 102400
def sendPacket(self, packet):
sock = socket(AF_INET, SOCK_DGRAM)
sock.settimeout(10.75)
sock.sendto(packet, (self.server, int(self.port)))
while 1:
print "waiting for response"
data, addr = sock.recvfrom(self.buffer)
sock.close()
return data
if __name__ == "__main__":
s = Sender("127.0.0.1")
response = s.sendPacket("Hello, world!")
print response

Resources