Minimal example of HTTP server doing asynchronous database queries?

Minimal example of HTTP server doing asynchronous database queries? - asynchronous

I'm playing with different asynchronous HTTP servers to see how they can handle multiple simultaneous connections. To force a time-consuming I/O operation I use the pg_sleep PostgreSQL function to emulate a time-consuming database query. Here is for instance what I did with Node.js:
var http = require('http');
var pg = require('pg');
var conString = "postgres://al:al#localhost/al";
/* SQL query that takes a long time to complete */
var slowQuery = 'SELECT 42 as number, pg_sleep(0.300);';
var server = http.createServer(function(req, res) {
pg.connect(conString, function(err, client, done) {
client.query(slowQuery, [], function(err, result) {
done();
res.writeHead(200, {'content-type': 'text/plain'});
res.end("Result: " + result.rows[0].number);
});
});
})
console.log("Serve http://127.0.0.1:3001/")
server.listen(3001)
So this a very simple request handler that does an SQL query taking 300ms and returns a response. When I try benchmarking it I get the following results:
$ ab -n 20 -c 10 http://127.0.0.1:3001/
Time taken for tests: 0.678 seconds
Complete requests: 20
Requests per second: 29.49 [#/sec] (mean)
Time per request: 339.116 [ms] (mean)
This shows clearly that requests are executed in parallel. Each request takes 300ms to complete and because we have 2 batches of 10 requests executed in parallel, it takes 600ms overall.
Now I'm trying to do the same with Elixir, since I heard it does asynchronous I/O transparently. Here is my naive approach:
defmodule Toto do
import Plug.Conn
def init(options) do
{:ok, pid} = Postgrex.Connection.start_link(
username: "al", password: "al", database: "al")
options ++ [pid: pid]
end
def call(conn, opts) do
sql = "SELECT 42, pg_sleep(0.300);"
result = Postgrex.Connection.query!(opts[:pid], sql, [])
[{value, _}] = result.rows
conn
|> put_resp_content_type("text/plain")
|> send_resp(200, "Result: #{value}")
end
end
In case that might relevant, here is my supervisor:
defmodule Toto.Supervisor do
use Application
def start(type, args) do
import Supervisor.Spec, warn: false
children = [
worker(Plug.Adapters.Cowboy, [Toto, []], function: :http),
]
opts = [strategy: :one_for_one, name: Toto.Supervisor]
Supervisor.start_link(children, opts)
end
end
As you might expect, this doesn't give me the expected result:
$ ab -n 20 -c 10 http://127.0.0.1:4000/
Time taken for tests: 6.056 seconds
Requests per second: 3.30 [#/sec] (mean)
Time per request: 3028.038 [ms] (mean)
It looks like there's no parallelism, requests are handled one after the other. What am I doing wrong?

Elixir should be completely fine with this setup. The difference is that your node.js code is creating a connection to the database for every request. However, in your Elixir code, init is called once (and not per request!) so you end-up with a single process that sends queries to Postgres for all requests, which then becomes your bottleneck.
The easiest solution would be to move the connection to Postgres out of init and into call. However, I would advise you to use Ecto which will set up a connection pool to the database too. You can also play with the pool configuration for optimal results.

UPDATE This was just test code, if you want to do something like this see #AlexMarandon's Ecto pool answer instead.
I've just been playing with moving the connection setup as José suggested:
defmodule Toto do
import Plug.Conn
def init(options) do
options
end
def call(conn, opts) do
{ :ok, pid } = Postgrex.Connection.start_link(username: "chris", password: "", database: "ecto_test")
sql = "SELECT 42, pg_sleep(0.300);"
result = Postgrex.Connection.query!(pid, sql, [])
[{value, _}] = result.rows
conn
|> put_resp_content_type("text/plain")
|> send_resp(200, "Result: #{value}")
end
end
With results:
% ab -n 20 -c 10 http://127.0.0.1:4000/
Time taken for tests: 0.832 seconds
Requests per second: 24.05 [#/sec] (mean)
Time per request: 415.818 [ms] (mean)

Here is the code I came up with following José's answer:
defmodule Toto do
import Plug.Conn
def init(options) do
options
end
def call(conn, _opts) do
sql = "SELECT 42, pg_sleep(0.300);"
result = Ecto.Adapters.SQL.query(Repo, sql, [])
[{value, _}] = result.rows
conn
|> put_resp_content_type("text/plain")
|> send_resp(200, "Result: #{value}")
end
end
For this to work we need to declare a repo module:
defmodule Repo do
use Ecto.Repo, otp_app: :toto
end
And start that repo in the supervisor:
defmodule Toto.Supervisor do
use Application
def start(type, args) do
import Supervisor.Spec, warn: false
children = [
worker(Plug.Adapters.Cowboy, [Toto, []], function: :http),
worker(Repo, [])
]
opts = [strategy: :one_for_one, name: Toto.Supervisor]
Supervisor.start_link(children, opts)
end
end
As José mentioned, I got the best performance by tweaking the configuration a bit:
config :toto, Repo,
adapter: Ecto.Adapters.Postgres,
database: "al",
username: "al",
password: "al",
size: 10,
lazy: false
Here is the result of my benchmark (after a few runs so that the pool has the time to "warm up") with default configuration:
$ ab -n 20 -c 10 http://127.0.0.1:4000/
Time taken for tests: 0.874 seconds
Requests per second: 22.89 [#/sec] (mean)
Time per request: 436.890 [ms] (mean)
And here is the result with size: 10 and lazy: false:
$ ab -n 20 -c 10 http://127.0.0.1:4000/
Time taken for tests: 0.619 seconds
Requests per second: 32.30 [#/sec] (mean)
Time per request: 309.564 [ms] (mean)

Related

Julia WebSocket slow to read/write

I was testing Julia WebSockets and I found that they are much slower than the Python/Node equivalents. I wrote a test to send a ping and measure the time taken for a pong response from some crypto exchanges as an example. Julia consistently takes 50ms for the test below, whereas python takes 2-4 ms. Is this expected?
Julia:
using WebSockets
function main()
WebSockets.open("wss://wsaws.okex.com:8443/ws/v5/public") do ws
for i = 1:100
a = time_ns()
write(ws, "ping")
data, success = readguarded(ws)
b = time_ns()
println(String(data), " ", (b - a) / 1000000)
sleep(0.1)
end
end
end
main()
Equivalent Python:
import websockets
import asyncio
import time
async def main():
uri = "wss://wsaws.okex.com:8443/ws/v5/public"
async with websockets.connect(uri) as websocket:
msg = "hello"
for i in range(101):
a = time.time()
await websocket.send(msg)
x = await websocket.recv()
b = time.time()
print(b-a)
time.sleep(0.1)
asyncio.get_event_loop().run_until_complete(main())

Python: Run only one function async

I have a large legacy application that has one function that is a prime candidate to be executed async. It's IO bound (network and disk) and doesn't return anything.
This is a very simple similar implementation:
import random
import time
import requests
def fetch_urls(site):
wait = random.randint(0, 5)
filename = site.split("/")[2].replace(".", "_")
print(f"Will fetch {site} in {wait} seconds")
time.sleep(wait)
r = requests.get(site)
with open(filename, "w") as fd:
fd.write(r.text)
def something(sites):
for site in sites:
fetch_urls(site)
return True
def main():
sites = ["https://www.google.com", "https://www.reddit.com", "https://www.msn.com"]
start = time.perf_counter()
something(sites)
total_time = time.perf_counter() - start
print(f"Finished in {total_time}")
if __name__ == "__main__":
main()
My end goal would be updating the something function to run fetch_urls async.
I cannot change fetch_urls.
All documentation and tutorials I can find assumes my entire application is async (starting from async def main()) but this is not the case.
It's a huge application spanning across multiple modules and re-factoring everything for a single function doesn't look right.
For what I understand I will need to create a loop, add tasks to it and dispatch it somehow, but I tried many different things and I still get everything running just one after another - as oppose to concurrently.
I would appreciate any assistance. Thanks!

Replying to myself, it seems there is no easy way to do that with async. Ended up using concurrent.futures
import time
import requests
import concurrent.futures
def fetch_urls(url, name):
wait = 5
filename = url.split("/")[2].replace(".", "_")
print(f"Will fetch {name} in {wait} seconds")
time.sleep(wait)
r = requests.get(url)
with open(filename, "w") as fd:
fd.write(r.text)
def something(sites):
with concurrent.futures.ProcessPoolExecutor(max_workers=5) as executor:
future_to_url = {
executor.submit(fetch_urls, url["url"], url["name"]): (url)
for url in sites["children"]
}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print("%r generated an exception: %s" % (url, exc))
return True
def main():
sites = {
"parent": "https://stackoverflow.com",
"children": [
{"name": "google", "url": "https://google.com"},
{"name": "reddit", "url": "https://reddit.com"},
],
}
start = time.perf_counter()
something(sites)
total_time = time.perf_counter() - start
print(f"Finished in {total_time}")

How create batch process in requests elixir phoenix

I must create an api with great performance and I want to create it with Elixir
I have a process (slow) that I must run after some requests. I want to make this flow
In each request, save the data received in memory
After x requests, send to another api (or after x seconds)
In node I can make this:
let batchData = []
const handlerRequest = (req, res) => {
batchData.push(req. body.data)
if (batchData > 1000) {
// Process to send to another api
batchData = []
}
res.json({ success: true })
}
Or
let batchData = []
setInterval(() => {
if (batchData > 1000) {
// Process to send to another api
batchData = []
}
}, 10000)
const handlerRequest = (req, res) => {
batchData.push(req. body.data)
res.json({ success: true })
}
How can I do something like this in Elixir Phoenix?
Thanks for this

Here is an approach using a GenServer. I presume you want to start the timer when the first item is received.
defmodule RequestHandler do
use GenServer
#name __MODULE__
#timeout 5_000
#size 5
def start_link(args \\ []) do
GenServer.start_link(__MODULE__, args, name: #name)
end
def request(req) do
GenServer.cast(#name, {:request, req})
end
def init(_) do
{:ok, %{timer_ref: nil, requests: []}}
end
def handle_cast({:request, req}, state) do
{:noreply, state |> update_in([:requests], & [req | &1]) |> handle_request()}
end
def handle_info(:timeout, state) do
# sent to another API
send_api(state.requests)
{:noreply, reset_requests(state)}
end
defp handle_request(%{requests: requests} = state) when length(requests) == 1 do
start_timer(state)
end
defp handle_request(%{requests: requests} = state) when length(requests) > #size do
# sent to another API
send_api(requests)
reset_requests(state)
end
defp handle_request(state) do
state
end
defp reset_requests(state) do
state
|> Map.put(:requests, [])
|> cancel_timer()
end
defp start_timer(state) do
timer_ref = Process.send_after(self(), :timeout, #timeout)
state
|> cancel_timer()
|> Map.put(:timer_ref, timer_ref)
end
defp cancel_timer(%{timer_ref: nil} = state) do
state
end
defp cancel_timer(%{timer_ref: timer_ref} = state) do
Process.cancel_timer(timer_ref)
Map.put(state, :timer_ref, nil)
end
defp send_api(requests) do
IO.puts "sending #{length requests} requests"
end
end
And here is a few tests
iex(5)> RequestHandler.start_link
{:ok, #PID<0.119.0>}
iex(6)> for i <- 1..6, do: Request
[Request, Request, Request, Request, Request, Request]
iex(7)> for i <- 1..6, do: RequestHandler.request(i)
sending 6 requests
[:ok, :ok, :ok, :ok, :ok, :ok]
iex(8)> for i <- 1..7, do: RequestHandler.request(i)
sending 6 requests
[:ok, :ok, :ok, :ok, :ok, :ok, :ok]
sending 1 requests
iex(9)> for i <- 1..3, do: RequestHandler.request(i)
[:ok, :ok, :ok]
sending 3 requests
iex(10)>

You can use GenServer or Agent
GenServer
The general idea is to have a GenServer process that holds the data to be processed and also handles the background processing. Using GenServer.cast/2 we can send a message to a process asynchronously. So whenever, the controller receives the request, we'll add a new item to the queue and also check if batch size is reached and process it.
# In Controller (page_controller.ex) module
def index(conn, params) do
App.BatchProcessor.add_item(params)
conn|>json(%{success: true})
end
Add module for GenServer. You can add a new file lib/batch_processor.ex
defmodule App.BatchProcessor do
use GenServer
#batch_size 10 #whenever queue reaches this size we'll start processing
def init(_) do
initial_queue = []
{:ok, initial_queue}
end
def start_link()do
GenServer.start_link(__MODULE__, [], [name: __MODULE__])
end
#api function to add item to the
def add_item(data)do
GenServer.cast({:add, data}, __MODULE__)
end
# implement GenServer behavior function to handle cast messages for adding item to the queue
def handle_cast({:add, data}, queue) do
update_queue = [data | queue] #addpend new item to front of queue
#check if batch size is reached and process current batch
if Enum.count(updated_queue) >= #batch_size do
#send async message to current process to process batch
GenServer.cast(__MODULE__, :process_batch)
end
{:noreply, updated_queue}
end
#implement GenServer behavior function to handle cast messages for processing batch
def handle_cast(:process_queue, queue)do
spawn(fn ->
Enum.each(queue, fn data ->
IO.inspect(data)
end)
end)
{:noreply, []} # reset queue to empty
end
end
Start the BatchProcessor process when the Phoenix app starts
#application.ex
children = [
# Start the endpoint when the application starts
supervisor(App.Web.Endpoint, []),
# Start your own worker by calling: App.Web.Worker.start_link(arg1, arg2, arg3)
worker(App.BatchProcessor, []),
]
Read more on GenServer
Hope this helps

How do I share data between worker processes in Elixir?

I have 2 workers
worker(Mmoserver.MessageReceiver, []),
worker(Mmoserver.Main, [])
The MessageReceiver will wait until messages are received on TCP and process them, the Main loop will take that information and act on it. How do I share the info obtained by worker1 with worker2?
Mmoserver.ex
This is the main file that starts the workers
defmodule Mmoserver do
use Application
def start(_type, _args) do
import Supervisor.Spec, warn: false
IO.puts "Listening for packets..."
children = [
# We will add our children here later
worker(Mmoserver.MessageReceiver, []),
worker(Mmoserver.Main, [])
]
# Start the main supervisor, and restart failed children individually
opts = [strategy: :one_for_one, name: AcmeUdpLogger.Supervisor]
Supervisor.start_link(children, opts)
end
end
MessageReceiver.ex
This will just start a tcp listener. It should be able to get a message, figure out what it is (by it's id) then parse data and send it to a specific function in Main
defmodule Mmoserver.MessageReceiver do
use GenServer
require Logger
def start_link(opts \\ []) do
GenServer.start_link(__MODULE__, :ok, opts)
end
def init (:ok) do
{:ok, _socket} = :gen_udp.open(21337)
end
# Handle UDP data
def handle_info({:udp, _socket, _ip, _port, data}, state) do
parse_packet(data)
# Logger.info "Received a secret message! " <> inspect(message)
{:noreply, state}
end
# Ignore everything else
def handle_info({_, _socket}, state) do
{:noreply, state}
end
def parse_packet(data) do
# Convert data to string, then split all data
# WARNING - SPLIT MAY BE EXPENSIVE
dataString = Kernel.inspect(data)
vars = String.split(dataString, ",")
# Get variables
packetID = Enum.at(vars, 0)
x = Enum.at(vars, 1)
# Do stuff with them
IO.puts "Packet ID:"
IO.puts packetID
IO.puts x
# send data to main
Mmoserver.Main.handle_data(vars)
end
end
Main.ex
This is the main loop. It will process all the most recent data received by the tcp listener and act on it. Eventually it will update the game state too.
defmodule Mmoserver.Main do
use GenServer
#tickDelay 33
def start_link(opts \\ []) do
GenServer.start_link(__MODULE__, [], name: Main)
end
def init (state) do
IO.puts "Main Server Loop started..."
# start the main loop, parameter is the initial tick value
mainLoop(0)
# return, why 1??
{:ok, 1}
end
def handle_data(data) do
GenServer.cast(:main, {:handle_data, data})
end
def handle_info({:handle_data, data}, state) do
# my_function(data)
IO.puts "Got here2"
IO.puts inspect(data)
{:noreply, state}
end
# calls respective game functions
def mainLoop(-1) do
IO.inspect "Server Loop has ended!" # base case, end of loop
end
def mainLoop(times) do
# do shit
# IO.inspect(times) # operation, or body of for loop
# sleep
:timer.sleep(#tickDelay);
# continue the loop RECURSIVELY
mainLoop(times + 1)
end
end

Because Mmoserver.MessageReceiver is going to send messages to Mmoserver.Main, Main has to be started in first place, plus, it needs to have name associated:
worker(Mmoserver.Main, []),
worker(Mmoserver.MessageReceiver, [])
The easiest way could be, in your Mmoserver.Main, assuming it is a GenServer:
defmodule Mmoserver.Main do
use GenServer
def start_link do
GenServer.start_link(__MODULE__, [], name: :main)
end
# ...
end
You can add convenience function, plus the implementation one like:
defmodule Mmoserver.Main do
# ...
def handle_data(data) do
GenServer.cast(:main, {:handle_data, data})
end
def handle_info({:handle_data, data}, state) do
my_function(data)
{:noreply, state}
end
end
So, your MessageReceiver, can send a message like:
defmodule Mmoserver.MessageReceiver do
def when_data_received(data) do
Mmoserver.Main.handle_data(data)
end
end
This assumes Mmoserver.MessageReceiver doesn't expect Mmoserver.Main to respond. I've decided to do it this way as you didn't specify the way you want to handle the data and this seems the easies example of how to do this.

send calls via an endpoint to a GenServer

Given a running GenServer, is there a known way to send synchronous/asynchronous calls to the pid via an endpoint, without using the Phoenix framework?
Here's an example call (using python's requests library) that maps the reply term to JSON:
iex> give_genserver_endpoint(pid, 'http://mygenserverendpoint/api')
iex> {:ok, 'http://mygenserverendpoint/api'}
>>> requests.get(url='http://mygenserverendpoint/getfood/fruits/colour/red')
>>> '{ "hits" : ["apple", "plum"]}'

You can write a complete elixir http server using cowboy and plug:
Application Module
defmodule MyApp do
use Application
def start(_type, _args) do
import Supervisor.Spec
children = [
worker(MyGenServer, []),
Plug.Adapters.Cowboy.child_spec(:http, MyRouter, [], [port: 4001])
]
opts = [strategy: :one_for_one, name: MyApp.Supervisor]
Supervisor.start_link(children, opts)
end
end
Router Module
defmodule MyRouter do
use Plug.Router
plug :match
plug :dispatch
get "/mygenserverendpoint/getfood/fruits/colour/:colour" do
response_body = MyGenServer.get_fruit_by_colour(colour)
conn
|> put_resp_content_type("application/json")
|> send_resp(conn, 200, Poison.encode(response_body))
end
match _ do
send_resp(conn, 404, "oops")
end
end
GenServer module
defmodule MyGenServer do
use GenServer
def start_link do
GenServer.start_link(__MODULE__, :ok, name: __MODULE__)
end
def get_fruit_by_colour(colour) do
GenServer.call(__MODULE__, {:get_by_colour, colour})
end
def handle_call({:get_by_colour, colour}, _from, state) do
{:reply, %{"hits" : ["apple", "plum"]}, state}
end
end