F# Http.AsyncRequestStream just 'hangs' on long queries - http

I am working with:
let callTheAPI = async {
printfn "\t\t\tMAKING REQUEST at %s..." (System.DateTime.Now.ToString("yyyy-MM-ddTHH:mm:ss"))
let! response = Http.AsyncRequestStream(url,query,headers,httpMethod,requestBody)
printfn "\t\t\t\tREQUEST MADE."
}
And
let cts = new System.Threading.CancellationTokenSource()
let timeout = 1000*60*4//4 minutes (4 mins no grace)
cts.CancelAfter(timeout)
Async.RunSynchronously(callTheAPI,timeout,cts.Token)
use respStrm = response.ResponseStream
respStrm.Flush()
writeLinesTo output (responseLines respStrm)
To call a web API (REST) and the let! response = Http.AsyncRequestStream(url,query,headers,httpMethod,requestBody) just hangs on certain queries. Ones that take a long time (>4 minutes) particularly. This is why I have made it Async and put a 4 minute timeout. (I collect the calls that timeout and make them with smaller time range parameters).
I started Http.RequestStream from FSharp.Data first, but I couldn't add a timeout to this so the script would just 'hang'.
I have looked at the API's IIS server and the application pool Worker Process active requests in IIS manager and I can see the requests come in and go again. They then 'vanish' and the F# script hangs. I can't find an error message anywhere on the script side or server side.
I included the Flush() and removed the timeout and it still hung. (Removing the Async in the process)
Additional:
Successful calls are made. Failed calls can be followed by successful calls. However, it seems to get to a point where all the calls time out and the do so without even reaching the server any more. (Worker Process Active Requests doesn't show the query)
Update:
I made the Fsx script output the queries and ran them through IRM with now issues (I have timeout and it never locks up). I have a suspicion that there is an issue with FSharp.Data.Http.

Async.RunSynchronously blocks. Read the remarks section in the docs: RunSynchronously. Instead, use Async.AwaitTask.

Related

start_server() method in python asyncio module

This question is really for the different coroutines in base_events.py and streams.py that deal with Network Connections, Network Servers and their higher API equivalents under Streams but since its not really clear how to group these functions I am going to attempt to use start_server() to explain what I don't understand about these coroutines and haven't found online (unless I missed something obvious).
When running the following code, I am able to create a server that is able to handle incoming messages from a client and I also periodically print out the number of tasks that the EventLoop is handling to see how the tasks work. What I'm surprised about is that after creating a server, the task is in the finished state not too long after the program starts. I expected that a task in the finished state was a completed task that no longer does anything other than pass back the results or exception.
However, of course this is not true, the EventLoop is still running and handling incoming messages from clients and the application is still running. Monitor however shows that all tasks are completed and no new task is dispatched to handle a new incoming message.
So my question is this:
What is going on underneath asyncio that I am missing that explains the behavior I am seeing? For example, I would have expected a task (or tasks created for each message) that is handling incoming messages in the pending state.
Why is the asyncio.Task.all_tasks() passing back finished tasks. I would have thought that once a task has completed it is garbage collected (so long as no other references are to it).
I have seen similar behavior with the other asyncio functions like using create_connection() with a websocket from a site. I know at the end of these coroutines, their result is usually a tuple such as (reader, writer) or (transport, protocol) but I don't understand how it all ties together or what other documentation/code to read to give me more insight. Any help is appreciated.
import asyncio
from pprint import pprint
async def echo_message(self, reader, writer):
data = await reader.read(1000)
message = data.decode()
addr = writer.get_extra_info('peername')
print('Received %r from %r' % (message, addr))
print('Send: %r' % message)
writer.write(message.encode())
await writer.drain()
print('Close the client socket')
writer.close()
async def monitor():
while True:
tasks = asyncio.Task.all_tasks()
pprint(tasks)
await asyncio.sleep(60)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.create_task(monitor())
loop.create_task(asyncio.start_server(echo_message, 'localhost', 7777, loop))
loop.run_forever()
Outputs:
###
# Soon after starting the application, monitor prints out:
###
{<Task pending coro=<start_server() running ...>,
<Task pending coro=<monitor() running ...>,
<Task pending coro=<BaseEventLoop._create_server_getaddrinfo() running ...>}
###
# After, things initialized and the server has started and the next print out is:
###
{<Task finished coro=<start_server() done ...>,
<Task pending coro=<monitor() running ...>,
<Task finished coro=<BaseEventLoop._create_server_getaddrinfo() done ...>}

How to make async requests using HTTPoison?

Background
We have an app that deals with a considerable amount of requests per second. This app needs to notify an external service, by making a GET call via HTTPS to one of our servers.
Objective
The objective here is to use HTTPoison to make async GET requests. I don't really care about the response of the requests, all I care is to know if they failed or not, so I can write any possible errors into a logger.
If it succeeds I don't want to do anything.
Research
I have checked the official documentation for HTTPoison and I see that they support async requests:
https://hexdocs.pm/httpoison/readme.html#usage
However, I have 2 issues with this approach:
They use flush to show the request was completed. I can't loggin into the app and manually flush to see how the requests are going, that would be insane.
They don't show any notifications mechanism for when we get the responses or errors.
So, I have a simple question:
How do I get asynchronously notified that my request failed or succeeded?
I assume that the default HTTPoison.get is synchronous, as shown in the documentation.
This could be achieved by spawning a new process per-request. Consider something like:
notify = fn response ->
# Any handling logic - write do DB? Send a message to another process?
# Here, I'll just print the result
IO.inspect(response)
end
spawn(fn ->
resp = HTTPoison.get("http://google.com")
notify.(resp)
end) # spawn will not block, so it will attempt to execute next spawn straig away
spawn(fn ->
resp = HTTPoison.get("http://yahoo.com")
notify.(resp)
end) # This will be executed immediately after previoius `spawn`
Please take a look at the documentation of spawn/1 I've pointed out here.
Hope that helps!

Asynchronous calls in OpenCPU

I would like to run OpenCPU job asynchronously and collect its results from a different session. In Rserve + RSclient I can do the following:
RS.eval(connection, expression, wait = FALSE)
# do something while the job is running
and then when I'm ready to receive results call either:
RS.collect(connection)
to try to collect results and wait until they are ready if job is still running or:
RS.collect(connection, timeout = 0)
if I want to check the job state and let it run if it is still not finished.
Is it possible with OpenCPU to receive the tmp/*/... path with the result id before the job has finished?
It seems acording to this post that OpenCPU does not support asynchronous jobs. Every request between the browser and the OpenCPU server must be alive in order to execute a script or function and receive a response succesfully.
If you find any workaround I would be pleased to know it.
In my case, I need to run a long process (may takes a few hours) and I can't keep alive the client request until the process finishes.

Erlang stopping application doesn't end all processes?

When I stop an Erlang application that I built, the cowboy listener process stays alive, continuing to handle requests. In the gen_server that I wrote I start a server on init. As you can see below:
init([Port]) ->
Dispatch = cowboy_router:compile([
{'_', [
{"/custom/[...]", ?MODULE, []},
% Serve index.html as default file
% Serve entire directory
{"/[...]", cowboy_static, {priv_dir,
app, "www"}}
]}
]),
Name = custom_name,
{ok, Pid} = cowboy:start_http(Name, 100,
[{port, Port}],
[{env, [{dispatch, Dispatch}]}]),
{ok, #state{handler_pid = Pid}}.
This starts the cowboy http server, which uses cowboy_static to server some stuff in the priv/app/ dir and the current module to handle custom stuff (module implements all the cowboy http handle callbacks). It takes the pid returned from the call and assigns it to handler_pid in the state record. This all works. However when I startup the application containing this module (which works) and then I stop it. All processes end (at least the ones in my application). The custom handler (which is implemented in the same module as the gen_server) no longer works. But the cowboy_static handle continues to handle requests. It continues to serve static files until I kill the node. I tried fixing this by adding this to the gen_server:
terminate(_Reason, State) ->
exit(State#state.handler_pid, normal),
cowboy:stop_listener(listener_name()),
ok.
But nothing changes. The cowboy_static handler continues to serve static files.
Questions:
Am I doing anything wrong here?
Is cowboy_static running under the cowboy application? I assume it is.
If so, how do I stop it?
And also, should I be concerned about stopping it? Maybe this isn't that big a deal.
Thanks in advance!
I don't think it is really important, generally you use one node/VM per application (in fact a bunch of erlang application working together, but I haven't a better word). But I think you can stop the server using application:stop(cowboy), application:stop(ranch).
You should fix 3 things:
the symbol in start_http(Name, ...) and stop_listener(Name) should match.
trap exit in service init: process_flag(trap_exit, true)
remove exit call from terminate.

Distributing workload

In my application I have a number of objects which can perform some long lasting calculations, let's call them clients. Also I have a number of objects which contain descriptions of tasks to be calculated. Something like this:
let clients = [1..4]
let tasks = [1..20]
let calculate c t =
printf "Starting task %d with client %d\n" t c
Thread.Sleep(3000)
printf "Finished task %d with client %d\n" t c
With one client I can only start one task at a time.
I want to create a function/class which will assign the tasks to the clients and perform the calculations. I've done this in C# using a queue of clients, so as soon as a new task is assigned to a client, this client is removed from the queue and when the calculations are finished, the client is released and is placed in the queue again. Now I'm interested in implementing this in a functional way. I've tried to experiment with asynchronous workflows, but I cannot think of a proper way to implement this.
Here's an F#-like code that I was trying to make work, but couldn't:
let rec distribute clients tasks calculate tasks_to_wait =
match clients, tasks with
| _ , [] -> () // No tasks - we're done!
| [], th::tt -> // No free clients, but there are still tasks to calculate.
let released_client = ?wait_for_one? tasks_to_wait
distribute [released_client] tasks calculate ?tasks_to_wait?
| ch::ct, th::tt -> // There are free clients.
let task_to_wait() =
do calculate ch th
ch
distribute ct tt calculate (task_to_wait::tasks_to_wait)
How do I do this? Is there a functional design pattern to solve this task?
There are various ways to do this. It would be perfectly fine to use some concurrent collection (like a queue) from .NET 4.0 from F#, as this is often the easiest thing to do, if the collection already implements the functionality you need.
The problem requires concurrent access to some resource, so it cannot be solved in a purely functional way, but F# provides agents, which give you a nice alternative way to solve the problem.
The following snippet implements an agent that schedules the work items. It uses its own mailbox to keep the available clients (which gives you a nice way to wait for the next available client). After the agent is created, you can just send all the initial clients. It will continue iterating over the tasks while clients are available. When there is no available client, it will block (asynchronously - without blocking of threads), until some previous processing completes and a client is sent back to the agent's mailbox:
let workloadAgent = MailboxProcessor.Start(fun agent ->
// The body of the agent, runs as a recursive loop that iterates
// over the tasks and waits for client before it starts processing
let rec loop tasks = async {
match tasks with
| [] ->
// No more work to schedule (but there may be still calculation,
// which was started earlier, going on in the background)
()
| task::tasks ->
// Wait for the next client to become available by receiving the
// next message from the inbox - the messages are clients
let! client = agent.Receive()
// Spanw processing of the task using the client
async { calculate client task
// When the processing completes, send the client
// back to the agent, so that it can be reused
agent.Post(client) }
|> Async.Start
// Continue processing the rest of the tasks
return! loop tasks }
// Start the agent with the initial list of tasks
loop tasks )
// Add all clients to the agent, so that it can start
for client in clients do workloadAgent.Post(client)
If you're not familiar with F# agents, then the MSDN section Server-Side Programming has some useful information.

Resources