Prolog HTTP Dispatch handle encapsulation - http

I'm creating a program in Prolog using a HTTP server to make requests...
I'm want to encapsulate the code in a way I can reuse it and have the http stuff in one module, my "controller" to handle requests in another module, etc.
I started having problems with the http dispatch handler registration:
:- http_handler('/test', foobar, []).
Is it possible to have something like this:
register_handler(path, callback) :-
http_handler(path, callback, []).
I tried using that but I got a error must likely due to the "callback" parameter. Also, the callback predicate is defined in a different module so I used:
:-consult(api_controller).
[EDIT]
server.pl
:- use_module(library(http/thread_httpd)).
:- use_module(library(http/http_dispatch)).
:- use_module(library(http/http_parameters)).
:- use_module(library(http/http_json)).
:- use_module(api_controller).
:- http_handler('/test', foo, []).
server(Port):-http_server(http_dispatch, [port(Port)]).
api_controller.pl
foo(_request) :-
format('Content-type: text/plain~n~n'),
format('Hello world!~n').
Error:
http_dispatch:call_action/2: Undefined procedure: foo/1

http_handler/3 is a directive, and you can place such directives in other files and then use include/1 to load them.
In addition, you can have total control over the HTTP dispatch by installing a generic handler as follows:
:- http_handler(/, handle_request, [prefix]).
Note the prefix option.
Then, you supply a suitable handle_request/1, for example like this:
handle_request(Request) :-
debug(my_dispatch, "~q\n", [Request]),
memberchk(path(Path0), Request),
atom_concat(., Path0, Path1),
http_safe_file(Path1, []),
absolute_file_name(Path1, Path),
( reply_file(Path0, File) -> http_reply_file(File, [unsafe(true)], Request)
; redirect(Path0, Other) -> http_redirect(moved, Other, Request)
; see_other(Path0, Other) -> http_redirect(see_other, Other, Request)
; hidden_file(Path0) -> http_404([], Request)
; exists_file(Path) -> http_reply_file(Path, [unsafe(true)], Request)
; ...
).
In this example, the following predicates are meant to be supplied by you to tailor the server to your exact use cases:
reply_file(Path, File): Send contents File in response to request Path.
redirect(Path0, Path): Redirect to Path in response to Path0.
see_other/2: Meaning left as an exercise.
hidden_file/1: Meaning left as an exercise.
These rules can be defined elsewhere, and you can include these files with the directive:
:- include(other_source).
A related directive you should check out is multifile/1.
I leave figuring out the precise libraries you need for the above to work as an exercise. A starting point:
:- use_module(library(http/thread_httpd)).
:- use_module(library(http/http_dispatch)).
:- use_module(library(http/http_server_files)).
:- use_module(library(http/http_files)).
:- use_module(library(http/http_header)).

Related

Lua - Download file asynchronously via HTTP

I just finish reading copas core code. And I want to write code to download file from website asynchronously, but copas seems to only support socket IO.
Since Lua does not provide async syntax, and other packages will surely have their own event loop that, I think, can not run along side copas' loop.
So to async download file via http, do I have to find a package that suppprt async http and async file IO at the same time? Or any other ideas?
After reading bunches of code, I can finally answer my own question.
As I mention in my comment to the question, one can make use of the step function exported by async IO library, and merge multiple stepping into a bigger loop.
In the case of luv, it uses external thread pool in C to manage file IO, and use a single-threaded loop to call pending callbacks and manage IO polling (polling is not needed in my use case).
One can simply call file operation function provided by luv to make async file IO. But still need to step luv's loop to call callbacks bind to IO operations.
The integerated main loop looks goes like this:
local function main_loop()
copas.running = true
while not copas.finished() or uv.loop_alive() do
if not copas.finished() then
copas.step()
end
if uv.loop_alive() then
uv.run("nowait")
end
end
end
copas.step() is the stepping function of copas. And uv.run("nowait") make luv run just one pass of event loop and don't block if there is no ready IO when polling.
A working solution looks like this:
local copas = require "copas"
local http = require "copas.http"
local uv = require "luv"
local urls = {
"http://example.com",
"http://example.com"
}
local function main_loop()
copas.running = true
while not copas.finished() or uv.loop_alive() do
if not copas.finished() then
copas.step()
end
if uv.loop_alive() then
uv.run("nowait")
end
end
end
local function write_file(file_path, data)
-- ** call to luv async file IO **
uv.fs_open(file_path, "w+", 438, function(err, fd)
assert(not err, err)
uv.fs_write(fd, data, nil, function(err_o, _)
assert(not err_o, err_o)
uv.fs_close(fd, function(err_c)
assert(not err_c, err_c)
print("finished:", file_path)
end)
end)
end)
end
local function dl_url(url)
local content, _, _, _ = http.request(url)
write_file("foo.txt", content)
end
-- adding task to copas' loop
for _, url in ipairs(urls) do
copas.addthread(dl_url, url)
end
main_loop()

How to send erlang functions source to riak mapreduce via HTTP?

I'm trying to use Riak's mapreduce via http. his is what i'm sending:
{
"inputs":{
"bucket":"test",
"key_filters":[["matches", ".*"]]
},
"query":[
{
"map":{
"language":"erlang",
"source":"value(RiakObject, _KeyData, _Arg) -> Key = riak_object:key(RiakObject), Count = riak_kv_crdt:value(RiakObject, <<\"riak_kv_pncounter\">>), [ {Key, Count} ]."
}
}
]}
Riak fails with "[worker_startup_failed]", which isn't very informative. Could anyone please help me get this to actually execute the function?
WARNING
Allowing arbitrary Erlang functions via map-reduce is a security risk. Any valid Erlang can be executed, including sending your entire data set offsite or formatting the hard drive.
You have been warned.
However, if you implicitly trust any client that may connect to your cluster, you can allow Erlang source to be passed in a map-reduce request by setting {allow_strfun, true} in the riak_kv section of app.config, (or in the advanced.config if you are using riak.conf).
Once you have allowed passing an Erlang function in a map-reduce phase, you need to pass in a function of the form fun(RiakObject,KeyData,Arg) -> [result] end. Note that this must be an anonymous fun, so fun is a keyword, not a name, and it must end with end.
Your function should handle the case where {error,notfound} is passed as the first argument instead of an object. Simply adding a catch-all clause to the function could accomplish that.
Perhaps something like:
{
"inputs":{
"bucket":"test",
"key_filters":[["matches", ".*"]]
},
"query":[
{
"map":{
"language":"erlang",
"source":"fun(RiakObject, _KeyData, _Arg) ->
Key = riak_object:key(RiakObject),
Count = riak_kv_crdt:value(
RiakObject,
<<\"riak_kv_pncounter\">>),
[ {Key, Count} ];
(_,_,_) -> [{error,0}]
end."
}
}
]}
Allowing the source to be passed in the request is very useful while developing and debugging. For production, you really should put the functions in a dedicated pre-compiled module that you copy to the code path of each node so that the phase spec can specify the module and function by name instead of providing arbitrary code.
{"map":{
"language":"erlang",
"module":"yourprecompiledmodule",
"function":"functionname"}}
You need to enable allow_strfun on all nodes in your cluster. To do so in Riak 2, you will need to use the advanced.config file to add this to the riak_kv configuration:
[
{riak_kv, [
{allow_strfun, true}
]}
].
The other option is to create your own Erlang module by using the compiler shipped with Riak and placing the *.beam file in a well-known location for Riak to find. The basho-patches directory is one such place.
Please see the documentation as well:
advanced.config
Installing custom Erlang code
HTTP MapReduce
Using MapReduce
Advanced MapReduce
MapReduce / curl example

Passing XQuery xml element as external variable to Marklogic via XCC

We have a fairly simple XQuery and Groovy code as follows.
Xquery code :
declare variable $criteria as element(criteria) external ;
<scopedInterventions>{
$criteria/equals/field
}</scopedInterventions>
Here is the test code that is trying to invoke it
def uri = new URI("xcc://admin:admin#localhost:8001")
def contentSource = ContentSourceFactory.newContentSource(uri)
def request = session.newModuleInvoke("ourQuery.xqy")
def criteria =
"""<criteria>
<equals>
<field>status</field>
<value>draft</value>
</equals>
</criteria>
"""
request.setNewVariable("criteria",ValueType.ELEMENT, criteria);
session.submitRequest(request).asString()
}
We are getting this error when executing:
Caused by: com.marklogic.xcc.exceptions.XQueryException: XDMP-LEXVAL:
xs:QName("element()") -- Invalid lexical value "element()" [Session:
user=admin, cb={default} [ContentSource: user=admin, cb={none}
[provider: address=localhost/127.0.0.1:9001, pool=1/64]]] [Client:
XCC/5.0-3, Server: XDBC/5.0-3] expr: xs:QName("element()") at
com.marklogic.xcc.impl.handlers.ServerExceptionHandler.handleResponse(ServerExceptionHandler.java:34)
at
com.marklogic.xcc.impl.handlers.EvalRequestController.serverDialog(EvalRequestController.java:83)
at
com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(AbstractRequestController.java:84)
at
com.marklogic.xcc.impl.SessionImpl.submitRequestInternal(SessionImpl.java:373)
at
com.marklogic.xcc.impl.SessionImpl.submitRequest(SessionImpl.java:356)
at
com.zynx.galen.dataaccess.MarkLogicUtilities.executeQueryWithMultipleXMLParameters(MarkLogicUtilities.groovy:52)
at
com.zynx.galen.repositories.ScopedInterventionService.getScopedInterventionsByCriteria(ScopedInterventionService.groovy:20)
... 1 more
Any help would be greately appreciated.
http://docs.marklogic.com/javadoc/xcc/overview-summary.html has the answer, I think:
Passing Variables With Queries
Variables may be bound to Request objects. When an execution request
is issued to the server with Session.submitRequest(Request) all the
variables currently bound to the Request object are sent along and
defined as external variables in the execution context in the server.
XCC lets you create XdmNodes and XdmSequences, as well as XdmAtomic
values. However, in the initial XCC release values of this type may
not be bound as external variables because MarkLogic Server cannot yet
accept them. This capability is anticipated for a future release.
Since XdmNode is not supported, I suppose its subclass XdmElement is not supported either. So these classes are only useful for responses, not requests. The error message could stand to be improved.
You could pass the XML string using setNewStringVariable, then call xdmp:unquote in your XQuery module. Note that xdmp:unquote returns a document-node, so the /* XPath step yields its root element.
declare variable $xml-string as xs:string external ;
declare variable $criteria as element(criteria) := xdmp:unquote($xml-string)/* ;
....

Parallel HTTP web crawler in Erlang

I'm coding on a simple web crawler and have generated a bunch gf static files I try to crawl by the code at bottom. I have two issues/questions I don't have an idea for:
1.) Looping over the sequence 1..200 throws me an error exactly after 100 pages have been crawled:
** exception error: no match of right hand side value {error,socket_closed_remotely}
in function erlang_test_01:fetch_page/1 (erlang_test_01.erl, line 11)
in call from lists:foreach/2 (lists.erl, line 1262)
2.) How to parallelize the requests, e.g. 20 cincurrent reqs
-module(erlang_test_01).
-export([start/0]).
-define(BASE_URL, "http://46.4.117.69/").
to_url(Id) ->
?BASE_URL ++ io_lib:format("~p", [Id]).
fetch_page(Id) ->
Uri = to_url(Id),
{ok, {{_, Status, _}, _, Data}} = httpc:request(get, {Uri, []}, [], [{body_format,binary}]),
Status,
Data.
start() ->
inets:start(),
lists:foreach(fun(I) -> fetch_page(I) end, lists:seq(1, 200)).
1. Error message
socket_closed_remotely indicates that the server closed the connection, maybe because you made too many requests in a short timespan.
2. Parallellization
Create 20 worker processes and one process holding the URL queue. Let each process ask the queue for a URL (by sending it a message). This way you can control the number of workers.
An even more "Erlangy" way is to spawn one process for each URL! The upside to this is that your code will be very straightforward. The downside is that you cannot control your bandwidth usage or number of connections to the same remote server in a simple way.

SWI-Prolog http_post and http_delete inexplicably hang

When I attempt to use SWI-Prolog's http_post/4, as follows:
:- use_module(library(http/http_client).
update(URL, Arg) :-
http_post(URL, form([update = Arg), _, [status_code(204)]).
When I query this rule, and watch the TCP traffic, I see the HTTP POST request and reply with the expected 204 status code both occur immediately. However, Prolog hangs for up to 30 seconds before returning back 'true'. What is happening that prevents the rule from immediately returning?
I've tried this variant as well, but it also hangs:
:- use_module(library(http/http_client).
update(URL, Arg) :-
http_post(URL, form([update = Arg), Reply, [status_code(204)]),
close(Reply).
I have a similar issue with http_delete/3, but not with http_get/3.
library docs state that http_post
It is equivalent to http_get/3, except for providing an input document, which is posted using http_post_data/3.
http_get has timeout(+Timeout) in its options. That could help to lower the latency, but as it is set to +infinite by default, I fear will not solve the issue. Seems like the service you are calling keeps the connection alive up to some timeout.
Personally I had to use http_open, instead of http_post, when calling Google API services on https...

Resources