Send a large file with HTTP.jl - http

I would like to implement a server with HTTP.jl and julia. After some computation the server would return a "large" file (about several 100 MB). I would like to avoid having to read all the file in memory and then send it to the client.
Some framework allow have a specific function for this (e.g. Flask http://flask.pocoo.org/docs/0.12/api/#flask.send_file) or allow to stream the content to the client (http://flask.pocoo.org/docs/0.12/patterns/streaming/).
Are one for these two options also available in HTTP.jl ? Or any other Julia web package?
Here is a test code which reads the file testfile.txt, but I want to avoid loading the complete file in memory.
import HTTP
f = open("testfile.txt","w")
write(f,"test")
close(f)
router = HTTP.Router()
function testfun(req::HTTP.Request)
f = open("testfile.txt")
data = read(f)
close(f)
return HTTP.Response(200,data)
end
HTTP.register!(router, "GET", "/testfun",HTTP.HandlerFunction(testfun))
server = HTTP.Servers.Server(router)
task = #async HTTP.serve(server, ip"127.0.0.1", 8000; verbose=false)
sleep(1.0)
req = HTTP.request("GET","http://127.0.0.1:8000/testfun/")
# end server
put!(server.in, HTTP.Servers.KILL)
#show String(req.body)

You can use memory mapped IO like this:
function testfun(req::HTTP.Request)
data = Mmap.mmap(open("testfile.txt"), Array{UInt8,1})
return HTTP.Response(200,data)
end
data now looks like a normal byte array to julia, but is actually liked to the file, which might be exactly what you want. The file will be closed upon garbage collection - if you have many requests and no garbage collection is triggered, you might end up with a lot of open files. If your request takes quite long anyway, you might consider calling gc() at the begin of the request.

Related

Lua - Download file asynchronously via HTTP

I just finish reading copas core code. And I want to write code to download file from website asynchronously, but copas seems to only support socket IO.
Since Lua does not provide async syntax, and other packages will surely have their own event loop that, I think, can not run along side copas' loop.
So to async download file via http, do I have to find a package that suppprt async http and async file IO at the same time? Or any other ideas?
After reading bunches of code, I can finally answer my own question.
As I mention in my comment to the question, one can make use of the step function exported by async IO library, and merge multiple stepping into a bigger loop.
In the case of luv, it uses external thread pool in C to manage file IO, and use a single-threaded loop to call pending callbacks and manage IO polling (polling is not needed in my use case).
One can simply call file operation function provided by luv to make async file IO. But still need to step luv's loop to call callbacks bind to IO operations.
The integerated main loop looks goes like this:
local function main_loop()
copas.running = true
while not copas.finished() or uv.loop_alive() do
if not copas.finished() then
copas.step()
end
if uv.loop_alive() then
uv.run("nowait")
end
end
end
copas.step() is the stepping function of copas. And uv.run("nowait") make luv run just one pass of event loop and don't block if there is no ready IO when polling.
A working solution looks like this:
local copas = require "copas"
local http = require "copas.http"
local uv = require "luv"
local urls = {
"http://example.com",
"http://example.com"
}
local function main_loop()
copas.running = true
while not copas.finished() or uv.loop_alive() do
if not copas.finished() then
copas.step()
end
if uv.loop_alive() then
uv.run("nowait")
end
end
end
local function write_file(file_path, data)
-- ** call to luv async file IO **
uv.fs_open(file_path, "w+", 438, function(err, fd)
assert(not err, err)
uv.fs_write(fd, data, nil, function(err_o, _)
assert(not err_o, err_o)
uv.fs_close(fd, function(err_c)
assert(not err_c, err_c)
print("finished:", file_path)
end)
end)
end)
end
local function dl_url(url)
local content, _, _, _ = http.request(url)
write_file("foo.txt", content)
end
-- adding task to copas' loop
for _, url in ipairs(urls) do
copas.addthread(dl_url, url)
end
main_loop()

urequests micropython problem (multiple POST requests to google forms)

I'm trying to send data to Google Forms directly (without and external service like IFTTT) using an esp8266 with micropython. I've already used IFTTT but at this point is not useful for me, i need a sampling rate of more or equal to 100 Hz and as you know this exceeds the IFTTT's usage limit. I've tried making a RAM buffer, but i got a error saying that the buffer exceded the RAM size (4 MB) so that's why im trying to do directly.
After trying some time i got it partially. I say "partially" because i have to do a random get-request after the post-request; i don't know why it works, but it works (in this way i can send data to Google Forms every second approximately, or maybe less). I guess the problem is that the esp8266 can't close the connection with Google Forms and it gets stuck when it tries to do a new post-request, if this were the problem, i don't know how to fix it in another way, any suggestions? The complete code is here:
ssid = 'my_network'
password = 'my_password'
import urequests
def do_connect():
import network
sta_if = network.WLAN(network.STA_IF)
if not sta_if.isconnected():
print('connecting to network...')
sta_if.active(True)
sta_if.connect(ssid, password)
while not sta_if.isconnected():
pass
print('network config:', sta_if.ifconfig())
def main():
do_connect()
print ("CONNECTED")
url = 'url_of_my_google_form'
form_data = 'entry.61639300=example' #have to change the entry
user_agent = {'Content-Type': 'application/x-www-form-urlencoded'}
while True:
response = urequests.post(url, data=form_data, headers=user_agent)
print ("DATA HAVE BEEN SENT")
response.close
print("TRYING TO SEND ANOTHER ONE...")
response = urequests.get("http://micropython.org/ks/test.html") #<------ RANDOM URL, I DON'T KNOW WHY THIS CODE WORKS CORRECTLY IN THIS WAY
print("RANDOM GET:")
print(response.text)
response.close
if __name__ == '__main__':
main()
Thank you for your time guys. Also i've tried with this code before but it DOESN'T WORK. Without the random get-request, it gets stuck after one or two times of posting:
while True:
response = urequests.post(url, data=form_data, headers=user_agent)
print ("DATA HAVE BEEN SENT")
response.close
print("TRYING TO SEND ANOTHER ONE...")
Shouldn't it be response.close() (with brackets)?.. 🤔,
Without brackets you access a (non existing) property close of the object response instead of calling the method close(), and do not really close the connection. This could lead to memory overflow.

TidHTTPServer "Out of memory" on large file upload

I'm using Delphi 10.3.1 and Indy TIdHTTP / TIdHTTPServer
I created a client / server application to archive files.
The client uses a TIdHTTP component, the code is something like this:
procedure TForm1.SendFileClick (Sender: TObject);
var
Stream: TIdMultipartFormDataStream;
begin
Stream: = TIdMultipartFormDataStream.Create;
try
Stream.AddFormField ('field1', 'hello world');
Stream.AddFile ('field2', 'c:\temp\gigafile.mp4');
idHTTP.Post ('http://192.168.1.100:1717', Stream);
finally
Stream.Free;
end;
end;
The server uses a TIdHTTPServer component.
Everything seemed to work perfectly until I uploaded very large video files (>= 1GB), because I got the error "Out of memory".
By debugging, I saw that I get the error in function PreparePostStream (line 1229 of the IdCustomHTTPServer unit) when it calls LIOHandler.ReadStream, the event OnCommandGet is not fired yet.
The function LIOHandler.ReadStream goes wrong when it runs AdjustStreamSize (line 2013 of the IdIOHandler unit)
In my last test, with a large video file, in the AdjustStreamSize function, the value of ASize was 1091918544 and I got the error during the execution of
AStream.Size line: = ASize
I think that the origin point of the error is in the System.Classes unit in the following procedure when at the SetPointer ... line.
procedure TMemoryStream.SetCapacity (NewCapacity: NativeInt);
{$ IF SizeOf (LongInt) = SizeOf (NativeInt)}
begin
  SetPointer (Realloc (LongInt (NewCapacity)), FSize);
  FCapacity: = NewCapacity;
end;
I read many articles on the web but I didn't understand if there is something wrong in my code.
How can I solve it or is there a limit to the size of the files I can upload with TIdHTTPServer?
By default, TIdHTTPServer receives posted data using a TMemoryStream, which will obviously not work well for such large files. You can use the server's OnCreatePostStream event to provide an alternative TStream object to receive into, such as a TFileStream.
Delphi by default seems to have some kind of limit on memory usage, adding this lines to .DPR project file:
const
IMAGE_FILE_LARGE_ADDRESS_AWARE = $0020;
{$SetPEFlags IMAGE_FILE_LARGE_ADDRESS_AWARE}
applications can use up to 2.5 GB of RAM on the 32bit versions of Windows and up to 3.5 GB of RAM on the 64bit versions.
(https://cc.embarcadero.com/item/24309)
Anyway I think #RemyLebeau solution is the best

Invalidate/prevent memoize with plone.memoize.ram

I've and Zope utility with a method that perform network processes.
As the result of the is valid for a while, I'm using plone.memoize.ram to cache the result.
MyClass(object):
#cache(cache_key)
def do_auth(self, adapter, data):
# performing expensive network process here
...and the cache function:
def cache_key(method, utility, data):
return time() // 60 * 60))
But I want to prevent the memoization to take place when the do_auth call returns empty results (or raise network errors).
Looking at the plone.memoize code it seems I need to raise ram.DontCache() exception, but before doing this I need a way to investigate the old cached value.
How can I get the cached data from the cache storage?
I put this together from several code I wrote...
It's not tested but may help you.
You may access the cached data using the ICacheChooser utility.
It's call method needs the dotted name to the function you cached, in your case itself
key = '{0}.{1}'.format(__name__, method.__name__)
cache = getUtility(ICacheChooser)(key)
storage = cache.ramcache._getStorage()._data
cached_infos = storage.get(key)
In cached_infos there should be all infos you need.

Seeking not working in HTML5 audio tag

I have a lighttpd server running locally. If I load a static file on the server (through an html5 audio tag), it plays and seeks fine.
However, seeking doesn't work when running a dev server (web.py/CherryPy) or if I return the bytes via a defined action url instead of as a static file. It won't load the duration either.
According to the "HTTP byte range requests" section in this Opera Page it's something to do with support for byte range requests/partial content responses. The content is treated as streaming instead.
What I don't understand is:
If the browser has the whole file downloaded surely it can display the duration, and surely it can seek.
What I need to do on the web server to enable byte range requests (for non-static urls).
Any advice would be most gratefully received.
Here's some web.py code to get you started (just happened to need this as well and ran into your question):
## experimental partial content support
## perhaps this shouldn't be enabled by default
range = web.ctx.env.get('HTTP_RANGE')
if range is None:
return result
total = len(result)
_, r = range.split("=")
partial_start, partial_end = r.split("-")
start = int(partial_start)
if not partial_end:
end = total-1
else:
end = int(partial_end)
chunksize = (end-start)+1
web.ctx.status = "206 Partial Content"
web.header("Content-Range", "bytes %d-%d/%d" % (start, end, total))
web.header("Accept-Ranges", "bytes")
web.header("Content-Length", chunksize)
return result[start:end+1]
Google tells me you have to use the staticFilter for byte ranges to work in CherryPy - but that is for static files only. Luckily this posting also includes pointers on how to do it for non-static data :-)

Resources