Erlang nitrogen file download - http

I need to provie file download feature in my nitrogen app.
In principle I need to set headers like this:
wf:header("Content-Disposition", "attachment; filename=\"" ++ Filename ++ "\""),
but now I can't find a function in Nitrogen API to send data blocks of my file.
I need to upload file in portions because they might be very large, in addition to that, the files are not available on the local storage but the binary data are obtained from other modules. So in practice I need to handle by myself sending blocks of data to the http stream.
Any idea, or example how to do that, what api function can be used?

The best answer I can give you is one I answered a few days ago on the Nitrogen mailing list:
There isn't a great way to deal with this.
There are two ways to deal with this:
1) Using the underlying server's streaming mechanisms (such as making
a cowboy-specific dispatch table to targets a cowboy handler module that
deals with the streaming), or a yaws outfile.
2) Using cowboy, there's a bit of a hack that can work in
simple_bridge if you're using cowboy. If your module's main() function
returns the tuple: {stream, StreamFun} where StreamFun is a function
with arity 2 that (fun(Socket, Transport) - Transport being a ranch
transport). Really, this is just a shortcut way that allows you to
use Transport:send(socket) to send data. I'll admit I haven't done
this before, but it should work with a little bit of tinkering.
Adding this as an actual option to simple_bridge and Nitrogen would
probably be worthwhile.

Related

Use Julia to perform computations on a webpage

I was wondering if it is possible to use Julia to perform computations on a webpage in an automated way.
For example suppose we have a 3x3 html form in which we input some numbers. These form a square matrix A, and we can find its eigenvalues in Julia pretty straightforward. I would like to use Julia to make the computation and then return the results.
In my understanding (which is limited in this direction) I guess the process should be something like:
collect the data entered in the form
send the data to a machine which has Julia installed
run the Julia code with the given data and store the result
send the result back to the webpage and show it.
Do you think something like this is possible? (I've seen some stuff using HttpServer which allows computation with the browser, but I'm not sure this is the right thing to use) If yes, which are the things which I need to look into? Do you have any examples of such implementations of web calculations?
If you are using or can use Node.js, you can use node-julia. It has some limitations, but should work fine for this.
Coincidentally, I was already mostly done with putting together an example that does this. A rough mockup is available here, which uses express to serve the pages and plotly to display results (among other node modules).
Another option would be to write the server itself in Julia using Mux.jl and skip server-side javascript entirely.
Yes, it can be done with HttpServer.jl
It's pretty simple - you make a small script that starts your HttpServer, which now listens to the designated port. Part of configuring the web server is that you define some handlers (functions) that are invoked when certain events take place in your app's life cycle (new request, error, etc).
Here's a very simple official example:
https://github.com/JuliaWeb/HttpServer.jl/blob/master/examples/fibonacci.jl
However, things can get complex fast:
you already need to perform 2 actions:
a. render your HTML page where you take the user input (by default)
b. render the response page as a consequence of receiving a POST request
you'll need to extract the data payload coming through the form. Data sent via GET is easy to reach, data sent via POST not so much.
if you expose this to users you need to setup some failsafe measures to respawn your server script - otherwise it might just crash and exit.
if you open your script to the world you must make sure that it's not vulnerable to attacks - you don't want to empower a hacker to execute random Julia code on your server or access your DB.
So for basic usage on a small case, yes, HttpServer.jl should be enough.
If however you expect a bigger project, you can give Genie a try (https://github.com/essenciary/Genie.jl). It's still work in progress but it handles most of the low level work allowing developers to focus on the specific app logic, rather than on the transport layer (Genie's author here, btw).
If you get stuck there's GitHub issues and a Gitter channel.
Try Escher.jl.
This enables you to build up the web page in Julia.

Can I append or overwrite some bytes to an existing object in Openstack Swift?

I need to append some bytes to an existing object stored in Openstack Swift, say like a log file object and constantly append new logs to it. Is this possible?
Moreover, can I change (overwrite) some bytes (specify with offset and length) to an existing object?
I believe ZeroVM (zerovm.org) would be perfect for doing this.
Disclaimer: I work for Rackspace, who owns ZeroVM. Opinions are mine and mine alone.
tl;dr: There's no append support currently in Swift.
There's a blueprint for Swift append support: https://blueprints.launchpad.net/swift/+spec/object-append. It doesn't look very active.
user2195538 is correct. Using ZeroVM + Swift (using the ZeroCloud middleware for Swift) you could get a performance boost on large-ish objects by sending deltas to a ZeroVM app and process them in place. Of course you still have to read/update/write the file, but you can do it in place. You don't need to pipe the entire file over the network, which could/would be costly for large files.
Disclaimer: I also work for Rackspace, and I work on ZeroVM for my day job.

How do I speed up loading many small RDF files into Sesame?

I'm working with an RDF dataset generated as part of our data collection which consists of around 1.6M small files totalling 6.5G of text (ntriples) and around 20M triples. My problem relates to the time it's taking to load this data into a Sesame triple store running under Tomcat.
I'm currently loading it from a Python script via the HTTP api (on the same machine) using simple POST requests one file at a time and it's taking around five days to complete the load. Looking at the published benchmarks, this seems very slow and I'm wondering what method I might use to load the data more quickly.
I did think that I could write Java to connect directly to the store and so do without the HTTP overhead. However I read in an answer to another question here that concurrent access is not supported, so that doesn't look like an option.
If I were to write Java code to connect to the HTTP repository does the Sesame library do some special magic that would make the data load faster?
Would grouping the files into larger chunks help? This would cut down the HTTP overhead for sending the files. What size of chunk would be good? This blog post suggest 100,000 lines per chunk (it's cutting a larger file up but the idea would be the same).
Thanks,
Steve
If you are able to work in Java instead of Python I would recommend using the transactional support of Sesame's Repository API to your advantage - start a transaction, add several files, then commit; rinse & repeat until you've sent all files.
If that is not an option then indeed chunking the data into larger files (or larger POST request bodies - you of course do not necessarily need to physically modify your files) would help. A good chunk size would probably be around 500,000 triples in your case - it's a bit of a guess to be honest, but I think that will give you good results.
You can also cut down on overhead by using gzip compression on the POST request body (if you don't do so already).

Qt - How to save downloaded data for a multipart downloader

I'm writing a multipart downloader in Qt. Multiple QNetWorkRequest with http header "Range" are used to download a file. Now I write data received in each part(QNetworkReply) to file.part1, file.part2, etc.
Is it possible to write data to the same file simultaneously? Do I have to implement a lock for it and which is the best way to save data in my application?
Any suggestions?
Why not just merge the file parts when you are finished? You can write the parts to a QFile easily. Perhaps something about the approach or the data keeps you from doing this, but if you can, it's probably the approach I would take before dealing with treating a QFile as a shared resource.
If you want multiple concurrent replies to be able to write to and access the QFile, then yes, the QFile becomes a shared resource. As far as I know, you're going to need to lock it. At that point, you have several options. You can have the slot(s) handling the replies attempt to acquire a QSemaphore, you can use QMutex and QMutexLocker if you'd prefer to lock on a mutex. You could treat it as a multiple producer (the various QNetworkReplys) single consumer (whatever is writing to the file) problem (here's a Stack Overflow post that provides some useful links) if you want to go that route. In short, there are numerous approaches here, all of which I think are more of a hassle than simply merging the file.part's at the end if you're able to go that route.
In terms of merging to a single QFile concurrently, there may be an easier Qt way of doing it, but I've never found it. Perhaps someone else can chime in if such a method exists.
I'm not sure what you mean by "which is the best way to save data in my application?" Are you referring to saving application specific settings in a persistent manner? If so, look into QSettings. If you're referring to saving the data you're downloading, I'd probably write it to a QFile, just like you appear to be doing. Although it's hard to know for sure without knowing more about what you're downloading.

Store map key/values in a persistent file

I will be creating a structure more or less of the form:
type FileState struct {
LastModified int64
Hash string
Path string
}
I want to write these values to a file and read them in on subsequent calls. My initial plan is to read them into a map and lookup values (Hash and LastModified) using the key (Path). Is there a slick way of doing this in Go?
If not, what file format can you recommend? I have read about and experimented with with some key/value file stores in previous projects, but not using Go. Right now, my requirements are probably fairly simple so a big database server system would be overkill. I just want something I can write to and read from quickly, easily, and portably (Windows, Mac, Linux). Because I have to deploy on multiple platforms I am trying to keep my non-go dependencies to a minimum.
I've considered XML, CSV, JSON. I've briefly looked at the gob package in Go and noticed a BSON package on the Go package dashboard, but I'm not sure if those apply.
My primary goal here is to get up and running quickly, which means the least amount of code I need to write along with ease of deployment.
As long as your entiere data fits in memory, you should't have a problem. Using an in-memory map and writing snapshots to disk regularly (e.g. by using the gob package) is a good idea. The Practical Go Programming talk by Andrew Gerrand uses this technique.
If you need to access those files with different programs, using a popular encoding like json or csv is probably a good idea. If you just have to access those file from within Go, I would use the excellent gob package, which has a lot of nice features.
As soon as your data becomes bigger, it's not a good idea to always write the whole database to disk on every change. Also, your data might not fit into the RAM anymore. In that case, you might want to take a look at the leveldb key-value database package by Nigel Tao, another Go developer. It's currently under active development (but not yet usable), but it will also offer some advanced features like transactions and automatic compression. Also, the read/write throughput should be quite good because of the leveldb design.
There's an ordered, key-value persistence library for the go that I wrote called gkvlite -
https://github.com/steveyen/gkvlite
JSON is very simple but makes bigger files because of the repeated variable names. XML has no advantage. You should go with CSV, which is really simple too. Your program will make less than one page.
But it depends, in fact, upon your modifications. If you make a lot of modifications and must have them stored synchronously on disk, you may need something a little more complex that a single file. If your map is mainly read-only or if you can afford to dump it on file rarely (not every second) a single csv file along an in-memory map will keep things simple and efficient.
BTW, use the csv package of go to do this.

Resources