File upload and store with lighttpd - multipart

I am running lighthttpd in Linux on an embedded platform.
Now i want to make it possible to transfer a file to the system, with an upload web page containing a file selector and "Upload" button (with HTML tags and ). The selected file is transferred as a POST HTTP request containing multipart/form-data. The file should then simply be stored as a regular file in the file system.
I'm already having a CGI interface, a bash script which receives the request and which passes it to the backend C++ application. And because it is an embedded platform, i would like to avoid using php, python etc. only for this case.
As far as i see, lighttpd is not able to save the received files directly from multipart-encoded request body to pure files, correct?
To decode the body i found 'munpack' tool from the mpack package, which writes the encoded body to files on disk, but is intended for mime encoded emails. Nevertheless i can call it in the CGI bash script, and it works almost like expected, except that it can't handle the terminating boundary id (the boundary id given in 'Content-Type' appended by two dashes), resulting in the last file still containing the final boundary. Update: This munpack behaviour came from a faulty script, but still it doesn't work, munpack produces wrong files when the body contains CRLF line endings; only LF produces the correct result.
Is there any other direct request-to-file-on-disk approach? Or do i really have to filter out the terminating boundary manually in the script, or write a multipart-message parser in my C++ application?
To make the use case clear: A user should be able to upload a firmware file to my system. So he connects to my system with a web browser, receives an upload page where he can select the file and send it with an "Upload" button. This transferred file should then simply be stored on my system. The CGI script for receiving the request does already exist (as well as a C++ backend where i could handle the request, too), the only problem is converting the multipart/form-data encoded file to a plain file on disk.

Now i want to make it possible to transfer a file to the system, through a POST HTTP request. The file should simply be stored as a regular file in the file system.
That sounds more like it should be an HTTP PUT rather than an HTTP POST.
As far as i see, lighttpd is not able to save the received files directly from multipart-encoded request body to pure files, correct?
Do you mean application/x-www-form-urlencoded with the POST?
Why multipart-encoded? Are there multiple files being uploaded?
lighttpd mod_webdav supports PUT. Otherwise, you need your own program to handle the request body, be it a shell script or a compiled program. You can use libfcgi with your C++, or you can look at the C programs that lighttpd uses for testing, which implement FastCGI and SCGI in < 300 lines of C each.

Related

How can I tell if visiting a URL would download a file of a certain mimetype?

I am building an application that tells me if visiting a URL would make a user download a file of a certain mimetype.
My question is: What information (like header fields) can be used to achive this?
I was thinking about sending a HEAD-request and look for Content-Disposition and Content-Type header fields. But an attacker might just lie in this fields and because of mimesniffing my browser would still save the file.
Is there a way to get this information without downloading the file (this would cause unwanted traffic.)
EDIT:
I want to develop an application that gets an URL as input.
The output should be three things:
1: does visiting the URL make browsers save ("download) a file delivered by the webserver?
if 1:
2: what is the mimetype of this file?
3: what is the filename of this file?
Example:The url https://foo.bar/game.exe visited with a browser saves the file game.exe
How could I tell (without causing huge traffic by downloading the file) that the url will: 1: make me download a file 2: application/octet-stream 3: game.exe
I already know how to make a head request. But can I really trust the Content-Disposition and Content-Type header fields? I have observed responses that did not contain a Content-Disposition field and my browser still saved the file. This would cause my application to think the URL is clear while it isn't.
Browsers do not guess the mime type if the type is present in the content-type header (see MDN:Mime Types)
So, you can rely on if that and/ or the content-Disposition header is present that the browser will not guess.
Now, in order to detect what it is you are getting, the best way is to request the head of the file (the first line / few bytes) and decipher the magic value from that. (e.a. the *NIX way to determine what a file is)
this is more reliable and less risky than depending on the file extension...
but if you need a fool proof methode to determine if a file will be downloaded.. there is n't one I know.
This can be done using curl, with the -I option (to fetch headers only), like so:
curl -I https://www.irs.gov/pub/irs-pdf/f1040.pdf

What are these two methods by which a web server handles a HTTP request?

From https://en.wikipedia.org/wiki/Query_string
A web server can handle a Hypertext Transfer Protocol request either
by reading a file from its file system based on the URL path or by
handling the request using logic that is specific to the type of
resource. In cases where special logic is invoked, the query string
will be available to that logic for use in its processing, along with
the path component of the URL.
What does the quote mean by the two methods by which a web server can handle a HTTP request
"by reading a file from its file system based on the URL path"
"by handling the request using logic that is specific to the type of resource"?
Can you give specific examples to explain the two methods?
Is the query string used in both method?
Thanks.
by reading a file from its file system based on the URL path
^ The web site uses a generic mapping mechanism to convert a URL path to a local filesystem path, and then returns the file located at that path. This is common with static files like .css.
by handling the request using logic that is specific to the type of resource"
^ The web site turns control over to a web application, which contains code written by a developer. The code reads the query string and decides what to do. The logic for deciding what to do is completely customizable, and there does not need to be a static file in the local filesystem that matches the URL.

Detecting if a URL is a file download

How can I detect if a given URL is a file to be downloaded?
I came across the content-disposition header, however it seems that this isn't a part of http 1.1 directly.
Is there a more standard way to detect if the response for a GET request made to a given URL is actually a file to/can be downloaded?
That is the response is not html or json or anything similar, but something like an image, mp3, pdf file etc.?
HTTP is a transfer protocol - which is a very different thing to hard drive storage layouts. The concept of "file" simply does not exist in HTTP. No more than your computer hard drive contains actual paper-and-cardboard "files" that one would see in an office filing system.
Whatever you may think the HTTP message or URL are saying the response content does not have to come from any computer file, and does not have to be stored in one by the recipient.
The response to any GET message in HTTP can always be "downloaded" by sending another GET request with that same URL (and maybe other headers in the case of HTTP/1.1 variants). That is built into the definition of what a GET message is and has nothing to do with files.
I ended up using the content-type to decide if it's an html file or some other type of file that is on the other end of a given URL.
I'm using the content-disposition header content to detect the original file name if it exists since the header isn't available everywhere.
Could checking for a file extension be a possibility? Sorry I can't enlarge on that much without knowing more, but I guess you could consider using PHP to implement this if HTML doesn't have enough functionality?

How to know when to resolve referer

I was working on my server and encountered the need to implement the use of request.headers.referer When I did tests and read headers to determine how to write the parsing functions, I couldn't determine a differentiation between requests that invoke from a link coming from outside the server, outside the directory, or calls for local resources from a given HTML response. For instance,
Going from localhost/dir1 to localhost/dir2 using <a href="http://localhost/dir2"> will yield the response headers:
referer:"http://localhost/dir1" url:"/dir2"
while the HTML file sent from localhost/dir2 asking for resources using local URI style.css will yeild:
referer:"http://localhost/dir2" url:"/style.css"
and the same situation involving an image could end up
referer:"http://localhost/dir2" url:"/_images/image.png"
How would I prevent incorrect resolution, between url and referer, from accidentally being parsed as http://localhost/dir1/dir2 or http://localhost/_images/image.png and so on? Is there a way to tell in what way the URI is being referred by the browser, and how can either the browser or server identify when http://localhost/dir2/../dir1 is intended destination?

How can I prevent an XSS vulnerability when using Flex and ASP.NET to save a file?

I've implemented a PDF generation function in my flex app using alivePDF, and I'm wondering if the process I've used to get the file to the user creates an XSS vulnerability.
This is the process I'm currently using:
Create the PDF in the flex application.
Send the binary PDF file to the server using a POST, along with the filename to deliver it as.
An ASP.NET script on the server checks the filename to make sure it's valid, and then sends it back to the user as an HTTP attachment.
Given that, what steps should I take to prevent XSS?
Are there any other GET or POST parameters other than the filename?
In preventing XSS, there are three main strategies: validation, escaping, and filtering.
Validation: Upon detecting nvalid characters, reject the POST request (and issue an error to the user).
Escaping: Likely not applicable when saving the file, as your OS will have restrictions on valid file names.
Filtering: Automatically strip the POST filename parameter of any invalid characters. This is what I'd recommend for your situation.
Within the ASP.NET script, immediately grab the POST string and remove the following characters:
< > & ' " ? % # ; +
How is this going to be XSS exploitable? You aren't outputting something directly to the user. The filesystem will just reject strange characters, and when putting the file on the output stream, the name nor the content does matter.

Resources