Flask send_file: how do I know if I need "as_attachment"? - http

Flask has a method to return a file: http://flask.pocoo.org/docs/1.0/api/#flask.send_file
There is a parameter called as_attachment with a default of False, and there is a handwaavy statement about it: "For extra security you probably want to send certain files as attachment (HTML for instance)"
How do I know if my use case is "those certain files"? Or alternatively stated, what does this do as opposed to leaving this as False?

You'll find a clue in the same documentation, further down in the parameter list:
as_attachment – set to True if you want to send this file with a Content-Disposition: attachment header.
So when the flag is set, an extra header is added to the response, which controls how the browser will handle the response. From the MDN documenation on Content-Disposition:
In a regular HTTP response, the Content-Disposition response header is a header indicating if the content is expected to be displayed inline in the browser, that is, as a Web page or as part of a Web page, or as an attachment, that is downloaded and saved locally.
Without an explicit Content-Disposition header, a text/html response from your Flask server will be shown as a web page in the browser. If you needed the file to be saved to disk instead (the browser prompting you what to do with the file), then you need to have a Content-Disposition: attachment set.
So when your response content type is likely to be shown in the browser as a web page but you want the user to download it instead, use as_attachment=True. These days, in addition to HTML, you probably want to set that flag for images, PDF files, and XML as well.

Related

Generating PDF on the fly with standard HTTP response fields

I'm developing a web page with a form which returns a PDF document based on the form data. Currently I use the HTTP response fields
Content-Type: application/pdf
Content-Disposition: attachment; filename="foo.pdf"
However, since the field Content-Disposition is non-standard and doesn't work in all browsers I'm looking for a different approach. Do I have to save the PDF document on the server? What is the modus operandi?
Edit: By "doesn't work in all browsers" I mean that with some browsers the filename is not set to foo.pdf. Dillo, for instance, just sets the default filename (in the download dialog) to the basename of the URL path (plus query string).
Do I have to save the PDF document on the server?
No. As far as the HTTP client is concerned it, the inner workings of the server are completely opaque to it. All it sees is a TCP stream of bytes from the server and how exactly that stream is produced doesn't matter as long as it matches the specified Content-Type.
Just send the PDF right after the HTTP headers and you're done with.
Update due to comment
So if you're wondering how to supply a filename without using a header field: Just augment the URL with it. I.e. something like
http://${DOMAIN}/${PDF_GENERATOR}/${DESIRED_FILENAME}
In the HTTP server add a rewrite rule to simply omit the filename part and redirect to just
http://${DOMAIN}/${PDF_GENERATOR}
The HTTP client does not see that, all it see is some URL ending with a "filename", that it can present the user as a default for saving.

Header Accept in HTTP

I have a problem with "Accept" header in http. I've writen a http client, and when I set "Accept: image/png" I can still read any file (like txt, html, etc).
I think it shouldn't be possible when header "Accept" is set like above.
I tried to check how my Firefox behaves. I wrote "about:config" and I set "network.http.accept.default" as "image/png", and I can surf the net as usually.
Am I misunderstanding meaning of this header? I think that I should only be able to open files *.png.
Accept isn't mandatory; the server can (and often does) either not implement it, or decides to return something else.
If the [Accept] header field is present in a request and none of the available representations for the response have a media type that is listed as acceptable, the origin server can either honor the header field by sending a 406 (Not Acceptable) response or disregard the header field by treating the response as if it is not subject to content negotiation.
Source - RFC 7231 5.3.2. Accept
Actually, the former behavior is normal. Let me give you an example.
If the given URL points to a PDF file and the Accept header accepts only docx, then the server will blindly ignore it and send the PDF file because server is not setup to decide between PDF and other documents.
If there are multiple formats available, then server will consider the " Accept " header and try to send the response accordingly, if not, then it will ignore the " Accept " header.
As you suppose, setting Accept means that you can't accept others medias than these specified, and servers should return a 406 response code.
It practice, servers don't implements correctly, and always send a response.
All details are available in RFC 2616
The accept header is poorly implemented by browsers and causes strange errors when used on public sites where crawlers make requests too.
That's why, accept header is ignore most of the time like in the Rail framework.

How do you include a cookie in an HTTP Response (e.g. an image request)?

I was wondering how companies like Double-Click include a cookie in their image responses to track users. Similarly, how do the images (e.g. smart pixels) send information back to their servers?
Please provide a scripting example if possible (any language is okay) [note: if this is resolve doings something server side, please describe how this would be accomplished using APACHE].
Cheers,
Rob
How do they include a cookie ? Configure the server, probably via script to send cookies with the responses. Images are http requests, that follow the http protocol, there is nothing magical about them.
"Smart pixels" convey their information simply via the request the browser must send to the server in order to load the image. Information about the user/browser, can be gathered via javascript and embedded in the url.
To do this in php, you'd use the setCookie function.
<?php
$value = 'something from somewhere';
setcookie("TestCookie", $value);
setcookie("TestCookie", $value, time()+3600); /* expire in 1 hour */
setcookie("TestCookie", $value, time()+3600, "/~rasmus/", ".example.com", 1);
?>
That code was taken from the php doc I referenced above. Basically this adds the Set-Cookie to the HttpResponse header like: Set-Cookie: UserID=JohnDoe; Max-Age=3600; Version=1
See http://en.wikipedia.org/wiki/List_of_HTTP_header_fields and search for Set-Cookie
ALSO, in scripting languages, like PHP, make sure you set the header before you render any content. This is because the HTTP Headers are the first thing sent in the response, so as soon as you write content, the headers should've already been written.
Another quote from the PHP:setcookie doc:
Like other headers, cookies must be sent before any output from your
script (this is a protocol restriction). This requires that you place
calls to this function prior to any output, including and
tags as well as any whitespace.

nginx resumable upload with upload_module and multipart/form

I currently upload to a webservice on an nginx server using the upload module (http://www.grid.net.ru/nginx/upload.en.html) from a custom desktop application doing a simple multipart-form POST that sends a file in one part and a base64 encoded XML with the file's metadata in another part.
The server receives this POST, passes it to my webservice which reads the metadata, processes the file and all is good.
What I want to do now is use the upload module's upload_resumable directive to do the POST in several chunks to minimize disconnection chances and allow resume. I can currently do this following the protocol described here: http://www.grid.net.ru/nginx/resumable_uploads.en.html
One sends byte ranges of the file along with some headers to identify the chunk and the session in several posts and once all the parts have been uploaded, nginx will compose the final POST containing the file name and path and pass it to your upload_pass location (which in my case CGIs to a django app).
However, I am not clear on how one would send a multipart post with this method since the protocol indicates that the body of the POST must be the bytes indicated in the byte range. I need the final post to also contain the XML I wrote about above.
I can think of sending the XML as the first bytes of the body and a header that indicates how many bytes belong to it but that would mean extra handling of the final file to remove that header and the final files are potentially in the GB size range.
Any other ideas?
Since the protocol supported by nginx specifically states that the post should not be multipart I ended up sending the file in the body, and the rest of the parameters encoded in the URL. Not the prettiest URLs but it works.

Should a HTTP POST'ed file be base64 encoded?

I'm currently implementing a client application that POST's a file over HTTP and have implemented base64 encoding on the file's data parameter.
However, it appears that when inspecting the traffic between a simple HTML page with a file upload form and the server that no Content-Transfer-Encoding header is sent in the body when describing the file's parameter.
Is this the preferred way of POST'ing a file over HTTP?
No, the preferred way is using multipart/form-data encoding, exactly as you would use with HTML form based file uploads.

Resources