Media metadata in HTTP header? - http

I am working on an HTTP server specifically designed for serving image, video and audio files. I thought that it might be useful to include some metadata in the HTTP response header to describe attributes of the file such as dimensions, duration and codec. This would allow querying this info with a HEAD request, possibly saving the download of the file.
A naive example might look something like this:
X-Media-Duration: 1:23.4
X-Media-Dimensions: (1280px,720px)
X-Media-VideoCodec: h264
Are there any existing standards that perform a similar task? Or, is this not a good idea and why?
Thanks.

Related

Detecting if a URL is a file download

How can I detect if a given URL is a file to be downloaded?
I came across the content-disposition header, however it seems that this isn't a part of http 1.1 directly.
Is there a more standard way to detect if the response for a GET request made to a given URL is actually a file to/can be downloaded?
That is the response is not html or json or anything similar, but something like an image, mp3, pdf file etc.?
HTTP is a transfer protocol - which is a very different thing to hard drive storage layouts. The concept of "file" simply does not exist in HTTP. No more than your computer hard drive contains actual paper-and-cardboard "files" that one would see in an office filing system.
Whatever you may think the HTTP message or URL are saying the response content does not have to come from any computer file, and does not have to be stored in one by the recipient.
The response to any GET message in HTTP can always be "downloaded" by sending another GET request with that same URL (and maybe other headers in the case of HTTP/1.1 variants). That is built into the definition of what a GET message is and has nothing to do with files.
I ended up using the content-type to decide if it's an html file or some other type of file that is on the other end of a given URL.
I'm using the content-disposition header content to detect the original file name if it exists since the header isn't available everywhere.
Could checking for a file extension be a possibility? Sorry I can't enlarge on that much without knowing more, but I guess you could consider using PHP to implement this if HTML doesn't have enough functionality?

HTTP Response before Request

My question might sound stupid, but I just wanted to be sure:
Is it possible to send an HTTP response before having the request for that resource?
Say for example you have an HTML page index.html that only shows a picture called img.jpg.
Now, if your server knows that a visitor will request the HTML file and then the jpg image every time:
Would it be possible for the server to send the image just after the HTML file to save time?
I know that HTTP is a synchronous protocol, so in theory it should not work, but I just wanted someone to confirm it (or not).
A recent post by Jacques Mattheij, referencing your very question, claims that although HTTP was designed as a synchronous protocol, the implementation was not. In practise the browser (he doesn't specify which exactly) accepts answers to requests have not been sent yet.
On the other hand, if you are looking to something less hacky, you could have a look at :
push techniques that allows the server to send content to the browser. The modern implementation that replace long-polling/Comet "hacks" are the websockets. You may want to have a look at socket.io also.
Alternatively you may want to have a look at client-side routing. Some implementations combine this with caching techniques (like in derby.js I believe).
If someone requests /index.html and you send two responses (one for /index.html and the other for /img.jpg), how do you know the recipient will get the two responses and know what to do with them before the second request goes in?
The problem is not really with the sending. The problem is with the receiver possibly getting unexpected data.
One other issue is that you're denying the client the ability to use HTTP caching tools like If-Modified-Since and If-None-Match (i.e. the client might not want /img.jpg to be sent because it already has a cached copy).
That said, you can approximate the server-push benefits by using Comet techniques. But that is much more involved than simply anticipating incoming HTTP requests.
You'll get a better result by caching resources effectively, i.e. setting proper cache headers and configuring your web server for caching. You can also inline images using base 64 encoding, if that's a specific concern.
You can also look at long polling javascript solutions.
You're looking for server push: it isn't available in HTTP. Protocols like SPDY have it, but you're out of luck if you're restricted to HTTP.
I don't think it is possible to mix .html and image in the same HTTP response. As for sending image data 'immediately', right after the first request - there is a concept of 'static resources' which could be of help (but it will require client to create a new reqest for a specific resource).
There are couple of interesting things mentioned in the the article.
No it is not possible.
The first line of the request holds the resource being requested so you wouldn't know what to respond with unless you examined the bytes (at least one line's worth) of the request first.
No. HTTP is defined as a request/response protocol. One request: one response. Anything else is not HTTP, it is something else, and you would have to specify it properly and implement it completely at both ends.

Custom HTTP Headers with old proxies

Is it true that some old proxies/caches will not honor some custom HTTP headers? If so, can you prove it with sections from the HTTP spec or some other information online?
I'm designing a REST API interface. For versioning I'm debating whether to use version as a part of the URL like (/path1/path2/v1 OR /path1/path2?ver=1) OR to use a custom Accepts X-Version header.
I was just reading in O'Reilly's Even Faster Websites about how mainly internet security software, but really anything that has to check the contents of a page, might filter the Accept-Encoding header in order to reduce the CPU time used decompressing and reading the file. The books cites that about 15% of user have this issue.
However, I see no reason why other, custom headers would be filtered. On the other hand, there also isn't really any reason to send it as a header and not with GET is there? It's not really part of the HTTP protocol, it's just your API.
Edit: Also, see the actual section of the book I mention.

HTTP POST vs HTTP PUT

Does HTTP PUT have advantages over HTTP POST, particularly for File Uploads? Data transfer should be highly secure. Your ideas / guidance on this will be of great help.
PUT is designed for file uploads moreso than POST which requires doing a multipart upload, but then it comes down to what your server can do as to which is more convenient for you to implement.
Whichever HTTP method you use, you'll be transmitting data in the clear unless you secure the connection using SSL.
I think the choice of PUT vs. POST should be more based on the rule:
PUT to a URL should be used to update or create the resource that can be located at that URL.
POST to a URL should be used to update or create a resource which is located at some other ("subordinate") URL, or is not locatable via http.
Any choices regarding security should work equally with both PUT and POST. https is a good start, if you are building a REST API then keys, authorisation, authentication and message signing are worth investigating.
Does HTTP PUT have advantages over HTTP POST, particularly for File Uploads?
You can use standard tools for sending the data (i.e. ones that don't have to be aware of your custom scheme for describing where the file should be uploaded to or how to represent that file). For example, OpenOffice.org includes WebDAV support.
Data transfer should be highly secure
The method you use has nothing to do with that. For security use SSL in combination with some form of authentication and authorization.

Better file uploading approach: HTTP post multipart or HTTP put?

Use-case: Upload a simple image file to a server, which clients could later retrieve
Designate a FTP Server for the job.
HTTP Put: It can directly upload files to a server without the need of a server side
component to handle the bytestream.
HTTP Post: Handle the bytestream by the server side component.
I think to safely use PUT on a public website requires even more effort than using POST (and is less commonly done) due to potential security issues. See http://bitworking.org/news/PUT_SaferOrDangerous.
OTOH, I think there are plenty of resources for safely uploading files with POST and checking them in the server side script, and that this is the more common practice.
PUT is only appropriate when you know the URL you are putting to.
You could also do:
4) POST to obtain a URL to which you then PUT the file.
edit: how are you going to get the HTTP server to decide whether it is OK to accept a particular PUT request?
What I usually do (via PHP) is HTTP POST.
And employ PHP's move_uploaded_file() to get it to whatever destination I want.

Resources