Is HTTP partial GET a reliable mechanism? If it is, how come it seems like modern browsers still start from the beginning instead of resuming the download?
In my experience this feature is not ubiquitous across all web servers. Probably because it is not a widely used by web clients. Sort of like HTTP HEAD requests which may or may not be implemented. As always, YMMV depending on the clients and servers involved.
The download resumption mechanism is based on HTTP range request headers that specify what part of the content you want (see here). I have not messed with this much in the last few years, so you may be better served doing a little more Google research. Here is a link to a blog posting that talks about some the latest developments regarding this feature.
Whenever I download big files with wget, I might interrupt them and resume with -c. I don't remember ever getting a corrupted file. Safari allows you to resume (instead of restart) a stopped download, works fine there too.
Yes, when done properly (If-Match etag...), it is reliable.
Related
I used to use whurl.heroku.com to make http web requests and share the responses with people. It's a great service for allowing people to see the results of requests themselves and test fixes.
It appears that whurl is going offline soon. Are there any good alternatives out there (besides hosting my own)?
Similar to what Mihai posted, I found Advanced Rest Client, a google chrome app. I prefer ARC a little more as it's an app so it doesn't take up space in my URL bar, and also because it's easier to use and has a richer saved history feature than XHR.
It seems someone is hosting Whurl again on heroku.com, https://gcurl.heroku.com/.
You can use XHR Poster extension for Google Chrome. Contains many of the functionalities of whurl.
It has JSON pretty print feature and handles all types of requests.
Most of the times, websites mainly only use GET and POST for all operations, yet there are seven more verbs out there. Where they used in older times but not so much now?
Or maybe is it because some browsers don't recognize the other verbs? And if that's the case, why do browser vendors choose to implement half of the protocol?
[Update]
I found this article which gives a good summary about the situation: Why REST failed.
The HTML spec is a big culprit, only really allowing GET, POST and HEAD. They are getting used quite a bit though, but not as much directly in browsers.
The most common uses of the other crud-verbs such as PUT and DELETE are in REST services and WebDAV.
You'll see OPTIONS more in the future, as it's used by the CORS specification (cross domain xmlhttprequest).
TRACE is pretty much disabled everywhere, as it poses a pretty big security risk. CONNECT is definitely used quite a bit by proxies.
PATCH is brand new. While it's odd to me that they decided to add it to the list (but not PROPFIND, MKCOL, ACL, LOCK, and so on), I do think we'll see it appear more in the future in RESTful services.
Addendum: The original browser used both GET and PUT (the latter for updating web pages). Later browsers pretty much became read-only until forms and the POST request made their way into the specifications.
Most of them are still used, though not as widely as GET or POST. For example RESTful web services use PUT & DELETE as well as GET & POST:
RESTful Web Service - Wiki Article
HEAD is very useful for server debugging of the HTTP headers, but as it doesn't return the response body, it's not much use to the browser / average web visitor...
Other verbs like TRACE aren't as widespread, because of potential security concerns etc. Mentioned briefly in the Wiki article:
HTTP Protocol Methods - Wiki Article
A decade later, these other verbs are used very commonly in RESTful APIs, which back nearly all of today's ubiquitous SPA applications and many mobile applications.
Though, interest in REST as an API structure is beginning to wane with the advent of GraphQL and growing interest in functional programming styles which benefit from RPC-style API structures.
In asp.net application, how its possible to download all png,css JavaScript and other resources parallel.
Because i am monitoring using Fiddler and found that content is downloaded one after another.
That is actually more of a browser (client) behaviour in accordance to the specification in HTTP 1.1. The guideline is to limit simultaneous downloads to two per hostname.
http://www.yuiblog.com/blog/2007/04/11/performance-research-part-4/
While you may be able to alter your browser's settings to download more per hostname, that is only your machine and not that of others' in the Internet wilderness. One way to trick clients in downloading more simulatenously is to designate your web resources into different hostnames, like images stored in http://images.yoursite.com. But you may wanna to test this and balance it out, as per the article's suggestion.
You can try AJAX for that as usually there are 5 allowed server/client http connections you could theoretically use them all at once.
However I guess you will take little advantage of this, unless you have really big (or many) css and javascript files.
Not sure if this will work on images or other files.
Looking around, I can't name a single web application (not web service) that uses anything besides GET and POST requests. Is there a specific reason for this? Do some browsers (or servers) not support any other types of requests? Or is this only for historical reasons? I'd like to make use of PUT and DELETE requests to make my life a little easier on the server-side, but I'm reluctant to because no one else does.
Actually a fair amount of people use PUT and DELETE, mostly for non-browser APIs. Some examples are the Atom Publishing Protocol and the Google Data APIs:
http://www.ietf.org/rfc/rfc5023.txt
http://code.google.com/apis/gdata/docs/2.0/basics.html
Beyond that, you don't see PUT/DELETE in common usage because most browsers don't support PUT and DELETE through Forms. HTML5 seems to be fixing this:
http://www.w3.org/TR/html5/forms.html#form-submission-0
The way it works for browser applications is: people design RESTful applications with PUT and DELETE in mind, then "tunnel" those requests through POSTs from the browser. For example, see this SO question on how Ruby on Rails accomplishes this using hidden fields:
How can I emulate PUT/DELETE for Rails and GWT?
So, you wouldn't be on your own designing your application with the larger set of HTTP verbs in mind.
EDIT: By the way, if you're curious about why PUT/DELETE are missing from browser based form posts, it turns out there's no real good technical reason. Reading around this thread on the rest-discuss mailing list, especially Roy Fielding's comments, is interesting for some context:
http://tech.groups.yahoo.com/group/rest-discuss/message/9620?threaded=1&var=1&l=1&p=13
EDIT: There are some comments on whether AJAX libraries support all the methods. It does come down to the actual browser implementation of XMLHttpRequest. I thought someone might find this link handy, which tests your browser to see how compliant the HttpRequest object is with various HTTP options.
http://www.mnot.net/javascript/xmlhttprequest/
Unfortunately, I don't know of a reference which collects these results.
Quite simply, the HTML 4.01 form element only allows the values "POST" and "GET" in its method attribute
Some proxy servers with tough security policies might drop them. I'm using PUT and DELETE anyways.
I've read that some browsers do not support other HTTP methods properly, though I can't name any specifics.
Rails, in particular, will pack your forms with a method parameter to explicitly set this even if the browser doesn't support those methods. That seems like a reasonable precaution if you're going to do this.
I say use all the features of HTTP, browsers be damned, lol. Maybe it'll inspire more complete and proper use of the HTTP protocol moving forward. There's more happening on the net than just POSTs and GETs. About time browser implementations reflected this.
This depends on your browser and Ajax library. For example jQuery supports all HTTP methods even though the browser may not. See for example the jQuery "ajax" documentation on the "type" attribute.
The Restlet Java framework lets you tunnel PUT and DELETE requests through HTML POST operations. To do this, you just add method=put or method=delete to your URI's query string, eg:
http://www.example.com/user=xyz?method=delete ...
This is the same as Ruby on Rails' approach (as described by #ars above).
Personally, I really don't see any purpose for using PUT or DELETE in a web application. All operations that an application performs are read or write, aka input output. Why do you need to distinguish the nature of the operation in the header of the HTTP request?
I could make ajax calls with the same url of form /object/object_id
and do multiple operations like delete, update, get the value, or create.
Just by looking at the URL, I have no clue which one it is.
By using GET and POST only, my urls will be:
/object/id/delete
/object/id/create
/object/id/update
/object/id --> implied GET
etc.
Based on my limited experience, this is a lot cleaner than hidden header request types in many cases.
I am not saying one should never use PUT or DELETE, just saying, use them only if absolutely needed.
Refer to "RESTful Web API" by Leonard Richardson to read more about different use cases and conventions regarding HTTP request methods in a RESTful web api.
I know that using non-GET methods (POST, PUT, DELETE) to modify server data is The Right Way to do things. I can find multiple resources claiming that GET requests should not change resources on the server.
However, if a client were to come up to me today and say "I don't care what The Right Way to do things is, it's easier for us to use your API if we can just use call URLs and get some XML back - we don't want to have to build HTTP requests and POST/PUT XML," what business-conducive reasons could I give to convince them otherwise?
Are there caching implications? Security issues? I'm kind of looking for more than just "it doesn't make sense semantically" or "it makes things ambiguous."
Edit:
Thanks for the answers so far regarding prefetching. I'm not as concerned with prefetching since is mostly surrounding internal network API use and not visitable HTML pages that would have links that could be prefetched by a browser.
Prefetch: A lot of web browsers will use prefetching. Which means that it will load a page before you click on the link. Anticipating that you will click on that link later.
Bots: There are several bots that scan and index the internet for information. They will only issue GET requests. You don't want to delete something from a GET request for this reason.
Caching: GET HTTP requests should not change state and they should be idempotent. Idempotent means that issuing a request once, or issuing it multiple times gives the same result. I.e. there are no side effects. For this reason GET HTTP requests are tightly tied to caching.
HTTP standard says so: The HTTP standard says what each HTTP method is for. Several programs are built to use the HTTP standard, and they assume that you will use it the way you are supposed to. So you will have undefined behavior from a slew of random programs if you don't follow.
How about Google finding a link to that page with all the GET parameters in the URL and revisiting it every now and then? That could lead to a disaster.
There's a funny article about this on The Daily WTF.
GETs can be forced on a user and result in Cross-site Request Forgery (CSRF). For instance, if you have a logout function at http://example.com/logout.php, which changes the server state of the user, a malicious person could place an image tag on any site that uses the above URL as its source: http://example.com/logout.php. Loading this code would cause the user to get logged out. Not a big deal in the example given, but if that was a command to transfer funds out of an account, it would be a big deal.
Good reasons to do it the right way...
They are industry standard, well documented, and easy to secure. While you fully support making life as easy as possible for the client you don't want to implement something that's easier in the short term, in preference to something that's not quite so easy for them but offers long term benefits.
One of my favourite quotes
Quick and Dirty... long after the
Quick has departed the Dirty remains.
For you this one is a "A stitch in time saves nine" ;)
Security:
CSRF is so much easier in GET requests.
Using POST won't protect you anyway but GET can lead easier exploitation and mass exploitation by using forums and places which accepts image tags.
Depending on what you do in server-side using GET can help attacker to launch DoS (Denial of Service). An attacker can spam thousands of websites with your expensive GET request in an image tag and every single visitor of those websites will carry out this expensive GET request against your web server. Which will cause lots of CPU cycle to you.
I'm aware that some pages are heavy anyway and this is always a risk, but it's bigger risk if you add 10 big records in every single GET request.
Security for one. What happens if a web crawler comes across a delete link, or a user is tricked into clicking a hyperlink? A user should know what they're doing before they actually do it.
I'm kind of looking for more than just "it doesn't make sense semantically" or "it makes things ambiguous."
...
I don't care what The Right Way to do things is, it's easier for us
Tell them to think of the worst API they've ever used. Can they not imagine how that was caused by a quick hack that got extended?
It will be easier (and cheaper) in 2 months if you start with something that makes sense semantically. We call it the "Right Way" because it makes things easier, not because we want to torture you.