I want to better my understanding behind the mechanisms of the Http protocol from a web application dev's perspective. I want to clear my confusion about what objects are involved such as the session object and the request object and when are they generated and terminated and what attributes of them we would commonly use during a web application. At the expense of not making much sense.. I won't say too much more. I just wish to be pointed to a good source of knowledge on this whether it be a book/video/web page/or a detailed response to this post. Thanks kindly.
Try this for HTTP and REST.
Here's a good place to start for sessions.
Wikipedia is a really good source for all HTTP topics as well.
Related
I'm designing a RESTful service aligning to HATEOAS principles as much as possible. As a result, I need a way to have my cool URLs return a list of links describing available options. I'm using HAL-JSON to facilitate the data format so that's all good, but I'm now considering what HTTP method should pull this.
I'm sure I could stick with a simple GET, but from reading over the HTTP RFC, it seems that OPTIONS might fit the bill here. My only concern is in bold:
9.2 OPTIONS
The OPTIONS method represents a request for information about the
communication options available on the request/response chain
identified by the Request-URI. This method allows the client to
determine the options and/or requirements associated with a resource,
or the capabilities of a server, without implying a resource action or
initiating a resource retrieval.
Responses to this method are not cacheable.
Could someone with more experience on the standards side of the web please explain why this is the case? In my view, you would certainly want clients caching this result at least for a short period of time, as in a fully HATEOAS system this call is likely to be made quite frequently to traverse the rel links to arrive at the operation you're looking for.
I'd also love some opinions on using OPTIONS vs a simple GET for retrieval of operations from a cool URL.
The OPTION HTTP request returns the available methods which can be performed on a resource. (The objects methods)
I can not say for certain why you can not cache the response, but its most likely a precaution. Caching would have little value for the OPTION http method.
A Resource is "any information that can be given a name", that name is its URI. the response from the OPTIONs request is only a list of methods that can be requested on this resource (e.g. "GET PUT POST" maybe the response). To actually get at the information stored, you must use the GET method.
It's not cacheable, period. Sorry.
While reading some articles about writing web servers using Twisted, I came across this page that includes the following statement:
While it's convenient for this example, it's often not a good idea to
make a resource that POSTs to itself; this isn't about Twisted Web,
but the nature of HTTP in general; if you do this, make sure you
understand the possible negative consequences.
In the example discussed in the article, the resource is a web resource retrieved using a GET request.
My question is, what are the possible negative consequences that can arrive from having a resource POST to itself? I am only concerned about the aspects related to the HTTP protocol, so please ignore the fact that I mentioned about Twisted.
The POST verb is used for making a new resource in a collection.
This means that POSTing to a resource has no direct meaning (POST endpoints should always be collections, not resources).
If you want to update your resource, you should PUT to it.
Sometimes, you do not know if you want to update or create the resource (maybe you've created it locally and want to create-or-update it). I think that in that case, the PUT verb is more appropriate because POST really means "I want to create something new".
There's nothing inherently wrong about a page POSTing back to itself - in fact, many of the widely-used frameworks (ASP.NET, etc.) use that method to handle various events that happen on the client - some data is posted back to the same page where the server processes it and sends a new reponse.
In RESTful web service, is OPTIONS method is supposed to be used to provide a list of available services? If yes, is that mandatory, or just a good programming practice.
Thank you
A response to an OPTIONS request should provide some information about what the client can do with the resource in question. The most obvious example is to show which methods the resource will respond to, probably using an Allow header. You could also respond with the Accept-Ranges header to show when you support range requests.
In practice, though, the Allow header is the only common use of the OPTIONS method, and even then implementation is far from universal. So it’s a nice feature to have, but isn’t likely to make a huge difference in a real world service.
It's definitively not mandatory but certainly an option.
But keep in mind that to use OPTIONS for service discovery you'll need extension header fields or a custom response media type, in which case GET might work better in practice.
Most of the times, websites mainly only use GET and POST for all operations, yet there are seven more verbs out there. Where they used in older times but not so much now?
Or maybe is it because some browsers don't recognize the other verbs? And if that's the case, why do browser vendors choose to implement half of the protocol?
[Update]
I found this article which gives a good summary about the situation: Why REST failed.
The HTML spec is a big culprit, only really allowing GET, POST and HEAD. They are getting used quite a bit though, but not as much directly in browsers.
The most common uses of the other crud-verbs such as PUT and DELETE are in REST services and WebDAV.
You'll see OPTIONS more in the future, as it's used by the CORS specification (cross domain xmlhttprequest).
TRACE is pretty much disabled everywhere, as it poses a pretty big security risk. CONNECT is definitely used quite a bit by proxies.
PATCH is brand new. While it's odd to me that they decided to add it to the list (but not PROPFIND, MKCOL, ACL, LOCK, and so on), I do think we'll see it appear more in the future in RESTful services.
Addendum: The original browser used both GET and PUT (the latter for updating web pages). Later browsers pretty much became read-only until forms and the POST request made their way into the specifications.
Most of them are still used, though not as widely as GET or POST. For example RESTful web services use PUT & DELETE as well as GET & POST:
RESTful Web Service - Wiki Article
HEAD is very useful for server debugging of the HTTP headers, but as it doesn't return the response body, it's not much use to the browser / average web visitor...
Other verbs like TRACE aren't as widespread, because of potential security concerns etc. Mentioned briefly in the Wiki article:
HTTP Protocol Methods - Wiki Article
A decade later, these other verbs are used very commonly in RESTful APIs, which back nearly all of today's ubiquitous SPA applications and many mobile applications.
Though, interest in REST as an API structure is beginning to wane with the advent of GraphQL and growing interest in functional programming styles which benefit from RPC-style API structures.
I'm developing a wrapper around an existent RESTful API. I basically have to do some preprocessing, calling the underlying API, and some preprocessing, with a little bit of cache in the middle. The API is specially designed for RESTful access via http.
My question is, should I refactor the API so I can invoke it via code, or should I make local http calls to it. This second option seems nice since it increases decoupling, but I'm afraid that creating the http requests / responses can seriously affect performance. I've heard though that couchDB does something like that (its api is RESTful and accessed via http).
No one can answer this for you, as it will depend hugely on how your current RESTful API is implemented. For example, you can write a relatively short C program that will listen on a socket and handle HTTP requests - if it does RESTful things in response to the different HTTP methods then it's an implementation of a RESTful API and can have very very little overhead over just calling the underlying functions directly (without HTTP). On the other hand, you can write your program as this bloated, heavy Java EE monster - in that case, the overhead may be quite large.
Thus, skaffman was correct to say "Measure it and see" - this really is the only way to get a meaningful answer.
All that said, if you are asking this question, odds are good that you're not really facing a Google-scale problem, so if the refactoring is going to be a lot of work and just intercepting HTTP requests is easy for you then I'd suggest you first get the functionality you need with the HTTP wrapper and only once you have a working product start worrying about performance optimization.
Look at section 5.1.6 in the REST Dissertation about layered systems. What you are actually describing fits very nicely into this idea of a layered architecture. Effectively you are building a HTTP proxy that will intercept the incoming requests, do some work and then and then pass it along to the next layer.
I'd refactor it. You used to have some set of functionality exposed by a RESTful API. You've now got a set of functionality exposed by a RESTful API and by your wrapper. You should refactor the code so that it can do both. It should be easy if your code is reasonably well organized.
When in doubt, err on the side of doing less work. Write the wrapper and test it. Refactor if you have to.