Why isn't HTTP PUT allowed to do partial updates in a REST API? - http

Who says RESTful APIs must support partial updates separately via HTTP PATCH?
It seems to have no benefits. It adds more work to implement on the server side and more logic on the client side to decide which kind of update to request.
I am asking this question within the context of creating a REST API with HTTP that provides abstraction to known data models. Requiring PATCH for partial updates as opposed to PUT for full or partial feels like it has no benefit, but I could be persuaded.
Related
http://restcookbook.com/HTTP%20Methods/idempotency/ - this implies you don't have control over the server software that may cache requests.
What's the justification behind disallowing partial PUT? - no clear answer given, only reference to what HTTP defines for PUt vs PATCH.
http://groups.yahoo.com/neo/groups/rest-discuss/conversations/topics/17415 - shows the divide of thoughts on this.

Who says? The guy who invented REST says:
#mnot Oy, yes, PATCH was something I created for the initial HTTP/1.1 proposal because partial PUT is never RESTful. ;-)
https://twitter.com/fielding/status/275471320685367296
First of all, REST is an architectural style, and one of its principles is to leverage on the standardized behavior of the protocol underlying it, so if you want to implement a RESTful API over HTTP, you have to follow HTTP strictly for it to be RESTful. You're free to not do so if you think it's not adequate for your needs, nobody will curse you for that, but then you're not doing REST. You'll have to document where and how you deviate from the standard, creating a strong coupling between client and server implementations, and the whole point of using REST is precisely to avoid that and focus on your media types.
So, based on RFC 7231, PUT should be used only for complete replacement of a representation, in an idempotent operation. PATCH should be used for partial updates, that aren't required to be idempotent, but it's a good to make them idempotent by requiring a precondition or validating the current state before applying the diff. If you need to do non-idempotent updates, partial or not, use POST. Simple. Everyone using your API who knows how PUT and PATCH works expects them to work that way, and you don't have to document or explain what the methods should do for a given resource. You're free to make PUT act in any other way you see fit, but then you'll have to document that for your clients, and you'll have to find another buzzword for your API, because that's not RESTful.
Keep in mind that REST is an architectural style focused on long term evolution of your API. To do it right will add more work now, but will make changes easier and less traumatic later. That doesn't mean REST is adequate for everything and everyone. If your focus is the ease of implementation and short term usage, just use the methods as you want. You can do everything through POST if you don't want to bother about clients choosing the right methods.

To extend on the existing answer, PUT is supposed to perform a complete update (overwrite) of the resource state simply because HTTP defines the method in this way. The original RFC 2616 about HTTP/1.1 is not very explicit about this, RFC 7231 adds semantic clarifications:
4.3.4 PUT
The PUT method requests that the state of the target resource be created or replaced with the state defined by the representation enclosed in the request message payload. A successful PUT of a given representation would suggest that a subsequent GET on that same target resource will result in an equivalent representation being sent in a 200 (OK) response.
As stated in the other answer, adhering to this convention simplifies the understanding and usage of APIs, and there is no need to explicitly document the behavior of the PUT method.
However, partial updates are not disallowed because of idempotency. I find this important to highlight, as these concepts are often confused, even on many StackOverflow answers (e.g. here).
Idempotent solely means that applying a request one or many times results in the same effect on the server. To quote RFC 7231 once more:
4.2.2 Idempotent methods
A request method is considered "idempotent" if the intended effect on the server of multiple identical requests with that method is the same as the effect for a single such request.
As long as a partial update contains only new values of the resource state and does not depend on previous values (i.e. those values are overwritten), the requirement of idempotency is fulfilled. Independently of how many times such a partial update is applied, the server's state will always hold the values specified in the request.
Whether an intermediate request from another client can change a different part of the resource is not relevant, because idempotency refers to the operation (i.e. the PUT method), not the state itself. And with respect to the operation of a partial overwriting update, its application yields the same effect after being applied once or many times.
On the contrary, an operation that is not idempotent depends on the current server state, therefore it leads to different results depending on how many times it is executed. The easiest example for this is incrementing a number (non-idempotent) vs. setting it to an absolute value (idempotent).
For non-idempotent changes, HTTP foresees the methods POST and PATCH, whereas PATCH is explicitly designed to carry modifications to an existing resource, whereas POST can be interpreted much more freely regarding the relation of request URI, body content and side effects on the server.
What does this mean in practice? REST is a paradigma for implementing APIs over the HTTP protocol -- a convention that many people have considered reasonable and is thus likely to be adopted or understood. Still, there are controversies regarding what is RESTful and what isn't, but even leaving those aside, REST is not the only correct or meaningful way to build HTTP APIs.
The HTTP protocol itself puts constraints on what you may and may not do, and many of them have actual practical impact. For example, disregarding idempotency may result in cache servers changing the number of requests actually issued by the client, and subsequently disrupt the logic expected by applications. It is thus crucial to be aware of the implications when deviating from the standard.
Being strictly REST-conform, there is no completely satisfying solution for partial updates (some even say this need alone is against REST). The problem is that PATCH, which first appears to be made just for this purpose, is not idempotent. Thus, by using PATCH for idempotent partial updates, you lose the advantages of idempotency (arbitrary number of automatic retries, simpler logic, potential for optimizations in client, server and network). As such, you may ask yourself if using PUT is really the worst idea, as long as the behavior is clearly documented and doesn't break because users (and intermediate network nodes) rely on certain behavior...?

Partial updates are allowed by PUT (according to RFC 7231 https://www.rfc-editor.org/rfc/rfc7231#section-4.3.4).
",... PUT request is defined as replacing the state of the target resource." - replacing part of object basically change state of it.
"Partial content updates are possible by targeting a separately identified resource with state that overlaps a portion of the larger resource, ..."
According to that RFC next request is valid: PUT /resource/123 {name: 'new name'}
It will change only name for specified resource. Specifying id inside request payload would be incorrect (as PUT not allow partial updates for unspecified resources).
PS: Below is example when PATCH is useful.
There is object that have Array inside. With PUT you can't update specific value. You only could replace whole list to new one. With PATCH, you could replace one value to another. With maps and more complex objects benefit will be even bigger.

Related

Generate temporary URIs

I have an application that, for a specific resource at URI x, generates a system of URIs such as x#a, x#b and so on, and stores them in an RDF graph at any position in a triple, even as predicates.
This is fine but the issue arises when the resource that should be described in this way is provided "inline" i.e. without a URI. In RDF I have the option to use blank nodes, but there are two issues I have with them:
Blank nodes are mostly regarded with existential meaning, i.e. they denote some node. The nodes produced by the application are not just some nodes – they are exactly the nodes produced from one specific resource that was given to them at one specific time. In contrast, a blank node can be "satisfied" by a concrete URI node with the same relations to other nodes, becoming redundant in presence of such a node.
Blank nodes cannot be unfortunately used as predicates in RDF (or at least in some formatting schemes).
This means that I have to find a URI, but I am wondering what scheme to choose. So far I have considered these options:
Generate a new UUID and use urn:uuid:. This is sufficient, but I would prefer something that expresses the temporary status of the URI better, so that it would be apparent it was generated for this one case only. In comparison, UUIDs can be used as stable identifiers for more permanent resources as well, and there is no inherent distinction.
Use a proprietary URI, like nodeID: which is sometimes used for blank nodes, but is still a valid URI nonetheless. I would append a new UUID to that as well. This seems a bit better to me since it mimics blank nodes but without their different semantics, but I would still prefer something more standardized.
Use tag: URIs. Tags have a time component to them, so the temporal notion is expressed sufficiently. However, I am not sure about the authority since whoever runs the application will in essence become the authority. Another option is to use my own domain and somehow "grant" the rights of minting tags to anyone running the application (or just to anyone in general). This way, I can express everything I need (the temporary notion of the resource, the uniqueness since I would also include a new UUID in the URI, and a description of the purpose and use of the tags on the website itself).
This is in my opinion not a bad solution, but it somewhat limits the usage of the application, as once my control of the domain expires, users of the application will no longer have the rights to mint new tags under the original authority, and they will have to find a new one (unless I leave the authority blank; while it is not strictly stated by the specification, anything is technically valid since implementations must not reject non-conforming authorities). I am also concerned about the time component itself, as it might pose a security risk (although it can be less specific). I am also not overly accustomed to the concept of tags, so I am not sure if it isn't against their spirit.
I can provide all of these methods as options in the application, but I would like to know if there are perhaps better options.

Etag: weak vs strong example

I have been reading about Etags, and I understand that there are 2 ways of generating an Etag, weak and strong. Weak Etags are computationally easier to generate than strong ones. I have also come to know that Weak Etags are practically enough for most use cases.
from MDN
Weak validators are easy to generate but are far less useful for
comparisons. Strong validators are ideal for comparisons but can be
very difficult to generate efficiently.
another snippet:
Weak Etag values of two representations of the same resources might be
semantically equivalent, but not byte-for-byte identical.
I am finding it hard to understand what does it mean for a resource to be semantically similar but not byte by byte same ? It would be great to see some examples.
EDIT: found an example here, but i don't get it:
Weak Validation: The two resource representations are semantically
equivalent, e.g. some of the content differences are not important
from the business logic perspective e.g. current date displayed on the
page might not be important for updating the entire resource for it.
Is it like while generating the Etag, you can decide that the changes in content are not important for the functionality (for e.g. a css property change for font-size) and respond with 304 ? If yes, then when is the resource updated on the browser, as I guess as long as the Etag is the same , the browser would not get the latest version. In this case it might mean that when a major change happens and a new Etag is created, the css property change would only then be sent to the browser along with the major change.
My suggestion is to look at the specification, RFC 7232, section 2.1. It's only a couple pages long and may answer all of your questions.
You asked for examples, here are some from the specification:
For example, the representation of a weather report that changes in
content every second, based on dynamic measurements, might be grouped
into sets of equivalent representations (from the origin server's
perspective) with the same weak validator in order to allow cached
representations to be valid for a reasonable period of time.
A representation's modification time, if defined with only
one-second resolution, might be a weak validator if it is possible
for the representation to be modified twice during a single second
and retrieved between those modifications.
If the origin server sends the same validator for a representation with
a gzip content coding applied as it does for a representation with no
content coding, then that validator is weak.
That last one represents what is probably the most common use of weak ETags: servers converting strong ETags into weak ones when they gzip the content. Nginx does this, for example.
The specification also explains when to change a weak ETag:
An origin server SHOULD change a weak entity-tag whenever it considers prior
representations to be unacceptable as a substitute for the current
representation.
In other words, it's up to you to decide if two representations of a resource are acceptable substitutions or not. If they are, you can improve caching performance by giving them the same weak ETag.

What are the Spring Web Flow advantages?

Can someone help me understand the advantages of Spring Web Flow.
What I have understood is
All the flows can be centrally configured in an XML file.
Need not have an overhead of carrying over the data from one request to another as it can be done by flow scope.
It helps especially in cases like Breadcrumbs navigation.
Flows can be further divided into Sub Flows to reduce the complexity.
Are there any other ones which I am have not tweaked into?
I'm going to play devil's advocate and say don't use it for anything other than simple use cases. Simple use cases meaning no ajax calls, no modal dialogs, no partial updates just standard html forms/flows for simple persistence (i.e page A -> page B -> Page C where each 'page' maps to a view-state definition in a 1 to 1 relationship all defined in the same flow xml file).
Spring webflow cons:
Yes everything is in xml files in theory it is suppose to be simple but when you have multiple flow xml files each with multiple state definitions and possibly subflow definitions it can become cumbersome to maintain or easily determine what the sequential logic of a flow is. (kind of like the old "GOTO operator" where any part of a flow logic can jump back to any previously or later defined part making the flow logic although seemingly "sequential" in xml... un-intuitive to follow)
Some features of Spring Webflow's documentation are unintuitive or flat out undocumented leading to hours of trial and error. For instance, exception handeling, usauge of 'output' tag (only works in subflow->back to parent caller undocumented), sending back flash view responses to a user is also unintuitive and uses a different container than Spring MVC (many times when a flow ends you want to send a msg to the user that is defined in a controller outside of webflow... but since the flow ended you can't do it with in spring webflow using flashScope container), etc...
Adding subflows although sounds good does not reduce complexity actually increases it. Due to the way subflows are defined. Definitions are long and complex and can be confusing when you have many end-states in both the main parent flow and the child subflows.
Initial setup and configuration can be painful if integrating with certain 3rd party view frameworks like Apache Tiles or Theymeleaf... I recall spending a few hours if not days on this.
State snapshots (Saving the user's input between pages) although a powerful feature from Flow A's view-state_1 <-> Flow A's view-state_2 and vise versa. This does not work between Main Flow A <-> Sub Flow B and vise versa... forcing the developer to manually bind (or rather hack) saving a user's state between Parent main flow's <-> subflows.
Debugging application logic placed inside webflow can be difficult. For instance, in webflow you can assign variables and perform condition checks all using SPEL inside the xml but this tends to be a pitfall. Over time you learn to avoid placing application logic inside the actual webflow xml and only use the xml to call service class methods and to place the returned values in the various scopes (again this hard learned lesson/best practice is undocumented). Also, because you are executing logic using SPEL... refactoring classes, method names, or variables sometimes silently break your application significantly increasing your development time.
fragment rendering... a powerful but unintuitive feature of webflow. Setting up fragment rendering was 1 of the most painful things I had to do with webflow. The documentation was lacking. I think this feature could easily go in the pros side if it was better documented and easy to setup. I actually documented how to use this feature via stackoverflow... How to include a pop-up dialog box in subflow
Static URLs per flow. If your flow has multiple views defined with in 1 flow your URL will not change navigating from view-state to view-state. This can be limiting if you want to control or match the content of the page with a dynamic url.
If your flow is defined in "/WEB-INF/flows/doSumTing/sumting-flow.xml" and your "base-path" is set to "WEB-INF/flows". Then to navigate to your flow you goto http://<your-host>/<your-webapp-name-if-defined>/doSumTing . The flow file name is completely ignored and not used in the URL at all. Although clear now I found this unintuitive when I first started.
Spring Webflow pros:
concept of "scope" containers flowScope, viewScope, flashScope, sessionScope and having easy access to these containers WITH IN A FLOW gives the developer flexibility as these are accessible from anywhere and are mutable.
Easily define states view-state,action-state,decision-state,end-state which clearly defines what each state is doing but as mentioned in the cons... if your application is complex and has MANY different states and transitions constantly Going back and forth... this can clutter your -flow.xml file makes it hard to read or follow the sequential logic. It's only easy if you have simple use cases with a small number of state definitions.
Seldom used but a powerful feature of webflow is flow inheritance. Common flow functionality across multiple flows can be defined in a single abstract parent flow and extended by child flows (similar to an abstract class definition in java). This feature is nice with regards to the DRY principle if you have many flows that share common logic.
Easily defined validation rules and support for JSR-303 (but Spring MVC has this as well)
The output tag can be used to send POJOs back and forth between Main flow <-> subflow. This feature is nice because the parameters don't need to be passed through the url via get/post and can pass as many POJOs as you wish.
Clearly defined views. What the view name is and which model variable it is being mapped to (e.g <view-state id="edit" view="#{flowScope.modelPathName}/editView" model="modelObj"> ). Also in the example just demonstrated can use preprocessing expressions for viewnames or most arguments in webflow... nice feature though not very well documented :/
Conclusion: Spring Webflow project was a good idea and sounds great on paper but the cons make it cumbersome to work with in complex use cases increasing development time significantly. Since a better solution exists for complex use cases (Spring MVC) to me it is not worth investing heavily in web flow because you can achieve the same results with Spring MVC and with a faster development time for both complex and simple use cases. Morever, Spring MVC is actively maintained, has better documentation, and a larger user community. Maybe if some of my cons are addressed it would tip the scales in Webflow's favor but until then I will recommend Spring MVC over webflow.
Note: I might be missing some things but this is what I came up with off the top of my head.
Additionally, you can use the back button and retain state up to the number of snapshots you choose to store.
You also may find this related question useful.

Why do Request.Cookies and Response.Cookies both use the same object?

Request.Cookies and Response.Cookies both contain a collection of HttpCookies, however, the usage of the Cookie object differs in each. For example, the value contained in Request.Cookies["MyCookie"].Expires seems to be useless, since browsers don't actually send the expiration date back to the server with the request. But since this field exists, it causes a lot of confusion with developers assuming the field has meaning, trying to use it, and then inevitably searching to find out why the expiration date is always 1/1/0001. There are other unused fields as well when looking at a cookie in the Response vs the Request because they are used in different ways, so I wonder:
What are the potential design reasons why a single class (HttpCookie) is used for both a request cookie and a response cookie, given the usage concerns noted above?
Edit: I see some people have voted to close this question because it is too opinion based. Someone certainly might know the answer to this, e.g. it was designed this way because of X. I would also be interested in knowing someone's best guess too, if no one outside of MS knows what X is.
Edit 2: Another valid answer would be that it was probably an oversight and they should be different objects.
I never found this in my original searching, but I'm guessing Anthony's response to this question is probably the best I'm going to get. He proposes:
Strictly speaking .NET ought to have used two different types (RequestCookie and ResponseCookie) but instead chose to use the same type for both circumstances.
I'll happily accept an answer that offers valid reasons (or conjecture) for why that choice was made, if it was intentional.

Assigning URIs to RDF Resources

I'm writing a desktop app using Gnome technologies, and I reached the
stage I started planning Semantic Desktop support.
After a lot of brainstorming, sketching ideas and models, writing notes
and reading a lot about RDF and related topics, I finally came up with a
plan draft.
The first thing I decided to do is to define the way I give URIs to
resources, and this is where I'd like to hear your advice.
My program consists of two parts:
1) On the lower level, an RDF schema is defined. It's a standard set of
classes and properties, possible extended by users who want more options
(using a definition language translated to RDF).
2) On the high level, the user defines resources using those classes and
properties.
There's no problem with the lower level, because the data model is
public: Even if a user decides to add new content, she's very welcome to
share it and make other people's apps have more features. The problem is
with the second part. In the higher level, the user defines tasks,
meetings, appointments, plans and schedules. These may be private, and
the user may prefer to to have any info in the URI revealing the source
of the information.
So here are the questions I have on my mind:
1) Which URI scheme should I use? I don't have a website or any web
pages, so using http doesn't make sense. It also doesn't seem to make
sense to use any other standard IANA-registered URI. I've been
considering two options: Use some custom, my own, URI scheme name for
public resources, and use a bare URN for private ones, something like
this:
urn : random_name_i_made_up : some_private_resource_uuid
But I was wondering whether a custom URI scheme is a good decision, I'm
open to hear ideas from you :)
2) How to hide the private resources? On one hand, it may be very useful
for the URI to tell where a task came from, especially when tasks are
shared and delegated between people. On the other hand, it doesn't
consider privacy. Then I was thinking, can I/should I use two different
URI styles depending on user settings? This would create some
inconsistency. I'm not sure what to do here, since I don't have any
experience with URIs. Hopefully you have some advice for me.
1) Which URI scheme should I use?
I would advise the standard urn:uuid: followed by your resource UUID. Using standards is generally to be preferred over home-grown solutions!
2) How to hide the private resources?
Don't use different identifier schemes. Trying to bake authorization and access control into the identity scheme is mixing the layers in a way that's bound to cause you pain in the future. For example, what happens if a user makes some currently private content (e.g. a draft) into public (it's now in its publishable form)?
Have a single, uniform identifier solution, then provide one or more services that may or may not resolve a given identifier to a document, depending on context (user identity, metadata about the content itself, etc etc). Yes this is much like an HTTP server would do, so you may want to reconsider whether to have an embedded HTTP service in your architecture. If not, the service you need will have many similarities to HTTP, you just need to be clear the circumstances in which an identifier may be resolved to a document, what happens when that is either not possible or not permitted, etc.
You may also want to consider where you're going to add the most value. Re-inventing the basic service access protocols may be a fun exercise, but your users may get more value if you re-use standard components at the basic service level, and concentrate instead on innovating and adding features once the user actually has access to the content objects.

Resources