Proper way to include data with an HTTP PATCH request - http

When I'm putting together an HTTP PATCH request, what are my options to include data outside of URL parameters?
Will any of the following work, and what's the most common choice?
multipart/form-data
application/x-www-form-urlencoded
Raw JSON
...any others?

There are no restrictions on the entity bodies of HTTP PATCH requests as defined in RFC 5789. So in theory, your options in this area are unlimited.
In my opinion the only sensible choice is to use the same Content-Type used to originally create the resource. The most common choice is application/json simply because most modern APIs utilize JSON as their preferred data transfer format.
The only relevent statement RFC 5789 makes in regard to what should and shouldn't be part of your PATCH entity body is silent on the matter of Content-Type:
the enclosed entity contains a set of instructions describing how a resource currently residing on the origin server should be modified to produce a new version.
In summary, how you choose to modify resources in your application is entirely up to you.

As rdlowrey writes, RFC 5789 does not mandate specific content types, so the choice of format is up to you.
However, using the general formats you listed or making up your own format is not interoperable, and developers could have a hard time figuring out the semantics you chose. An official erratum to the RFC states this in a more formal way:
The means of applying a PATCH request to a resource's state is
determined by the request's media type. If a server receives a PATCH
request with a media type whose specification does not define
semantics specific to PATCH, the server SHOULD reject the request by
returning the 415 Unsupported Media Type status code, unless a more
specific error status code takes priority.
In particular, servers SHOULD NOT assume PATCH semantics for generic
media types that don't define them, such as application/xml or
application/json. Doing so will cause interoperability issues,
because the semantics of PATCH become specific to that resource,
rather than general.
(Quote formatted for readability, but unchanged otherwise)
One media type whose specification defines PATCH semantics is application/json-patch+json, also called JSON Patch: RFC 6902. I suppose it could be considered the "standard" choice (at least) when dealing with data originally posted as JSON.

The PATCH method is defined in the RFC 5789. This document, however, doesn't enforce any media type for the payload:
The PATCH method requests that a set of changes described in the request entity be applied to the resource identified by the Request-URI. The set of changes is represented in a format called a "patch document" identified by a media type.
Other RFCs, released years later, define some media types for describing a set of changes to the applied to a resource, suitable for PATCHing:
application/json-patch+json
Defined in the RFC 6902:
JSON Patch defines a JSON document structure for expressing a sequence of operations to apply to a JavaScript Object Notation (JSON) document; it is suitable for use with the HTTP PATCH method. The application/json-patch+json media type is used to identify such patch documents.
application/merge-patch+json
Defined in the RFC 7396:
This specification defines the JSON merge patch format and processing rules. The merge patch format is primarily intended for use with the HTTP PATCH method as a means of describing a set of modifications to a target resource's content.

Related

Why does 'x-www-form-urlencoded' begin with 'x-www', when other standard content types do not?

I understand that in the past, it was standard for custom headers names to use the prefix "X-" (I'm aware it no longer is considered standard to do this), but I've been unable to find if there is any relationship between this naming convention and the value ("application/x-www-form-urlencoded"). Did it start out as a custom content-type value that was later adopted or something?
I found this link here, which certainly was interesting, but have been unable to find the answer to my question.
Does anybody know the reason this prefix was chosen, and what it signifies?
it was standard for custom headers names to use the prefix "X-"
Actually … no, not at all. To be precise: It has never been a standard, just a best practice. It allowed implementors to introduce new content types and codings without the need to write an entire RFC for it. Nowadays the IANA Media Type Registry is good for that. RFC 6648 put an end to this practice.
The reason application/x-www-form-urlencoded is prefixed in this way (it is listed as a proper MIME type in said registry, btw)) stems from the fact that it is a "custom" method of structuring the query string in a URL. That part has never seen proper regulation. The people behind HTML just went and did it, which fully justified the prefix.
As far as the history: it has the x- prefix because it originated in a proposal from Mosaic—and since it was just a proposal, they used that x- extension prefix to initially define it. But then other browsers implemented it that way too, and nobody ever got around to taking the time to properly standardize an unprefixed alternative, so it just stuck that way, and here were are now.
It can be traced back to a 1993 thread on the www-talk mailing list titled “Submitting input-form data to server”, and in that thread, a September 1993 message from Marc Andreessen:
This is what we're doing in Mosaic 2.0… See
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/fill-out-forms/overview.html
...for details on what we're up to
That link is broken now but the document, titled “Mosaic for X version 2.0 Fill-Out Form Support” is archived at archive.org. Here’s the relevant excerpt:
ENCTYPE specifies the encoding for the fill-out form contents. This attribute only applies if METHOD is set to POST -- and even then, there is only one possible value (the default, application/x-www-form-urlencoded) so far.
Anyway, application/x-www-form-urlencoded is now formally defined in the URL spec, with algorithms for parsing and serializing it—though the section it’s all defined in has this note:
The application/x-www-form-urlencoded format is in many ways an aberrant monstrosity, the result of many years of implementation accidents and compromises leading to a set of requirements necessary for interoperability, but in no way representing good design practices. In particular, readers are cautioned to pay close attention to the twisted details involving repeated (and in some cases nested) conversions between character encodings and byte sequences. Unfortunately the format is in widespread use due to the prevalence of HTML forms.

Using "application/octet-stream" as a content type in POST

I'm thinking about implementing an RPC mechanism over HTTP. The POST method seems to be suitable for the calls. However, since each call comes with the binary payload, there's a decision to be made about how to attach that binary data to the POST request. It seems there are two content types for POSTs of use today: application/x-www-form-urlencoded and multipart/form-data. The former seems to require percent-escaping binary data, while the latter seems to add some overhead with the boundaries/content-disposition fields.
Therefore my question is: how good is it to just use application/octet-stream as a POST content type and just include the binary payload afterwards as is? Will it go through all proxies? Will all HTTP servers be able to handle this? Is it standards-compliant? In other words, should I go for it?
Yes, you can do that; but it would be better to use a more specific type that makes the message self-descriptive.

What RFC defines arrays transmitted over HTTP?

What RFC defines the passing arrays over HTTP? Most web application platforms allow you to supply an array of arguments over GET or POST. The following URL is an example:
http://localhost/?var[1]=one&var[2]=two&var[3]=three
RFC1738 defines URLs, however the bracket is missing from the Backus–Naur Form(BNF) definition of the URL. Also this RFC doesn't cover POST. Ideally I would like to get the BNF for this feature as defined in the RFC.
According to Wikipedia, there is no single spec:
While there is no definitive standard, most web frameworks allow multiple values to be associated with a single field (eg. field1=value1&field1=value2&field2=value3)
That Wikipedia article links to the following Stack Overflow post, which covers a similar question: Authoritative position of duplicate HTTP GET query keys
The issue here is that form parameters can be whatever you want them to be. Some web frameworks have settled on key[number]=value for arrays, others haven't. Interestingly, RFC1866 section 8.2.4, page 48 (note: this RFC is historical and not current) shows an example with the same key used twice in a form POST:
name=John+Doe
&gender=male
&family=5
&city=kent
&city=miami
&other=abc%0D%0Adef
&nickname=J%26D
On the W3C side of things, HTML 4.01 has some information about how to encode form parameters. Sadly this doesn't cover arrays.
At the time of writing, I don't think there is a correct answer to your question - no IETF RFC or W3C spec defines the behavior that you're interested in.
(As a side note, the W3C HTML JSON form submission draft spec covers posting arrays, thank goodness.)
URIs are defined by RFC 3986.
However, what you're asking about is encoding of form parameters. You need to look up the HTML spec for that.

Purpose of +xml in HTTP MIME type

What's the significance of the +xml in the following HTTP Accept Header:
application/vnd.google-earth.kml+xml
Is that just to denote it's an XML based format, or that it's suitable for XML editors, or something completely different?
http://www.iana.org/assignments/media-types/application/vnd.google-earth.kml+xml
Check the RFC:
Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types?
Although the use of a suffix was not considered as part of the
original MIME architecture, this choice is considered to provide the
most functionality with the least potential for interoperability
problems or lack of future extensibility. The alternatives to the
'+xml' suffix and the reason for its selection are described below.
There is a whole list of reasons listed underneath, but I don't think copying them verbatim falls under fair-use. So check the RFC for their full (long!) story of how this came to be.

POSTing to a URI with GET query params?

I stumbled upon some code the other day that was making use of query params specified in the URI while at the same time being an HTTP POST.
I was just wondering, is the interpretation of these fields vendor specific? Do the RFCs say anything specific about it? And if a parameter exists in both, which one wins out?
To illustrate better, the query looked something like this:
POST /posts/?user=bob HTTP/1.1
user=bill&title=Test&content=Testing+Content
Thanks
This is perfect legal. Many frameworks have support for it for example the Servlet API even specifies the priority (order) of the arguments as they appear in getParameters(String) which will provide the query parameter first. For example this is also legal, not the parameter names are the same.
POST /path?param1=value HTTP/1.1
Host: localhost
param1=value&param2=value
This is also valid according to the HTTP/1.1 RFC, a look at RFC 2616.
It should not be vendor specific, and most comprehensive frameworks will support it.
There is no trumping. The GET and POST values are passed as separate collections.
I do this occasionally. Usually i'll put the actual update fields in the post data, with query data used to format the response

Resources