POSTing to a URI with GET query params? - http

I stumbled upon some code the other day that was making use of query params specified in the URI while at the same time being an HTTP POST.
I was just wondering, is the interpretation of these fields vendor specific? Do the RFCs say anything specific about it? And if a parameter exists in both, which one wins out?
To illustrate better, the query looked something like this:
POST /posts/?user=bob HTTP/1.1
user=bill&title=Test&content=Testing+Content
Thanks

This is perfect legal. Many frameworks have support for it for example the Servlet API even specifies the priority (order) of the arguments as they appear in getParameters(String) which will provide the query parameter first. For example this is also legal, not the parameter names are the same.
POST /path?param1=value HTTP/1.1
Host: localhost
param1=value&param2=value
This is also valid according to the HTTP/1.1 RFC, a look at RFC 2616.
It should not be vendor specific, and most comprehensive frameworks will support it.

There is no trumping. The GET and POST values are passed as separate collections.

I do this occasionally. Usually i'll put the actual update fields in the post data, with query data used to format the response

Related

Documentation for Rebol2's read/custom?

I've been trying to update Ross-Gill's Twitter API for REBOL2 to support uploading media. From looking at its source, the REBOL cookbook, the codeconscious site, and other questions here, my understanding is that read/custom is the preferred way to POST data to websites.
However, I haven't been able to find any real documentation on read/custom. For example: Does it support sending multipart/form-data? (I've managed to work around this by manually composing each part, but it doesn't seem to work for all image files on Twitter's end and is a bit of a hack). Does read/custom only return text on an HTTP/1.0 200 OK response? (It appears so, which is problematic when I receive HTTP/1.0 202 Accepted and need to read the resulting data). Is there a reason that read/custom/binary doesn't appear to send binary data correctly without converting the data using to-string?
TL;DR: Is there good documentation on REBOL2's read/custom somewhere? Alternatively, is read/custom only meant for basic POSTs and I should be using ports and handling the HTTP responses manually?
You guessed right, read/custom is meant for simple HTTP posts, handling web forms data only (that is why it will fail on binary data). No official documentation for it. But that is not an issue as you can access the source code of the HTTP implementation:
probe system/schemes/HTTP
There you can see that /custom refinement supports two keywords, post and header (for setting custom HTTP headers). It also appears that even if you use both keywords, Content-Type will be forced to application/x-www-form-urlencoded no matter what (which is probably the reason why your binary data gets rejected by the server, as the provided mime type is wrong).
In order to work around that, you can save the HTTP object, modify its implementation to fit your needs and reload it.
Saving:
save %http-scheme.r system/schemes/HTTP
Reloading:
system/schemes/HTTP: do load %http-scheme.r
If you just disable the hard-coded Content-Type setting in the HTTP code, and then provide your own one using header keyword, it should work fine, even with binary data:
read/custom <url> [header [Content-Type: <...>] post <data>]
Hope this helps.

Is it true that POST can be used instead of GET in all scenarios?

I've read lots of articles about the differences between GET and POST. Lots of them are available here at StackOverflow.
A summary of the important differences is:
Post can send its information via body while GET should not (but I think it can be done practically)
Some browsers cache the GET results and rely on the idempotent behavior of GET requests.
Using GET is much easier than using POST for most of developers.
Concluding this summary, Using GET in POST situations is bad and dangerous.
But is it true that ignoring the easiness, POST can be used as a replacement of the GET requests as it seems it totally covers the GET requirements.
To clarify that I'm not crazy!, I'm not going to use POST instead of GET. This question is just about to check if I understand the GET and POST difference correctly.
No, POST is not a replacement of GET requests. There are two important things that a POST request cannot do that a GET request can.
You cannot generate a POST request simply by typing a URL in the address bar of the browser. This always generates a GET request.
You cannot generate a POST requesting using an ordinary link in HTML. This has far-reaching consequences. You cannot find a page that is only accessible using a POST request with any search engine, and you cannot link to it unless it is done by an HTML form or using Javascript.
Its a good practice that you classify your transaction. These methods are very important specially when you are developing an API Service Oriented architecture or even Single Page Applications.
GET - used to retrieve a dataset. (also has a limitation for url length. parameters are exposed and urlencoded.)
POST - Saving/adding (this is more secure)
EX:
GET /items - means you are getting the list of items.
POST /items - means you are saving/adding item(s)
and later you might need to learn PUT and DELETE too.
But for now, always use POST in your form or ajax request when saving/adding data. and GET when retrieving data.

Using "application/octet-stream" as a content type in POST

I'm thinking about implementing an RPC mechanism over HTTP. The POST method seems to be suitable for the calls. However, since each call comes with the binary payload, there's a decision to be made about how to attach that binary data to the POST request. It seems there are two content types for POSTs of use today: application/x-www-form-urlencoded and multipart/form-data. The former seems to require percent-escaping binary data, while the latter seems to add some overhead with the boundaries/content-disposition fields.
Therefore my question is: how good is it to just use application/octet-stream as a POST content type and just include the binary payload afterwards as is? Will it go through all proxies? Will all HTTP servers be able to handle this? Is it standards-compliant? In other words, should I go for it?
Yes, you can do that; but it would be better to use a more specific type that makes the message self-descriptive.

HTTP requests and querystring vs headers?

I'm trying to understand the difference between querystrings and headers. Where do you use each?
Query strings might be more useful in making URLs human readable I suppose, but other than that, wouldn't it be easier to just embed that in your own custom HTTP header (side question, but how this relate to cookies?)? What's the distinction between the two?
Refer a similar question Adding Custom HTTP Headers
Why would I prefer query string over http-header fields?
It is easy
I don't need any additional API
It is also recommended in
HTTP-RFC to "follow common-forms" when it comes to header
fields.

Do search engines respect the HTTP header field “Content-Location”?

I was wondering whether search engines respect the HTTP header field Content-Location.
This could be useful, for example, when you want to remove the session ID argument out of the URL:
GET /foo/bar?sid=0123456789 HTTP/1.1
Host: example.com
…
HTTP/1.1 200 OK
Content-Location: http://example.com/foo/bar
…
Clarification:
I don’t want to redirect the request, as removing the session ID would lead to a completely different request and thus probably also a different response. I just want to state that the enclosed response is also available under its “main URL”.
Maybe my example was not a good representation of the intent of my question. So please take a look at What is the purpose of the HTTP header field “Content-Location”?.
I think Google just announced the answer to my question: the canonical link relation for declaring the canonical URL.
Maile Ohye from Google wrote:
MickeyC said...
You should have used the Content-Location header instead, as per:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
"14.14 Content-Location"
#MikeyC: Yes, from a theoretical standpoint that makes sense and we certainly considered it. A few points, however, led us to choose :
Our data showed that the "Content-Location" header is configured improperly on many web sites. Sometimes webmasters provide long, ugly URLs that aren’t even duplicates -- it's probably unintentional. They're likely unaware that their webserver is even sending the Content-Location header.
It would've been extremely time consuming to contact site owners to clean up the Content-Location issues throughout the web. We realized that if we started with a clean slate, we could provide the functionality more quickly. With Microsoft and Yahoo! on-board to support this format, webmasters need to only learn one syntax.
Often webmasters have difficulty configuring their web server headers, but can more easily change their HTML. rel="canonical" seemed like a friendly attribute.
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html?showComment=1234714860000#c8376597054104610625
Most decent crawlers do follow Content-Location. So, yes, search engines respect the Content-Location header, although that is no guarantee that the URL having the sid parameter will not be on the results page.
In 2009 Google started looking at URIs qualified as rel=canonical in the response body.
Looks like since 2011, links formatted as per RFC5988 are also parsed from the header field Link:. It is also clearly mentioned in the Webmaster Tools FAQ as a valid option.
Guess this is the most up-to-date way of providing search engines some extra hypermedia breadcrumbs to follow - thus allow keeping you to keep them out of the response body when you don't actually need to serve it as content.
In addition to using 'Location' rather than 'Content-Location' use the proper HTTP status code in your response depending on your reason for redirect. Search engines tend to favor permanent redirect (301) status vs temporary (302) status.
Try the "Location:" header instead.

Resources