I've been trying to update Ross-Gill's Twitter API for REBOL2 to support uploading media. From looking at its source, the REBOL cookbook, the codeconscious site, and other questions here, my understanding is that read/custom is the preferred way to POST data to websites.
However, I haven't been able to find any real documentation on read/custom. For example: Does it support sending multipart/form-data? (I've managed to work around this by manually composing each part, but it doesn't seem to work for all image files on Twitter's end and is a bit of a hack). Does read/custom only return text on an HTTP/1.0 200 OK response? (It appears so, which is problematic when I receive HTTP/1.0 202 Accepted and need to read the resulting data). Is there a reason that read/custom/binary doesn't appear to send binary data correctly without converting the data using to-string?
TL;DR: Is there good documentation on REBOL2's read/custom somewhere? Alternatively, is read/custom only meant for basic POSTs and I should be using ports and handling the HTTP responses manually?
You guessed right, read/custom is meant for simple HTTP posts, handling web forms data only (that is why it will fail on binary data). No official documentation for it. But that is not an issue as you can access the source code of the HTTP implementation:
probe system/schemes/HTTP
There you can see that /custom refinement supports two keywords, post and header (for setting custom HTTP headers). It also appears that even if you use both keywords, Content-Type will be forced to application/x-www-form-urlencoded no matter what (which is probably the reason why your binary data gets rejected by the server, as the provided mime type is wrong).
In order to work around that, you can save the HTTP object, modify its implementation to fit your needs and reload it.
Saving:
save %http-scheme.r system/schemes/HTTP
Reloading:
system/schemes/HTTP: do load %http-scheme.r
If you just disable the hard-coded Content-Type setting in the HTTP code, and then provide your own one using header keyword, it should work fine, even with binary data:
read/custom <url> [header [Content-Type: <...>] post <data>]
Hope this helps.
Related
By browsing the source code and playing with some toy examples I got to the conclusion that Netty currently (as of 5.0.0 alpha2) supports only multipart/form-data, but not multipart/mixed, at least not as specified in RFC1342 (sec. 7.2). It looks like mixed is supported inside a part in multipart/form-data though.
Is that really the case or am I missing something?
Since I get the very same question, I post here what could be an beginning of answear...
However, the current implementation seems to have 2 limitations:
1) it supports only multipart/form-data. I would like to also be able
to use multipart/mixed, which is very similar on the wire (see
http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html ). I think that
the encoder/decoder could be extended to understand multipart/mixed
and still create the same kinds of HttpDatas.
Yes, the current codec is focused on multipart/form-data. I shall be possible to extend or propose a new one (based on it probably) to enable the support of multipart/mixed.
The current codec was made based on user needs (mine in the beginning, others following). Since no one yet has requested a support for multipart/mixed, it was not coded, except for internal multipart/mixed code.
The reference is RFC1867.
As Netty loves contributions, you are more than welcome to propose yours ;-)
2) it seems that is it only possible to use efficient HttpDatas like
FileUpload if you are in multipart/form-data. I would like to be able
to add a FileUpload to the request, and by this way make the contents
of the file be the body of the request, without making it a multipart
request. I think this could be done by extending the Standard Post
Encoder to understand FileUploads.
This could a bit more complicated since it has to be done without multipart, which holds currently the FileUpload class.
Maybe a good direction could be to switch to ChunkFile or ChunkNioFile and to combine it with "your" HttpCodec or in your "HttpHandler" when doing the body request, in order to pass the content through the ChunkFile.
Hoping this helps you in the right direction...
I've read lots of articles about the differences between GET and POST. Lots of them are available here at StackOverflow.
A summary of the important differences is:
Post can send its information via body while GET should not (but I think it can be done practically)
Some browsers cache the GET results and rely on the idempotent behavior of GET requests.
Using GET is much easier than using POST for most of developers.
Concluding this summary, Using GET in POST situations is bad and dangerous.
But is it true that ignoring the easiness, POST can be used as a replacement of the GET requests as it seems it totally covers the GET requirements.
To clarify that I'm not crazy!, I'm not going to use POST instead of GET. This question is just about to check if I understand the GET and POST difference correctly.
No, POST is not a replacement of GET requests. There are two important things that a POST request cannot do that a GET request can.
You cannot generate a POST request simply by typing a URL in the address bar of the browser. This always generates a GET request.
You cannot generate a POST requesting using an ordinary link in HTML. This has far-reaching consequences. You cannot find a page that is only accessible using a POST request with any search engine, and you cannot link to it unless it is done by an HTML form or using Javascript.
Its a good practice that you classify your transaction. These methods are very important specially when you are developing an API Service Oriented architecture or even Single Page Applications.
GET - used to retrieve a dataset. (also has a limitation for url length. parameters are exposed and urlencoded.)
POST - Saving/adding (this is more secure)
EX:
GET /items - means you are getting the list of items.
POST /items - means you are saving/adding item(s)
and later you might need to learn PUT and DELETE too.
But for now, always use POST in your form or ajax request when saving/adding data. and GET when retrieving data.
I'm thinking about implementing an RPC mechanism over HTTP. The POST method seems to be suitable for the calls. However, since each call comes with the binary payload, there's a decision to be made about how to attach that binary data to the POST request. It seems there are two content types for POSTs of use today: application/x-www-form-urlencoded and multipart/form-data. The former seems to require percent-escaping binary data, while the latter seems to add some overhead with the boundaries/content-disposition fields.
Therefore my question is: how good is it to just use application/octet-stream as a POST content type and just include the binary payload afterwards as is? Will it go through all proxies? Will all HTTP servers be able to handle this? Is it standards-compliant? In other words, should I go for it?
Yes, you can do that; but it would be better to use a more specific type that makes the message self-descriptive.
In J2ME ,Which connection type is better?Get or post.Which one is faster?which one uses less bandwidth?and which one is supported by most of the handsets?What are the advantages and disadvantages of both?
Also, see Is there a limit to the length of a GET request? which may be relevant if you plan to abuse GET.
Be aware that network operators (certainly in the UK) have caching schemes in place that may affect your traffic.
If you look at what Opera Mini does, they only use HTTP POST in their HTTP mode.
I think this is a great idea because of the following reasons:
POST's are never cached (according to HTTP spec at least) - this saves you from operator caching etc.
It seems some operators do better with POST's than GET's - this is a feeling I get from what some Nigerian users mention.
Opera has the most installations of any J2ME app in the world most probably, and if they do it, it's probably safer.
No problems with HTTP GET limits on query length.
You can use a more flexible data format if you like that uses less data (no encoding needed on the data as with GET)
I think it's much cleaner, but does require some extra work, e.g. if you are using your HTTP web logs to parse out number of requests per "?type=blah" for example, then you'll have to move that into your site's logic.
If you follow standards get should be used only for data retrieval and post for adding new items. It depends on the server handler implementation which one is faster/slower.
I was wondering whether search engines respect the HTTP header field Content-Location.
This could be useful, for example, when you want to remove the session ID argument out of the URL:
GET /foo/bar?sid=0123456789 HTTP/1.1
Host: example.com
…
HTTP/1.1 200 OK
Content-Location: http://example.com/foo/bar
…
Clarification:
I don’t want to redirect the request, as removing the session ID would lead to a completely different request and thus probably also a different response. I just want to state that the enclosed response is also available under its “main URL”.
Maybe my example was not a good representation of the intent of my question. So please take a look at What is the purpose of the HTTP header field “Content-Location”?.
I think Google just announced the answer to my question: the canonical link relation for declaring the canonical URL.
Maile Ohye from Google wrote:
MickeyC said...
You should have used the Content-Location header instead, as per:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
"14.14 Content-Location"
#MikeyC: Yes, from a theoretical standpoint that makes sense and we certainly considered it. A few points, however, led us to choose :
Our data showed that the "Content-Location" header is configured improperly on many web sites. Sometimes webmasters provide long, ugly URLs that aren’t even duplicates -- it's probably unintentional. They're likely unaware that their webserver is even sending the Content-Location header.
It would've been extremely time consuming to contact site owners to clean up the Content-Location issues throughout the web. We realized that if we started with a clean slate, we could provide the functionality more quickly. With Microsoft and Yahoo! on-board to support this format, webmasters need to only learn one syntax.
Often webmasters have difficulty configuring their web server headers, but can more easily change their HTML. rel="canonical" seemed like a friendly attribute.
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html?showComment=1234714860000#c8376597054104610625
Most decent crawlers do follow Content-Location. So, yes, search engines respect the Content-Location header, although that is no guarantee that the URL having the sid parameter will not be on the results page.
In 2009 Google started looking at URIs qualified as rel=canonical in the response body.
Looks like since 2011, links formatted as per RFC5988 are also parsed from the header field Link:. It is also clearly mentioned in the Webmaster Tools FAQ as a valid option.
Guess this is the most up-to-date way of providing search engines some extra hypermedia breadcrumbs to follow - thus allow keeping you to keep them out of the response body when you don't actually need to serve it as content.
In addition to using 'Location' rather than 'Content-Location' use the proper HTTP status code in your response depending on your reason for redirect. Search engines tend to favor permanent redirect (301) status vs temporary (302) status.
Try the "Location:" header instead.