Google search by image "image_content" format? - http

I'm trying to create an Application, which is able to upload an image to https://www.google.de/searchbyimage/upload. I got that working (Posting multipart/form-data via C#)
The only thing I now need to know is:
How is the image sent by the browser usually? In the multipart/form-data I found something called "image_content" in a sniffed request, what stores the image data.
But I don't know which format the image is stored.
------WebKitFormBoundaryumAjUbPr6ymfh8hM
Content-Disposition: form-data; name="image_content"

------WebKitFormBoundaryumAjUbPr6ymfh8hM
Any suggestions?

The default encoding is base64. You should form a request that matches your sniffed request, except for the following:
The WebKitFormBoundaryumAj... string should have a random string appended to ensure its uniqueness
The _9j_ line should be replaced with the base64-encoded contents of the image you are uploading.
The server will automatically detect the type of file (JPG, PNG, etc) so you shouldn't need to worry about that.

This is base64 encoded image. You can actually use it in many places, such as in CSS and in JavaScript. You can basically place it anywhere, where usual URI would be required. You can also encode many different things in such way (typefaces used in #font-face, for example).
In most modern computer languages there is built in functionality for base64 encoding – just google for one in C# if that's what you're using.
You can read more on the usage of data-URIs here: https://developer.mozilla.org/en-US/docs/data_URIs and perhaps here: http://css-tricks.com/data-uris/

Related

What does this response mean?

This is a response from server of a video file. When seeing the preview in chrome(image) it shows in some characters(Not sure what kind of character is that. If someone know please let me know what is the name of those characters/symbols). Same video response in firefox(image) is seen as base64. So, is the video is transferred to the browser in form of base64 string even when the content type is set to video/mp4(image)? I notice this when i download a pdf file as well. Please explain me. Thanks.
You're looking at something that is binary data, not text, therefore it doesn't show as any ascii characters that make any sense.

How to Encode a web address

data:text/html;base64,77u/data:text/html;base64,77u/PCFET0NUWVBFIGh0bWw+CjxodG1sIGxhbmc9ImVuLVVTIiBjbGFzcz0ia
I saw a url as the one above.
what is it called ?
how do i encrypt a url to be like that ?
It's called base64 encoding; what's encoded in your example is not a web address, but actual HTML content itself.
Note that it's trivially easy to decode for anyone. Base64 is not suitable for "encrypting" resources. You can not hide content or URLs from your visitors this way.
Using this makes sense only in a very limited set of situations, like when you want to reduce the number of HTTP requests and store multiple resources inside one HTML page.
If you still want to use it - there are online base64 encoders and decoders like this one. You'd use what you have above and replace everything after base64,.

How to detect wrong encoding declaration?

I am building a ASP.NET webservice loading other webpages and then hand it clients.
I have been doing quite well with character code treatment, reading the meta tag from HTML then use that codeset to read the file.
But nevertheless, some less educated users just don't understand code sets. They declare a specific encoding method e.g. "gb2312", but in fact, he is just using normal UTF8. When I use gb2312 to decode the text, everything turns out a holy mess.
How can I detect whether the text is properly decoded? I loaded that page into my IE, which correctly use UTF-8 to decode the page. How does it achieve that?
Based on the BOM you can tell what encoding is used.
BOM and encoding
If you want to detect character set you could use the C# port of mozilla's character set detector.
CharDetSharp
If you want to make it extra sure that you are using a correct one, you maybe could be looking for special characters that are not supposed to be there. It is not very likely to include "óké". So you could be looking for such characters and try to use different encoding/character set to process your file.
Actually it is really hard to make your application completely "fool-proof".

Letters becoming "ë"

I Have a website, and there a few textboxes. If the users fill in something that contains the letters "ë" then it becomes like:
ë
How can I store it ë like this in the database?
My website is built on .NET and Iam using the C# language.
Both ASP.Net (your server-side application) and SQL Server are Unicode-aware. They can handle different languages, and different character sets:
http://msdn.microsoft.com/en-us/library/39d1w2xf.aspx
Internally, the code behind ASP.NET Web pages handles all string data
as Unicode. You can set how the page encodes its response, which sets
the CharSet attribute on the Content-Type part of the HTTP header.
This enables browsers to determine the encoding without a meta tag or
having to deduce the correct encoding from the content. You can also
set how the page interprets information that is sent in a request.
Finally, you can set how ASP.NET interprets the content of the page
itself — in other words, the encoding of the physical .aspx file on
disk. If you set the file encoding, all ASP pages must use that
encoding. Notepad.exe can save files that are encoded in the current
system ANSI codepage, in UTF-8, or in UTF-16 (also called Unicode).
The ASP.NET runtime can distinguish between these three encodings. The
encoding of the physical ASP.NET file must match the encoding that is
specified in the file in the # Page encoding attributes.
This article is also helpful:
http://support.microsoft.com/kb/893663
This "Joel-on-Software" article is an absolute must-read
The Absolute Minimum Every Software Developer Absolutely Positively Must Know About Unicode (No Excuses!)
Please read all three articles, and let us know if that helps.
You need HtmlEncode and HtmlDecode functions.
SQL Server is fine with ë and any other local or 'unusual' characters but HTML is not. This is because some characters have special meanings in HTML. Best examples are < or > which are essential to HTML syntax but there is lots more. For some reason ë is also special. To be able to display characters like that they need to be encoded before transmission as HTML. Transmission means also sending to a browser.
So, although you see ë in a browser your app is handling it in an encoded version which is ë and it's always in this form including database. If you want ë to be saved in SQL Server as ë you need to decode it first. Remember to encode it back to ë before displaying on your page.
Use these functions to decode/encode all your texts before saving/displaying respectively. They will only convert special characters and leave alone everything else:
string encoded = HttpUtility.HtmlEncode("Noël")
string decoded = HttpUtility.HtmlDecode("Noël")
There is another important reason to operate on encoded texts - JavaScript injections. It is an attack on your site meant to disrupt it by placing JavaScript chunks into edit/memo boxes with a hope that they will get executed at one point on someone else's browser. If you encode all texts you get from UI, those JavaScripts will never run because they will be treated as texts rather than an executable code.

How to send MIME over HTTP?

I need to send certain data to the server in a .zip archive, over HTTP POST request, MIME encoded. I take it that means only that I need to specify MIME type in a request header. But I'm confused as to what should I put in request's body. So far I can see two ways to do it:
Usually, as I take it (sorry, I'm not a web coder, so kinda lame with HTTP), POST request body consists of pairs parameter_name=some+data divided by '&'. Should I do it the same way and write contents of my file in base64 in one of parameters? That would also let me provide supplemental parameters.
Or should I just fill POST body with contents of my file (in base64, right?)? If so, is there any way to provide additional info about the file?
Is only one of theese ways acceptable or are both? If so, what would be the best practice?
Also, code sample in C++ for Qt would be very-very much appreciated, but totally not necessary :)
The whole key=value body in POST requests is just for when you are sending form-data to your server. If you want to POST only the contents of a .zip file you can just send that as the body of your POST, no need to set it up like a form post as you describe. You can set the following headers in the request:
Content-Type: application/zip
Content-Disposition: attachment; filename=myzip.zip
You don't even necessarily have to base64 encode the body, although you should if that's what your server is expecting.
The Content-Disposition is the thing you need to describe more about your file upload. You can find some details about it here:
http://en.wikipedia.org/wiki/MIME#Content-Disposition
and here
http://www.ietf.org/rfc/rfc2183.txt
At the server end, you just need to write some code which will get the response body in its entirity (which is straightforward, although YMMV depending on language and framework), and handle it however you want.
For a real world example, you might find it useful to look at, say, AtomPub for how this is done:
http://bitworking.org/projects/atom/rfc5023.html

Resources