Base64 string Compression - asp.net

I have got an ActiveX Control that gets an image from a fingerprint device as base64 string. The Active works great and I could transfer the returned base64 string to the server to be converted to a binary data and then to be saved to a database. I use ASP.NET as server side technology and JavaScript as client side technology. The problem is that the base64 string is tool large and it would take from 30 to 40 seconds for the string to be transferred to the server. My question is: Is there any way to compress this base64 string on client (Browser) and deflate it back on server.

If the base64 image is really a jpeg or some other compressed image format, I wouldn't expect you to be able to get much extra compression out of it in the first place. You'd also need to work out a way of posting the binary compressed data afterwards in a portable way, which may not be terribly easy. (You may be able to pretend it's a file or something like that.)
Just how large is this image? 30 seconds is a very long time...

on my linux system, using the bzip2 utility (which uses burrows-wheeler transform and then compresses), I reduce a jpg encoded in Base64 from 259.6KB to 194.5KB.
Using gzip (which uses an LZ algorithm of some variety), I reduce it to 194.4KB.
So, yes you can compress it. the question is why do you want to? It sounds as though your problems are really lying elsewhere.

Base64 is a format that is usually only used to get around technical limitaions of sending binary data. For example, a base64 string could be used in a GET request, like "website.com/page?q=BASE64ENCODED".
You have two options:
Figure out how to send/recieve binary data in a POST request, then send the data in a different form and convert it appropriately on the server.
-OR-
Since this is a fingerprint device, I assume you're using it as a password, so you actually don't have to send the raw fingerprint data every time. Just make an SHA-1 hash of it, and compare it to a stored hash on the server. This is just as secure and will take a fraction of a second to upload.
Hope I helped!

Related

Base64 string and Image

When I use images database for my asp.net web application , i found two methods !
First method is using "Image" data type , second is convert image to base64 string and store in "byte" data type .
I want to know which method is better in process and why ?
Image type is deprecated so is not an option. There is no byte type in SQL Server, there is a type called tinyint which is equivalent to a byte type but I'm pretty sure that's not what you're asking.
So here is your actual question, corrected of wrong terminology:
When I use images database for my asp.net web application , i found two methods ! First method is using VARBINARY(max) data type , second is convert image to base64 string and store in VARCHAR(max) data type.
Base64 encoding adds significant size to binary data, it converts 3 octets into 4 octets, so it adds 33% overhead. That is real estate consumed in your storage (be it database or file).
However a web application will serve media files overwhelmingly as base64 encoded responses. Having the file already encoded in storage saves CPU on the web server as it can stream the storage content directly, w/o going through base64 encoding. Not that base64 encoding is expensive (is very cheap and very much cache friendly) when done properly, using streaming semantics.
So choose your poison: save binary content for smaller disk space used, or take the 33% size penalty for faster response write and less CPU on your web tier.
For an example of properly storing and serving media files from the database see Download and Upload images from aSP.Net
As for the neverending discussion about the storage be database or file system, I point you toward To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem? and its conclusion:
The study indicates that if objects are larger than
one megabyte on average, NTFS has a clear advantage
over SQL Server. If the objects are under 256
kilobytes, the database has a clear advantage. Inside
this range, it depends on how write intensive the
workload is.
This study doe snot consider the administrative problems of an out-of database storage (consistent backup-restore, failover) but those kind of problems are only discussed in v2 of products.

How to send chunks of video for streaming using HTTP protocol?

I am creating an app which uses sockets to send data to other devices. I am using Http protocol to send and receive data. Now the problem is, i have to stream a video and i don't know how to send a video(or stream a video).
If the user directly jump to the middle of video then how should i send data.
Thanks...
HTTP wasn't really designed with streaming in mind. Honestly the best protocol is something UDP-based (SCTP is even better in some ways, but support is sketchy). However, I appreciate you may be constrained to HTTP so I'll answer your question as written.
I should also point out that streaming video is actually quite a deep topic and all I can do here is try to touch on some of the approaches that you might want to investigate. If you have control of the end-to-end solution then you have some choices to make - if you only control one end, then your choices are more or less dictated by what's available at the other end.
If you only want to play from the start of the file then it's fairly straightforward - make a standard HTTP request and just start playing as soon as you've buffered up enough video that you can finish downloading the file before you catch up with your download rate. You don't need any special server support for this and any video format will work.
Seeking is trickier. You could take the approach that sites like YouTube used to take which is to simply not allow the user to seek until the file has downloaded enough to reach that point in the video (or just leave them looking at a spinner until that point is reached). This is not the user experience that most people will expect these days, however.
To do better you need to be in control of the streaming client. I would suggest treating the file in chunks and making byte range requests for one chunk at a time. When the user seeks into the middle of the file, you can work out the byte offset into the file and start making byte range requests from that point.
If the video format contains some sort of index at the start then you can use this to work out file offsets - so, your video client would have to request at least enough to get the index before doing any seeking.
If the format doesn't have any form of index but it's encoded at a constant bit rate (CBR) then you can do an initial HEAD request and look at the Content-Length header to find the size of the file. Then, if the use seeks 40% of the way through the video, for example, you just seek to 40% of the way through the encoded frames. This relies on knowing enough about the file format that you can calculate an appropriate seek point so that you can identify framing information and the like (or at least an encoding format which allows you to resynchonise with both the audio and video streams even if you jump in at an arbitrary point in the file). This approach might also work with variable bit rate (VBR) as long as the format is such that you can recover from an arbitrary seek.
It's not ideal but as I said, HTTP wasn't really designed for streaming.
If you have control of the file format and the server, you could make life easier by making each chunk a separate resource. This is how Apple HTTP live streaming and Microsoft smooth streaming both work. They need tool support to pre-process the video, and I don't know if you have control of the server end. Might be worth looking into, however. These also do more clever tricks such as allowing a client to switch between multiple versions of the stream encoded at different bit rates to cope with differences in bandwidth.

How to distribute init. vector with data encrypted via a symmetric cipher?

I want to provide for the user a service of encrypting some data via symmetric cipher to a file. The user simply provide a key and he/she may provide an initialize vector for the cipher.
Is there a standard how the file should look like? It makes sense to fill the file with the encrypted data and show the corresponding initialize vector in a dialog window. It may seem reasonable to someone else that the initialize vector should be stored in the file with the encrypted data.
The important thing for me is that the result is useful for a user and he/she won't need to bother with adjustment of the result.
Thank for a comment!
It is common practice to provide the IV as the first block of the cyphertext file. That way the receiver just treats the first 8 bytes (DES) or 16 bytes (AES) as the IV and the rest of the file as the actual cyphertext.
Use the same format for the IV as you are using for the cyphertext: Base64, hex, byte data or whatever.
In principle, you can use any format you want, as long as the decrypting part of the program knows how to read it. For efficiency, having the initialization vector before the data seems a good idea.
If you want to encrypt files, a good idea would be to not create your own format (which leads to you having to do decisions like the one here), but use an existing file format (which then also is a cryptographic protocol).
I recommend the OpenPGP message format, as defined in RFC 4880 (or some subset thereof, if you don't need all features). This also has the advantage that your clients then can decrypt your files using any OpenPGP implementation (like pgp or gpg), if your program somehow ceases to work (of course, only if they have the key/password).
you should be fine if you store the IV together with the encrypted data in the file ...

application/x-www-form-urlencoded or multipart/form-data?

In HTTP there are two ways to POST data: application/x-www-form-urlencoded and multipart/form-data. I understand that most browsers are only able to upload files if multipart/form-data is used. Is there any additional guidance when to use one of the encoding types in an API context (no browser involved)? This might e.g. be based on:
data size
existence of non-ASCII characters
existence on (unencoded) binary data
the need to transfer additional data (like filename)
I basically found no formal guidance on the web regarding the use of the different content-types so far.
TL;DR
Summary; if you have binary (non-alphanumeric) data (or a significantly sized payload) to transmit, use multipart/form-data. Otherwise, use application/x-www-form-urlencoded.
The MIME types you mention are the two Content-Type headers for HTTP POST requests that user-agents (browsers) must support. The purpose of both of those types of requests is to send a list of name/value pairs to the server. Depending on the type and amount of data being transmitted, one of the methods will be more efficient than the other. To understand why, you have to look at what each is doing under the covers.
For application/x-www-form-urlencoded, the body of the HTTP message sent to the server is essentially one giant query string -- name/value pairs are separated by the ampersand (&), and names are separated from values by the equals symbol (=). An example of this would be:
MyVariableOne=ValueOne&MyVariableTwo=ValueTwo
According to the specification:
[Reserved and] non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character
That means that for each non-alphanumeric byte that exists in one of our values, it's going to take three bytes to represent it. For large binary files, tripling the payload is going to be highly inefficient.
That's where multipart/form-data comes in. With this method of transmitting name/value pairs, each pair is represented as a "part" in a MIME message (as described by other answers). Parts are separated by a particular string boundary (chosen specifically so that this boundary string does not occur in any of the "value" payloads). Each part has its own set of MIME headers like Content-Type, and particularly Content-Disposition, which can give each part its "name." The value piece of each name/value pair is the payload of each part of the MIME message. The MIME spec gives us more options when representing the value payload -- we can choose a more efficient encoding of binary data to save bandwidth (e.g. base 64 or even raw binary).
Why not use multipart/form-data all the time? For short alphanumeric values (like most web forms), the overhead of adding all of the MIME headers is going to significantly outweigh any savings from more efficient binary encoding.
READ AT LEAST THE FIRST PARA HERE!
I know this is 3 years too late, but Matt's (accepted) answer is incomplete and will eventually get you into trouble. The key here is that, if you choose to use multipart/form-data, the boundary must not appear in the file data that the server eventually receives.
This is not a problem for application/x-www-form-urlencoded, because there is no boundary. x-www-form-urlencoded can also always handle binary data, by the simple expedient of turning one arbitrary byte into three 7BIT bytes. Inefficient, but it works (and note that the comment about not being able to send filenames as well as binary data is incorrect; you just send it as another key/value pair).
The problem with multipart/form-data is that the boundary separator must not be present in the file data (see RFC 2388; section 5.2 also includes a rather lame excuse for not having a proper aggregate MIME type that avoids this problem).
So, at first sight, multipart/form-data is of no value whatsoever in any file upload, binary or otherwise. If you don't choose your boundary correctly, then you will eventually have a problem, whether you're sending plain text or raw binary - the server will find a boundary in the wrong place, and your file will be truncated, or the POST will fail.
The key is to choose an encoding and a boundary such that your selected boundary characters cannot appear in the encoded output. One simple solution is to use base64 (do not use raw binary). In base64 3 arbitrary bytes are encoded into four 7-bit characters, where the output character set is [A-Za-z0-9+/=] (i.e. alphanumerics, '+', '/' or '='). = is a special case, and may only appear at the end of the encoded output, as a single = or a double ==. Now, choose your boundary as a 7-bit ASCII string which cannot appear in base64 output. Many choices you see on the net fail this test - the MDN forms docs, for example, use "blob" as a boundary when sending binary data - not good. However, something like "!blob!" will never appear in base64 output.
I don't think HTTP is limited to POST in multipart or x-www-form-urlencoded. The Content-Type Header is orthogonal to the HTTP POST method (you can fill MIME type which suits you). This is also the case for typical HTML representation based webapps (e.g. json payload became very popular for transmitting payload for ajax requests).
Regarding Restful API over HTTP the most popular content-types I came in touch with are application/xml and application/json.
application/xml:
data-size: XML very verbose, but usually not an issue when using compression and thinking that the write access case (e.g. through POST or PUT) is much more rare as read-access (in many cases it is <3% of all traffic). Rarely there where cases where I had to optimize the write performance
existence of non-ascii chars: you can use utf-8 as encoding in XML
existence of binary data: would need to use base64 encoding
filename data: you can encapsulate this inside field in XML
application/json
data-size: more compact less that XML, still text, but you can compress
non-ascii chars: json is utf-8
binary data: base64 (also see json-binary-question)
filename data: encapsulate as own field-section inside json
binary data as own resource
I would try to represent binary data as own asset/resource. It adds another call but decouples stuff better. Example images:
POST /images
Content-type: multipart/mixed; boundary="xxxx"
... multipart data
201 Created
Location: http://imageserver.org/../foo.jpg
In later resources you could simply inline the binary resource as link:
<main-resource&gt
...
<link href="http://imageserver.org/../foo.jpg"/>
</main-resource>
I agree with much that Manuel has said. In fact, his comments refer to this url...
http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4
... which states:
The content type
"application/x-www-form-urlencoded" is
inefficient for sending large
quantities of binary data or text
containing non-ASCII characters. The
content type "multipart/form-data"
should be used for submitting forms
that contain files, non-ASCII data,
and binary data.
However, for me it would come down to tool/framework support.
What tools and frameworks do you
expect your API users to be building
their apps with?
Do they have
frameworks or components they can use
that favour one method over the
other?
If you get a clear idea of your users, and how they'll make use of your API, then that will help you decide. If you make the upload of files hard for your API users then they'll move away, of you'll spend a lot of time on supporting them.
Secondary to this would be the tool support YOU have for writing your API and how easy it is for your to accommodate one upload mechanism over the other.
Just a little hint from my side for uploading HTML5 canvas image data:
I am working on a project for a print-shop and had some problems due to uploading images to the server that came from an HTML5 canvas element. I was struggling for at least an hour and I did not get it to save the image correctly on my server.
Once I set the
contentType option of my jQuery ajax call to application/x-www-form-urlencoded everything went the right way and the base64-encoded data was interpreted correctly and successfully saved as an image.
Maybe that helps someone!
If you need to use Content-Type=x-www-urlencoded-form then DO NOT use FormDataCollection as parameter: In asp.net Core 2+ FormDataCollection has no default constructors which is required by Formatters. Use IFormCollection instead:
public IActionResult Search([FromForm]IFormCollection type)
{
return Ok();
}
In my case the issue was that the response contentType was application/x-www-form-urlencoded but actually it contained a JSON as the body of the request. Django when we access request.data in Django it cannot properly converted it so access request.body.
Refer this answer for better understanding:
Exception: You cannot access body after reading from request's data stream

Send fodder bytes until actual response data is ready?

I need to serve MP3 content that is generated dynamically during the request. My clients (podcatchers I can't configure) are timing out before I'm able to generate the first byte of the response data.
Is there a way to send fodder/throwAway data while I'm generating the real data, to prevent/avoid the timeout, but in a way that allows me to instruct the client to ignore/discard the fodder data once I'm ready to start sending the "real" data?
If the first few bytes of the encoded content are always the same then you could very slowly send back those bytes. I'm not familiar with the MP3 file format, but if the first few bytes are always some magic (and constant) header, this technique could work.
Once the file encoding gets started you could then skip the first few bytes (since you already sent them) and continue from there.
You could have a default, static "hi, welcome to Lance's stream!" stream go out while you're generating the real deal.

Resources