Encoding, Compression - encryption

I wanted to know whether
A file encryption algorithm can be designed in such a way that it can perform the activities of a file compression as well (any live example?).
Also can I integrate it to a Mobile SMS Service, I mean for text messages?
Also I wanted to know about binary files...if a plain text file is encoded in binary does its size reduces? Also is it better to encode a plain text file to a binary one rather than encoding it to any other format (in case if anyone wants it to encode for any purpose)

In fact all decent encryption programs (take PGP for example) compress data before encryption. They use something mainstream like ZIP for compression. The reason is once data is encrypted it looks like random noise and becomes incompressible, so it can only be compressed before encryption. You likely can't do that for SMS - you have to obey the specifications for the SMS so you'd better check those specifications.

Compression removes redundant information. Redundant information makes breaking an encryption easier. So yes, encryption and compression are compatible. I don't know if there is an algorithm designed to do both things though.
Yes, a binary file will usually be smaller than a plain text file. For instance, the number 34 written out in text takes 2 bytes (at least) whereas in those same 2 bytes you could write a number as large as 65000.
What makes an encoding "better" than another is the purpose to which it is put. If you are optimizing for size, binary is probably better. If you are optimizing for readability or graceful failure, text might be better.

First question: No reason why it shouldn't be possible. Since you can still encrypt compressed data and vise versa, you can test the benefits by trying. I don't know if it can be used for text messaging, but you'd have to explain what you're trying to do if anyone wants to give a reasonable and fitting answer.
Second question: plain text is still binary, it's just readable and encoded in a certain character set. Main difference is that plain text usually conforms to a certain encoding, say UTF-8 or ISO-8859-1.
You can still compress plain text, which makes it "binary" in the traditional sense that you only see weird bytes. :-)

Related

Why do we still use base64 but only in limited contexts, like SMTP?

I'm trying to explain base64 to someone, but I'm not grasping the fundamental need for it. And/or it seems like all network protocols would need it, meaning base64 wouldn't be a special thing.
My understanding is that base64 is used to convert binary data to text data so that the protocol sending the text data can use certain bytes as control characters.
This suggests we'll still be using base64 as long as we're sending text data.
But why do I only see base64 being used for SMTP and a few other contexts? Don't most commonly-used protocols need to:
support sending of arbitrary binary data
reserve some bytes for themselves as control chars
Maybe I should specify that I'm thinking of TCP/IP, FTP, SSH, etc. Yet base64 is not used in those protocols, TMK. Why not? How are those protocols solving the same problems? Which then begs the reverse question: why doesn't SMTP use that solution instead of base64?
Note: obviously I have tried looking at WP, other Stack-O questions about base64, etc, and not found an answer to this specific question.

Why is HTTP / 2 binary?

Why is HTTP / 2 binary, and HTTP 1.1 text? In my opinion, they are both binary, because the computer does not have artificial intelligence and does not understand the text as such.
A binary protocol is a protocol which is intended to be read by a machine rather than a human being, as opposed to a plain text protocol such as IRC, SMTP, or HTTP. Binary protocols have the advantage of terseness, which translates into speed of transmission and interpretation.
Source: https://en.wikipedia.org/wiki/Binary_protocol
Ultimately everything a (non-quantum) computer holds in its memory is expressed in binary. This very sentence is stored as bits in some SQL server somewhere in a data center.
However, some bytes can be expressed as characters, while others can not. A textual format (such as source code files or other plaintext file formats) will be entirely human-readable, because all its bytes compose characters.
This is different for a binary format. Binary is not meant to be human-readable, so if you'd open an HTTP/2 stream in a text editor such as Notepad, you'd see a lot of question marks or black squares.
That's because HTTP/2 uses compression, and compression urges to make the most use of all available bits, while text doesn't.

Can ciphertext output of AES or RSA encryption be limited to the ASCII-128 character table?

I'm doing an R&D project focused on encrypting information into cipher-text and then printing it as a Code-128 barcode.
It is my understanding that since code-128 barcodes can only retain the ASCII table up to 128 characters. Therefore, I need to know if the cipher-text output of modern algorithms such as AES or RSA can be restricted to only the ASCII-128 table.
I am not doing any programming at this point, I am mainly trying to find precedence that this is can be done or has been done. If anyone knows the answer to this question, and better yet could provide a reference to an example, I would be very grateful.
bonus question... if restricting the ciphertext to ascii-128 is possible, how much could it affect encryption strength?
Encode resulting cipher-text, which is binary, with Base64, which is ASCII. Then you will be able to print base64-encoded cipher-text as
barcode.
Since Base64 encoding decoding doesn't change content, but rather changes its presentation, it doesn't affect underlying cryptography strength.
If resulting data won't fit into 103 (255) characters of Code 128, consider using QR Code instead.

Data Encryption Standard(DES) zipped is bigger?

I kind of know why its bigger when DES files are zipped are bigger but can anyone give me proper reason why or link i cant seem to find anything why ?
Thanks :)
A decently encrypted ciphertext looks very much like a sequence of random numbers to the compression program. Without the patterns that'd be present in cleartext, the compressor can't find much redundancy to remove, and thus can't compress very well at all. Add in the data the compressor needs in order to be able to decode, and it's entirely possible that the "compressed" file will be bigger than the original.
If you want to compress, you might consider doing so before you encrypt.

Does using Base64 over Hex for encoding my SHA-1 hash make it less secure?

A lot of the examples online show the hash as a hex representation, and they are typically custom implementations. Is there anything wrong with, or less secure about, using the Apache Commons Base64 encoding instead? When reading about encoding, it is usually within the realm of how to represent binary as text in XML, but does not necessarily discuss security concerns... just how the compression works.
On a related issue, why encode it all, since databases have binary types that could probably hold the encryption as binary? So if I'm storing a password, why not just store it in its native type?
An encoding affects only the representation of the data, not its security. So, if you send an unencrypted password and use some form of encoding, you've not made it any more secure; likewise, if you take some highly encrypted text and then represent it in some encoding scheme, that won't make it any less secure, either. Typically, the reason to use this form of encoding is to send binary data using a protocol (such as SMTP), where the protocol is only capable of supporting 7-bit ASCII text. Another use is in URLs, where the set of characters that a URL can support is limited, but you might want to put arbitrarily complicated binary data in that URL (such as a validation signature of some sort).
Not at all. It's just an encoding that represents the same bits in ASCII. It is mostly useful when you have to store or transmit binary data over communications paths designed to handle only text. Modern examples ore email and web interfaces. You also can't send the binary form to a terminal to view it, since it would result in garbage or strange terminal behavior.
If you can safely store the bits in a binary blob in a database there is no reason to encode in base64. But if you don't it would be harder to view it. You would have to convert it back to a text form first.
Well we typically don't do too well reading binary, and hex is a better substitute for that. I wish you had linked to the articles you were referencing, so others could have a direct line on what it is you're forming an opinion from.
I don't understand why they would use Base64 over hex, but I'm assuming it's because hex is 16 digits and Base64 is a few more, thus generating a more compact version of the actual hash ;) ~ We humans tend to do better absorbing a small amount of data at a time.
No, because Base64 is a 1:1 encoding (that is, for every input there is exactly one base64 encoded output, and vice-versa), base64 encoding a SHA1 hash is just as “secure” as a hex-encoded (or binary-encoded, for that matter) hash.
The encoding would only make a hash insecure if the encoding made it possible for multiple hashes to encode to the same string, or multiple strings to decode to the same hash.

Resources