Terminology : encoding without decoding? - encryption

I am having trouble understanding such basic concept.
I did some research about cryptography and manipulated few concepts (RSA key pair, AES/DES/whatever secret key, hash functions ...). But I would like to understand more deeply one basic thing :
Encoding is transforming a message into an other form.
Decoding is giving a message its original form.
Well, for me it sounds like encryption is encoding. And I think (please correct me) that encrytion is a way of encoding (for a very particular purpose : increasing the confidence of having a known list of person who can decode).
But what about hash function ? Since there is no decoding function, when we hash a message, can we say :
"this text is this message encoded with SHA-1 algorithm",
as we can surely say :
"this digest is this message hashed with SHA-1 algorithm" ?
Thank you !

Encoding, and its reverse decoding, are mere transformations of the data into some alternative form. Each form expresses the exact same data, just written differently. The transformation is well known and can be carried out by anyone.
Encryption, and its reverse decryption, is the encoding of data using a secret. The ciphertext (the encrypted data), is for all intents and purposes random noise. The ciphertext does not express the plaintext in some alternative format, the plaintext is hidden inside the ciphertext. The transformation is not well known, as it requires a secret key which, supposedly, only specific entities are in possession of.
In that way, yes, encryption is a specialised form of encoding, but in usage "encoding" typically means a transformation that can be carried out by anyone, while "encryption" specifically involves preventing unauthorised parties from carrying out the transformation.
Hashing is a one-way operation (there's no dehashing) and thereby entirely distinct from the other two operations.

Related

Can I use the word 'encrypt' as hashing by cryptographic hash function in English? [duplicate]

My question is simple: encryption hides information to the point where only the people intended to see the information can actually see it. You can encrypt information and then decrypt it. Why is hashing is considered encryption if it cannot be decrypted?
I say hashing is considered encryption because you call MD5 a cryptographic hash function.
Correct, Hashing is a one way function, but Hashing isn't considered Encryption. A good explanation of the difference Hashing vs Encryption and Fundamental difference between Hashing and Encryption algorithms.
Cryptography is broad field of study which covers both encryption and secure hashing. It also encompasses a variety of other topics, including secret sharing, public-key systems, and random number generation. One might summarize it broadly as the science of working with secrets.
Hashing is not considered a form of encryption. There are some relationships between certain algorithms used for hashing and encryption, but the two are not interchangeable.
It is all about the purpose. Hashing is not about 'find out what the original message is' but about an unique sequence of bits, in other words unique id, that identifies the original message.
Even very small changes in the source message would drastically change the resulting output bit sequence, by the so-called avalanche effect.
En and De prefix many English word pairs, such as en code and de code. Or en able and dis able. En generally means 'put into', and De generally means 'remove'
en cryption is the act of applying cryptography. de cryption is the act of removing cryptography.
To en crypt infers that de crypting must be possible.
With hashes ('one way cryptography') this is not possible. So to say you en crypt a hash makes no sense, as it cannot be de crypted.
Cryptography is called as "The Art of Secret Writing" includes Encryption and Decryption in which Encryption means conversion for Plain Text into Cipher Text and Decryption means Conversion of Cipher Text to Plain Text. It is a two-way process since the encrypted message can be decrypted and viewed by the authorized person or the intended person. So only Cryptography is used for Secure Communication.
But in Hashing, it is a one-way mechanism. Here the converted value is called as the Message Digest like Cipher in Cryptography.Hashing mechanism converts the data into Hash Value (or) Message Digest by using Hashing Algorithms like SHA(Secure Hash Algorithm), MD5(Message Digest v5) etc.
In this the data converted into Hash Value can't be converted back as it's the main purpose is to Validate and enhance the Security.
For example:-All your passwords for your Online Accounts are stored in the form of Hash since even it is hacked it can't be viewed. Every time when you enter your password it is converted into a hash value and it is checked with the existing hash value of your password.
Hashes can be decrypted given enough time and computing power, that's why we have to change to better and better standards. Now this is probably done through brute force which is mathematically inelegant, but the point remains that they can be decrypted.
While there are differences in the usage of the words as pointed to above, we shouldn't be so arrogant as to think that hashes can't be decrypted (made unsecret), they just usually aren't used for that purpose. They are still encrypted text.

Is Base64 an encryption or encoding algorithm?

I have to use an encryption algorithm using Base64 but when I researched online I find forums state it is an encoding algorithm. This has me confused. :(
Is Base64 an encryption or encoding algorithm? How do we differentiate between the two except for the fact that one is publicly decipherable while the other needs a key for that?
It's an encoding algorithm (hence "Base64 encoding") to allow people to move data in an ASCII friendly environment (i.e. no control characters or anything non-printable). It should give you good portability with XML and JSON etc.
The encoding is entirely well known, the algorithm is simple and as it has not "mutability" of the algorithm or concept of keys etc. it is not considered as "encryption".
In summary, anybody can Base64 decode your content, so it's not encryption. At least not useful as encryption. It may keep a four year old stumped, but that's it.
An encoding algorithm merely presents data in an alternative format. It does not in any way attempt to hide data, it merely expresses the same data in an alternative syntax. Base64 is such an encoding algorithm. It merely encodes arbitrary data using only ASCII characters, which is useful in many situations in which non-ASCII characters may not be handled correctly. You can encode and decode Base64 back and forth all day long; there's no secret, no protection, no encryption.
The difference between encoding and encrypting is in whether you need to know a secret in order to get back the original form. base64 is an encoding because all you need to know is the algorithm to encode/decode.
When something is encrypted, there's a secret key that's used, and you need to know the key in order to decrypt it. There's two general types of encryption:
symmetric encryption = the same key is used to encrypt and decrypt. The correspondents using this encryption both need to know this key.
asymmetric encryption = different keys are used to encrypt and decrypt. This is also called public key encryption because you can make one of the keys well known (public), while keeping the other one secret (private). This allows anyone to encrypt a message that using the public key, while only the person who knows the private key can decrypt it, or vice versa.
One can certainly see Base64 as a substitution cipher with a pre-set/fixed key which also blows up the ciphertext by roughly 4/3, but this is not a very useful thought process. The main property of it is that it transforms some data into another format without some additional information. So it is an encoding algorithm.
Note that there are different variants of Base64 with different alphabets such as the one that is URL-safe (table 2 of the RFC4648). If you can set the alphabet with positions, then it will be an encryption algorithm, but it shouldn't be called Base64 anymore.

Difference between encoding and encryption

What is the difference between encoding and encryption?
Encoding transforms data into another format using a scheme that is publicly available so that it can easily be reversed.
Encryption transforms data into another format in such a way that only specific individual(s) can reverse the transformation.
For Summary -
Encoding is for maintaining data usability and uses schemes that are publicly available.
Encryption is for maintaining data confidentiality and thus the ability to reverse the transformation (keys) are limited to certain people.
More details in SOURCE
Encoding:
Purpose: The purpose of encoding is to transform data so that it can be properly (and safely) consumed by a different type of system.
Used for: Maintaining data usability i.e., to ensure that it is able to be properly consumed.
Data Retrieval Mechanism: No key and can be easily reversed provided we know what algorithm was used in encoding.
Algorithms Used: ASCII, Unicode, URL Encoding, Base64.
Example: Binary data being sent over email, or viewing special characters on a web page.
Encryption:
Purpose: The purpose of encryption is to transform data in order to keep it secret from others.
Used for: Maintaining data confidentiality i.e., to ensure the data cannot be consumed by anyone other than the intended recipient(s).
Data Retrieval Mechanism: Original data can be obtained if we know the key and encryption algorithm used.
Algorithms Used: AES, Blowfish, RSA.
Example: Sending someone a secret letter that only they should be able to read, or securely sending a password over the Internet.
Reference URL: http://danielmiessler.com/study/encoding_vs_encryption/
Encoding is the process of transforming data so that it may be transmitted without danger over a communication channel or stored without danger on a storage medium. For instance, computer hardware does not manipulate text, it merely manipulates bytes, so a text encoding is a description of how text should be transformed into bytes. Similarly, HTTP does not allow all characters to be transmitted safely, so it may be necessary to encode data using base64 (uses only letters, numbers and two safe characters).
When encoding or decoding, the emphasis is placed on everyone having the same algorithm, and that algorithm is usually well-documented, widely distributed and fairly easily implemented. Anyone is eventually able to decode encoded data.
Encryption, on the other hand, applies a transformation to a piece of data that can only be reversed with specific (and secret) knowledge of how to decrypt it. The emphasis is on making it hard for anyone but the intended recipient to read the original data. An encoding algorithm that is kept secret is a form of encryption, but quite vulnerable (it takes skill and time to devise any kind of encryption, and by definition you can't have someone else create such an encoding algorithm for you - or you would have to kill them). Instead, the most used encryption method uses secret keys : the algorithm is well-known, but the encryption and decryption process requires having the same key for both operations, and the key is then kept secret. Decrypting encrypted data is only possible with the corresponding key.
Encoding is the process of putting a sequence of characters into a special format for transmission or storage purposes
Encryption is the process of translation of data into a secret code. Encryption is the most effective way to achieve data security. To read an encrypted file, you must have access to a secret key or password that enables you to decrypt it. Unencrypted data is called plain text ; encrypted data is referred to as cipher text
Encoding is for maintaining data usability and can be reversed by employing the same algorithm that encoded the content, i.e. no key is used.
Encryption is for maintaining data confidentiality and requires the use of a key (kept secret) in order to return to plaintext.
Also there are two major terms that brings confusion in the world of security Hashing and Obfuscation
Hashing is for validating the integrity of content by detecting all modification thereof via obvious changes to the hash output.
Obfuscation is used to prevent people from understanding the meaning of something, and is often used with computer code to help prevent successful reverse engineering and/or theft of a product’s functionality.
Read more # Danielmiessler article
See encoding as a way to store or communicate data between different systems. For example, if you want to store text on a hard drive, you're going to have to find a way to convert your characters to bits. Alternatively, if all you have is a flash light, you might want to encode your text using Morse. The result is always "readable", provided you know how it's stored.
Encryption means you want to make your data unreadable, by encrypting it using an algorithm. For example, Caesar did this by substituting each letter by another. The result here is unreadable, unless you know the secret "key" with which is was encrypted.
I'd say that both operations transform information from one form to another, the difference being:
Encoding means transforming information from one form to another, in most cases it is easily reversible
Encryption means that the original information is obscured and involves encryption keys which must be supplied to the encryption / decryption process to do the transformation.
So, if it involves (symmetric or asymmetric) keys (aka a "secret"), it's encryption, otherwise it's encoding.
Encoding -》 example data is 16
Then encoding is 10000 means it's binary format or ASCII or UNCODED etc
Which can be read by any system eassily and eassy to understand it's real meaning
Encryption -》 example data is 16
Then encryprion is 3t57 or may be anything depend upon which algo is used to encryption
Which can be read by any system eassily BUT ony who can understand it's real meaning who has it's decryption key
These are little bit different from each other. The encoding used when we want to convert text in a specific computer coding technique and in the encryption we hide data between a specific key or text.
Encoding is process of transforming given set of characters in relevant accepted format, take this question's URL,
This is what we see -->
hhttps://stackoverflow.com/questions/4657416/difference-between-encoding-and-encryption
Over transmission this will be transformed to -->
https%3A%2F%2Fstackoverflow.com%2Fquestions%2F4657416%2Fdifference-between-encoding-and-encryption
^ is example of URL encoding using ASCII char set where,
: = %3A
/ = %2F
The reverse of Encoding is Decoding to original form and with given ASCII standard.
Encryption is process of converting plane text to cipher text so only authorized party can decipher it.
For example a simple HELLO is encrypted into KHOOR if just 3 characters are shifted.
p.s. Encoding (to code in some form) is form of encryption. :)
what-is-encryption
Encryption converts data to non-readable format (Possibly containing special non-readable characters).
Encoding helps to convert that data to readable format (characters) so that it can be stored for future use i.e. possibly during decryption.

Identifying An Encryption Algorithm

First off, I would like to ask if any of you know of an encryption algorithm that uses a key to encrypt the data, but no key to decrypt the data. This seems highly unlikely, if not impossible to me, so sorry if it's a stupid question.
My final question is, say you have access to the plain text data before it is encrypted, the key used to encrypt the plain text data, and the resulting encrypted data, would figuring out which algorithm used to encrypt the data be feasible?
First off, I would like to ask
if any of you know of an encryption
algorithm that uses a key to encrypt
the data, but no key to decrypt the
data.
No. There are algorithms that use a different key to decrypt than to encrypt, but a keyless method would rely on secrecy of the algorithm, generally regarded as a poor idea.
My final question is, say you have
access to the plain text data before
it is encrypted, the key used to
encrypt the plain text data, and the
resulting encrypted data, would
figuring out which algorithm used to
encrypt the data be feasible?
Most likely yes, especially given the key. A good crypto algorithm relies on the secrecy of the key, and the key alone. See kerckhoff's principle.
Also if a common algorithm is used it would be a simple matter of trial and error, and besides cryptotext often is accompanied by metadata which tells you algorithm details.
edit: as per comments, you may be thinking of digital signature (which requires a secret only on the sender side), a hash algorithm (which requires no key but isn't encryption), or a zero-knowledge proof (which can prove knowledge of a secret without revealing it).
Abstractly, we can think of the encryption system this way:
-------------------
plaintext ---> | algorithm & key | ---> ciphertext
-------------------
The system must guarantee the following:
decrypt(encrypt(plaintext, algorithm, key), algorithm, key) = plaintext
First off, I would like to ask
if any of you know of an encryption
algorithm that uses a key to encrypt
the data, but no key to decrypt the
data.
Yes, in such a system the key is redundant; all the "secrecy" lies in the algorithm.
My final question
is, say you have access to the plain
text data before it is encrypted, the
key used to encrypt the plain text
data, and the resulting encrypted
data, would figuring out which
algorithm used to encrypt the data be
feasible?
In practice, you'll probably have a small space of algorithms, so a simple brute-force search is feasible. However, there may be more than one algorithm that fits the given information. Consider the following example:
We define the following encryption and decryption operations, where plaintext, ciphertext, algorithm, and key are real numbers (assume algorithm is nonzero):
encrypt(plaintext, algorithm, key) = algorithm x (plaintext + key) = ciphertext
decrypt(ciphertext, algorithm, key) = ciphertext/algorithm - key = plaintext
Now, suppose that plaintext + key = 0. We have ciphertext = 0 for any choice of algorithm. Hence, we cannot deduce the algorithm used.
First off, I would like to ask if any of you know of an encryption algorithm that uses a key to encrypt the data, but no key to decrypt the data.
What are you getting at? It's trivial to come up with a pair of functions that fits the letter of the specification, but without knowing the intent it's hard to give a more helpful answer.
say you have access to the plain text data before it is encrypted, the key used to encrypt the plain text data, and the resulting encrypted data, would figuring out which algorithm used to encrypt the data be feasible?
If the algorithm is any good the output will be indistinguishable from random noise, so there is no analytic solution to this. As a practical matter, there are only so many trusted algorithms in wide use. Trying each one in turn would be quick, but would be complicated by the fact that an implementation has some freedom with regard to things like byte order (little-endian vs big-endian), key derivation (if you had a pass-phrase instead of the actual cryptographic key itself), encryption modes and padding.
As frankodwyer points out, this situation is not part of usual threat models. This would work in your favor, as it makes it more likely that the algorithm is a well-known one.
The best you could do without a known key in the decoder would be to add a bit of obscurity. For example, if the first step of the decode algorythm is to strip out everything except for every tenth character, then your encode key may be used to seed some random garbage for nine out of every ten characters. Thus, with different keys you could achieve different encoded results which would be decoded to the same message, with no key necessary for the decoder.
However, this does not add much real security and should not be solely relied on to protect crucial data. I'm just thinking of a case where it would be possible to do so yes I suppose it could - if you were just trying to prove a point or add one more level of security.
I don't believe that there is such an algorithm that would use a key to encrypt, but not to decrypt. (Silly answers like a 26 character Caesar cipher aside...)
To your second question, yes; it just depends on how much time you're willing to spend on it. In theoretical cryptography it is assumed that the algorithm can always be determined. Whether that be through theft of the algorithm or a physical machine, or as in your case having a plain text and cipher text pair.

What is the difference between Obfuscation, Hashing, and Encryption?

What is the difference between Obfuscation, Hashing, and Encryption?
Here is my understanding:
Hashing is a one-way algorithm; cannot be reversed
Obfuscation is similar to encryption but doesn't require any "secret" to understand (ROT13 is one example)
Encryption is reversible but a "secret" is required to do so
Hashing is a technique of creating semi-unique keys based on larger pieces of data. In a given hash you will eventually have "collisions" (e.g. two different pieces of data calculating to the same hash value) and when you do, you typically create a larger hash key size.
obfuscation generally involves trying to remove helpful clues (i.e. meaningful variable/function names), removing whitespace to make things hard to read, and generally doing things in convoluted ways to make following what's going on difficult. It provides no serious level of security like "true" encryption would.
Encryption can follow several models, one of which is the "secret" method, called private key encryption where both parties have a secret key. Public key encryption uses a shared one-way key to encrypt and a private recipient key to decrypt. With public key, only the recipient needs to have the secret.
That's a high level explanation. I'll try to refine them:
Hashing - in a perfect world, it's a random oracle. For the same input X, you always recieve the same output Y, that is in NO WAY related to X. This is mathematically impossible (or at least unproven to be possible). The closest we get is trapdoor functions. H(X) = Y for with H-1(Y) = X is so difficult to do you're better off trying to brute force a Z such that H(Z) = Y
Obfuscation (my opinion) - Any function f, such that f(a) = b where you rely on f being secret. F may be a hash function, but the "obfuscation" part implies security through obscurity. If you never saw ROT13 before, it'd be obfuscation
Encryption - Ek(X) = Y, Dl(Y) = X where E is known to everyone. k and l are keys, they may be the same (in symmetric, they are the same). Y is the ciphertext, X is the plaintext.
A hash is a one way algorithm used to compare an input with a reference without compromising the reference.
It is commonly used in logins to compare passwords and you can also find it on your reciepe if you shop using credit-card. There you will find your credit-card-number with some numbers hidden, this way you can prove with high propability that your card was used to buy the stuff while someone searching through your garbage won't be able to find the number of your card.
A very naive and simple hash is "The first 3 letters of a string".
That means the hash of "abcdefg" will be "abc". This function can obviously not be reversed which is the entire purpose of a hash. However, note that "abcxyz" will have exactly the same hash, this is called a collision. So again: a hash only proves with a certain propability that the two compared values are the same.
Another very naive and simple hash is the 5-modulus of a number, here you will see that 6,11,16 etc.. will all have the same hash: 1.
Modern hash-algorithms are designed to keep the number of collisions as low as possible but they can never be completly avoided. A rule of thumb is: the longer your hash is, the less collisions it has.
Obfuscation in cryptography is encoding the input data before it is hashed or encrypted.
This makes brute force attacks less feasible, as it gets harder to determine the correct cleartext.
That's not a bad high-level description. Here are some additional considerations:
Hashing typically reduces a large amount of data to a much smaller size. This is useful for verifying the contents of a file without having to have two copies to compare, for example.
Encryption involves storing some secret data, and the security of the secret data depends on keeping a separate "key" safe from the bad guys.
Obfuscation is hiding some information without a separate key (or with a fixed key). In this case, keeping the method a secret is how you keep the data safe.
From this, you can see how a hash algorithm might be useful for digital signatures and content validation, how encryption is used to secure your files and network connections, and why obfuscation is used for Digital Rights Management.
This is how I've always looked at it.
Hashing is deriving a value from
another, using a set algorithm. Depending on the algo used, this may be one way, may not be.
Obfuscating is making something
harder to read by symbol
replacement.
Encryption is like hashing, except the value is dependent on another value you provide the algorithm.
A brief answer:
Hashing - creating a check field on some data (to detect when data is modified). This is a one way function and the original data cannot be derived from the hash. Typical standards for this are SHA-1, SHA256 etc.
Obfuscation - modify your data/code to confuse anyone else (no real protection). This may or may not loose some of the original data. There are no real standards for this.
Encryption - using a key to transform data so that only those with the correct key can understand it. The encrypted data can be decrypted to obtain the original data. Typical standards are DES, TDES, AES, RSA etc.
All fine, except obfuscation is not really similar to encryption - sometimes it doesn't even involve ciphers as simple as ROT13.
Hashing is one-way task of creating one value from another. The algorithm should try to create a value that is as short and as unique as possible.
obfuscation is making something unreadable without changing semantics. It involves value transformation, removing whitespace, etc. Some forms of obfuscation can also be one-way,so it's impossible to get the starting value
encryption is two-way, and there's always some decryption working the other way around.
So, yes, you are mostly correct.
Obfuscation is hiding or making something harder to understand.
Hashing takes an input, runs it through a function, and generates an output that can be a reference to the input. It is not necessarily unique, a function can generate the same output for different inputs.
Encryption transforms the input into an output in a unique manner. There is a one-to-one correlation so there is no potential loss of data or confusion - the output can always be transformed back to the input with no ambiguity.
Obfuscation is merely making something harder to understand by intruducing techniques to confuse someone. Code obfuscators usually do this by renaming things to remove anything meaningful from variable or method names. It's not similar to encryption in that nothing has to be decrypted to be used.
Typically, the difference between hashing and encryption is that hashing generally just employs a formula to translate the data into another form where encryption uses a formula requiring key(s) to encrypt/decrypt. Examples would be base 64 encoding being a hash algorithm where md5 being an encryption algorithm. Anyone can unhash base64 encoded data, but you can't unencrypt md5 encrypted data without a key.

Resources