Dictionary based SMS encryption - encryption

I am trying to create end to end SMS encryption application, but don't want to use standard encryption algorithm.The idea is to convert the text of message to completely different meaningful text so that over the network it doesn't seem encrypted. I am assuming messages only in English language.
In implementation part, I have first compressed the message using huffman encoding which gives me compressed stream of bits. Now for encryption I don't have any idea. Is it possible to build a dictionary of some random text or what other way can be used for getting such encryption? Decryption at other receiving end is also required.

Are you asking for a a codebook? If you have access to an English language dictionary of 65536 entries for example, you can take every 16-bits of your message as an index into this table to get a word. Good luck converting this into a real cryptosystem.

Related

File Delimiters on AES 256 Encrypted fields

I have a requirement for one of my projects in which I am expecting a few of the incoming fields encrypted as AES-256 when sent to us by upstream. The incoming file is comma delimited. Is there a possibility that the AES encrypted fields may contain "," throwing off the values to different fields? What about if it is pipe delimited or some other delimiter?
Also, how what should be the datatype of these encrypted fields in order to read these encrypted fields using an ETL tool?
Thanks in Advance
AES as a block cipher is a family of permutations selected by the key. The output is expected to be random ( more precisely we believe that AES is a Pseudo-Random-Permutation)
AES ( like any block cipher) outputs binary data, usually as a byte array and bytes can take any value between 0 and 256 with equal probability.
You are not alone;
Transmitting binary data can create problems, especially in protocols that are designed to deal with textual data. To avoid it altogether, we don't transmit binary data. Many of the programming errors related to encryption on Stack Overflow are due to sending binary data over text-based protocols. Most of the time this works, but occasionally it fails and the coders wonder about the problem. The binary data corrupts the network protocol.
Therefore hex, base64, or similar encodings are necessary to mitigate this. Base64 is not totally URL safe and one can make it URL safe with a little work.
And note that has nothing to do with security; it is about visibility and interoperability.

Can a brute force attack reliably detect whether a key was right?

My understanding is that for most encryption algorithms there is always an output, regardless of the key. A wrong key will of course produce a wrong output. So when using brute force to decrypt encrypted data, how do hackers know when the key was correct? Is there a way other than analyzing the output data?
If this is the only way, I have this thoughts. When encrypting texts, wouldn't it safer to encrypt on word level using a directory rather than on bit level as done today? Then the output would always consist of words. Hackers would need to use complicated and slow algorithms to check grammar in the output words to determine whether this could be the real written text.
To answer the first part, I simply state my old answer to the super user question "How does Truecrypt know it has the correct password?"
It knows the correct password because within that encrypted container
there is a known header.
When Trucrypt decrypts a blob of data and the header matches what it
was expecting it reports back that the decryption was successful. If
you use a incorrect password it will still "decrypt" the text, but it
will decrypt the header in to gibberish and fail the decryption check.
Here is a link to the specification, you can see there are many
things that must be true for it to be a valid header (bytes 64-67
after decryption should always be the ASCII value TRUE, bytes
132-251 must all be 0's, ect.). If you you decrypt a blob of data
and it does not match that header format, you know the decryption
failed.
So they already do what you where suggesting about "checking the grammar", they attempt to decrypt the message and if the message "has proper grammar" (the data follows the spec of the encrypted file format) the message was successfully decrypted.
For your 2nd part of "using a dictionary" there are a few important issues.
First, this would only work on plain unformatted text, no binary data or text metadata allowed. However, more importantly, second how do you "create" this dictionary? If you create the dictionary on the fly using the words in the document tell me what would be the dictionary for the following message:
We attack tomorrow!
You could pad the dictionary with extra words but how do you choose the padding? If you used an existing fixed dictionary, what if a word is not in the dictionary, what do you then? What about misspellings?
I have not even begun to touch on how this method is very likely to leek information. Like you said, English has a set of rules for grammar and some words more often come near the end of sentences and some words come more often near the start of sentences, looking at the numbers used as the indexes you could potentially do a statistical analysis on it and rule out a portion of the dictionary as "unlikely" to be used words.
I am sure there are many many other problems with this, but I am only a beginner in crypto and I can not think of any others off of the top of my head.
There is an adage in cryptography "It is easy to for you to create a cypher that you yourself can not break, it is quite hard for you to make a cypher that other people can not break"

AES128 bit encryption string is not similar as on .net

I am Implementing the AES128 bit encryption/Decryption in iOS application for sending/receiving data from .net server, I almost done but during unit testing I got some issue in encryption string, some encrypted string are not similar as on .net server, Can say 98 percent strings are correct at both side but issue comes in 2 percent strings , when I match the both side encrypted string then found at iOS end generated string is little short and .net end it is long string. One more thing i found the iOS string is the substring of .net string. When i tried to decrypt the iOS generated encrypted string, it is not decrypted showing null but when I try to decrypt the .net server generated encrypted string (it was larger than the iOS) I am able to se the decrypted string.
Using the same KEY(16 character long at server and iOS end).
could you please suggest the solution or where I am wrong .
Thanks a lot to all.
Original string: "custId=10&mode=1"
KEY= "PasswordPassword"
at iOS encrypted string:
r51TbJpBLYDkcPC+Ei6Rmg==
at .net encrpted string:
r51TbJpBLYDkcPC+Ei6RmtY2fuzv3RsHzsXt/RpFxAs=
padding for encryption = kCCOptionPKCS7Padding;
I followed this tutorial.
http://automagical.rationalmind.net/2009/02/12/aes-interoperability-between-net-and-iphone/
A similar question found on CryptoSE
My Version TL;DR
Essentially .net and iOS both have different implementations, and since the guide you are following is from 2009 I would expect that it is rather out of date by now given there have been at least 1 major revision bump in each of the platforms since then.
Original Answer Gives the following answer:
I can immediately think of four reasons:
They're both not using AES256. I see in the Obj-C document a direct statement that they are using AES256 (unless you deliberately change it), I don't see any statement in the Visual Basic document that says what key size they're using (unless that's what they mean by "Block Bits").
Different keys. AES256 takes a key of 256 bits; there's no standard method to take a five character string and convert that into a 256 bit value. Now, there are a lot of possible methods; there's no particular assurance that they both use the same one.
Different modes of operation. The AES block cipher takes 128-bit values, and translates that into 128-bit values. However, not all our messages can fit into 128 bits, and in addition, sometimes there are other things we'd like to do other than message encryption. A Mode of Operation is a method that takes a block cipher, and uses it as a tool to perform some more generally useful function (such as encrypting a much longer message). There are a number of standard modes of operations, the Obj-C document states that it is using CBC mode; the Visual Basic document has scary sounding words which might be a garbled explination of CBC mode.
IVs. Some modes of operation (such as CBC mode) have the encryptor select an "Initialization Vector" randomly; that can be translated along with the encrypted message (because the decryptor will need that value). One of the things that this Initialization Vector does if you encrypt the message a second time, the second ciphertext will not resemble the first ciphertext at all; that way, someone listening will not be able to deduce that you've just repeated a message. The Obj-C document specifically says that it will pick a random IV (unless to tell give it one yourself).
As you can see, there are a bunch of reasons why the two ciphertexts may be different. One thing you can try: hand the ciphertext from one to the other, and ask them to decrypt it; if they can, you can be pretty sure that both sides are doing basically the same thing.
As you can see, there are a bunch of reasons why the two ciphertexts may be different. One thing you can try: hand the ciphertext from one to the other, and ask them to decrypt it; if they can, you can be pretty sure that both sides are doing basically the same thing.

How to distribute init. vector with data encrypted via a symmetric cipher?

I want to provide for the user a service of encrypting some data via symmetric cipher to a file. The user simply provide a key and he/she may provide an initialize vector for the cipher.
Is there a standard how the file should look like? It makes sense to fill the file with the encrypted data and show the corresponding initialize vector in a dialog window. It may seem reasonable to someone else that the initialize vector should be stored in the file with the encrypted data.
The important thing for me is that the result is useful for a user and he/she won't need to bother with adjustment of the result.
Thank for a comment!
It is common practice to provide the IV as the first block of the cyphertext file. That way the receiver just treats the first 8 bytes (DES) or 16 bytes (AES) as the IV and the rest of the file as the actual cyphertext.
Use the same format for the IV as you are using for the cyphertext: Base64, hex, byte data or whatever.
In principle, you can use any format you want, as long as the decrypting part of the program knows how to read it. For efficiency, having the initialization vector before the data seems a good idea.
If you want to encrypt files, a good idea would be to not create your own format (which leads to you having to do decisions like the one here), but use an existing file format (which then also is a cryptographic protocol).
I recommend the OpenPGP message format, as defined in RFC 4880 (or some subset thereof, if you don't need all features). This also has the advantage that your clients then can decrypt your files using any OpenPGP implementation (like pgp or gpg), if your program somehow ceases to work (of course, only if they have the key/password).
you should be fine if you store the IV together with the encrypted data in the file ...

Difference between encoding and encryption

What is the difference between encoding and encryption?
Encoding transforms data into another format using a scheme that is publicly available so that it can easily be reversed.
Encryption transforms data into another format in such a way that only specific individual(s) can reverse the transformation.
For Summary -
Encoding is for maintaining data usability and uses schemes that are publicly available.
Encryption is for maintaining data confidentiality and thus the ability to reverse the transformation (keys) are limited to certain people.
More details in SOURCE
Encoding:
Purpose: The purpose of encoding is to transform data so that it can be properly (and safely) consumed by a different type of system.
Used for: Maintaining data usability i.e., to ensure that it is able to be properly consumed.
Data Retrieval Mechanism: No key and can be easily reversed provided we know what algorithm was used in encoding.
Algorithms Used: ASCII, Unicode, URL Encoding, Base64.
Example: Binary data being sent over email, or viewing special characters on a web page.
Encryption:
Purpose: The purpose of encryption is to transform data in order to keep it secret from others.
Used for: Maintaining data confidentiality i.e., to ensure the data cannot be consumed by anyone other than the intended recipient(s).
Data Retrieval Mechanism: Original data can be obtained if we know the key and encryption algorithm used.
Algorithms Used: AES, Blowfish, RSA.
Example: Sending someone a secret letter that only they should be able to read, or securely sending a password over the Internet.
Reference URL: http://danielmiessler.com/study/encoding_vs_encryption/
Encoding is the process of transforming data so that it may be transmitted without danger over a communication channel or stored without danger on a storage medium. For instance, computer hardware does not manipulate text, it merely manipulates bytes, so a text encoding is a description of how text should be transformed into bytes. Similarly, HTTP does not allow all characters to be transmitted safely, so it may be necessary to encode data using base64 (uses only letters, numbers and two safe characters).
When encoding or decoding, the emphasis is placed on everyone having the same algorithm, and that algorithm is usually well-documented, widely distributed and fairly easily implemented. Anyone is eventually able to decode encoded data.
Encryption, on the other hand, applies a transformation to a piece of data that can only be reversed with specific (and secret) knowledge of how to decrypt it. The emphasis is on making it hard for anyone but the intended recipient to read the original data. An encoding algorithm that is kept secret is a form of encryption, but quite vulnerable (it takes skill and time to devise any kind of encryption, and by definition you can't have someone else create such an encoding algorithm for you - or you would have to kill them). Instead, the most used encryption method uses secret keys : the algorithm is well-known, but the encryption and decryption process requires having the same key for both operations, and the key is then kept secret. Decrypting encrypted data is only possible with the corresponding key.
Encoding is the process of putting a sequence of characters into a special format for transmission or storage purposes
Encryption is the process of translation of data into a secret code. Encryption is the most effective way to achieve data security. To read an encrypted file, you must have access to a secret key or password that enables you to decrypt it. Unencrypted data is called plain text ; encrypted data is referred to as cipher text
Encoding is for maintaining data usability and can be reversed by employing the same algorithm that encoded the content, i.e. no key is used.
Encryption is for maintaining data confidentiality and requires the use of a key (kept secret) in order to return to plaintext.
Also there are two major terms that brings confusion in the world of security Hashing and Obfuscation
Hashing is for validating the integrity of content by detecting all modification thereof via obvious changes to the hash output.
Obfuscation is used to prevent people from understanding the meaning of something, and is often used with computer code to help prevent successful reverse engineering and/or theft of a product’s functionality.
Read more # Danielmiessler article
See encoding as a way to store or communicate data between different systems. For example, if you want to store text on a hard drive, you're going to have to find a way to convert your characters to bits. Alternatively, if all you have is a flash light, you might want to encode your text using Morse. The result is always "readable", provided you know how it's stored.
Encryption means you want to make your data unreadable, by encrypting it using an algorithm. For example, Caesar did this by substituting each letter by another. The result here is unreadable, unless you know the secret "key" with which is was encrypted.
I'd say that both operations transform information from one form to another, the difference being:
Encoding means transforming information from one form to another, in most cases it is easily reversible
Encryption means that the original information is obscured and involves encryption keys which must be supplied to the encryption / decryption process to do the transformation.
So, if it involves (symmetric or asymmetric) keys (aka a "secret"), it's encryption, otherwise it's encoding.
Encoding -》 example data is 16
Then encoding is 10000 means it's binary format or ASCII or UNCODED etc
Which can be read by any system eassily and eassy to understand it's real meaning
Encryption -》 example data is 16
Then encryprion is 3t57 or may be anything depend upon which algo is used to encryption
Which can be read by any system eassily BUT ony who can understand it's real meaning who has it's decryption key
These are little bit different from each other. The encoding used when we want to convert text in a specific computer coding technique and in the encryption we hide data between a specific key or text.
Encoding is process of transforming given set of characters in relevant accepted format, take this question's URL,
This is what we see -->
hhttps://stackoverflow.com/questions/4657416/difference-between-encoding-and-encryption
Over transmission this will be transformed to -->
https%3A%2F%2Fstackoverflow.com%2Fquestions%2F4657416%2Fdifference-between-encoding-and-encryption
^ is example of URL encoding using ASCII char set where,
: = %3A
/ = %2F
The reverse of Encoding is Decoding to original form and with given ASCII standard.
Encryption is process of converting plane text to cipher text so only authorized party can decipher it.
For example a simple HELLO is encrypted into KHOOR if just 3 characters are shifted.
p.s. Encoding (to code in some form) is form of encryption. :)
what-is-encryption
Encryption converts data to non-readable format (Possibly containing special non-readable characters).
Encoding helps to convert that data to readable format (characters) so that it can be stored for future use i.e. possibly during decryption.

Resources