lua aes encryption - encryption

I found a "lua aes" solution on the web a while ago. And have some concern about its safety.
It states that:
-- Do not use for real encryption, because the password is easily viewable while encrypting.
It says this at its "file encryption test" script.
My questions are:
Why is that, how is it any different from encrypting a string and writing it to a file?
How could it be viewable while encryption? Is it viewable after encryption too?
Basically, Is it safe to use or not?
Is there anyone who can confirm this who has used it? I mailed the original developer but the email address was invalid.
Should I be using it at all?

I assume there are two reasons why that recommendation was made:
Strings are immutable in Lua, so there is no way to overwrite a string with different data
once it's created.
In Lua, objects are garbage collected. The garbage collector runs only at certain points in
the program, and the application has no way of telling when the garbage collector will run after there are no more references to the object. Until then, the password string will remain in memory by point 1.
See Java's case, which is similar to Lua:
Why is char[] preferred over String for passwords?
As you can see there, using char arrays instead of strings is a better way to store passwords, since arrays are mutable and can be reinitialized to zero when done.
The closest Lua equivalent to a char array is a table filled with numbers. Here the password is stored as a table, rather than a string, where each element in the table consists of the integer representation of each character. For example, "pass" becomes {0x70,0x61,0x73,0x73}. After the table containing the password is used to encrypt or decrypt, it is filled with zeros before it's unreachable by the program and eventually gets garbage collected.
According to your comment, I may have misunderstood. Maybe the "file encryption test" stores the password in plain text along with the encrypted file, allowing anyone with access to the file, even attackers, the ability to trivially decrypt it. The points above still apply, though. This is still only a guess, however; I can't know exactly what you mean unless you provide a link to the encryption library you mention.
I've taken a look at the AES library and the concern about the password being "easily viewable" occurs because the user types the password in plain text, through the command line or terminal, in order to start the Lua program, even though the output of the program contains only cipher text. A slightly more secure way of providing the password would be not to show the input (as is done in sudo) or to mask the input with dots or stars (as is done in many Web pages).
Either that or the points given above are perhaps the only logical explanation.

You may also try out alternate methods, like LuaCrypto, which is a binding to OpenSSL and is able to encrypt data using the AES standard.

Related

Can a brute force attack reliably detect whether a key was right?

My understanding is that for most encryption algorithms there is always an output, regardless of the key. A wrong key will of course produce a wrong output. So when using brute force to decrypt encrypted data, how do hackers know when the key was correct? Is there a way other than analyzing the output data?
If this is the only way, I have this thoughts. When encrypting texts, wouldn't it safer to encrypt on word level using a directory rather than on bit level as done today? Then the output would always consist of words. Hackers would need to use complicated and slow algorithms to check grammar in the output words to determine whether this could be the real written text.
To answer the first part, I simply state my old answer to the super user question "How does Truecrypt know it has the correct password?"
It knows the correct password because within that encrypted container
there is a known header.
When Trucrypt decrypts a blob of data and the header matches what it
was expecting it reports back that the decryption was successful. If
you use a incorrect password it will still "decrypt" the text, but it
will decrypt the header in to gibberish and fail the decryption check.
Here is a link to the specification, you can see there are many
things that must be true for it to be a valid header (bytes 64-67
after decryption should always be the ASCII value TRUE, bytes
132-251 must all be 0's, ect.). If you you decrypt a blob of data
and it does not match that header format, you know the decryption
failed.
So they already do what you where suggesting about "checking the grammar", they attempt to decrypt the message and if the message "has proper grammar" (the data follows the spec of the encrypted file format) the message was successfully decrypted.
For your 2nd part of "using a dictionary" there are a few important issues.
First, this would only work on plain unformatted text, no binary data or text metadata allowed. However, more importantly, second how do you "create" this dictionary? If you create the dictionary on the fly using the words in the document tell me what would be the dictionary for the following message:
We attack tomorrow!
You could pad the dictionary with extra words but how do you choose the padding? If you used an existing fixed dictionary, what if a word is not in the dictionary, what do you then? What about misspellings?
I have not even begun to touch on how this method is very likely to leek information. Like you said, English has a set of rules for grammar and some words more often come near the end of sentences and some words come more often near the start of sentences, looking at the numbers used as the indexes you could potentially do a statistical analysis on it and rule out a portion of the dictionary as "unlikely" to be used words.
I am sure there are many many other problems with this, but I am only a beginner in crypto and I can not think of any others off of the top of my head.
There is an adage in cryptography "It is easy to for you to create a cypher that you yourself can not break, it is quite hard for you to make a cypher that other people can not break"

AES128 bit encryption string is not similar as on .net

I am Implementing the AES128 bit encryption/Decryption in iOS application for sending/receiving data from .net server, I almost done but during unit testing I got some issue in encryption string, some encrypted string are not similar as on .net server, Can say 98 percent strings are correct at both side but issue comes in 2 percent strings , when I match the both side encrypted string then found at iOS end generated string is little short and .net end it is long string. One more thing i found the iOS string is the substring of .net string. When i tried to decrypt the iOS generated encrypted string, it is not decrypted showing null but when I try to decrypt the .net server generated encrypted string (it was larger than the iOS) I am able to se the decrypted string.
Using the same KEY(16 character long at server and iOS end).
could you please suggest the solution or where I am wrong .
Thanks a lot to all.
Original string: "custId=10&mode=1"
KEY= "PasswordPassword"
at iOS encrypted string:
r51TbJpBLYDkcPC+Ei6Rmg==
at .net encrpted string:
r51TbJpBLYDkcPC+Ei6RmtY2fuzv3RsHzsXt/RpFxAs=
padding for encryption = kCCOptionPKCS7Padding;
I followed this tutorial.
http://automagical.rationalmind.net/2009/02/12/aes-interoperability-between-net-and-iphone/
A similar question found on CryptoSE
My Version TL;DR
Essentially .net and iOS both have different implementations, and since the guide you are following is from 2009 I would expect that it is rather out of date by now given there have been at least 1 major revision bump in each of the platforms since then.
Original Answer Gives the following answer:
I can immediately think of four reasons:
They're both not using AES256. I see in the Obj-C document a direct statement that they are using AES256 (unless you deliberately change it), I don't see any statement in the Visual Basic document that says what key size they're using (unless that's what they mean by "Block Bits").
Different keys. AES256 takes a key of 256 bits; there's no standard method to take a five character string and convert that into a 256 bit value. Now, there are a lot of possible methods; there's no particular assurance that they both use the same one.
Different modes of operation. The AES block cipher takes 128-bit values, and translates that into 128-bit values. However, not all our messages can fit into 128 bits, and in addition, sometimes there are other things we'd like to do other than message encryption. A Mode of Operation is a method that takes a block cipher, and uses it as a tool to perform some more generally useful function (such as encrypting a much longer message). There are a number of standard modes of operations, the Obj-C document states that it is using CBC mode; the Visual Basic document has scary sounding words which might be a garbled explination of CBC mode.
IVs. Some modes of operation (such as CBC mode) have the encryptor select an "Initialization Vector" randomly; that can be translated along with the encrypted message (because the decryptor will need that value). One of the things that this Initialization Vector does if you encrypt the message a second time, the second ciphertext will not resemble the first ciphertext at all; that way, someone listening will not be able to deduce that you've just repeated a message. The Obj-C document specifically says that it will pick a random IV (unless to tell give it one yourself).
As you can see, there are a bunch of reasons why the two ciphertexts may be different. One thing you can try: hand the ciphertext from one to the other, and ask them to decrypt it; if they can, you can be pretty sure that both sides are doing basically the same thing.
As you can see, there are a bunch of reasons why the two ciphertexts may be different. One thing you can try: hand the ciphertext from one to the other, and ask them to decrypt it; if they can, you can be pretty sure that both sides are doing basically the same thing.

How to distribute init. vector with data encrypted via a symmetric cipher?

I want to provide for the user a service of encrypting some data via symmetric cipher to a file. The user simply provide a key and he/she may provide an initialize vector for the cipher.
Is there a standard how the file should look like? It makes sense to fill the file with the encrypted data and show the corresponding initialize vector in a dialog window. It may seem reasonable to someone else that the initialize vector should be stored in the file with the encrypted data.
The important thing for me is that the result is useful for a user and he/she won't need to bother with adjustment of the result.
Thank for a comment!
It is common practice to provide the IV as the first block of the cyphertext file. That way the receiver just treats the first 8 bytes (DES) or 16 bytes (AES) as the IV and the rest of the file as the actual cyphertext.
Use the same format for the IV as you are using for the cyphertext: Base64, hex, byte data or whatever.
In principle, you can use any format you want, as long as the decrypting part of the program knows how to read it. For efficiency, having the initialization vector before the data seems a good idea.
If you want to encrypt files, a good idea would be to not create your own format (which leads to you having to do decisions like the one here), but use an existing file format (which then also is a cryptographic protocol).
I recommend the OpenPGP message format, as defined in RFC 4880 (or some subset thereof, if you don't need all features). This also has the advantage that your clients then can decrypt your files using any OpenPGP implementation (like pgp or gpg), if your program somehow ceases to work (of course, only if they have the key/password).
you should be fine if you store the IV together with the encrypted data in the file ...

Can anyone describe the difference between password encryption and hash (sha-256)?

I need to save the password to database.I get confused in encryption,hash using sha-256 ,salt generation method .If any one explains the basic concept behind this then it will be helpful
The follow is a very basic explanation, anyway...
Encryption is a reversible method to crypt the data. So if you have "password" a encryption method convert it into (for example) "ufmehlejw" and then you are able to get again "password".
An hash function (one of them is the sha-256) is a function that once it's used on a string you have no way to recover the original string.
A salt is a string which usually prorammers (and not only, of course) use to mix the given password. It's usually randomly generated. A salt is used to extend the original data before using an hash function. The goal of the salt is to avoid attackers to discover the original password of a user from a stolen hash using rainbow tables.
In short:
Encryption is a process with an inverse. In other words: If I encrypt some text, there is a process which is able to convert the new text back to the original, called decryption.
Hashing is fundamentally different from encryption, because it does not have such a process. What a hash is meant to do is provide you with a result, which is unique for that given input text (well, almost unique, let's keep it at unique). This way, people can verify if two input texts were equal, without knowing what the actual input text was. So, if people get their hands on your hashed password, they still cannot decrypt it. SHA is a family of methods which provide hashing.
Salts and Peppers are merely additional techniques to hashing, which describe the process of adding something before and after the input text before hashing. This improves the difficulty of brute-force cracking of hashes back to text.
Brute force cracking means simply trying all possible inputs (aa, ab, ac, etc...) and see if you can generate a hash which matches the hash you have gotten via hacking some website or whatever. You can find more on that here: https://security.stackexchange.com/questions/3272/password-hashing-add-salt-pepper-or-is-salt-enough

Cryptography: hash of string correlates to hash of substrings of string

I'm processing human-written text documents and I do a dictionary based string matching to find specific strings in the document.
For security reasons, I can not input the document in unencrypted text format, but rather in a strong encrypted format. I can not allow developers working on the unit access the unencrypted input string, but they can access the matched strings.
To make it clearer:
Dictionary = {"Apple", "Apple pie", "World War II"}
Document1 = "apple is my favorite fruit." -> Should match "apple"
Document2 = "apple pie was invented during world war II" -> Should match "apple pie" and "world war II"
So the string matching is case-insensitive and only matches longest occurrence (I'm using Aho-Corasick).
The options I see are:
Find an encryption function F where F("ABCD") = F("A")+F("B")+F("C")+F("D") = F("AB")+F("CD").
Chunk the document by whitespace, hash both the chunks and the dictionary and then look for similarities. (complicated)
Make a separate unit responsible for encryption and string matching with obfuscated code. (most obvious way)
As I'm not good at cryptography, I might be missing something here. Can anyone see a better way of achieving this?
Firstly, any encryption function that satisfies your condition:
F("ABCD") = F("A")+F("B")+F("C")+F("D")
is inherently not strong encryption (assuming + here means concatenation). The problem is that this condition implies that F("A") is invariant, which means that it the encryption is equivalent to a simple substitution cipher, vulnerable to frequency analysis.
A bigger problem however is that any solution is going to be vulnerable to a dictionary attack. If you can determine that a word in the unknown document is a particular word in your limited dictionary, then you can also search for it in a complete dictionary - in this way, you can quickly discover the entire plaintext.
If I understand correctly, the goal is to prevent someone who has physical access to the machine and access to the processes running on it from being able to determine the contents of the document. I don't think that is possible if the "bad guy" is extremely dedicated. He will be able to extract key information necessary to decrypt the document from the process space. As a general rule, if the attacker has physical access, then there is not a lot that can be done.
If the program can match parts of text of a document to known text, then the attacker will be able to observe that and extract the information. Obfuscation of the code may make it harder, but if the information is valuable enough, then the attacker will just work harder.
It seems that it would be better if the server can be run in a secure fashion and limiting physical access as much as possible. There are, of course, still a lot of issues involved (code would need to be audited for malicious code for example since the developers are apparently not trusted) but that at least gets you to a position that has a chance of being defended.
Edit A couple thoughts about encryption in the context of what you are trying to do.  If you are using, for example, AES encryption in CBC (cipher block chaining) mode, then it is not possible to decrypt a single word from the document (assuming the document is encrypted as a whole). Each block of cipher text depends on the preceding block. Thus, it would be necessary to decrypt the entire document up to the point of interest.  In other words, you would have to decrypt the entire document to search it.  
Another encryption possibility would be to use AES in CTR mode. CTR mode generates cipher stream (based on the key and some initialization vector) and XORs that against the plain text to produce the cipher text. In this mode, it is possible to decrypt a portion in the middle of the document without decrypting the previous section. But that is somewhat misleading and a bit of a semantics argument. Even though you don't have to decrypt the preceding section, it is still necessary to generate the cipher stream for the entire document up to the point of interest. And from an attacker's standpoint, that would be the same as decrypting the document since the attacker would have access to the encrypted text (presumably in the situation you describe) and the generated XOR stream, which would yield the plain text. 
Your proposed solution #1 is a very very difficult problem - known to be solvable, but almost certainly not worth your while to solve.
The technique you would want for it is Homomorphic Encryption. It was first demonstrated in 2009 by Craig Gentry of IBM that arbitrary computation can be performed without revealing the plaintext.
The state-of-the-art is probably too inefficient for almost all applications - while exponential security can be obtained with "polynomial" computation (which is all the theorists really care about), the polynomial is enormous enough to be not valuable. This might change in the near future.
With that said, I don't see any reason why you can't:
hash each entry in the dictionary
(split each entry on whitespace, multiword entries are tuples of hashes)
split document on whitespace, hash each word
do the matching with the hashes
Essentially, you're matching arbitrary items, not inherently words. The client can produce the words-items map, and pass the items to the server. The server doesn't need to know anything about the items, just that an item from the dictionary appears in the text.

Resources