I'm author of a couple evernote extensions, and I wan't to utilize encrypted text, create them on my end and add to a note using the enml structure and en-crypt.
however, I'm unable to figure out the structure of an en-crypt CDATA.
If you look at the .enex file after exporting the note, the enml element of an encrypted text looks like this:
<en-crypt hint="My Cat's Name">NKLHX5yK1MlpzemJQijAN6C4545s2EODxQ8Bg1r==</en-crypt>
I've read through this
https://help.evernote.com/hc/en-us/articles/208314128-What-type-of-encryption-does-Evernote-use
which explains that the key is derived with PBKDF2 and the encryption is done with unique salt aes-cbc 256bit method, however, looking at the example above, I'm unable to figure out how that data is stored there.
If I count correctly, There are two unique salts and one IV, as well as the encrypted text to store there. So my question is, how can one make use of that block? There might be a standard way of storing all that information in one base64 block. But I'm no encryption expert so any bit of help is appreciated.
Thanks.
For old, RC2-based encryption (and if your <en-crypt> tag doesn't have the cipher="AES" attribute, this is the RC2-based encryption), the algorithm roughly looks like this:
decodedString = RC2.decrypt(Base64.decode(encodedString), MD5.hash(UTF8.encode(passphrase)), 64);
In the decoded string, the first 4 chars of the string is the HEX-representation of the upper-byte of the CRC32 of the rest of the decoded string.
One caveat: when calculating CRC32, you may need to XOR it with -1 (0xFFFFFFFF) as there are different implementations of CRC32 in the wild that may or may not do this final XOR.
Related
My understanding is that for most encryption algorithms there is always an output, regardless of the key. A wrong key will of course produce a wrong output. So when using brute force to decrypt encrypted data, how do hackers know when the key was correct? Is there a way other than analyzing the output data?
If this is the only way, I have this thoughts. When encrypting texts, wouldn't it safer to encrypt on word level using a directory rather than on bit level as done today? Then the output would always consist of words. Hackers would need to use complicated and slow algorithms to check grammar in the output words to determine whether this could be the real written text.
To answer the first part, I simply state my old answer to the super user question "How does Truecrypt know it has the correct password?"
It knows the correct password because within that encrypted container
there is a known header.
When Trucrypt decrypts a blob of data and the header matches what it
was expecting it reports back that the decryption was successful. If
you use a incorrect password it will still "decrypt" the text, but it
will decrypt the header in to gibberish and fail the decryption check.
Here is a link to the specification, you can see there are many
things that must be true for it to be a valid header (bytes 64-67
after decryption should always be the ASCII value TRUE, bytes
132-251 must all be 0's, ect.). If you you decrypt a blob of data
and it does not match that header format, you know the decryption
failed.
So they already do what you where suggesting about "checking the grammar", they attempt to decrypt the message and if the message "has proper grammar" (the data follows the spec of the encrypted file format) the message was successfully decrypted.
For your 2nd part of "using a dictionary" there are a few important issues.
First, this would only work on plain unformatted text, no binary data or text metadata allowed. However, more importantly, second how do you "create" this dictionary? If you create the dictionary on the fly using the words in the document tell me what would be the dictionary for the following message:
We attack tomorrow!
You could pad the dictionary with extra words but how do you choose the padding? If you used an existing fixed dictionary, what if a word is not in the dictionary, what do you then? What about misspellings?
I have not even begun to touch on how this method is very likely to leek information. Like you said, English has a set of rules for grammar and some words more often come near the end of sentences and some words come more often near the start of sentences, looking at the numbers used as the indexes you could potentially do a statistical analysis on it and rule out a portion of the dictionary as "unlikely" to be used words.
I am sure there are many many other problems with this, but I am only a beginner in crypto and I can not think of any others off of the top of my head.
There is an adage in cryptography "It is easy to for you to create a cypher that you yourself can not break, it is quite hard for you to make a cypher that other people can not break"
I'm trying to figure something out. I have a legacy system in place and I'm not using all of it. There are business reasons why we use things this way.
Some fields in the system get encrypted by a piece of middleware that I ultimately would like to replace. I can't replace this part of the system because I can't decrypt the values properly.
For example I have a field that contains the word:
ferret
This is encrypted and becomes:
^ADFJBLFOHLOJFNHHKFJLHFJNPCJFJCPFBAPEKDKM
The words
wellington boot
becomes
^KOKFDEJPAAPFJHPOIGOICOAHKFLNFHMIOJNHAAHF
I can see the unencrypted data and I can see the resulting encrypted data but I am trying to find what algorithm was used to turn the field value into the encrypted versions. The main reason for this is that I have a requirement to massively increase the number of fields that contain the encrypted data but at the moment I can't because I cannot replace the existing encryption mechanism because I don't know what was used to encrypt the data.
There is simply too much data in the system to go through and load up each record and make a note of the unecrypted data so I can make a new encryption mechanism.
If I knew how the existing data was encrypted I could use the same method to encrypt my new fields. The system encrypts certain fields only and my extension to the system needs to encrypt others using the same method.
How can I do this? Is it even possible to find out how the data was encrypted and what method was used?
It is SHA1, translated into A for 0, B for 1, C for 2, etc. For example, your "wellington boot" example has the SHA1 hash of "aea5349f00f..." which is clearly "KOKFDEJPAAP..."
So you can just use SHA1 and do the same translation to continue the pattern.
To check this, try the phrase "test phrase" - the SHA1 of this is "ab8f37d89b1154ba18c78a7e4b8eef2acdfec1eb", which becomes "KLIPDHNIJL..." in your system.
I found a "lua aes" solution on the web a while ago. And have some concern about its safety.
It states that:
-- Do not use for real encryption, because the password is easily viewable while encrypting.
It says this at its "file encryption test" script.
My questions are:
Why is that, how is it any different from encrypting a string and writing it to a file?
How could it be viewable while encryption? Is it viewable after encryption too?
Basically, Is it safe to use or not?
Is there anyone who can confirm this who has used it? I mailed the original developer but the email address was invalid.
Should I be using it at all?
I assume there are two reasons why that recommendation was made:
Strings are immutable in Lua, so there is no way to overwrite a string with different data
once it's created.
In Lua, objects are garbage collected. The garbage collector runs only at certain points in
the program, and the application has no way of telling when the garbage collector will run after there are no more references to the object. Until then, the password string will remain in memory by point 1.
See Java's case, which is similar to Lua:
Why is char[] preferred over String for passwords?
As you can see there, using char arrays instead of strings is a better way to store passwords, since arrays are mutable and can be reinitialized to zero when done.
The closest Lua equivalent to a char array is a table filled with numbers. Here the password is stored as a table, rather than a string, where each element in the table consists of the integer representation of each character. For example, "pass" becomes {0x70,0x61,0x73,0x73}. After the table containing the password is used to encrypt or decrypt, it is filled with zeros before it's unreachable by the program and eventually gets garbage collected.
According to your comment, I may have misunderstood. Maybe the "file encryption test" stores the password in plain text along with the encrypted file, allowing anyone with access to the file, even attackers, the ability to trivially decrypt it. The points above still apply, though. This is still only a guess, however; I can't know exactly what you mean unless you provide a link to the encryption library you mention.
I've taken a look at the AES library and the concern about the password being "easily viewable" occurs because the user types the password in plain text, through the command line or terminal, in order to start the Lua program, even though the output of the program contains only cipher text. A slightly more secure way of providing the password would be not to show the input (as is done in sudo) or to mask the input with dots or stars (as is done in many Web pages).
Either that or the points given above are perhaps the only logical explanation.
You may also try out alternate methods, like LuaCrypto, which is a binding to OpenSSL and is able to encrypt data using the AES standard.
I want to provide for the user a service of encrypting some data via symmetric cipher to a file. The user simply provide a key and he/she may provide an initialize vector for the cipher.
Is there a standard how the file should look like? It makes sense to fill the file with the encrypted data and show the corresponding initialize vector in a dialog window. It may seem reasonable to someone else that the initialize vector should be stored in the file with the encrypted data.
The important thing for me is that the result is useful for a user and he/she won't need to bother with adjustment of the result.
Thank for a comment!
It is common practice to provide the IV as the first block of the cyphertext file. That way the receiver just treats the first 8 bytes (DES) or 16 bytes (AES) as the IV and the rest of the file as the actual cyphertext.
Use the same format for the IV as you are using for the cyphertext: Base64, hex, byte data or whatever.
In principle, you can use any format you want, as long as the decrypting part of the program knows how to read it. For efficiency, having the initialization vector before the data seems a good idea.
If you want to encrypt files, a good idea would be to not create your own format (which leads to you having to do decisions like the one here), but use an existing file format (which then also is a cryptographic protocol).
I recommend the OpenPGP message format, as defined in RFC 4880 (or some subset thereof, if you don't need all features). This also has the advantage that your clients then can decrypt your files using any OpenPGP implementation (like pgp or gpg), if your program somehow ceases to work (of course, only if they have the key/password).
you should be fine if you store the IV together with the encrypted data in the file ...
What is the difference between encoding and encryption?
Encoding transforms data into another format using a scheme that is publicly available so that it can easily be reversed.
Encryption transforms data into another format in such a way that only specific individual(s) can reverse the transformation.
For Summary -
Encoding is for maintaining data usability and uses schemes that are publicly available.
Encryption is for maintaining data confidentiality and thus the ability to reverse the transformation (keys) are limited to certain people.
More details in SOURCE
Encoding:
Purpose: The purpose of encoding is to transform data so that it can be properly (and safely) consumed by a different type of system.
Used for: Maintaining data usability i.e., to ensure that it is able to be properly consumed.
Data Retrieval Mechanism: No key and can be easily reversed provided we know what algorithm was used in encoding.
Algorithms Used: ASCII, Unicode, URL Encoding, Base64.
Example: Binary data being sent over email, or viewing special characters on a web page.
Encryption:
Purpose: The purpose of encryption is to transform data in order to keep it secret from others.
Used for: Maintaining data confidentiality i.e., to ensure the data cannot be consumed by anyone other than the intended recipient(s).
Data Retrieval Mechanism: Original data can be obtained if we know the key and encryption algorithm used.
Algorithms Used: AES, Blowfish, RSA.
Example: Sending someone a secret letter that only they should be able to read, or securely sending a password over the Internet.
Reference URL: http://danielmiessler.com/study/encoding_vs_encryption/
Encoding is the process of transforming data so that it may be transmitted without danger over a communication channel or stored without danger on a storage medium. For instance, computer hardware does not manipulate text, it merely manipulates bytes, so a text encoding is a description of how text should be transformed into bytes. Similarly, HTTP does not allow all characters to be transmitted safely, so it may be necessary to encode data using base64 (uses only letters, numbers and two safe characters).
When encoding or decoding, the emphasis is placed on everyone having the same algorithm, and that algorithm is usually well-documented, widely distributed and fairly easily implemented. Anyone is eventually able to decode encoded data.
Encryption, on the other hand, applies a transformation to a piece of data that can only be reversed with specific (and secret) knowledge of how to decrypt it. The emphasis is on making it hard for anyone but the intended recipient to read the original data. An encoding algorithm that is kept secret is a form of encryption, but quite vulnerable (it takes skill and time to devise any kind of encryption, and by definition you can't have someone else create such an encoding algorithm for you - or you would have to kill them). Instead, the most used encryption method uses secret keys : the algorithm is well-known, but the encryption and decryption process requires having the same key for both operations, and the key is then kept secret. Decrypting encrypted data is only possible with the corresponding key.
Encoding is the process of putting a sequence of characters into a special format for transmission or storage purposes
Encryption is the process of translation of data into a secret code. Encryption is the most effective way to achieve data security. To read an encrypted file, you must have access to a secret key or password that enables you to decrypt it. Unencrypted data is called plain text ; encrypted data is referred to as cipher text
Encoding is for maintaining data usability and can be reversed by employing the same algorithm that encoded the content, i.e. no key is used.
Encryption is for maintaining data confidentiality and requires the use of a key (kept secret) in order to return to plaintext.
Also there are two major terms that brings confusion in the world of security Hashing and Obfuscation
Hashing is for validating the integrity of content by detecting all modification thereof via obvious changes to the hash output.
Obfuscation is used to prevent people from understanding the meaning of something, and is often used with computer code to help prevent successful reverse engineering and/or theft of a product’s functionality.
Read more # Danielmiessler article
See encoding as a way to store or communicate data between different systems. For example, if you want to store text on a hard drive, you're going to have to find a way to convert your characters to bits. Alternatively, if all you have is a flash light, you might want to encode your text using Morse. The result is always "readable", provided you know how it's stored.
Encryption means you want to make your data unreadable, by encrypting it using an algorithm. For example, Caesar did this by substituting each letter by another. The result here is unreadable, unless you know the secret "key" with which is was encrypted.
I'd say that both operations transform information from one form to another, the difference being:
Encoding means transforming information from one form to another, in most cases it is easily reversible
Encryption means that the original information is obscured and involves encryption keys which must be supplied to the encryption / decryption process to do the transformation.
So, if it involves (symmetric or asymmetric) keys (aka a "secret"), it's encryption, otherwise it's encoding.
Encoding -》 example data is 16
Then encoding is 10000 means it's binary format or ASCII or UNCODED etc
Which can be read by any system eassily and eassy to understand it's real meaning
Encryption -》 example data is 16
Then encryprion is 3t57 or may be anything depend upon which algo is used to encryption
Which can be read by any system eassily BUT ony who can understand it's real meaning who has it's decryption key
These are little bit different from each other. The encoding used when we want to convert text in a specific computer coding technique and in the encryption we hide data between a specific key or text.
Encoding is process of transforming given set of characters in relevant accepted format, take this question's URL,
This is what we see -->
hhttps://stackoverflow.com/questions/4657416/difference-between-encoding-and-encryption
Over transmission this will be transformed to -->
https%3A%2F%2Fstackoverflow.com%2Fquestions%2F4657416%2Fdifference-between-encoding-and-encryption
^ is example of URL encoding using ASCII char set where,
: = %3A
/ = %2F
The reverse of Encoding is Decoding to original form and with given ASCII standard.
Encryption is process of converting plane text to cipher text so only authorized party can decipher it.
For example a simple HELLO is encrypted into KHOOR if just 3 characters are shifted.
p.s. Encoding (to code in some form) is form of encryption. :)
what-is-encryption
Encryption converts data to non-readable format (Possibly containing special non-readable characters).
Encoding helps to convert that data to readable format (characters) so that it can be stored for future use i.e. possibly during decryption.