Encrypting SQLite - sqlite

I am going to write my own encryption, but would like to discuss some internals. Should be employed on several mobile platforms - iOS, Android, WP7 with desktop serving more or less as a test platform.
Let's start first with brief characteristics of existing solutions:
SQLite standard (commercial) SEE extension - I have no idea how it works internally and how it co-operates with mentioned mobile platforms.
System.data.sqlite (Windows only): RC4 encyption of the complete DB, ECB mode. They encrypt also DB header, which occasionally (0.01% chance) leads to DB corruption.*) Additional advantage: They use SQLite amalgamation distribution.
SqlCipher (openssl, i.e. several platforms): Selectable encryption scheme. They encrypt whole DB. CBC mode (I think), random IV vector. Because of this, they must modify page parameters (size + reserved space to store IV). They realized the problems related to unencrypted reading of the DB header and tried to introduce workarounds, yet the solution is unsatisfactory. Additional disadvantage: They use SQLite3 source tree. (Which - on the other hand - enables additional features, i.e. fine tuning of the encryption parameters using special pragmas.)
Based on my own analysis I think the following could be a good solution that would not suffer above mentioned problems:
Encrypting whole DB except the DB header.
ECB mode: Sounds risky, but after briefly looking at the DB format I cannot imagine how this could be exploited for an attack.
AES128?
Implementation on top of the SQLite amalgamation (similarly as system.data.sqlite)
I'd like to discuss possible problems of this encryption scheme.
*) Due to SQLite reading DB header without decryption. Due to RC4 (a stream cipher) this problem will manifest at the very first use only. AES would be a lot more dangerous as every "live" DB would sooner or later face this problem.
EDITED - case of VFS-based encryption
Above mentioned methods use codec-based methodology endorsed by sqlite.org. It is a set of 3 callbacks, the most important being this one:
void *(*xCodec)(void *iCtx, void *data, Pgno pgno, int mode)
This callback is used at SQLite discretion for encrypting/decrypting data read from/written to the disk. The data is exchanged page by page. (Page is a multiple of 512 By.)
Alternative option is to use VFS. VFS is a set of callbacks used for low-level OS-services. Among them there are several file-related services, e.g. xOpen/xSeek/xRead/xWrite/xClose. In particular, here are the methods used for data exchange
int (*xRead)(sqlite3_file*, void*, int iAmt, sqlite3_int64 iOfst);
int (*xWrite)(sqlite3_file*, const void*, int iAmt, sqlite3_int64 iOfst);
Data size in these calls ranges from 4 By (frequent case) to the DB page size. If you want to use a block cipher (what else to use?), then you need to organize underlying block cache. I cannot imagine an implementation that would be as safe and as efficient as SQLite built-in transactions.
Second problem: VFS implementation is platform-dependent. Android/iOS/WP7/desktop all use different sources, i.e. VFS-based encryption would have to be implemented platform-by-platform.
Next problem is a more subtle: Platform may use VFS calls to realize file locks. These uses must not be encrypted. More over, shared locks must not be buffered. In other words, encryption at the VFS level might compromise locking functionality.
EDITED - plaintext attack on VFS-based encryption
I realized this later: DB header starts with fixed string "SQLite format 3" and the header contains a lot of other fixed byte values. This opens the door for known plaintext attacks (KPA).
This is mainly the problem of VFS-based encryption as it does not have the info that the DB header is being encrypted.
System.data.sqlite has also this problem as it encrypts (RC4) also the DB header.
SqlCipher overwrites hdr string with salt used to convert password to the key. Moreover, it uses by default AES, hence KPA attack presents no danger.

You don't need to hack db format or sqlite source code. SQLite exposes virtual file-system (vfs) API, which can be used to wrap file system (or another vfs) with encryption layer which encrypts/decrypts pages on the fly. When I did that it turned out to be very simple task, just hundred lines of code or so. This way whole DB will be encrypted, including journal file, and it is completely transparent to any client code. With typical page size of 1024, almost any known block cipher can be used. From what I can conclude from their docs, this is exactly what SQLCipher does.
Regarding the 'problems' you see:
You don't need to reimplement file system support, you can wrap around the default VFS. So no problems with locks or platform-dependence.
SQLite's default OS backend is also VFS, there is no overhead for using VFS except that you add.
You don't need block cache. Of course you will have to read whole block when it asks for just 4 bytes, but don't cache it, it will never be read again. SQLite has its own cache to prevent that (Pager module).

Didn't get much response, so here is my decision:
Own encryption (AES128), CBC mode
Codec interface (same as used by SqlCipher or system.data.sqlite)
DB header unencrypted
Page headers unencrypted as well and used for IV generation
Using amalgamation SQLite distribution
AFAIK this solution should be better than either SqlCipher or system.data.sqlite.

Related

SonarQube: Make sure that encrypting data is safe here. AES/GCM/NoPadding, RSA/ECB/PKCS1Padding

I'm using:
1. RSA/ECB/PKCS1Padding
2. AES/GCM/NoPadding
To encrypt my data in my Android (Java) application. At the documentation of SonarQube it states that:
The Advanced Encryption Standard (AES) encryption algorithm can be used with various modes. Galois/Counter Mode (GCM) with no padding should be preferred to the following combinations which are not secured:
Electronic Codebook (ECB) mode: Under a given key, any given
plaintext block always gets encrypted to the same ciphertext block.
Thus, it does not hide data patterns well. In some senses, it doesn't
provide serious message confidentiality, and it is not recommended
for use in cryptographic protocols at all.
Cipher Block Chaining (CBC) with PKCS#5 padding (or PKCS#7) is
susceptible to padding oracle attacks.
So, as it is recommended, I use AES/GCM/NoPadding as :
Cipher c = Cipher.getInstance("AES/GCM/NoPadding");
But, it still gives me the warning Make sure that encrypting data is safe here.
The same for:
Cipher c = Cipher.getInstance("RSA/ECB/PKCS1Padding");
Why does SonarQube throws that warning?
Aren't these uses safe any more?
AES in GCM mode is secured as a block cipher algorithm. But that doesn't guarantee that the code that encrypts data using AES (in GCM mode) is secured. Several things can go wrong leaving the code vulnerable to attacks. It is developers' responsibility to code it in the right way to get the desired level of security. Some examples where things can go wrong are:
The IV repeats for a given key
The key or the raw data are stored in String data type which keeps lingering in the heap
The secret key is stored in clear text in a property file that goes in the code repository
and so on.
Now, SonarQube cannot identify all these vulnerabilities and hence they've come up with a new concept called Hotspot which is described here as:
Unlike Vulnerabilities, Security Hotspots aren't necessarily issues that are open to attack. Instead, Security Hotspots highlight security-sensitive pieces of code that need to be manually reviewed. Upon review, you'll either find a Vulnerability that needs to be fixed or that there is no threat.
Hotspots have a separate life cycle which is explained in the link given above.
P.S. This answer explains how to encrypt a string in Java with AES in GCM mode in a secured way: https://stackoverflow.com/a/53015144/1235935
Seems like it's a general warning about encrypting any data. There shouldn't be an issue with "AES/GCM/NoPadding", as shown in their test code.

How to encrypt files with AES256-GCM in golang?

AES256-GCM could be implemented in go as https://gist.github.com/cannium/c167a19030f2a3c6adbb5a5174bea3ff
However, Seal method of interface cipher.AEAD has signature:
Seal(dst, nonce, plaintext, additionalData []byte) []byte
So for very large files, one must read all file contents into memory, which is unacceptable.
A possible way is to implement Reader/Writer interfaces on Seal and Open, but shouldn't that be solved by those block cipher "modes" of AEAD? So I wonder if this is a design mistake of golang cipher lib, or I missed something important with GCM?
AEADs should not be used to encrypt large amounts of data in one go. The API is designed to discourage this.
Encrypting large amounts of data in a single operation means that a) either all the data has to be held in memory or b) the API has to operate in a streaming fashion, by returning unauthenticated plaintext.
Returning unauthenticated data is dangerous it's not
hard to find people on the internet suggesting things like gpg -d your_archive.tgz.gpg | tar xzbecause the gpg command also provides a streaming interface.
With constructions like AES-GCM it's, of course, very easy to
manipulate the plaintext at will if the application doesn't
authenticate it before processing. Even if the application is careful
not to "release" plaintext to the UI until the authenticity has been
established, a streaming design exposes more program attack surface.
By normalising large ciphertexts and thus streaming APIs, the next
protocol that comes along is more likely to use them without realising
the issues and thus the problem persists.
Preferably, plaintext inputs would be chunked into reasonably large
parts (say 16KiB) and encrypted separately. The chunks only need to be
large enough that the overhead from the additional authenticators is
negligible. With such a design, large messages can be incrementally
processed without having to deal with unauthenticated plaintext, and
AEAD APIs can be safer. (Not to mention that larger messages can be
processed since AES-GCM, for one, has a 64GiB limit for a single
plaintext.)
Some thought is needed to ensure that the chunks are in the correct
order, i.e. by counting nonces, that the first chunk should be first, i.e. by starting the nonce at zero, and that the last chunk should be
last, i.e. by appending an empty, terminator chunk with special
additional data. But that's not hard.
For an example, see the chunking used in miniLock.
Even with such a design it's still the case that an attacker can cause
the message to be detectably truncated. If you want to aim higher, an
all-or-nothing transform can be used, although that requires two
passes over the input and isn't always viable.
It's not a design mistake. It's just that the API is incomplete in that regard.
GCM is a streaming mode of operation and therefore able to handle encryption and decryption on demand without stopping the stream. It seems that you cannot reuse the same AEAD instance with the previous MAC state, so you cannot directly use this API for GCM encryption.
You could implement your own GCM on top of crypto.NewCTR and your own implementation of GHASH.

Adding Encryption to Solr/lucene indexes

I am currently using Solr to perform search services over some sensitive records.
As Solr/lucene provides fast searching by storing inverted indexes of the sensitive information in plain text on a disk there is a requirement to encrypt these index files so that unauthorized people can't have access to them by bypassing the system's security.
I found there are similar patches open on Apache JIRA AES encrypted directory and Codec for index-level encryption.
AES encrypted directory looks promising but this patch has been implemented for lucene 3.1 as I am using the newer version, I am not sure if this patch can be used with lucene version 5 or higher.
I was wondering if there is a way to implement a security measure that encrypts the indexes or if it is possible to write some custom plugin which can encrypt/decrypt the indexes on I/O level(i.e FsDirectory)?
The discussion in the comment section of LUCENE-6966 you have shared is really interesting. I would reason with this quote of Robert Muir that there is nothing baked into Solr and probably will never be.
More importantly, with file-level encryption, data would reside in an unencrypted form in memory which is not acceptable to our security team and, therefore, a non-starter for us.
This speaks volumes. You should fire your security team! You are wasting your time worrying about this: if you are using lucene, your data will be in memory, in plaintext, in ways you cannot control, and there is nothing you can do about that!
Trying to guarantee anything better than "at rest" is serious business, sounds like your team is over their head.
So you should consider to encrypt the storage Solr is using on OS level. This should be transparent for Solr. But if someone comes into your system, he should not be able to copy the Solr data.
This is also the conclusion the article Encrypting Solr/Lucene indexes from Erick Erickson of Lucidwors draws in the end
The short form is that this is one of those ideas that doesn't stand up to scrutiny. If you're concerned about security at this level, it's probably best to consider other options, from securing your communications channels to using an encrypting file system to physically divorcing your system from public networks. Of course, you should never, ever, let your working Solr installation be accessible directly from the outside world, just consider the following: http://server:port/solr/update?stream.body=<delete><query>*:*</query></delete>!

How to obfuscate key for encryption function?

If an encryption function requires a key, how do you obfuscate the key in your source so that decompilation will not reveal the key and thereby enable decryption?
The answer to large extent depends on the platform and development tool, but in general there's no reliable solution. Encryption function is the point at which the key must be present in it's "natural" form. So all the hacker needs to do is to put the breakpoint there and dump the key. There's no need to even decompile anything. Consequently any obfuscation is only good for newbies or when debugging is not possible for whatever reason. Using the text string that exists in the application as the key is one of variants.
But the best approach is not to have the key inside, of course. Depending on your usage scenario you sometimes can use some system information (eg. smartphone's IMEI) as the key. In other cases you can generate the key when the application is installed and store that key as an integral part of your application data (eg. use column names of your DB as the key, or something similar).
Still, as said, all of this is tracked relatively easy when one can run the debugger.
There's one thing to counteract debugging -- offload decryption to third-party. This can be done by employing external cryptodevice (USB cryptotoken or smartcard) or by calling a web service to decrypt certain parts of information. Of course, there methods are also suitable only for a limited set of scenarios.
Encryption is built into the .NET configuration system. You can encrypt chunks of your app/web.config file, including where you store your private key.
http://www.dotnetprofessional.com/blog/post/2008/03/03/Encrypt-sections-of-WebConfig-or-AppConfig.aspx
source

Blackberry Content Protection and the Persistent Store

I have an application which stores data to the persistent store by setting the contents of the PersistentObject as a hashtable, e.g. saving preferences is done by entering strings as the keys and values of the hashtable and then setContents is called on the PersistentObject with the Hashtable passed as the parameter.
I understand that the data is saved unencrypted. If I enable content protection in the IT policy for the device will this implementation of persistent storage automatically start encrypting the data or do I have to change the implementation to use for example the ContentProtectedHashtable for saving the contents?
All the information I have found so far about content protection has been with regards to the BES IT policy and nothing about implementation in the application, which makes me think that the standard implementation (i.e. just commiting a Persistable object to PersistentObject object) is adapted automatically to encrypt the data??
Any ideas?? Thanks.
See the documentation for net.rim.device.api.util.ContentProtectedHashtable for one way to implement content protection.
Also see this document for a more in depth discussion of content protection.
I don't think it has something to do with IT policy, it's rather PersistentContent which has encryption/decryption functionality:
This API was designed to allow applications to protect data in a database if the user has enabled Content Protection/Compression in their device's security settings. It consists of two main methods (encode and decode), as well as a number of helper methods.
...
Note that encoding can be performed anytime, whether the device is locked or unlocked. However, an object that was encoded using encryption can only be decoded if the device is unlocked. This can pose a problem if the device locks while an application is performing a potentially long operation during which it requires the ability to decode encrypted data, such as sorting encrypted records. In this case, the application can obtain a ticket. So long as a strong reference to a ticket exists, decoding encrypted data is allowed. Thus, applications should release tickets as soon as possible to allow the device to reach a locked and secure state.
See riccomini - code blackberry persistent store for encryption implementation.

Resources