Firebase - GDPR+Schrems requires end-to-end encryption, but how? - firebase

I need to protect a group of user’s data, i.e. to facilitate multiple invited users to read and contribute protected data.
My understanding is that I need to implement the following design:
1 . Each user gets authenticated with Firebase Auth (OAuth2 or email/password).
2 . Each user creates a public/private key-pair (How?). The public keys can be shared in Firebase. How to store the private key? Seems natural to store it in Firebase in encrypted form (passcode protected), but how?...
2a. Either the users must manually store the passcode in e.g. a “personal password manager”. Password protect private key
2b. Or could the passcode come from e.g. the OAuth 3rd party?
3 . Encrypt/decrypt data (support user group). Multi-user-data requires an extra layer for having a common (symmetrical) key (How?), so all invited users can encrypt and decrypt data. The group manager stores this common key in Firebase to all users, using their individual public keys. Virgils story about the Creator.
So does the above requirement make sense(!), and how to get around compliance? How to document such security measures? Protecting your customer’s data might make you sleep at night, but you also need to convince the future customers that data will be safe.
I wonder why Firebase does not have a guide on all this, to safely facilitate GDPR. The topic seems such a show stopper nowadays.
All constructive input are highly appreciated!

Related

Firebase custom authentication how to choose unique user ID?

Firebase is great as it offers a lot of authentication providers. In one of my apps, I use four different providers provided by Firebase (Email, Twitter, Facebook and Google), but I also need to let users sign in via LinkedIn.
As Firebase SDK does not offer LinkedIn, I need to implement the login flow manually, which doesn't seem to be difficult, but there is one huge issue which I see. During the creation of a custom JWT token, I need to assign a user ID. And I have no idea how to generate one while making sure that my approach will not conflict with user IDs which Firebase generate on its own for other providers.
For example, let's imagine that a user Andriy Gordiychuk signs in via LinkedIn and his email address is andriy#gordiychuk.com. A simple way to create a user ID would be to take an email address (andriy#gordiychuk.com) and to randomise it using some hashing function. I would get some random id such as aN59nlphs... which I would be able to recreate as long as the same user signs in. So far, so good.
However, how can I be sure that the ID which I get is not already used by another user who signed in via Twitter, for example?
One way to mitigate this issue is to store LinkedIn user IDs in a Firestore collection. Then, when I need to create a token, I first check whether I already have an ID for this user. If not, I would hash the email address, and I would try to create a user with this ID. If this ID is already occupied, I would then try to create another ID until I stumble upon an ID which is not occupied, and I would then use it.
I don't like this approach for two reasons:
Although the chance that I would generate an already occupied ID
is small, theoretically the process of finding an "available ID" can
take a lot of steps (an infinite loop in a worst-case scenario).
Once I find an available ID, I must store it. Given that all these calls are asynchronous there is a real chance that I would create a user with a suitable ID, but because the save operation fails, I would not be able to use this ID.
So, does anyone know how to choose user IDs for such use case correctly?
It's fairly common to generate a string with enough entropy (randomness) to statistically guarantee it will never be duplicated. This is for example behind the UUID generators that exist in many platforms, and similarly behind Firebase Realtime Database's push keys, and Cloud Firestore's add() keys. If there's one in your platform, I recommend starting with that.
Also see:
The 2^120 Ways to Ensure Unique Identifiers, which explains how Firebase Realtime Database's push() works.
Universally unique identifier, Version 4 on Wikipedia
the uuid npm module

Protecting sensitive customer data in cloud based Multi-Tenant environment

We are building a multi-tenant cloud-based web product where customer data is stored in single Database instance. There are certain portion of customer specific business data which is highly sensitive. The sensitive business data should be protected such that nobody can access it except the authorized users of the customer (neither through application not through accessing Database directly). Customer want to make sure even the platform provider(us) is not able to access specific data by any means. They want us to clearly demonstrate Data security in this context. I am looking for specific guidance in the following areas:
How to I make sure the data is protected at Database level such that even the platform provider cannot access the data.
Even if we encrypt the Data, the concern is that anyone with the decryption key can decrypt the data
What is the best way to solve this problem?
Appreciate your feedback.
"How to I make sure the data is protected at Database level such that even the platform provider cannot access the data"
-- As you are in a Multi-Tenanted environment, First of all you would have to "single tenant your databases" so one DB per customer. Then you need to modify the application to pick up the database from some form of config.
For encryption as you are in Azure you would have to use the Azure Key vault with your own keys or customer's own keys. you then configure SQL to use these keys to encrypt the data. see here and here
if you want the database to stay multi-tenanted, you would need to do the encryption at the application level. However this would need the application to know about customer keys, hence I dont think that this would be a valid solution.
"Even if we encrypt the Data, the concern is that anyone with the decryption key can decrypt the data" - yep anyone with the keys can access the data. For this you would need to set the access controls appropriately on your key vault.. so the customer can see only their keys.
In the end as you are the service provider.. the customers would have to trust you some what :)

Is it good idea to store sensitive info in firebase?

In my Android application I have an idea to store in database some serial key. If user enters correct key he gets full version of application and the key is disabled on the server to avoid multiply usage of the same key, otherwise he can buy app in Google Play without a key.
For this I thought to use Firebase Database but after read this I have some doubts
Firebase Realtime Database
Store and sync data with our NoSQL cloud database. Data is synced across all clients in realtime, and remains available when your app goes offline.
Does it mean that firebase will duplicate the table with all available keys to all application users and some smart user can read the list from this copy at his phone?
Not all data is automatically duplicated to all clients. Only data that the client subscribes to is received by that client.
You can control what data each client can see through Firebase's server-side security rules. For example, you'll typically want to ensure that each user can only read their own data.
It probably isn't a good idea to store super-sensitive data like social security numbers or credit card numbers, but if you see https://firebase.google.com/docs/database/security/ you can see, that you can control access to data, & use validation, especially since you can regenerate the keys if they become compromised, it wouldn't be the worst option. If you look at https://firebase.google.com/docs/database/security/user-security you can see, that it's possible to write an app that uses it like google drive with a smartphone-based client.
Personally the answer would no. You may want to think about Google Play Subscriptions and In-App Purchases.
If you really have to then:
Create a key as a user buys the upgrade (server-side).
Store the device id/account id (hashed) and timestamp with the key.
Credit card details and expiry dates should be combined into one hash.
Just encrypt everything.
It's better to have a banned list than a list of approved key. Eventually you have to create more keys and it's easier just to maintain a list of banned keys.

DPAPI Key Storage and Restoration

In light of the upcoming GDPR regulations, the company I work for is looking at upgrading their encryption algorithms and encrypting significantly more data than before. As the one appointed to take care of this, I have replaced our old CAST-128 encryption (I say encryption but it was more like hashing, no salt and resulting in the same ciphertext every time) with AES-256 and written the tools to migrate the data. However, the encryption key is still hardcoded in the application, and extractable within a couple of minutes with a disassembler.
Our product is a desktop application, which most of our clients have installed in-house. Most of them are also hosting their own DBs. Since they have the entirety of the product locally, securing the key seems like a pretty difficult task.
After some research, I've decided to go with the following approach. During the installation, a random 256-bit key will be generated for every customer and used to encrypt their data with AES encryption. The key itself will then be encrypted with DPAPI in user mode, where the only user who can access the data will be a newly created locked down domain service account with limited permissions, who is unable to actually log in to the machine. The encrypted key will the be stored in an ACL-ed part of the registry. The encryption module will then impersonate that user to perform its functions.
The problem is that since the key will be randomly generated at install time, and encrypted immediately, not even we will have it. If customers happen to delete this account, reinstall the server OS, or manage to lose the key in some other manner, the data will be unrecoverable. So after all that exposition, here comes the actual question:
I am thinking of having customers back up the registry where the key is stored and assuming that even after a reinstall or user deletion, as long as the same user account is created with the same password, on the same machine, it will create the same DPAPI secrets and be able to decrypt the key. However, I do not know whether or not that is the case since I'm not sure how these secrets are generated in the first place. Can anyone confirm whether or not this is actually the case? I'm also open to suggestions for a completely different key storage approach if you can think of a better one.
I don't see the link with GDPR but let's say this is just context.
It takes more than the user account, its password and the machine. there is more Entropy added to the ciphering of data with DPAPI.
See : https://msdn.microsoft.com/en-us/library/ms995355.aspx#windataprotection-dpapi_topic02
A small drawback to using the logon password is that all applications
running under the same user can access any protected data that they
know about. Of course, because applications must store their own
protected data, gaining access to the data could be somewhat difficult
for other applications, but certainly not impossible. To counteract
this, DPAPI allows an application to use an additional secret when
protecting data. This additional secret is then required to unprotect
the data. Technically, this "secret" should be called secondary
entropy. It is secondary because, while it doesn't strengthen the key
used to encrypt the data, it does increase the difficulty of one
application, running under the same user, to compromise another
application's encryption key. Applications should be careful about how
they use and store this entropy. If it is simply saved to a file
unprotected, then adversaries could access the entropy and use it to
unprotect an application's data. Additionally, the application can
pass in a data structure that will be used by DPAPI to prompt the
user. This "prompt structure" allows the user to specify an additional
password for this particular data. We discuss this structure further
in the Using DPAPI section.

Encrypting DB with multi user access

I am working on a project that will be sold to government entities. Because they will be storing sensitive lists of employees, they do not want us to have access to their DB.
I am not an encryption specialist, but I was thinking of encrypting the DB the app uses in such a way that we do not have access to it, but many users in their organisation (users they gave permissions to) must be able to read the data from their app.
How does that work? I read about public/private keys, symmetric/asymmetric encryption, but I'm having a hard time understanding how all of that fits in.

Resources