I am currently working on an iOS App that uses Cloud Firestore from Firebase.
I was wondering: what is the best way (cost, efficiency and security-wise) to upload some data to multiple Firestore documents simultaneously (or almost simultaneously)?
* The data I have to upload consists of the following: there are two users (User A is the user currently using the app, User B is the one whose profile is currently being seen by User A). If User A saves User B's profile, I must upload User B's UID to User A's Firestore Document. Then, I have to increase a counter in User A's Firestore Document. Finally, I must add User A's UID to User B's Firestore Document. - Note that with Firestore Document I mean either a document Field or a document Subcollection.
The choices are:
Upload everything from the client: seems the best method, cost-wise: it doesn't require extra Cloud Functions usage. I would create a Batch Operation and upload all the data from there*. The downside is that the client must be able to access multiple unrelated collections and documents.
Update one document from the client, then update everything else from Cloud Functions: this method is the best one efficiency and security-wise; the client only uploads data to the user's document*, without accessing unrelated collections and documents. Also, the client only has to upload a fraction of the data that it had to upload in the previous method, saving bandwidth and cellular data / WiFi usage. The downside is that the usage of Cloud Functions would increase, eventually resulting in more costs.
Update one document from the client, update the counter* from the client and then update everything else form Cloud Functions: this method is somewhat a hybrid between the first two, as I think that updating the counter from the client is more secure (Cloud Functions' .onWrite trigger may happen twice or more, increasing the counter multiple times?).
My first thought was to go with method 2, as it's far more secured and efficient, but I would like to have someone else's advice too, before "wasting" too much time coding something wrong.
I hope this isn't any kind of duplicate, as I couldn't find anything that answered my question with enough specificity.
Any advice would be much appreciated. Thank you.
I would follow the third approach: updating from the client the current user collections (the saved_profiles collection and the counter field), which are private and only accessible by this user (configure Firestore Security Rules) and updating the other user's collection (users_who_saved_my_profile) with a triggered Cloud Function. As these operations are not controlled by security rules, they can access any part of the database. This way no unnecessary permissions are granted to any user.
Related
Firebase Firestore: How to monitor read document count by collection?
So first of something similar like question was already asked almost a year ago so dont mark it duplicate cause I need some suggestions in detail.
So here is the scenario,
lets say I have some collections of database and in future I might need to perform some ML on the DB. According to the documents visit.
That is how many times a specific document is visited and for how much time.
I know the mentioned solution above indirectly suggests to perform a read followed by write operation to the database to append the read count every time I visit the database. But it seems this needs to be done from client side
Now if you see, lets say I have some documents and client is only allowed to read the document and not given with access for writing or updating. In this case either I will have to maintain a separate collection specifically to maintain the count, which is of course from client side or else I will have to expose a specific field in the parent document (actual documents from where I am showing the data to clients) to be write enabled and rest remaining field protected.
But fecthing this data from client side sounds an alarm for lot of things and parameters cause I want to collect this data even if the client is not authenticated.
I saw the documentation of cloud functions and it seems there is not trigger function which works as a watch dog for listening if the document is being fetched.
So I want some suggestions on how can we perform this in GCP by creating own custom trigger or hook in a server.
Just a head start will be so usefull.
You cannot keep track of read counts if you are using the Client SDKs. You would have to fetch data from Firestore with some secure env in the middle (Cloud Functions or your own server).
A callable function like this may be useful:
// Returns data for the path passed in data obj
exports.getData = functions.https.onCall(async (data, context) => {
const snapshot = admin.firestore().doc(data.path).get()
//Increment the read count
await admin.firestore().collection("uesrs").doc(context.auth.uid).update({
reads: admin.firestore.FieldValue.increment(1)
})
return snapshot.data()
});
Do note that you are using Firebase Admin SDK in this case which has complete access to all Firebase resources (bypasses all security rules). So you'll need to authorize the user yourself. You can get UID of user calling the function like this: context.auth.uid and then maybe some simple if-else logic will help.
One solution would be to use a Cloud Function in order to read or write from/to Firestore instead of directly interacting with Firestore from you front-end (with one of the Client SDKs).
This way you can keep one or more counters of the number of reads as well as calculate and apply specific access rights and everything is done in the back-end, not in the front-end. With a Callable Cloud Function you can get the user ID of authenticated users out of the box.
Note that by going through a Cloud Function you will loose some advantages of the Client SDKs, for example the ability to use a listener for real-time updates or the possibility to declare access rights through standard security rules. The following article covers the advantages and drawbacks of such approach.
I know there are several questions regarding this (e.g. https://stackoverflow.com/a/52808572/3481904), but I still don't have a good solution for my case.
My application has Groups, which are created/removed dynamically, and members (users) can be added/removed at anytime.
Each Group has 0..N private files (Firebase Storage), saved in different paths (all having the prefix groups/{groupId}/...).
In Firestore Security Rules, I use get() & exists() to know if the signed-in-user is part of a group. But I cannot do this in the Firebase Storage Security Rules.
The 2 proposed solution are:
User Claims:
but the token needs to be refreshed (signing out/in, or renewing expired token) which is not acceptable for my use case, because users need to have access immediately once invited. Also, a user can be part of many groups, which can potentially grow over 1000 bytes.
File Metadata:
but Groups can have N files in different paths, so I will need to loop-list all files of a group, and set the userIds of the group-members in the metadata of each file, allowing access to it. This would be an action triggered by Firestore (a Firebase Function), when a member is added/removed.
I don't like this approach because:
needs to loop-list N files and set metadata for each one (not very performant)
To add new files, I think I would need to set create to public (as there is no metadata to check against yet), and then a Function would need to be triggered to add the userIds to the metadata
there might be some seconds of delay to give files access, which could cause problems in my case if the user opens the group page before that time, having a bad experience
So, my questions are:
Is there a better way?
If I only allow the client to get and create all files when authenticated (disallowing delete and list), would this be enough for security? I think that there might be a chance that malicious hackers can upload anything with an anonymous user, or potentially read all private group files if they know the path...
Thanks!
If custom claims don't work for you, there is really no "good" way to implement this. Your only real options are:
Make use of Cloud Functions in some way to mirror the relevant data from Firestore into Storage, placing Firestore document data into Storage object metadata to be checked by rules.
Route all access to Storage through a backend you control (could also be Cloud Functions) that performs all the relevant security checks. If you use Cloud Functions, this will not work for files whose content is greater than 10MB, as that's the limit for the size of the request and response with Cloud Functions.
Please file a feature request with Firebase support to be allow use of Firestore documents in Storage rules - it's a common request. https://support.google.com/firebase/contact/support
I had similar use case, here’s another way to go about it without using file metadata.
Create a private bucket
Upload files to this bucket via cloud function
2a. validate group stuff here then upload to above bucket.
2b. Generate a signed url for uploaded file
2c. Put this signed URL in Firestore where only the group members can read it (eg. /groups/id/urls)
In UI get the signed URL from firestore for given image id in a group and render the image.
Because we generate the signed URL and upload file together there will be no delay in using the image. (The upload might take longer but we can show spinner)
Also we generate the URL once so not incurring any B class operations or extra functions running every time we add new members to groups.
If you want to be more secure you could set expiry of signed urls quite short and rotate them periodically.
I have a user-profile collection. Currently it is writable by only the user whose profile it is.
Now I want to record the count 'no of times the profile visited' let say profileVisitedCount. And, it also counts if a non-signedIn user visit the profile.
If I store the count in the documents of user-profile collection itself from firebase js client library, I will have to make it publicly writable.
Other option I am thinking is to have a cloud function. It will only increment the profileVisitedCount without need of making the the document publicly writable. But not sure if it is a correct approach, as the cloud function endpoint seems still vulnerable and can be called by bot.
Also, yes 'the profile visit count' kind of data should be recorded in analytics like GA but I need this count to use in one of the business logic like displaying top visited profiles.
So, any guidance on how the data should be structured? Thanks!
You could have another collection called, for example, profileVisitsCounters in which you store one document per user with a document Id corresponding to the user Id. In this user document, you maintain a dedicated profileVisitedCount field that you update with increment() each time a user reads the corresponding profile.
You assign full read and write access to this collection with allow read, write: if true;.
In your question, while mentioning the Cloud Function solution, you write that "the cloud function endpoint seems still vulnerable and can be called by bot". Be aware that in the case of an extra collection with full write access, as detailed above, it will also be the case: for example, someone who knows the collection name and user uid(s) could call the update() method of the JavaScript SDK or, even easier, an endpoint of the Cloud Firestore API.
If you want to avoid this risk you could use a callable Cloud Function to read the User Profiles, as you have mentioned. This Cloud Function will:
Fetch the User Profile data;
Increment the profileVisitedCount field (in the User Profile document);
Send back the User Profile data to the client.
You need to deny read access right to the user-profile collection, in order to force the users to "read" it through the Cloud Function.
This way you are sure that the profileVisitedCount fields are only incremented when there is a "real" User Profile read.
Note also that you could still keep the profileVisitsCounters collection if having two different collections brings some extra advantages for your business case. In this case, the Cloud Function would increment the counter in this collection, instead of incrementing it in the User Profile itself. You would restrict the access right of the profileVisitsCounters collection to read only since the Cloud Function bypasses the security rules. (allow read: if true; allow write: if false;).
Finally, note that it might be interesting to read this article, which, among others, details the pros and cons of querying Firebase databases with Cloud Functions.
Update: Editing the question title/body based on the suggestion.
Firebase store makes everything that is publicly readable also publicly accessible to the browser with a script, so nothing stops any user from just saying db.get('collection') and saving all the data as theirs.
In more traditional db setup where an app's frontend is pulling data from backend, and the user would have to at least go through the extra trouble of tweaking the UI and then scraping the front end to pull more-and-more data (think Twitter load more button).
My question was whether it was possible to limit users from accessing the entire database in a click, while also keeping the data publicly available.
Old:
From what I understand, any user who can see data coming out of a Firebase datastore can also run a query to extract all of that data. That is not desirable when data itself is of any value, and yet Firebase is such an easy to use tool, it's great for pretty much everything else.
Is there a way, or a best practice, for how to structure the data or access rules s.t. users see the data, but can't just run a script to download all of it entirely?
Thanks!
Kato once implemented a simplistic rate limit for writes in Realtime Database security rules: Firebase rate limiting in security rules?. Something similar could be possible in Cloud Firestore rules. But this approach won't work for reads, since you can't update the timestamp at the same time the read is performed.
You can however limit what queries a user can perform on your database. For example, to limit them to reading 50 documents at a time:
allow list: if request.query.limit <= 50;
I need to get a user profile document, which then needs to access two other documents in separate collections, before it returns. At the moment I have implemented this client side but it takes a while. Should I/Can I run this using Cloud Functions, so that I just call one GET and retrieve everything in one go, rather than calling separate get functions sequentially from within my app?
The database retrieval from separate collections would take a similar amount of time whether it's done from the client or Cloud Function.
Collection queries should be very fast on your indexed fields, so probably your problem is the way you are handling asynchronicity. Are you waiting for the result from the first collection before starting the second query? You could dispatch both queries at the same time to cut your waiting time.
You can store all your documents in Firebase Storage and then concatenate the references from the files and download all the documents at the same time, plus you can access them quicker because you can store them into your SD card or internal storage.
Then, if the documents need to be rewritten there is not problem because if you download again from the storage it will auto replace them and the user will still have access to the documents. I tell you this because I'm doing something similar and it's working great!
Edit: As Sujil says, first make an authentication between the user and the database structure with Firebase, so only people logged in or authenticated in your app can read/write files.