My users can create documents (let's say tasks) in a subcollection with a bunch of security rules checking for authentication, permissions and data validity. They can even select multiple tasks and copy them in the same collection.
Now, a regular user will likely create at most a hundred tasks at once, but what if someone with bad intentions manage to obtain my database credentials, authenticate and try to create a huge number of valid documents programmatically? This will result in Firestore scaling without problems and an unexpected surprise in my Firebase billing.
This is my first concern, but I'm also thinking about the possibility to limit a collection size for other reasons, and it would be at the same time a solution for the problem described.
I read about techniques to count documents in a collection described in the Firestore documentation, but I did not found a solution.
Keeping a counter on a doc field updated with a transaction in a cloud function would be inefficient in my case. Distributed counters increase the complexity of my data model a bit, and also I would not know how to properly read those counters in security rules for every task creation, and even if that would be an efficient solution.
Does anyone has suggestions?
I believe the way for a person to gain read/write access to your database would be to either to hack Google servers, in which case no one is safe and it doesn't really matter what you do, or to guess the exact name of your collections and documents.
As for the latter case, what I have done in my project is that for each collection and document I have used the name I wanted plus random 10-char Strings (including all kinds of chars and numbers. For example Users-x5NfaS1jCb) which kind of serve as independent, separate passwords every step of the way. This, at least, makes it difficult to guess the name of the collections and documents.
(Just like mentioned in the question) If using authentication does not cause any complications for you project, you can use it to further raise the security of your database by limiting access to users authenticating through your app only.
I guess (have never tried it) you can make use of Firebase Functions to limit the number of documents available in any given collection based on the criteria you want. This function will be invoked every time an event in created in the database.
If by "obtain my database credentials", you mean finding the username and password to your Firebase account, well it doesn't really matter what you do again. If they know what they are doing, they can take so many advantages that this particular issue will be the least of your problems.
All in all, if you ask me, your database is safe unless either someone guesses your collection and document names, or gains access to your Firebase account.
These are the only things I can think of for now. I'll try to update my answer later.
Related
Hey so with my current feed database design, I am using Redis for the cache for super-fast reads, which are routed through my Google Cloud Functions. The Redis database handles all post data and timeline updates, which is great and all, but I forgot one of the most considerable caveats to this. Firebase Firestore only permits one document write per second, meaning that if I have a document that stores the post data (post_id, user_id, content, like_count), the like_count would be impossible to track with the possibility for many likes per second. Does anyone have any solutions to this?
You can shard your counter among multiple documents and query them in aggregate as needed.
You can also try Cloud Tasks queue to smooth out the write frequency. It will add considerable complexity to the system, but is really the only genericized way in GCP to manage the rate of some work. This might not work out the way you need, however.
If you use Cloud Tasks, your task will need to be configured with a rate limit, and it will have to deliver the document data to write to yet another function or other HTTP endpoint that will perform the write.
I'm using Firestore I have these questions, regarding how to user behavior will have an impact on app costs:
what's is more cost-effective:
To use a realtime form that saves in the database while the user typing in a web form
To save all the fields in the form at once using a firebase function
questions:
is it overkill to proxy with cloud functions? (just to avoid costs)
when the user types (realtime updates) is it considered as a new write to the database every time?
what's is more cost-effective:
To use a realtime form that saves in the database while the user typing in a web form
This is going to cost you a write for each time the form is save in realtime.
To save all the fields in the form at once using a firebase function
This is going to cost you a single write.
The difference in cost between the two should be obvious - multiple writes vs. a single write.
questions:
is it overkill to proxy with cloud functions? (just to avoid costs)
If you're proxying for no other reason than to save costs, it's overkill. The function invocation will cost you money, in addition to the document write, which will cost the same no matter where it originates.
when the user types (realtime updates) is it considered as a new write to the database every time?
As I said before, yes, it is.
The only real reason to send form submissions through a function is the ability to do deep, secure checking for validity of the form fields. Client side checks are not secure. You could use security rules to perform checks, but those are limited. If you need to make sure the form fields have stricly checked values, a Cloud Function might be your best choice. But it's not possible to tell given the information in your question.
There's no particular reason you need to use a function to save all at the same time -- at whatever point you would call the function, instead call a single update to the database. Using a function here is going to be strictly more expensive (assuming it provides no functionality other than the database write), since you incur the cost of the write and you incur the cost of a function execution.
Of course, its possible you have some other reason to call a cloud function to do the write beyond a simple proxy -- such as to ensure constraints that cannot be enforced by security rules alone. In that case, the cost may be worth the added functionality.
As for is it better to batch or write in real time, it will certainly be cheaper to write all at once, as you are charged for every document write to Firestore. More specifically, each set or update is charged as a single write. So, its definitely going to be less expensive to only write the document once for many fields, as opposed to write it in real time (or per field) as the user is entering data.
Update: Editing the question title/body based on the suggestion.
Firebase store makes everything that is publicly readable also publicly accessible to the browser with a script, so nothing stops any user from just saying db.get('collection') and saving all the data as theirs.
In more traditional db setup where an app's frontend is pulling data from backend, and the user would have to at least go through the extra trouble of tweaking the UI and then scraping the front end to pull more-and-more data (think Twitter load more button).
My question was whether it was possible to limit users from accessing the entire database in a click, while also keeping the data publicly available.
Old:
From what I understand, any user who can see data coming out of a Firebase datastore can also run a query to extract all of that data. That is not desirable when data itself is of any value, and yet Firebase is such an easy to use tool, it's great for pretty much everything else.
Is there a way, or a best practice, for how to structure the data or access rules s.t. users see the data, but can't just run a script to download all of it entirely?
Thanks!
Kato once implemented a simplistic rate limit for writes in Realtime Database security rules: Firebase rate limiting in security rules?. Something similar could be possible in Cloud Firestore rules. But this approach won't work for reads, since you can't update the timestamp at the same time the read is performed.
You can however limit what queries a user can perform on your database. For example, to limit them to reading 50 documents at a time:
allow list: if request.query.limit <= 50;
I am using firebase database and my question is, for example how fast can I reach 1GB if i have 100 users each storing worth 10 document pages of microsoft word full of text everyday, for one month?
Word documents would be stored in Firebase Storage, not the realtime database. Realistically, the only way you will be billed anything for using the Firebase platform is if your app gets a significant of usage. I suspect that 99% of firebase apps do not generate any billing whatsoever. ...that's just a hunch.
If you do run into billing issues, that will/would be a good thing.
Although this question is too broad since it lacks various variables like the number of users, size of the files and how this data is presented in the app I will try to give my $0.02 on this in a very generic way which can also be interpreted as how not to end up with a huge bill while using firebase,
Even though Firebase provides a sufficient space to test out the app in production there is a lot of ways in which things can go bad real quick like,
1) since firebase automatically handles the sync this additional read/write call comes out of your quota apart from the call you trigger check-out how one app developers found this out the hard way
2) if you have bad DB schema/design that you have not addressed, then you end up making multiple calls to the server to fetch the data which again bloats up the number of calls you make read about this here
3) Not setting spending limits and alerts, this should be a mandatory step to avoid a lot of the above problems even though the docs clearly gives an indication on how to set this up
These are some of the cases that I have come across I hope this serves as a guideline to set up your app
Unfortunately when using the amazing Firebase Realtime Database (ie, traditional Firebase), and the Cloud Functions thereof
There's no concept really of lockup available, other than the base transaction concept. (Which is awesome as far as it goes.) For example you can't do a say read, delete, insert.
We haven't user the new Firestore in a project yet; I'm wondering if it solves that particular problem?
This would make it tremendously useful for things like, well almost anything really, transactional game currencies, logic, etc.
Is this an advantage of Firestore?
Transactions in Firestore are more flexible than those in Realtime Database. With Realtime Database transactions, you had to choose a single location in that transaction, and you could only modify children under that location. All clients has to be using transactions to safely modify that transaction.
With Firestore transactions, you can transact using any arbitrary set of documents across any set of collections in your database, and you have atomicity on changes made to those documents. You're not obliged to choose just one collection or just one document.
There is no such thing as a "lock" in either product. Locks are not provided because they're difficult to manage correctly (avoiding deadlock) while also being scalable to millions of concurrent writers.