Understanding the Firebase and purpose of google cloud functions - firebase

Let's say I'm developing app like Instagram: for iOS, Android and Web. I decided to use Google Firebase as it really seems to simplify the work.
The features user needs in the app are:
Authorization/Registration
Uploading photos
Searching for other people, following them and see their photos
I come from traditional "own-backend" development where I do need to setup a server, create database and finally write the API to let the frontend retrieve the data from the server. That's the reason why it's unclear to me how it all works in Firebase.
So the question is how can I create such app:
Should I create my own API with cloud functions? Or it's ok to work with the database directly from the client-side?
If I work with the database directly why do I need cloud functions? Should I use them?
Sorry for such silly questions, but it is really hard to get from scratch.

The main difference between Firebase and the traditional setup you describe is that with Firebase, as far as the app developer is concerned, the client has direct access to the database, without the need for an intermediate custom API layer. Firebase provides SDKs in various languages that you would typically use to fetch the data you need / commit data updates.
You also have admin SDKs that you can use server-side, but these are meant for you to run some custom business logic - such as analytics, caching in an external service, for exemple - not for you to implement a data fetching API layer.
This has 2 important consequences:
You must define security rules to control who is allowed to read/write at what paths in your database. These security rules are defined at the project level, and rely on the authenticated user (using Firebase Authentication). Typically, if you store the user profile at the path users/$userId, you would define a rule saying that this node can be written to only if the authenticated user has an id of $userId.
You must structure your data in a way that makes it easily readable - without the need for complex database operations such as JOINs that are not supported by Firebase (you do have some limited querying options tough).
These 2 points allow you to skip the 2 main roles of traditional APIs: validating access and fetching/formatting the data.
Cloud functions allow you to react to data changes. Let's say everytime a new user is created, you want to send him a Welcome email: you could define a cloud function sending this email everytime a new node is appended to the users path. They allow you to run the code you would typically run server-side when writes happen, so they can have a very broad range of use-cases: side-effects (such as sending an email), caching data in an external service, caching data within Firebase for easier reads, analytics, etc..

You don't really need a server, you can access the database directly from the client, as long as your users are authenticated and you have defined reasonable security rules on Firebase.
In your use case you could, for example, use cloud functions to create a thumbnail when someone uploads a photo (Firebase Cloud Functions has ImageMagick included for that), or to denormalize your data so your application is faster, or to generate logs. So, basically you can use them whenever you need to do some server side processing when something changes on your database or storage. But I find cloud functions hard to develop and debug, and there are alternatives such as creating a Node application that subscribes to real time changes in your data and processes it. The downside is that you need to host it outside Firebase.

My answer is definitely NOT complete or professional, but here are the reasons why I choose Cloud Functions
Performance
You mentioned that you're writing an instagram-like mobile device app, then I assume that people can comment on others' pictures, as well as view those comments. How would you like to download comments from database and display them on users' devices? I mean, there could be hundreds, maybe thousands of comments on 1 post, you'll need to paginate your results. Why not let the server do all the hard work, free up users' devices and wait for the results? This doesn't seem like a lot better, but let's face it, if your app is incredibly successful, you'll have millions of users, millions of comments that you need to deal with, server will do those hard jobs way better than a mobile phone.
Security
If your project is small, then it's true that you won't worry about performance, but what about security? If you do everything on client side, you're basically allowing every device to connect to your database, meaning that every device can read from/write into your database. Once a malicious user have found out your database url, all he has to do is to
firebase.database().ref(...).remove();
With 1 line of code, you'll lose all your data. Okay, if you say, then I'll just come up with some good security rules like the one below:
This means that for each post, only the owner of that post can make any changes to it or read from it, other people are forbidden to do anything. It's good, but not realistic. People are supposed to be able to comment on the post, that's modifying the post, this rule will not apply to the situation. But again, if you let everybody read/write, it's not safe again. Then, why not just make .read and .write false, like this:
It's 100% safe, because nobody can do anything about anything in your database. Then, you write an API to do all the operations to your database. API limits the operations that can be done to your database. And you have experience in writing APIs, I'm sure you can do something to make your API strong in terms of security, for example, if a user wants to delete a post that he created, in your deletePost API, you're supposed to authenticate the user first. This way, 'nobody' can cause any damage to your database.

Related

Should I query my Firebase database directly, or use Cloud Functions?

I am still new to solo-writing a back-end to my app so I have some concerns,
the concern I am asking about here is a security concern about sharing my database structur in the client app,
As it is known all code that is written on client side is "not safe from interested clients",
I read this medium post by Doug Stevenson from the firebase team,
What I am looking for exactly is an answer to the title of my question (which is the same title as the post on medium):
Should I query my Firebase database directly, or use Cloud Functions?
but I didn't really get an answer as he said that it depends on the situation and requirements of my app,
So can anyone tell me if it is ok,from a security perspective, to do direct queries on the client side that expose the structure of data in my database (firestore), or should I use instead only cloud functions for this?
notes:
I am aware that real-time data can only be achieved using client-sdks and thus I should give up that feature if I don't want to share my database structure in the client app
Allowing direct client access is as safe as you choose to make it.
There's nothing about the structure of data that's not secure. Your implementation lacks security only if users are able to do things that you didn't intend for them to do. That's entirely up to you to implement with security rules. If your rules accurately express what users should and should not be able to do, you will have no problem. If you are unable to use security rules to meet your needs, then you should force access through a backend.

Does Firebase have a way to limit access to all public data in the security rules?

Update: Editing the question title/body based on the suggestion.
Firebase store makes everything that is publicly readable also publicly accessible to the browser with a script, so nothing stops any user from just saying db.get('collection') and saving all the data as theirs.
In more traditional db setup where an app's frontend is pulling data from backend, and the user would have to at least go through the extra trouble of tweaking the UI and then scraping the front end to pull more-and-more data (think Twitter load more button).
My question was whether it was possible to limit users from accessing the entire database in a click, while also keeping the data publicly available.
Old:
From what I understand, any user who can see data coming out of a Firebase datastore can also run a query to extract all of that data. That is not desirable when data itself is of any value, and yet Firebase is such an easy to use tool, it's great for pretty much everything else.
Is there a way, or a best practice, for how to structure the data or access rules s.t. users see the data, but can't just run a script to download all of it entirely?
Thanks!
Kato once implemented a simplistic rate limit for writes in Realtime Database security rules: Firebase rate limiting in security rules?. Something similar could be possible in Cloud Firestore rules. But this approach won't work for reads, since you can't update the timestamp at the same time the read is performed.
You can however limit what queries a user can perform on your database. For example, to limit them to reading 50 documents at a time:
allow list: if request.query.limit <= 50;

How to secure database without authentication?

I am creating an Unity game where I want to have global top 50 score list with usernames. I use Firebase realtime database. There is no need for user to authenticate. I am not that familiar with database security and pretty beginner with this concept. I am using Rest Api from Unity Asset store because it was pretty easy to send and get data from databse.
How can I be sure that every score sent to database is from my app?
Add a dedicated user with password to your database
Somewhere in you app, add those credentials e.g. in a ScriptableObject / in some component
Always use those credentials to authenticate
Note that your app can still be decompiled and thereby cheated.
You can at least make it more difficult by encrypting the data etc.
The only way really around is to have an account and sessioning server to assure a user is locked in with a valid session.
If you don't use Firebase Authentication, you can't restrict who can access your database. Anyone will be able to issues a query, and they can even do it using the Realtime Database REST API. All they have to know is the name of your project.
Even if you do use Firebase Authentication, anyone may still effectively authenticate and access the database outside of your app using other public APIs.
My experience is that you can't stop dedicated "users" from cheating global at high scores. I made a small handfull of trivial games for windows phone with global top 50. Even if your game is unpopular, and you obfuscate your code, and you are on an unpopular platform, and you encrypt your network traffic: somebody is going to jailbreak their phone, decompile your app, and inject their own high score into your game before high scores are sent to the global list. The only way I ever came up with to combat this was to keep track of play sessions -on the server- to make sure their scores were theoretically possible based on how long they were playing.
Disclaimer: I don't know anything about Firebase
From what I can tell, you will need to set up access for Default and Public sections of your configuration to tell the database who can and cannot access your database. Here's their documentation on Get Started with Database Rules.
In general database access, no one should know the details of your connection to a database, so all calls should only ever come from your app.

can I post data to my server directly instead of firebase

I am considering firebase for an app - mainly for the real-time but other features like the analytics and authentication (and price) are other bonuses.
I have my own database and I want everything saved in there. Firebase will have a small portion of the dataset I push as it's needed.
So I'm basically thinking that the firebase data will be read only to the users. If a user comments, that will actually go to my server, I'll authenticate, clean, whatever.. and push to the that feed.
Are there problems with this approach? Are there other (better) ways to solve the problem?
This is a completely valid approach. Firebase is designed so you can use specific features that suit your needs.

Firebase and indexing/search

I am considering using Firebase for an application that should people to use full-text search over a collection of a few thousand objects. I like the idea of delivering a client-only application (not having to worry about hosting the data), but I am not sure how to handle search. The data will be static, so the indexing itself is not a big deal.
I assume I will need some additional service that runs queries and returns Firebase object handles. I can spin up such a service at some fixed location, but then I have to worry about its availability ad scalability. Although I don't expect too much traffic for this app, it can peak at a couple of thousand concurrent users.
Architectural thoughts?
Long-term, Firebase may have more advanced querying, so hopefully it'll support this sort of thing directly without you having to do anything special. Until then, you have a few options:
Write server code to handle the searching. The easiest way would be to run some server code responsible for the indexing/searching, as you mentioned. Firebase has a Node.JS client, so that would be an easy way to interface the service into Firebase. All of the data transfer could still happen through Firebase, but you would write a Node.JS service that watches for client "search requests" at some designated location in Firebase and then "responds" by writing the result set back into Firebase, for the client to consume.
Store the index in Firebase with clients automatically updating it. If you want to get really clever, you could try implementing a server-less scheme where clients automatically index their data as they write it... So the index for the full-text search would be stored in Firebase, and when a client writes a new item to the collection, it would be responsible for also updating the index appropriately. And to do a search, the client would directly consume the index to build the result set. This actually makes a lot of sense for simple cases where you want to index one field of a complex object stored in Firebase, but for full-text-search, this would probably be pretty gnarly. :-)
Store the index in Firebase with server code updating it. You could try a hybrid approach where the index is stored in Firebase and is used directly by clients to do searches, but rather than have clients update the index, you'd have server code that updates the index whenever new items are added to the collection. This way, clients could still search for data when your server is down. They just might get stale results until your server catches up on the indexing.
Until Firebase has more advanced querying, #1 is probably your best bet if you're willing to run a little server code. :-)
Google's current method to do full text search seems to be syncing with either Algolia or BigQuery with Cloud Functions for Firebase.
Here's Firebase's Algolia Full-text search integration example, and their BigQuery integration example that could be extended to support full search.

Resources