Ingest messages from mobile client app, through BEAM, to Firestore - firebase

I'm working on a personal project to get familiar with a number of Google/GCP projects and services (Flutter/Dart, Firestore, Cloud Functions, PubSub, Dataflow/BEAM).
For context on the "toy" problem I'm solving, imagine a mobile/cloud-based Ouija board:
a board printed with letters, numbers, and other signs, to which a planchette or movable indicator points, supposedly in answer to questions from people at a seance.
Mobile users can create a board, share the link with friends, and collectively ask the board simple questions (e.g., "Does Suzie have a crush on Johnny?"). Players on the same board gesture in the UI to coax the Oiuja "indicator" to the answer in a "crowd-sourced" way. apache-beam gives me the tools to group/window/process the streams of user data before writing to firestore.
The current technical question I'm wrestling with is: What is a "low code" + lowest latency pattern to use to send the messages from the mobile app, into a BEAM pipeline for processing (which ultimately changes Document properties on a Firestore database).
I have 3 Options in mind depicted in the following diagram:
OPTION 1: Mobile app uses Cloud Functions, which accept the payload and publish to a PubSub topic. This was my initial idea, but as I started to look at it closer, I questioned whether the Cloud Function is even necessary. Enter option 2
OPTION 2: Remove the Cloud Function and have the UI directly publish to the PubSub topic. As I looked into the docs, it seemed like PubSub was meant to be harnessed server-side, so maybe I still need the Cloud Function? Or maybe I lean into Firebase more... Enter option 3
OPTION 3: Have the client only post messages through Firebase, and use Cloud Functions to pick these messages up and push into PubSub and/or BEAM. Advantage is I can benefit from the "offline" handling Firestore client gives me, but this feels like the worst-case for latency.
Am I missing something? Is this question too much of an "opinion" question?
Thanks in advance.
Here's a link to the Google Diagram
EDITS
So after thinking things through more, Authentication/Authorization is another consideration:
OPTION 2 - I would have to ask Google Account users to give permission to use "Pub/Sub", which feels like a bad UX. Users don't know what "Pub/Sub" is!! Option 1 would allow me to put in a service account from Cloud Function to Pub/Sub... and also use other Authentication providers besides Google.
OPTION 1 - I still need some kind of security at the Cloud Function layer to prevent bad actors.

Related

Deciding when to use firebase cloud functions instead of only firestore client sdk [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I know there’s no correct answer, but I would like to hear your thoughts on a specific example:
Say you have restaurant, and this restaurant has a ordering system with 3 applications:
for the restaurant ‘boss’ (for a lack of a better word).
for the delivery guys.
for the client (ordering food).
Works more like an admin account. Therefore I have decided to use cloud functions with the admin sdk to add new users i.e delivery guys accounts. The constructions of menus and dishes are all happening here. I’m currently using only the client sdk for this, simply because it's fast for such simple tasks and cache is a big plus. Now, security wise, my thought is that the ‘boss’ wouldn’t want to tamper with his own documents, so allowing him read and write access to all documents here via security rules seems fine. Would like to hear your thoughts, though.
Now, this one is a bit harder, because I’b been only using the client sdk here and allowing read and write access to orders documents for delivery guys. This may be a bit naive since allowing write access to a delivery guy opens up the possibility for tampering with specific fields on the orders doc. I decided to do it that way simply for the speed and cache, but now I’m thinking that I should change it. There is also a chat functionality between the ‘boss’ and the delivery guys. I’m using one document per conversation and if it exceeds 1mb, I have a cloud function that archives older messages if this limit is approached. But since all reads and writes to the chat doc can happen from the delivery guy via client sdk, this leaves it a bit open security wise also (even with more complex security rules). I might be able to solve this if I stopped using one doc and had a doc per message. And with some good security rules I could make this a bit more secure, but not compared to using a cloud function.
Here, I have not coded anything on the front end at all, I’m thinking of using cloud functions for security when order requests are submitted.
I really like the speed (for simple queries) and cache when using the client only. If I have something a little more compute heavy or complex I have used cloud functions, which works fine. I also like using cloud functions to trigger stuff after the client sdk has done something. My big concern using cloud functions as a sort of a wrapper or middle man for simple queries too, is the speed on cold starts (and sometimes it’s a bit slow after it's spun up compared to only using client sdk). I’m not a big fan of losing cache and streams that I have with the client sdk. But, since security is very important I’m a bit twisted on what to do.
I’m not asking for any code or the perfect answer, just your thoughts and if you have had any experience with this. I’m leaning towards sticking to the client sdk even for the delivery guys and try to moderate this through even better and complex security rules. And then, maybe try to exploit it myself to see how good it is. What are your thoughts?
I’ve read this article which I found useful: https://medium.com/firebase-developers/should-i-query-my-firebase-database-directly-or-us
And watched the video series on cloud functions by firebase, and I found the docs really helpful (both cloud functions and firestore).
Whenever you need to handle sensitive information that the client could compromise or manipulate beyond what Security Rules can prevent, you should use Cloud Functions as a source of authority.
Firestore + Security Rules
Firestore + Security Rules
Firestore + Security Rules
Cloud Functions only work with the admin-sdk, you can't use the client modules successfully in a node.js environment.
Security Rules are your friend, you can deny writes and updates specifically - Chat should be managed through the realtime database
use Cloud Functions to finalize orders, cross reference prices, etc. but the rest can be done with Firestore and smart Security Rules to prevent illegal edits.
Additionally, you can use Custom Claims to denote who has what role and validate who can do what with rules. so only boss's can issue refunds, wave costs, etc.
Reference:
https://firebase.google.com/docs/firestore/security/rules-conditions#data_validation
https://firebase.google.com/docs/auth/admin/custom-claims

Is there a Google API answering about Firestore database either Metrics or Health Checks or Current Active Connectios or Exceptions or Performance

Context: I am total Google Cloud begginer and I have just convinced my company headers to use Firestore Realtime Database for pushing transaction status to our mobile application. We have around 4 millions users that will use significantly our application for small money transfers. Now-a-days we use the concept of polling from Android/IOS to our Microservice endpoints and it will replaced by Firebase SDK imported to our Mobile app which will listen/observe to our Firestore Collection following few Firestore Rules. Since all money transfer will be confirmed/denied in short time (from few seconds to 1 or 2 minutes) the idea of replacing polling by a real reactive approach straigh from Firestore sounded and is already ongoing coding.
The issue: Firstly I don't what to compare solutions. It is just my reality: the prodution support operators must look after our internal Dashboard. Isn't allowed to them look at Google Dashboard Console (please accept this for this question). I need get on demand metrics of our FIrestore. It is nothing to do with Google pricing. It is just our demand: they want to see metrics like:
how many users listening at the same time now
how many users took some exception during connection
is there any user holding connection for more than X minute
when was the connection pick this morning
any exception of any type surrounding our Firestore database
I read Code Samples carefully follow the sample step-by-step trying to figure out some idea if there is some API providing the answers I am looking for.
So, my straight question is: is there such type of Google API providing metrics about my Firestore Database? Maybe following the same idea we found in Performance Monitor which works on Mobile side also some similar aproach on Firestore side.
*** Edited
Future readers may find worth read also about a way to get Firestore metrics info striagh from curl/postman
A couple of things: You mentioned both Firestore and Realtime Database; just wanted to make sure that you are aware that those are two different databases offered under the Firebase umbrella.
how many users listening at the same time now
is there any user holding connection for more than X minute
Yes, there's a dashboard: https://support.google.com/firebase/answer/6317517?hl=en. Including lots of options, like users active in the last 30 mins.
how many users took some exception during connection
any exception of any type surrounding our Firestore database
Yes, you can track errors and other logging via Stack Driver logging. These can give you reports on your cloud functions.
https://cloud.google.com/functions/docs/monitoring
Where can I find Stackdriver in Firebase console?
when was the connection pick this morning
For this one, I'm not sure if you mean A. when did somebody log on in the morning, or B. what was the time that there was the peak \ most usage. If B see 1. If A,
Real-time database has the concept of presence, which lets you know if a user is currently logged in or not. See examples here from the official documentation:
https://firebase.google.com/docs/firestore/solutions/presence
and this post
How to make user presence mechanism using Firebase?
Also applies to your
is there any user holding connection for more than X minute
..............
Edit in response to comments: I believe you are experiencing the XY problem https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem where you are focused on a particular solution, even though your problem has other solutions. User metrics, database events, and errors are all accessible through both dashboards and cloud functions. You can cURL cloud functions if you wish, or set up cron functions to auto report, or set up database trigger functions to log errors. So, while the exact way you want this to work may not exist, you just need to connect existing tools to get the result you want.

Firebase, Client Server-Side vs Cloud Functions Server-Side

assume there is a chat app that needs to delete chat message documents
when total number of documents became a 5.
yes I saw this example in guideline
but can I do this on client server-side on Android?(not cloud-functions)
like this
db.collection("chat").orderBy("something").get(){
if(task.getResult().getDocuments().size()>5){
db.collection("blahblah").document("blahblah").delete()....
}
}
is there any disadvantage for this?
if I do these things not on cloud-functions server-side
thank you (I also saw the question that looks like similar to this question, but that`s not my case)
The disadvantage is that you're making the client app do the work, when you could instead do it more efficiently in Cloud Functions. The user pays the cost against their data plan by downloading all the documents in "chat", then deleting each document (requiring more round trips with the server). Sure, you could make the client do this work, but do you want them to pay for it in terms of data usage and speed? And what if other clients are each also trying to do the same thing?
See also my blog: Should I query my Firebase database directly or use Cloud Functions?

Understanding the Firebase and purpose of google cloud functions

Let's say I'm developing app like Instagram: for iOS, Android and Web. I decided to use Google Firebase as it really seems to simplify the work.
The features user needs in the app are:
Authorization/Registration
Uploading photos
Searching for other people, following them and see their photos
I come from traditional "own-backend" development where I do need to setup a server, create database and finally write the API to let the frontend retrieve the data from the server. That's the reason why it's unclear to me how it all works in Firebase.
So the question is how can I create such app:
Should I create my own API with cloud functions? Or it's ok to work with the database directly from the client-side?
If I work with the database directly why do I need cloud functions? Should I use them?
Sorry for such silly questions, but it is really hard to get from scratch.
The main difference between Firebase and the traditional setup you describe is that with Firebase, as far as the app developer is concerned, the client has direct access to the database, without the need for an intermediate custom API layer. Firebase provides SDKs in various languages that you would typically use to fetch the data you need / commit data updates.
You also have admin SDKs that you can use server-side, but these are meant for you to run some custom business logic - such as analytics, caching in an external service, for exemple - not for you to implement a data fetching API layer.
This has 2 important consequences:
You must define security rules to control who is allowed to read/write at what paths in your database. These security rules are defined at the project level, and rely on the authenticated user (using Firebase Authentication). Typically, if you store the user profile at the path users/$userId, you would define a rule saying that this node can be written to only if the authenticated user has an id of $userId.
You must structure your data in a way that makes it easily readable - without the need for complex database operations such as JOINs that are not supported by Firebase (you do have some limited querying options tough).
These 2 points allow you to skip the 2 main roles of traditional APIs: validating access and fetching/formatting the data.
Cloud functions allow you to react to data changes. Let's say everytime a new user is created, you want to send him a Welcome email: you could define a cloud function sending this email everytime a new node is appended to the users path. They allow you to run the code you would typically run server-side when writes happen, so they can have a very broad range of use-cases: side-effects (such as sending an email), caching data in an external service, caching data within Firebase for easier reads, analytics, etc..
You don't really need a server, you can access the database directly from the client, as long as your users are authenticated and you have defined reasonable security rules on Firebase.
In your use case you could, for example, use cloud functions to create a thumbnail when someone uploads a photo (Firebase Cloud Functions has ImageMagick included for that), or to denormalize your data so your application is faster, or to generate logs. So, basically you can use them whenever you need to do some server side processing when something changes on your database or storage. But I find cloud functions hard to develop and debug, and there are alternatives such as creating a Node application that subscribes to real time changes in your data and processes it. The downside is that you need to host it outside Firebase.
My answer is definitely NOT complete or professional, but here are the reasons why I choose Cloud Functions
Performance
You mentioned that you're writing an instagram-like mobile device app, then I assume that people can comment on others' pictures, as well as view those comments. How would you like to download comments from database and display them on users' devices? I mean, there could be hundreds, maybe thousands of comments on 1 post, you'll need to paginate your results. Why not let the server do all the hard work, free up users' devices and wait for the results? This doesn't seem like a lot better, but let's face it, if your app is incredibly successful, you'll have millions of users, millions of comments that you need to deal with, server will do those hard jobs way better than a mobile phone.
Security
If your project is small, then it's true that you won't worry about performance, but what about security? If you do everything on client side, you're basically allowing every device to connect to your database, meaning that every device can read from/write into your database. Once a malicious user have found out your database url, all he has to do is to
firebase.database().ref(...).remove();
With 1 line of code, you'll lose all your data. Okay, if you say, then I'll just come up with some good security rules like the one below:
This means that for each post, only the owner of that post can make any changes to it or read from it, other people are forbidden to do anything. It's good, but not realistic. People are supposed to be able to comment on the post, that's modifying the post, this rule will not apply to the situation. But again, if you let everybody read/write, it's not safe again. Then, why not just make .read and .write false, like this:
It's 100% safe, because nobody can do anything about anything in your database. Then, you write an API to do all the operations to your database. API limits the operations that can be done to your database. And you have experience in writing APIs, I'm sure you can do something to make your API strong in terms of security, for example, if a user wants to delete a post that he created, in your deletePost API, you're supposed to authenticate the user first. This way, 'nobody' can cause any damage to your database.

Validate data before insertion in Firebase

I'm building an app which uses user contributed content.
The contribution by each user should be available to all others in real time.
I was looking into firebase Realtime database for this.
However, when a user contributes content, there are quite heavy validations and calculations (read server side) to be done on the data before making it available to others.
Is it possible to have a server side validation in firebase ? Or should I look for alternatives ?
Initially, Firebase did not have a feature to implement server-side processing/calculations. All your processing had to be done on the client side.
Now, they've recently introduced a new feature called Cloud Functions For Firebase. Its a really useful new addition where you can write server-side code without the hassles of managing servers or instances. Read up more about it from the above link.
Also, this Youtube playlist by Jen Person is a great start. And, you can find examples similar to your use case here.

Resources