Firestore operations with cloud functions - firebase

I am developing mobile app with flutter framework in which I am using Firestore database and Firebase cloud functions. I have successfully implemented both in my project. However, I have some difficulties understanding how to run functions that are independent from my app. For example, if I have a Firestore database of 100 people and want to calculate average user age for my own analysis. I know that I can call the function calculateAverageAge() from my app, however is there a way to call this function without adding it in my app's code? Should I create a separate project (possibly in different language) and call those functions from there or I can somehow trigger this function from firebase website? Not sure what is the the right approach here.
Hopefully that makes sense.
Thanks!

In general there are three common ways to trigger a function:
Explicitly call it from your client-side application code, which applies to either HTTP functions or Callable functions.
Run them periodically on a schedule.
Run them when something in the data (or in another Firebase feature, or EventArc for v2) changes with background functions.
For your example of calculating the average age of your users, this means that you could combine the last two types and:
Recalculate the average whenever a new user gets added to the database, or a user gets removed from the database.
And then also run a scheduled function daily to update the average for any birthdays.

Related

Why use Firebase Functions SDK for Unity, if UnityWebRequest exists?

At first, I thought the advantage was being able to call HttpsCallables, but now I know that you can call these with some special format and parameters from Postman (and it is also possible using UnityWebRequest, and if not, could just change them from onCall to onRequest).
Then, I thought that it might include some special authorization info from the client to the server. But context.auth (from https.onCall(data, context)) appears to be undefined. Plus, I can still call the functions from Postman.
Important note: I am not registering users, so I don't need Firebase Auth specifically. But I imagined Firebase added something to verify that the function call was coming from an authorized client (e.g. the app).
I am still using the Functions SDK, but I am wondering, what are the advantages of using this SDK for Unity, when UnityWebRequest exists? Why should I have an package when I can perform the same call using a UnityWebRequest? Am I missing something too obvious?
Additional information of how I am using Firebase Functions:
I have a level editor where people can contribute with levels. I use a function to add these levels to Firestore.
When these levels are created, a database trigger runs and checks if that level was already created.
Getting levels from the database to replay.
Finally, in the future I plan to create a voting system to help me curate the levels.
Auth state should be passed along from the Firebase Functions client to your callable Cloud Functions automatically. If that is not the case, I'd report that as a bug.
But outside of that: there's indeed nothing the SDK does that you can't also do yourself. Using it is a matter of choosing between greater convenience and more fine grained control.
If you use the Firebase-provided SDK, you won't have to build anything yourself. But on the other hand, if you build your own client-side implementation of the wire protocols, you have full control over what you do, and don't, implement.

Firebase - Perform Analytics from database/firestore data

I am using Firebase as my authentication and database platform in my React Native-Expo app. I have not yet decided if I will be using the realtime-database or Firestore database.
I need to perform statistical analysis on daily data gathered from my users, which is stored in the database. I.e. the users type in their daily intake of protein, from it I would like to calculate their weekly average, expected monthly average, provide suggestions of types of food if protein intake is too low and etc.
What would be the best approach in order to achieve the result wanted in my specific situation?
I am really unfamiliar and stepping into uncharted territory regarding on how I can accomplish this. I have read that Firebase Analytics generates different basic analytics regarding usage of the app, number crash-free users etc. But can it perform analytics on custom events? Can I create a custom event for Firebase analytics to keep track of a certain node in my database, and output analytics from that? And then of course, if yes, does it work with React Native-Expo or do I need to detach from Expo? In addition, I have read that Firebase Analytics can be combined with Google BigQuery. Would this be an alternative for my case?
Are there any other ways of performing such data analysis on my data stored in Firebase database? For example, export the data and use Python and SciKit Learn?
Whatever opinion or advice you may have, I would be grateful if you could share it!
You're not alone - many people building web apps on GCP have this question, and there is no single answer.
I'm not too familiar with Firebase Analytics, but can answer the question for Firestore and for your custom analytics (e.g. weekly avg protein consumption)
The first thing to point out is that Firestore, unlike other NoSQL databases, is storage only. You can't perform aggregations in real time like you can with MongoDB, so the calculations have to be done somewhere else.
The best practice recommended by GCP in this case is indeed to do a regular export of your Firestore data into BQ (BigQuery), and you can run analytical calculations there in the meantime. You could also, when a user inputs some data, send that to Pub/Sub and use one of GCP Dataflow's streaming templates to stream the data into BQ, and have everything in near real time.
Here's the issue with that however: while this solution gives you real time, and is very scalable, it gets expensive fast, and if you're more used to Python than SQL to run analytics it can be a steep learning curve. Here's an alternative I've used for smaller webapps, which scales well for <100k users and costs <$20 a month on GCP's current pricing:
Write a Python script that grabs the data from Firestore (using the Firestore Python SDK), generates the analytics you need on it, and writes the results back to a Firestore collection
Create an endpoint for that function using Flask or Django
Deploy that server application on Cloud Run, preventing unauthenticated invocations (you'll only be calling it from within GCP) - see this article, steps 1 and 2 only. You can also deploy the Python script(s) to GCP's Vertex AI or hosted Jupyter notebooks if you're more comfortable with that
Use Cloud Scheduler to call that function every x minutes - see these docs for authentication
Have your React app query the "analytics results" collection to get the results
My solution is a FlutterWeb based Dashboard that displays relevant data in (near) realtime like the Regular Flutter IOS/Android app and likewise some aggregated data.
The aggregated data is compiled using a few nodejs based triggers in the database that does any analytic lifting and hence is also near realtime. If you study pricing you will learn, that function invocations are pretty cheap unless of-course you happen to make a 'desphew' :)
I came up with a great solution.
I used the inbuilt firebase BigQuery plugin. Then I used Cube.js (deployed on GCP - cloud run on docker) on top of bigquery.
Cube.js just makes everything just so easy. You do need to make a manual query It tries to do optimize queries. On top of that, it uses caching so you won't get big bills on GCP. I think this is the best solution I was able to find. And this is infinitely scalable and totally real-time.
Also if you are a small startup then it is mostly free with GCP - free limits on cloud run and BigQuery.
Note:- This is not affiliated in any way with cubejs.

I'd like to set a value on Firebase database to True just for a certain amount of time

I can't figure out how to change a value of Firebase database and then change it back after a certain amount of time(30 min), doing everything on the server side and not by the actual device date.
I'm assuming i need Firebase functions.
In case i can't do it, is there any other way keeping Firebase as main Database?
I don't really need any code but just the logic behind it.
I would question your data model. Instead of using a boolean, you may want to consider using a timestamp.
For example, if your data model is currently something along the lines of:
Permissions
- user_id
- is_allowed (boolean)
You may want to use this instead:
Permissions
- user_id
- allow_until (timestamp)
You application code can then just check if the current time is earlier than the allow_until timestamp.
There is no logic in the Firebase Realtime Database to automatically change a value after a certain amount of time. You'll typically run such code in Cloud Functions, or in the apps in your client devices.
In both cases you can keep using the Firebase Realtime Database, as you'll just be interacting with that. From Cloud Functions you'll do that through the Admin SDK.
It's a few steps:
Create a Cloud Function that queries the database to find expired items, and changes the value on them. This code uses the Admin SDK for Node.js, but is very similar to what you'd otherwise run in a web client.
Tie that Cloud Function to a cron job that runs every minute or so (depending on how accurate you want the time-out to be). For some options, see Cloud Functions for Firebase trigger on time?
I recommend you also check out these similar questions:
Delete firebase data older than 2 hours
How to delete firebase data after "n" days (doing the same from an Android client)
How to purge old content in firebase realtime database

Can I use Firebase Cloud Functions for search engine?

Firebase recently released integration to Cloud Functions that allows us to upload Javascript functions to run without needing our own servers.
Is it possible to build a search engine using those functions? My idea is to use local disk (tmpfs volume) to keep indexed data in memory and for each write event I would index the new data. Does tmpfs keeps data between function calls (instances)?
Can cloud functions be used for this purpose or should I use a dedicated server for indexing data?
Another question related to this is: when cloud functions get data from Firebase Realtime Database, does it consumes network or just disk reading? How is it computed in princing?
Thanks
You could certainly try that. Cloud Functions have a local file system that typically is used to maintain state during a run. See this answer for more: Write temporary files from Google Cloud Function
But there are (as far as I know) no guarantees that state will be maintained between runs of your function. Or even that the function will be running on the same container next time. You may be running on a newly created container next time. Or when there's a spike in invocations, your function may be running on multiple containers at once. So you'd potentially have to rebuild the search index for every run of your function.
I would instead look at integrating an external dedicated search engine, such as Algolia in this example: https://github.com/firebase/functions-samples/tree/master/fulltext-search. Have a look at the code: even with comments and license it's only 55 lines!
Alternatively you could find a persistent storage service (Firebase Database and Firebase Storage being two examples) and use that to persist the search index. So you'd run the code to update the search index in Cloud Functions, but would store the resulting index files in a more persistent location.
GCF team member + former google search member. Cloud Functions would not be suitable for in-memory search engines for a few reasons.
A search engine is very wise to separate its indexing and serving machines. At scale you'll want to worry about read and write hot-spotting differently.
As Frank alluded to, you're not guaranteed to get the same instance across multiple requests. I'd like to strengthen his concern: you will never get the same instance across two different Cloud Functions. Each Cloud Function has its own backend infrastructure that is provisioned and scaled independently.
I know it's tempting to cut dependencies, but cutting out persistent storage isn't the way. Your serving layer can use caching to speed up requests, but durable storage makes sure you don't have to reindex the whole corpus if your Cloud Function crashes or you deploy an update (each guarantees the whole instance is scrapped and recreated).

Can I trigger creation of tables on Analytcs Export completion?

I have setup Analytics Export to BigQuery. Everytime when a new ga_sessions_yyyymmdd gets created I would like to run some queries aggregating some data for future use.
I can't figure out how to do this. Do I have to create a job and trigger it from outside or is there a way to trigger this in BigQuery directly (prefably using the Web UI).
You cannot schedule queries to run via the Web UI. You'll need to write a small piece of software to do this by using use the BigQuery API, and cron(s).
You may also want to check out Cloud Functions - bearing in mind that it's still in Alpha.

Resources