I am building a web app with the following stack:
UI - React
Backend framework - NestJS
Infrastructure - Google Firestore document DB, services deployed in Heroku
I need to calculate finance portfolio metrics on a daily basis for all users and display them when the user logs in. I am in a bit of a dilemma what approach to take and I have several ideas, so I hope you can give me some guidance.
Scheduled microservice
I can build and schedule a microservice in Python (the finance framework is in Python) that will run every day and calculate the needed metrics for the users and update the database. Seems straightforward but it might consume a lot of compute resources, especially when the user base grows large.
Cloud Functions
Google Firestore supports cloud functions that can trigger on specific events. I can leverage that and run the calculation microservice when the data is requested - that way I will calculate the information only on-demand. The downside is that if the data has not been requested for a long time, I will have to calculate the metrics for a larger period of time and this might take a while.
P.S. Just saw that there are also scheduled cloud functions - possible implementation might check if the data is calculated today (user has logged in at least once) and if not, calculate it.
I will be happy to discuss any other options that might be available.
Related
I have an app (Flutter using Android Studio) I am on the final stages of and I would, in an ideal world, want to include a feature to notify the user via a mobile notification if a date held against their UID is equal to today (let's use a birthday as an example).
I've spent 2 days looking in to all options, and was very close to using Cloud Functions to store a once a day cron function to notify all users, using FCM, based on the condition above - but something stopped me.
I'm very new to app building. So new that I can not confidently say I do not have a bug or infinity loop somewhere to rack up a huge bill upgrading to the Blaze plan - which without I can not use functions (I literally had credit card in hand on the upgrade page and stopped).
After 3 months of app building I feel I'm between a rock and a hard place. I don't want to launch without auto-notifications (as it's pretty key to the slickness of the app) BUT I can not risk a skies the limit, no-cap, no protection Blaze account if the worst was to happen.
It seems crazy for the effort to be put in to Firebase by Google, which to be fair helps new developers code and launch apps, to put them unnecessarily at risk or cost without automated protection. At least the Flame plan capped your spend - but I can see this is a real concern to new app developers such as myself (I've developed for the web for years). I just can't risk Blaze. I am more than happy to pay for things I use, but not to put myself at risk. Anyway I digress...
Without upgrading to Blaze - is there anyway a newbie such as myself, who is still learning the ropes, I can use FCM, and a cron job, to every day check the Cloud Firestore for users where a certain condition applies (i.e. UID date = today) - and notify them through a notification to their mobile device?
I would recommend using Google Cloud Functions and Cloud Scheduler to accomplish this.
It is worth noting that Firebase + Google Cloud provide an amount of free usage per month. It is quite likely that you could keep your usage under the limits, at least initially. Also, if you are a new Google Cloud customer, there may currently be a trial offer you can redeem for things not covered in the free tier.
https://cloud.google.com/free
https://firebase.google.com/pricing
https://cloud.google.com/scheduler
Context: I am total Google Cloud begginer and I have just convinced my company headers to use Firestore Realtime Database for pushing transaction status to our mobile application. We have around 4 millions users that will use significantly our application for small money transfers. Now-a-days we use the concept of polling from Android/IOS to our Microservice endpoints and it will replaced by Firebase SDK imported to our Mobile app which will listen/observe to our Firestore Collection following few Firestore Rules. Since all money transfer will be confirmed/denied in short time (from few seconds to 1 or 2 minutes) the idea of replacing polling by a real reactive approach straigh from Firestore sounded and is already ongoing coding.
The issue: Firstly I don't what to compare solutions. It is just my reality: the prodution support operators must look after our internal Dashboard. Isn't allowed to them look at Google Dashboard Console (please accept this for this question). I need get on demand metrics of our FIrestore. It is nothing to do with Google pricing. It is just our demand: they want to see metrics like:
how many users listening at the same time now
how many users took some exception during connection
is there any user holding connection for more than X minute
when was the connection pick this morning
any exception of any type surrounding our Firestore database
I read Code Samples carefully follow the sample step-by-step trying to figure out some idea if there is some API providing the answers I am looking for.
So, my straight question is: is there such type of Google API providing metrics about my Firestore Database? Maybe following the same idea we found in Performance Monitor which works on Mobile side also some similar aproach on Firestore side.
*** Edited
Future readers may find worth read also about a way to get Firestore metrics info striagh from curl/postman
A couple of things: You mentioned both Firestore and Realtime Database; just wanted to make sure that you are aware that those are two different databases offered under the Firebase umbrella.
how many users listening at the same time now
is there any user holding connection for more than X minute
Yes, there's a dashboard: https://support.google.com/firebase/answer/6317517?hl=en. Including lots of options, like users active in the last 30 mins.
how many users took some exception during connection
any exception of any type surrounding our Firestore database
Yes, you can track errors and other logging via Stack Driver logging. These can give you reports on your cloud functions.
https://cloud.google.com/functions/docs/monitoring
Where can I find Stackdriver in Firebase console?
when was the connection pick this morning
For this one, I'm not sure if you mean A. when did somebody log on in the morning, or B. what was the time that there was the peak \ most usage. If B see 1. If A,
Real-time database has the concept of presence, which lets you know if a user is currently logged in or not. See examples here from the official documentation:
https://firebase.google.com/docs/firestore/solutions/presence
and this post
How to make user presence mechanism using Firebase?
Also applies to your
is there any user holding connection for more than X minute
..............
Edit in response to comments: I believe you are experiencing the XY problem https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem where you are focused on a particular solution, even though your problem has other solutions. User metrics, database events, and errors are all accessible through both dashboards and cloud functions. You can cURL cloud functions if you wish, or set up cron functions to auto report, or set up database trigger functions to log errors. So, while the exact way you want this to work may not exist, you just need to connect existing tools to get the result you want.
so up until mid 2018 there have been complaints about performance issues with Firebase Cloud Functions and Google CFs (which are the same under the hood I believe). Like these ones:
https://github.com/googleapis/google-cloud-node/issues/2374
https://github.com/firebase/firebase-functions/issues/161
I remember seeing that a simple Hello World example had a response time of 500ms - 800ms. EDIT: I know about cold starts, but as described in the GitHub issues cold starts were not the main problem. A Firebase Cloud Functions would randomly take up to 10s to respond which looked like a problem within Firebase.
I am currently considering building a project with Firebase and would like to build a REST API with Firebase cloud functions - but bad performance would be a deal breaker.
What's the current status? Do these problems still occur? None of these GitHub issues were properly answered by Google, but also no more users have complained ever since …
Cold start times are a fact of life for serverless backends such as Cloud Functions. It's due to the way server instances are automatically scaled up and down to handle load in a cost-effective way. You can always expect that the first request to a new server instance will take some amount of time longer than the subsequent requests that get directed to that same server instance. That amount of time will be variable depending on a number of factors, including the type of trigger, and what all needs to happen with the first request.
If you want to learn more about Cloud Functions scale, what you can expect as a result, and what you can do to mitigate cold starts, watch my video series on the matter.
Cloud Functions for Firebase are Google cloud Functions with a wrapper to allow them to integrate better with other Firebase products. Therefor it is expected a small loss of performance.
The important part to decide which one to use is more to what are you integrating the most.
If your project is running in Firebase, uses firebase authentication etc then Cloud Functions for Firebase is the best choice.
On the other hand if you are using Google Cloud Platform Products then Google Cloud Funtions is the best choice.
I am using Firebase as my authentication and database platform in my React Native-Expo app. I have not yet decided if I will be using the realtime-database or Firestore database.
I need to perform statistical analysis on daily data gathered from my users, which is stored in the database. I.e. the users type in their daily intake of protein, from it I would like to calculate their weekly average, expected monthly average, provide suggestions of types of food if protein intake is too low and etc.
What would be the best approach in order to achieve the result wanted in my specific situation?
I am really unfamiliar and stepping into uncharted territory regarding on how I can accomplish this. I have read that Firebase Analytics generates different basic analytics regarding usage of the app, number crash-free users etc. But can it perform analytics on custom events? Can I create a custom event for Firebase analytics to keep track of a certain node in my database, and output analytics from that? And then of course, if yes, does it work with React Native-Expo or do I need to detach from Expo? In addition, I have read that Firebase Analytics can be combined with Google BigQuery. Would this be an alternative for my case?
Are there any other ways of performing such data analysis on my data stored in Firebase database? For example, export the data and use Python and SciKit Learn?
Whatever opinion or advice you may have, I would be grateful if you could share it!
You're not alone - many people building web apps on GCP have this question, and there is no single answer.
I'm not too familiar with Firebase Analytics, but can answer the question for Firestore and for your custom analytics (e.g. weekly avg protein consumption)
The first thing to point out is that Firestore, unlike other NoSQL databases, is storage only. You can't perform aggregations in real time like you can with MongoDB, so the calculations have to be done somewhere else.
The best practice recommended by GCP in this case is indeed to do a regular export of your Firestore data into BQ (BigQuery), and you can run analytical calculations there in the meantime. You could also, when a user inputs some data, send that to Pub/Sub and use one of GCP Dataflow's streaming templates to stream the data into BQ, and have everything in near real time.
Here's the issue with that however: while this solution gives you real time, and is very scalable, it gets expensive fast, and if you're more used to Python than SQL to run analytics it can be a steep learning curve. Here's an alternative I've used for smaller webapps, which scales well for <100k users and costs <$20 a month on GCP's current pricing:
Write a Python script that grabs the data from Firestore (using the Firestore Python SDK), generates the analytics you need on it, and writes the results back to a Firestore collection
Create an endpoint for that function using Flask or Django
Deploy that server application on Cloud Run, preventing unauthenticated invocations (you'll only be calling it from within GCP) - see this article, steps 1 and 2 only. You can also deploy the Python script(s) to GCP's Vertex AI or hosted Jupyter notebooks if you're more comfortable with that
Use Cloud Scheduler to call that function every x minutes - see these docs for authentication
Have your React app query the "analytics results" collection to get the results
My solution is a FlutterWeb based Dashboard that displays relevant data in (near) realtime like the Regular Flutter IOS/Android app and likewise some aggregated data.
The aggregated data is compiled using a few nodejs based triggers in the database that does any analytic lifting and hence is also near realtime. If you study pricing you will learn, that function invocations are pretty cheap unless of-course you happen to make a 'desphew' :)
I came up with a great solution.
I used the inbuilt firebase BigQuery plugin. Then I used Cube.js (deployed on GCP - cloud run on docker) on top of bigquery.
Cube.js just makes everything just so easy. You do need to make a manual query It tries to do optimize queries. On top of that, it uses caching so you won't get big bills on GCP. I think this is the best solution I was able to find. And this is infinitely scalable and totally real-time.
Also if you are a small startup then it is mostly free with GCP - free limits on cloud run and BigQuery.
Note:- This is not affiliated in any way with cubejs.
I've been putting together a mechanism to sync activity data collected by the MS Band with our backend via the cloud API and getting all the boilerplate setup for the OAuth flows... The intent being to periodically run this data through our backend processes to categorise periods of meaningful walk based activity.
I've been experimenting with the data available and as far as I can tell we cannot get access to the raw step data (or at a fine grained level )? We have successfully been able to request summary info by hour/day, however this is not fit for our purpose.
What I'd like is to access step data in the form [startTimeStamp,endTimeStamp,stepsTaken,...] where each record represents a continuous period of movement by the wearer.
We would also be able to work with data summarised by minute as this would give enough context to our use case.
Is this possible via the cloud API? or are there any plans to implement the Period "Minute" on the summary API endpoint?
https://api.microsofthealth.net/v1/me/Summaries/Minute?startTime=2015-12-09T14%3A00%3A00.369Z
If this isn't possible perhaps there is another way to make this data available? (via HealthKit on iOS or Fit on Android?)
As a complete alternate perhaps it might be possible to get the accumulated step data detail from the band via bluetooth in a similar fashion to the native MS Health App?
We already use the SDK to stream realtime Heart Rate data during user cardio sessions, but there appears to be no way to extract the historical step info from the band directly.
Thanks!
the Band itself monitors and logs the steps over time. When sync'ing, that log is transferred to the Cloud via the Microsoft Health app. The app then pulls the "steps for the day" from the Health service.
These logs are not exposed to apps via the SDK. The only way to calculate steps per custom short period yourself is to have your app sample the counter in the background on a frequent enough basis in order to do the calculation.