Firestore database reads are increasing without even running the application - firebase

I have been noticing some increase in the amount of reads in the firestore. I am testing my application on localhost.
Today I decided to have a closer look at the amount of reads and started from zero. I waited around 3 minutes without even running my application or performing any read operation, and my number of reads went up to 210, and 2 writes. Now that must be pretty weird, knowing that the application wasn't even running and it was all zero when I started.
I tried to avoid the onvalueChanges() and snapshotChanges() since they generate a lot of read.
Below is the service that I call in the home page.
Any one has an idea on what's going on, and would like to share?
Thanks in advance.
export class CatService {
lastVisibleCat: any;
cats = [];
fecha = new Date().setHours(23, 59, 5, 9);
todaysDate = new Date(this.fecha);
constructor(public afs: AngularFirestore) {
}
getCats() {
const reference = this.afs.collection('Cats');
const query1 = reference.ref.where('timeStampEndDate', '>=',
this.todaysDate);
return query1.get().then(snapShot => {
snapShot.forEach(cat => {
this.cats.push({ ...cat.data(), id: cat.id });
});
return this.cats;
});
}
}

A couple of things could be at play here:
Read counts are not real time, so the data could be coming in delayed
Reads in the Firebase Console also count towards billed operations in terms of read counts

Related

Incremental Static Regeneration doesn't work without redeploying

This is my very first time building and deploying a website so bear with me if this is a dumb question. I'm building a Heardle style clone. The idea is that every day there's a new song and people have 6 guesses to figure out which song it is from short clips of the song. Every part of this seems to work with one major exception -- I can't seem to reload today's song dynamically.
I have a function:
export async function getStaticProps() {
const allSearchableSongs = songHelper.getSongData()
const todaysSong = await songHelper.getTodaysSong()
const songURLs: string[] = (todaysSong) ? await songHelper.getTodaysSongClips(todaysSong!) : []
let valid = false
if (songURLs) {
valid = true
}
return {
props: {
allSearchableSongs,
todaysSong,
songURLs,
valid
},
revalidate: 10
};
}
Note that getTodaysSong() and getTodaysSongClips() both make calls to AWS to get data from S3. Whenever I rebuild the website this works well. However, I would like for this to refresh after 60 seconds so that nobody is ever looking at a stale website. But this doesn't change ever. The song is always out of date until I redeploy. I've checked to make sure that the data is changing daily and that's all well and good -- but the website doesn't ever reload.
I'm currently hosting this on Vercel.
What am I doing wrong? How can I ensure that this reloads after 60 seconds?

How to improve preformance of firestore cache query

I am developing a PWA, which displays a list of transactions (transaction is an object with ~10 fields). I am using firestore for storage and realtime updates and I have also enabled persistance.
I want my application to have all the data in memory and I want to take care of displaying only necessary information myself (e.g. using virtual scrolling for transaction list). Due to this reason I listen to the whole collection (a.k.a the transactions).
At the start of the app, I want to make sure the data is loaded so I use one time cache query to get the transactions. I would expect the query to be nearly instantaneous, but on laptop it takes around ~1 second to get the initial data (and I also have another collection which I fetch from cache and this resolves after ~2 seconds after transactions request). For mobile it takes around ~9seconds (loading on mobile, loading on laptop)
I want my app to feel instantaneous, but I takes a few seconds until the data is in place. Note, that I am not doing any advanced queries (I just want to load the data to memory).
Am I doing something wrong? I have read Firestore docs, but I don't think the amount of data that I have in cache should cause such bad performance.
UPDATE: Even if I limit the initial query to just load 20 documents. It still takes around ~2 seconds to retrieve them.
UPDATE 2: The code looks like this:
export const initializeFirestore = (): Thunk => (dispatch) => {
const initialQueries: Array<Promise<unknown>> = []
getQueries().forEach((query) => {
const q = query.createFirestoneQuery()
initialQueries.push(
q
.get({
source: 'cache',
})
.then((snapshot) =>
dispatch(firestoneChangeAction(query, snapshot, true)),
),
)
q.onSnapshot((change) => {
dispatch(firestoneChangeAction(query, change))
})
})
console.log('Now I am just waiting for initial data...')
return Promise.all(initialQueries)
}
You may be interested by the smart approach presented by Firebase engineers during the "Faster web apps with Firebase" Session of the Firebase Summit 2019 (You can watch the video here: https://www.youtube.com/watch?v=DHbVyRLkX4c).
In a nutshell, their idea is to use the Firestore REST API to make the first query to the database (which does not need to download any SDK), and in parallel, dynamically import the Web SDK in order to use it for the subsequent queries.
The github repository is here: https://github.com/hsubox76/fireconf-demo
I paste below the content of the key js file (https://github.com/hsubox76/fireconf-demo/blob/master/src/dynamic.js) for further reference.
import { firebaseConfigDynamic as firebaseConfig } from "./shared/firebase-config";
import { renderPage, logPerformance } from "./shared/helpers";
let firstLoad = false;
// Firestore REST URL for "current" collection.
const COLLECTION_URL =
`https://firestore.googleapis.com/v1/projects/exchange-rates-adcf6/` +
`databases/(default)/documents/current`;
// STEPS
// 1) Fetch REST data
// 2) Render data
// 3) Dynamically import Firebase components
// 4) Subscribe to Firestore
// HTTP GET from Firestore REST endpoint.
fetch(COLLECTION_URL)
.then(res => res.json())
.then(json => {
// Format JSON data into a tabular format.
const stocks = formatJSONStocks(json);
// Measure time between navigation start and now (first data loaded)
performance && performance.measure("initialDataLoadTime");
// Render using initial REST data.
renderPage({
title: "Dynamic Loading (no Firebase loaded)",
tableData: stocks
});
// Import Firebase library.
dynamicFirebaseImport().then(firebase => {
firebase.initializeApp(firebaseConfig);
firebase.performance(); // Use Firebase Performance - 1 line
subscribeToFirestore(firebase);
});
});
/**
* FUNCTIONS
*/
// Dynamically imports firebase/app, firebase/firestore, and firebase/performance.
function dynamicFirebaseImport() {
const appImport = import(
/* webpackChunkName: "firebase-app-dynamic" */
"firebase/app"
);
const firestoreImport = import(
/* webpackChunkName: "firebase-firestore-dynamic" */
"firebase/firestore"
);
const performanceImport = import(
/* webpackChunkName: "firebase-performance-dynamic" */
"firebase/performance"
);
return Promise.all([appImport, firestoreImport, performanceImport]).then(
([dynamicFirebase]) => {
return dynamicFirebase;
}
);
}
// Subscribe to "current" collection with `onSnapshot()`.
function subscribeToFirestore(firebase) {
firebase
.firestore()
.collection(`current`)
.onSnapshot(snap => {
if (!firstLoad) {
// Measure time between navigation start and now (first data loaded)
performance && performance.measure("realtimeDataLoadTime");
// Log to console for internal development
logPerformance();
firstLoad = true;
}
const stocks = formatSDKStocks(snap);
renderPage({
title: "Dynamic Loading (Firebase now loaded)",
tableData: stocks
});
});
}
// Format stock data in JSON format (returned from REST endpoint)
function formatJSONStocks(json) {
const stocks = [];
json.documents.forEach(doc => {
const pathParts = doc.name.split("/");
const symbol = pathParts[pathParts.length - 1];
stocks.push({
symbol,
value: doc.fields.closeValue.doubleValue || 0,
delta: doc.fields.delta.doubleValue || 0,
timestamp: parseInt(doc.fields.timestamp.integerValue)
});
});
return stocks;
}
// Format stock data in Firestore format (returned from `onSnapshot()`)
function formatSDKStocks(snap) {
const stocks = [];
snap.forEach(docSnap => {
if (!docSnap.data()) return;
const symbol = docSnap.id;
const value = docSnap.data().closeValue;
stocks.push({
symbol,
value,
delta: docSnap.data().delta,
timestamp: docSnap.data().timestamp
});
});
return stocks;
}
You're not doing anything wrong. The query will take as much time as it needs to finish. This is why many sites use a loading indicator.
For the first query in your app, it's going to include the time it takes to fully initialize the SDK, which might involve asynchronous work beyond more than just the query itself. Also bear in mind that reading and sorting data from local disk isn't necessarily "fast", and that for larger amounts of documents, the local disk cache read might even be more expensive than the time it would take the fetch the same documents over the network.
Since we don't have any indication of how many documents you have, and how much total data you're trying to transfer, and the code you're using for this, all we can do is guess. But there's really not much you can do to speed up the initial query, other than perhaps limiting the size of the result set.
If you think that what you're experiencing is a bug, then please file a bug report on GitHub.

minimize time operation in firebase/firestore

I build react native app with firebase & firestore.
what I'm looking to do is, when user open app, to insert/update his status to 'online' (kind of presence system), when user close app, his status 'offline'.
I did it with firebase.database.onDisconnect(), it works fine.
this is the function
async signupAnonymous() {
const user = await firebase.auth().signInAnonymouslyAndRetrieveData();
this.uid = firebase.auth().currentUser.uid
this.userStatusDatabaseRef = firebase.database().ref(`UserStatus/${this.uid}`);
this.userStatusFirestoreRef = firebase.firestore().doc(`UserStatus/${this.uid}`);
firebase.database().ref('.info/connected').on('value', async connected => {
if (connected.val() === false) {
// this.userStatusFirestoreRef.set({ state: 'offline', last_changed: firebase.firestore.FieldValue.serverTimestamp()},{merge:true});
return;
}
await firebase.database().ref(`UserStatus/${this.uid}`).onDisconnect().set({ state: 'offline', last_changed: firebase.firestore.FieldValue.serverTimestamp() },{merge:true});
this.userStatusDatabaseRef.set({ state: 'online', last_changed: firebase.firestore.FieldValue.serverTimestamp() },{merge:true});
// this.userStatusFirestoreRef.set({ state: 'online',last_changed: firebase.firestore.FieldValue.serverTimestamp() },{merge:true});
});
}
after that, I did trigger to insert data into firestore(because I want to work with firestore), this is the function(works fine, BUT it takes 3-4 sec)
module.exports.onUserStatusChanged = functions.database
.ref('/UserStatus/{uid}').onUpdate((change,context) => {
const eventStatus = change.after.val();
const userStatusFirestoreRef = firestore.doc(`UserStatus/${context.params.uid}`);
return change.after.ref.once("value").then((statusSnapshot) => {
return statusSnapshot.val();
}).then((status) => {
console.log(status, eventStatus);
if (status.last_changed > eventStatus.last_changed) return status;
eventStatus.last_changed = new Date(eventStatus.last_changed);
//return userStatusFirestoreRef.set(eventStatus);
return userStatusFirestoreRef.set(eventStatus,{merge:true});
});
});
then after that, I want to calculate the online users in app, so I did trigger when write new data to node of firestore so it calculate the size of online users by query.(it works fine but takes 4-7 sec)
module.exports.countOnlineUsers = functions.firestore.document('/UserStatus/{uid}').onWrite((change,context) => {
console.log('userStatus')
const userOnlineCounterRef = firestore.doc('Counters/onlineUsersCounter');
const docRef = firestore.collection('UserStatus').where('state','==','online').get().then(e=>{
let count = e.size;
console.log('count',count)
return userOnlineCounterRef.update({count})
})
return Promise.resolve({success:'added'})
})
then into my react native app
I get the count of online users
this.unsubscribe = firebase.firestore().doc(`Counters/onlineUsersCounter`).onSnapshot(doc=>{
console.log('count',doc.data().count)
})
All the operations takes about 12 sec. it's too much for me, it's online app
my firebase structure
what I'm doing wrong? maybe there is unnecessary function or something?
My final goals:
minimize time operation.
get online users count (with listener-each
change, it will update in app)
update user status.
if there are other way to do that, I would love to know.
Cloud Functions go into a 'cold start' mode, where they take some time to boot up. This is the only reason I can think of that it would take that long. Stack Overflow: Firebase Cloud Functions Is Very Slow
But your cloud function only needs to write to Firestore on log out to
catch the case where your user closes the app. You can write to it directly on log in from your client
with auth().onAuthStateChange().
You could also just always read who is logged in or out directly from the
realtime database and use Firestore for the rest of your data.
You can rearrange your data so that instead of a 'UserStatus' collection you have an 'OnlineUsers' collection containing only online users, kept in sync by deleting the documents on log out. Then it won't take a query operation to get them. The query's impact on your performance is likely minimal, but this would perform better with a large number of users.
The documentation also has a guide that may be useful: Firebase Docs: Build Presence in Cloud Firestore

Firestore transactions getting triggered multiple times resulting in wrong data

So I have a cloud function that is triggered each time a transaction is liked/unliked. This function increments/decrements the likesCount. I've used firestore transactions to achieve the same. I think the problem is the Code inside the Transaction block is getting executed multiple times, which may be correct as per the documentation.
But my Likes count are being updated incorrectly at certain times.
return firestore.runTransaction(function (transaction) {
return transaction.get(transRef).then(function (transDoc) {
let currentLikesCount = transDoc.get("likesCount");
if (event.data && !event.data.previous) {
newLikesCount = currentLikesCount == 0 || isNaN(currentLikesCount) ? 1 : transDoc.get("likesCount") + 1;
} else {
newLikesCount = currentLikesCount == 0 || isNaN(currentLikesCount) ? 0 : transDoc.get("likesCount") - 1;
}
transaction.update(transRef, { likesCount: newLikesCount });
});
});
Anyone had similar experience
Guys finally found out the cause for this unexpected behaviour.
Firestore isn't suitable for maintaining counters if your application is going to be traffic intensive. They have mentioned it in their documentation. The solution they suggest is to use a Distributed counter.
Many realtime apps have documents that act as counters. For example,
you might count 'likes' on a post, or 'favorites' of a specific item.
In Cloud Firestore, you can only update a single document about once
per second, which might be too low for some high-traffic applications.
https://cloud.google.com/firestore/docs/solutions/counters
I wasn't convinced with that approach as it's too complex for a simple use case, which is when I stumbled across the following blog
https://medium.com/evenbit/on-collision-course-with-cloud-firestore-7af26242bc2d
These guys used a combination of Firestore + Firebase thereby eliminating their weaknesses.
Cloud Firestore is sitting conveniently close to the Firebase Realtime
Database, and the two are easily available to use, mix and match
within an application. You can freely choose to store data in both
places for your project, if that serves your needs.
So, why not use the Realtime database for one of its strengths: to
manage fast data streams from distributed clients. Which is the one
problem that arises when trying to aggregate and count data in the
Firestore.
Its not correct to say that Firestore is an upgrade to the Realtime database (as it is advertised) but a different database with different purposes and both can and should coexist in a large scale application. That's my thought.
It might have something to do with what you're returning from the function, as you have
return transaction.get(transRef).then(function (transDoc) { ... })
And then another return inside that callback, but no return inside the inner-most nested callback. So it might not be executing the transaction.update. Try removing the first two return keywords and add one before transaction.update:
firestore.runTransaction(function (transaction) {
transaction.get(transRef).then(function (transDoc) {
let currentLikesCount = transDoc.get("likesCount");
if (event.data && !event.data.previous) {
newLikesCount = currentLikesCount == 0 || isNaN(currentLikesCount) ? 1 : transDoc.get("likesCount") + 1;
} else {
newLikesCount = currentLikesCount == 0 || isNaN(currentLikesCount) ? 0 : transDoc.get("likesCount") - 1;
}
return transaction.update(transRef, { likesCount: newLikesCount });
});
});
Timeouts
First of all, check your Cloud Functions logs to see if you get any timeout messages.
Function execution took 60087 ms, finished with status: 'timeout'
If so, sort out your function so that it returns a Promise.resolve(). And shows
Function execution took 344 ms, finished with status: 'ok'
Idempotency
Secondly, write your data so that the function is idempotent. When your function runs, write a value to the document that you are reading. You can then check if that value exists before running the function again.
See this example for ensuring that functions are only run once.

Dealing with lots of data in Firebase for a recommender system

I am building a recommender system where I use Firebase to store and retrieve data about movies and user preferences.
Each movie can have several attributes, and the data looks as follows:
{
"titanic":
{"1997": 1, "english": 1, "dicaprio": 1, "romance": 1, "drama": 1 },
"inception":
{ "2010": 1, "english": 1, "dicaprio": 1, "adventure": 1, "scifi": 1}
...
}
To make the recommendations, my algorithm requires as input all the data (movies) and is matched against an user profile.
However, in production mode I need to retrieve over >10,000 movies. While the algorithm can handle this relatively fast, it takes a lot of time to load this data from Firebase.
I retrieve the data as follows:
firebase.database().ref(moviesRef).on('value', function(snapshot) {
// snapshot.val();
}, function(error){
console.log(error)
});
I am there wondering if you have any thoughts on how to speed things up? Are there any plugins or techniques known to solve this?
I am aware that denormalization could help split the data up, but the problem is really that I need ALL movies and ALL the corresponding attributes.
My suggestion would be to use Cloud Functions to handle this.
Solution 1 (Ideally)
If you can calculate suggestions every hour / day / week
You can use a Cloud Functions Cron to fire up daily / weekly and calculate recommendations per users every week / day. This way you can achieve a result more or less similar to what Spotify does with their weekly playlists / recommendations.
The main advantage of this is that your users wouldn't have to wait for all 10,000 movies to be downloaded, as this would happen in a cloud function, every Sunday night, compile a list of 25 recommendations, and save into your user's data node, which you can download when the user accesses their profile.
Your cloud functions code would look like this :
var movies, allUsers;
exports.weekly_job = functions.pubsub.topic('weekly-tick').onPublish((event) => {
getMoviesAndUsers();
});
function getMoviesAndUsers () {
firebase.database().ref(moviesRef).on('value', function(snapshot) {
movies = snapshot.val();
firebase.database().ref(allUsersRef).on('value', function(snapshot) {
allUsers = snapshot.val();
createRecommendations();
});
});
}
function createRecommendations () {
// do something magical with movies and allUsers here.
// then write the recommendations to each user's profiles kind of like
userRef.update({"userRecommendations" : {"reco1" : "Her", "reco2", "Black Mirror"}});
// etc.
}
Forgive the pseudo-code. I hope this gives an idea though.
Then on your frontend you would have to get only the userRecommendations for each user. This way you can shift the bandwidth & computing from the users device to a cloud function. And in terms of efficiency, without knowing how you calculate recommendations, I can't make any suggestions.
Solution 2
If you can't calculate suggestions every hour / day / week, and you have to do it each time user accesses their recommendations panel
Then you can trigger a cloud function every time the user visits their recommendations page. A quick cheat solution I use for this is to write a value into the user's profile like : {getRecommendations:true}, once on pageload, and then in cloud functions listen for changes in getRecommendations. As long as you have a structure like this :
userID > getRecommendations : true
And if you have proper security rules so that each user can only write to their path, this method would get you the correct userID making the request as well. So you will know which user to calculate recommendations for. A cloud function could most likely pull 10,000 records faster and save the user bandwidth, and finally would write only the recommendations to the users profile. (similar to Solution 1 above) Your setup would like this :
[Frontend Code]
//on pageload
userProfileRef.update({"getRecommendations" : true});
userRecommendationsRef.on('value', function(snapshot) { gotUserRecos(snapshot.val()); });
[Cloud Functions (Backend Code)]
exports.userRequestedRecommendations = functions.database.ref('/users/{uid}/getRecommendations').onWrite(event => {
const uid = event.params.uid;
firebase.database().ref(moviesRef).on('value', function(snapshot) {
movies = snapshot.val();
firebase.database().ref(userRefFromUID).on('value', function(snapshot) {
usersMovieTasteInformation = snapshot.val();
// do something magical with movies and user's preferences here.
// then
return userRecommendationsRef.update({"getRecommendations" : {"reco1" : "Her", "reco2", "Black Mirror"}});
});
});
});
Since your frontend will be listening for changes at userRecommendationsRef, as soon as your cloud function is done, your user will see the results. This might take a few seconds, so consider using a loading indicator.
P.S 1: I ended up using more pseudo-code than originally intended, and removed error handling etc. hoping that this generally gets the point across. If there's anything unclear, comment and I'll be happy to clarify.
P.S. 2: I'm using a very similar flow for a mini-internal-service I built for one of my clients, and it's been happily operating for longer than a month now.
Firebase NoSQL JSON structure best practice is to "Avoid nesting data", but you said, you don't want to change your data. So, for your condition, you can have REST call to any particular node (node of your each movie) of the firebase.
Solution 1) You can create some fixed number of Threads via ThreadPoolExecutors. From each worker thread, you can do HTTP (REST call request) as below. Based on your device performance and memory power, you can decide how many worker threads you want to manipulate via ThreadPoolExecutors. You can have code snippet something like below:
/* creates threads on demand */
ThreadFactory threadFactory = Executors.defaultThreadFactory();
/* Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available */
ExecutorService threadPoolExecutor = Executors.newFixedThreadPool(10); /* you have 10 different worker threads */
for(int i = 0; i<100; i++) { /* you can load first 100 movies */
/* you can use your 10 different threads to read first 10 movies */
threadPoolExecutor.execute(() -> {
/* OkHttp Reqeust */
/* urlStr can be something like "https://earthquakesenotifications.firebaseio.com/movies?print=pretty" */
Request request = new Request.Builder().url(urlStr+"/i").build();
/* Note: Firebase, by default, store index for every array.
Since you are storing all your movies in movies JSON array,
it would be easier, you read first (0) from the first worker thread,
second (1) from the second worker thread and so on. */
try {
Response response = new OkHttpClient().newCall(request).execute();
/* OkHttpClient is HTTP client to request */
String str = response.body().string();
} catch (IOException e) {
e.printStackTrace();
}
return myStr;
});
}
threadPoolExecutor.shutdown();
Solution 2) Solution 1 is not based on the Listener-Observer pattern. Actually, Firebase has PUSH technology. Means, whenever something particular node changes in Firebase NoSQL JSON, the corresponding client, who has connection listener for particular node of the JSON, will get new data via onDataChange(DataSnapshot dataSnapshot) { }. For this you can create an array of DatabaseReferences like below:
Iterable<DataSnapshot> databaseReferenceList = FirebaseDatabase.getInstance().getReference().getRoot().child("movies").getChildren();
for(DataSnapshot o : databaseReferenceList) {
#Override
public void onDataChange(DataSnapshot o) {
/* show your ith movie in ListView. But even you use RecyclerView, showing each Movie in your RecyclerView's item is still show. */
/* so you can store movie in Movies ArrayList. When everything completes, then you can update RecyclerView */
}
#Override
public void onCancelled(DatabaseError databaseError) {
}
}
Although you stated your algorithm needs all the movies and all attributes, it does not mean that it processes them all at once. Any computation unit has its limits, and within your algorithm, you probably chunk the data into smaller parts that your computation unit can handle.
Having said that, if you want to speed things up, you can modify your algorithm to parallelize fetching and processing of the data/movies:
| fetch | -> |process | -> | fetch | ...
|chunk(1)| |chunk(1)| |chunk(3)|
(in parallel) | fetch | -> |process | ...
|chunk(2)| |chunk(2)|
With this approach, you can spare almost the whole processing time (but the last chunk) if processing is really faster than fetching (but you have not said how "relatively fast" your algorithm run, compared to fetching all the movies)
This "high level" approach of your problem is probably your better chance if fetching the movies is really slow although it requires more work than simply activating a hypothetic "speed up" button of a Library. Though it is a sound approach when dealing with large chunk of data.

Resources