How to improve preformance of firestore cache query - firebase

I am developing a PWA, which displays a list of transactions (transaction is an object with ~10 fields). I am using firestore for storage and realtime updates and I have also enabled persistance.
I want my application to have all the data in memory and I want to take care of displaying only necessary information myself (e.g. using virtual scrolling for transaction list). Due to this reason I listen to the whole collection (a.k.a the transactions).
At the start of the app, I want to make sure the data is loaded so I use one time cache query to get the transactions. I would expect the query to be nearly instantaneous, but on laptop it takes around ~1 second to get the initial data (and I also have another collection which I fetch from cache and this resolves after ~2 seconds after transactions request). For mobile it takes around ~9seconds (loading on mobile, loading on laptop)
I want my app to feel instantaneous, but I takes a few seconds until the data is in place. Note, that I am not doing any advanced queries (I just want to load the data to memory).
Am I doing something wrong? I have read Firestore docs, but I don't think the amount of data that I have in cache should cause such bad performance.
UPDATE: Even if I limit the initial query to just load 20 documents. It still takes around ~2 seconds to retrieve them.
UPDATE 2: The code looks like this:
export const initializeFirestore = (): Thunk => (dispatch) => {
const initialQueries: Array<Promise<unknown>> = []
getQueries().forEach((query) => {
const q = query.createFirestoneQuery()
initialQueries.push(
q
.get({
source: 'cache',
})
.then((snapshot) =>
dispatch(firestoneChangeAction(query, snapshot, true)),
),
)
q.onSnapshot((change) => {
dispatch(firestoneChangeAction(query, change))
})
})
console.log('Now I am just waiting for initial data...')
return Promise.all(initialQueries)
}

You may be interested by the smart approach presented by Firebase engineers during the "Faster web apps with Firebase" Session of the Firebase Summit 2019 (You can watch the video here: https://www.youtube.com/watch?v=DHbVyRLkX4c).
In a nutshell, their idea is to use the Firestore REST API to make the first query to the database (which does not need to download any SDK), and in parallel, dynamically import the Web SDK in order to use it for the subsequent queries.
The github repository is here: https://github.com/hsubox76/fireconf-demo
I paste below the content of the key js file (https://github.com/hsubox76/fireconf-demo/blob/master/src/dynamic.js) for further reference.
import { firebaseConfigDynamic as firebaseConfig } from "./shared/firebase-config";
import { renderPage, logPerformance } from "./shared/helpers";
let firstLoad = false;
// Firestore REST URL for "current" collection.
const COLLECTION_URL =
`https://firestore.googleapis.com/v1/projects/exchange-rates-adcf6/` +
`databases/(default)/documents/current`;
// STEPS
// 1) Fetch REST data
// 2) Render data
// 3) Dynamically import Firebase components
// 4) Subscribe to Firestore
// HTTP GET from Firestore REST endpoint.
fetch(COLLECTION_URL)
.then(res => res.json())
.then(json => {
// Format JSON data into a tabular format.
const stocks = formatJSONStocks(json);
// Measure time between navigation start and now (first data loaded)
performance && performance.measure("initialDataLoadTime");
// Render using initial REST data.
renderPage({
title: "Dynamic Loading (no Firebase loaded)",
tableData: stocks
});
// Import Firebase library.
dynamicFirebaseImport().then(firebase => {
firebase.initializeApp(firebaseConfig);
firebase.performance(); // Use Firebase Performance - 1 line
subscribeToFirestore(firebase);
});
});
/**
* FUNCTIONS
*/
// Dynamically imports firebase/app, firebase/firestore, and firebase/performance.
function dynamicFirebaseImport() {
const appImport = import(
/* webpackChunkName: "firebase-app-dynamic" */
"firebase/app"
);
const firestoreImport = import(
/* webpackChunkName: "firebase-firestore-dynamic" */
"firebase/firestore"
);
const performanceImport = import(
/* webpackChunkName: "firebase-performance-dynamic" */
"firebase/performance"
);
return Promise.all([appImport, firestoreImport, performanceImport]).then(
([dynamicFirebase]) => {
return dynamicFirebase;
}
);
}
// Subscribe to "current" collection with `onSnapshot()`.
function subscribeToFirestore(firebase) {
firebase
.firestore()
.collection(`current`)
.onSnapshot(snap => {
if (!firstLoad) {
// Measure time between navigation start and now (first data loaded)
performance && performance.measure("realtimeDataLoadTime");
// Log to console for internal development
logPerformance();
firstLoad = true;
}
const stocks = formatSDKStocks(snap);
renderPage({
title: "Dynamic Loading (Firebase now loaded)",
tableData: stocks
});
});
}
// Format stock data in JSON format (returned from REST endpoint)
function formatJSONStocks(json) {
const stocks = [];
json.documents.forEach(doc => {
const pathParts = doc.name.split("/");
const symbol = pathParts[pathParts.length - 1];
stocks.push({
symbol,
value: doc.fields.closeValue.doubleValue || 0,
delta: doc.fields.delta.doubleValue || 0,
timestamp: parseInt(doc.fields.timestamp.integerValue)
});
});
return stocks;
}
// Format stock data in Firestore format (returned from `onSnapshot()`)
function formatSDKStocks(snap) {
const stocks = [];
snap.forEach(docSnap => {
if (!docSnap.data()) return;
const symbol = docSnap.id;
const value = docSnap.data().closeValue;
stocks.push({
symbol,
value,
delta: docSnap.data().delta,
timestamp: docSnap.data().timestamp
});
});
return stocks;
}

You're not doing anything wrong. The query will take as much time as it needs to finish. This is why many sites use a loading indicator.
For the first query in your app, it's going to include the time it takes to fully initialize the SDK, which might involve asynchronous work beyond more than just the query itself. Also bear in mind that reading and sorting data from local disk isn't necessarily "fast", and that for larger amounts of documents, the local disk cache read might even be more expensive than the time it would take the fetch the same documents over the network.
Since we don't have any indication of how many documents you have, and how much total data you're trying to transfer, and the code you're using for this, all we can do is guess. But there's really not much you can do to speed up the initial query, other than perhaps limiting the size of the result set.
If you think that what you're experiencing is a bug, then please file a bug report on GitHub.

Related

How do I know if there are more documents left to get from a firestore collection?

I'm using flutter and firebase. I use pagination, max 5 documents per page. How do I know if there are more documents left to get from a firestore collection. I want to use this information to enable/disable a next page button presented to the user.
limit: 5 (5 documents each time)
orderBy: "date" (newest first)
startAfterDocument: latestDocument (just a variable that holds the latest document)
This is how I fetch the documents.
collection.limit(5).orderBy("date", descending: true).startAfterDocument(latestDocument).get()
I thought about checking if the number of docs received from firestore is equal to 5, then assume there are more docs to get. But this will not work if I there are a total of n * 5 docs in the collection.
I thought about getting the last document in the collection and store this and compare this to every doc in the batches I get, if there is a match then I know I've reach the end, but this means one excess read.
Or maybe I could keep on getting docs until I get an empty list and assume I've reached the end of the collection.
I still feel there are a much better solution to this.
Let me know if you need more info, this is my first question on this account.
There is no flag in the response to indicate there are more documents. The common solution is to request one more document than you need/display, and then use the presence of that last document as an indicator that there are more documents.
This is also what the database would have to do to include such a flag in its response, which is probably why this isn't an explicit option in the SDK.
You might also want to check the documentation on keeping a distributed count of the number of documents in a collection as that's another way to determine whether you need to enable the UI to load a next page.
here's a way to get a large data from firebase collection
let latestDoc = null; // this is to store the last doc from a query
//result
const dataArr = []; // this is to store the data getting from firestore
let loadMore = true; // this is to check if there's more data or no
const initialQuery = async () => {
const first = db
.collection("recipes-test")
.orderBy("title")
.startAfter(latestDoc || 0)
.limit(10);
const data = await first.get();
data.docs.forEach((doc) => {
// console.log("doc.data", doc.data());
dataArr.push(doc.data()); // pushing the data into the array
});
//! update latest doc
latestDoc = data.docs[data.docs.length - 1];
//! unattach event listeners if no more docs
if (data.empty) {
loadMore = false;
}
};
// running this through this function so we can actual await for the
//docs to get from firebase
const run = async () => {
// looping until we get all the docs
while (loadMore) {
console.log({ loadMore });
await initialQuery();
}
};

TTFB is taking so long (15s-20s) for simple NextJS page in Firebase production

I have a simple page that is applying SSR as follows:
const page = ({initProps}) => {
// render some static texts
// render images
};
page.getInitialProps = async (ctx) => {
// get id from ctx
// get data from Firestore (get by id, no aggregation)
const firebaseRes = await db.collection("organizations")
.doc(id)
.get();
// return data
}
Currently, in the production environment, it takes around 15s for TTFB.
I tried a lot of things (use next/image, reduce the data amount returned by getInitialProps...) to reduce the latency time but no luck.
Is there anything else I can check/improve for my case?
==========
Add more information:
I run my app as a Firebase function
My page is a landing page (static text, static images, dynamic image loading, one Lottie animation)
I'm using TailwindCSS
My NextJS version is 12.x
Inside the initialProp function, I connect to Firestore directly to get the data.
Inside the initialProp function, besides querying the data, I have a signInWithEmailAndPassword to get the token.

How to check if client's contacts are using my app?

I'm currently developing an app using Firebase.
My Firestore Database looks like below:
Once the user passes the Firebase authentication procedure, I'm creating a user document with a field "Phone:" which contains his phone number. Basically, everyone who is gonna using the app will be listed in the database.
And here is my challenge:
I'm using the plugin easy_contact_picker to store all the contacts of the users device to a List.
How can I find out whether the users contacts are using the app or whether they are listed in the database?
My goal is create a contact List Widget which shows me all my contacts. But those contacts which are using the app or which are listed in the database, should be highlighted or marked particularly.
Which is the best way to realize that if we consider that millions of users (to minimize computing power)are listed in the database?
Anyone has an idea?
Thanks a lot
First of all try to awoid giving everyone access to read all users. That is something most ppl do when handling such a problem. The do it because the query over all users won't work if you don't give the rights to read all of them.
Because of security reasons I would move the logic for checking if a user exists into callable function (not a http function!). That way you can call it inside of your app and check for a single user or multiple of them in an array. That would depend how your frontend would handle it.
Very importand would be to store all phone numbers in the absolute same format. That way you could query for them. Regardless of the number of users you could always find a specific one like here:
var citiesRef = db.collection("users");
var query = citiesRef.where("Phone", "==", "+4912345679");
The numbers need to be absolutely the same without any emtpy spaces - chars and the +49 or 0049 also needs to be the same.
You could create two callable funcitons. One to check if a single user exists in your app and another where you send an array of phone numbers and you get an array back. The cloud function can use Promise.all to performe such queries in parallel so you get your responce quite fast.
I'm using a similar approach to add users in my app as admins to specific groups where you just enter the email of the user and if he is in the app he will be added. I not he get's an invitation on the email to join the App.
With the help of Tarik's answer, Ayrix and I came up with the following solution.
Important: Read Tarik's answer for more information.
Client: callable_compare_contacts.dart
import 'package:cloud_functions/cloud_functions.dart';
Future<List<Object>> getMembersByPhoneNumber(List<String> allPhoneNumbers) async {
HttpsCallable callable = FirebaseFunctions.instance.httpsCallable('membersByPhoneNumber');
final results = await callable.call(<String, dynamic>{'allPhoneNumbers': allPhoneNumbers});
return results.data;
}
Server: index.js
const functions = require("firebase-functions");
const admin = require("firebase-admin");
if (admin.apps.length === 0) {
admin.initializeApp({
credential: admin.credential.applicationDefault(),
});
}
exports.membersByPhoneNumber = functions.https.onCall((data, context) => {
return new Promise((resolve, reject) => {
if (!data || !data.allPhoneNumbers.length) return resolve([]);
const phoneNumbers = data.allPhoneNumbers;
// TODO: different scope? move vars for future use
const db = admin.firestore();
const collectionRef = db.collection("User");
let batches = [];
// because of wrong eslint parsing (dirty)
batches = [];
while (phoneNumbers.length) {
// firestore limits batches to 10
const batch = phoneNumbers.splice(0, 10);
// add the batch request to to a queue
batches.push(
new Promise((response) => {
collectionRef.where("Phone", "in", [...batch]).get()
.then((results) =>
response(results.docs.map(function(result) {
return result.data().Phone;
} )));
})
);
}
// response / return to client
Promise.all(batches).then(function(content) {
// console.log("content.flat()");
// console.log(content.flat());
return resolve(content.flat());
});
});
});
Note: This is our first callable/cloud function .. so Suggestions for changes are welcome.

Why is it not possible to orderBy on different fields in Cloud Firestore and how can I work around it?

I have a collection in firebase cloud firestore called 'posts' and I want to show the most liked posts in the last 24h on my web app.
The post documents have a field called 'like_count' (number) and another field called 'time_posted' (timestamp).
I also want to be able to limit the results to apply pagination.
I tried to apply a filter to only get the posts posted in the last 24 hours and then ordering them by the 'like_count' and then the 'time_posted' since I want the posts with the most likes to appear first.
postsRef.where("time_posted", ">", twentyFourHoursAgo)
.orderBy("like_count", "desc")
.orderBy("time_posted", "desc")
.limit(10)
However, I quickly found out that it is not possible to filter and then sort by two different fields.
(See the Limitations part of the documentation for Order and limit data with Cloud Firestore)
It states:
Invalid: Range filter and first orderBy on different fields
I thought about sorting the results by 'like_count' in the frontend, but this won't work properly because I don't have all the documents. And getting all the documents is infeasible for a large number of daily posts.
Is there an easy work-around I am missing or how can I go about this?
When performing a query, Firestore must be able to traverse an index in a continuous fashion.
This introduction video is a little outdated (because "OR" queries are now possible using the "in" operator) but it does give a good visualization of what Firestore is doing as it runs a query.
If your query was just postsRef.orderBy("like_count", "desc").limit(10), Firestore would load up the index it has for a descending "like_count", pluck the first 10 entries and return them.
To handle your query, it would have to pluck an entry off the descending "like_count" index, compare it to your "time_posted" requirement, and either discard it or add it to a list of valid entries. Once it has all of the recent posts, it then needs to sort the results as you specified. As these steps don't make use of a continuous read of an index, it is disallowed.
The solution would be to build your own index from the recent posts and then pluck the results off of that. Because doing this on the client is ill-advised, you should instead make use of a Cloud Function to do the work for you. The following code makes use of a Callable Cloud Function.
const MS_TWENTY_FOUR_HOURS = 24 * 60 * 60 * 1000;
export getRecentTopPosts = function.https.onCall((data, context) => {
// unless otherwise stated, return only 10 entries
const limit = Number(data.limit) || 10;
const postsRef = admin.firestore().collection("posts");
// OPTIONAL CODE SEGMENT: Check Cached Index
const twentyFourHoursAgo = Date.now() - MS_TWENTY_FOUR_HOURS;
const recentPostsSnapshot = await postsRef
.where("time_posted", ">", twentyFourHoursAgo)
.get();
const orderedPosts = recentPostsSnapshot.docs
.map(postDoc => ({
snapshot: postDoc,
like_count: postDoc.get("like_count"),
time_posted: postDoc.get("time_posted")
})
.sort((p1, p2) => {
const deltaLikes = p2.like_count - p1.like_count; // descending sort based on like_count
if (deltaLikes !== 0) {
return deltaLikes;
}
return p2.time_posted - p1.time_posted; // descending sort based on time_posted
});
// OPTIONAL CODE SEGMENT: Save Cached Index
return orderedPosts
.slice(0, limit)
.map(post => ({
_id: post.snapshot.id,
...post.snapshot.data()
}));
})
If this code is expected to be called by many clients, you may wish to cache the index to save it getting constantly rebuilt by inserting the following segments into the function above.
// OPTIONAL CODE SEGMENT: Check Cached Index
if (!data.skipCache) { // allow option to bypass cache
const cachedIndexSnapshot = await admin.firestore()
.doc("_serverCache/topRecentPosts")
.get();
const oneMinuteAgo = Date.now - 60000;
// if the index was created in the past minute, reuse it
if (cachedIndexSnapshot.get("timestamp") > oneMinuteAgo) {
const recentPostMetadataArray = cachedIndexSnapshot.get("posts");
const recentPostIdArray = recentPostMetadataArray
.slice(0, limit)
.map((postMeta) => postMeta.id)
const postDocs = await fetchDocumentsWithId(postsRef, recentPostIdArray); // see https://gist.github.com/samthecodingman/aea3bc9481bbab0a7fbc72069940e527
// postDocs is not ordered, so we need to be able to find each entry by it's ID
const postDocsById = {};
for (const doc of postDocs) {
postDocsById[doc.id] = doc;
}
return recentPostIdArray
.map(id => {
// may be undefined if not found (i.e. recently deleted)
const postDoc = postDocsById[id];
if (!postDoc) {
return null; // deleted post, up to you how to handle
} else {
return {
_id: postDoc.id,
...postDoc.data()
};
}
});
}
}
// OPTIONAL CODE SEGMENT: Save Cached Index
if (!data.skipCache) { // allow option to bypass cache
await admin.firestore()
.doc("_serverCache/topRecentPosts")
.set({
timestamp: Date.now(),
posts: orderedPosts
.slice(0, 25) // cache the maximum expected amount
.map(post => ({
id: post.snapshot.id,
like_count: post.like_count,
time_posted: post.time_posted,
}))
});
}
Other improvements you could add to this function include:
A field mask - i.e. instead of return every part of the post documents, return just the title, like count, time posted and the author.
Variable post age (instead of 24 hours)
Variable minimum likes count
Filter by author

minimize time operation in firebase/firestore

I build react native app with firebase & firestore.
what I'm looking to do is, when user open app, to insert/update his status to 'online' (kind of presence system), when user close app, his status 'offline'.
I did it with firebase.database.onDisconnect(), it works fine.
this is the function
async signupAnonymous() {
const user = await firebase.auth().signInAnonymouslyAndRetrieveData();
this.uid = firebase.auth().currentUser.uid
this.userStatusDatabaseRef = firebase.database().ref(`UserStatus/${this.uid}`);
this.userStatusFirestoreRef = firebase.firestore().doc(`UserStatus/${this.uid}`);
firebase.database().ref('.info/connected').on('value', async connected => {
if (connected.val() === false) {
// this.userStatusFirestoreRef.set({ state: 'offline', last_changed: firebase.firestore.FieldValue.serverTimestamp()},{merge:true});
return;
}
await firebase.database().ref(`UserStatus/${this.uid}`).onDisconnect().set({ state: 'offline', last_changed: firebase.firestore.FieldValue.serverTimestamp() },{merge:true});
this.userStatusDatabaseRef.set({ state: 'online', last_changed: firebase.firestore.FieldValue.serverTimestamp() },{merge:true});
// this.userStatusFirestoreRef.set({ state: 'online',last_changed: firebase.firestore.FieldValue.serverTimestamp() },{merge:true});
});
}
after that, I did trigger to insert data into firestore(because I want to work with firestore), this is the function(works fine, BUT it takes 3-4 sec)
module.exports.onUserStatusChanged = functions.database
.ref('/UserStatus/{uid}').onUpdate((change,context) => {
const eventStatus = change.after.val();
const userStatusFirestoreRef = firestore.doc(`UserStatus/${context.params.uid}`);
return change.after.ref.once("value").then((statusSnapshot) => {
return statusSnapshot.val();
}).then((status) => {
console.log(status, eventStatus);
if (status.last_changed > eventStatus.last_changed) return status;
eventStatus.last_changed = new Date(eventStatus.last_changed);
//return userStatusFirestoreRef.set(eventStatus);
return userStatusFirestoreRef.set(eventStatus,{merge:true});
});
});
then after that, I want to calculate the online users in app, so I did trigger when write new data to node of firestore so it calculate the size of online users by query.(it works fine but takes 4-7 sec)
module.exports.countOnlineUsers = functions.firestore.document('/UserStatus/{uid}').onWrite((change,context) => {
console.log('userStatus')
const userOnlineCounterRef = firestore.doc('Counters/onlineUsersCounter');
const docRef = firestore.collection('UserStatus').where('state','==','online').get().then(e=>{
let count = e.size;
console.log('count',count)
return userOnlineCounterRef.update({count})
})
return Promise.resolve({success:'added'})
})
then into my react native app
I get the count of online users
this.unsubscribe = firebase.firestore().doc(`Counters/onlineUsersCounter`).onSnapshot(doc=>{
console.log('count',doc.data().count)
})
All the operations takes about 12 sec. it's too much for me, it's online app
my firebase structure
what I'm doing wrong? maybe there is unnecessary function or something?
My final goals:
minimize time operation.
get online users count (with listener-each
change, it will update in app)
update user status.
if there are other way to do that, I would love to know.
Cloud Functions go into a 'cold start' mode, where they take some time to boot up. This is the only reason I can think of that it would take that long. Stack Overflow: Firebase Cloud Functions Is Very Slow
But your cloud function only needs to write to Firestore on log out to
catch the case where your user closes the app. You can write to it directly on log in from your client
with auth().onAuthStateChange().
You could also just always read who is logged in or out directly from the
realtime database and use Firestore for the rest of your data.
You can rearrange your data so that instead of a 'UserStatus' collection you have an 'OnlineUsers' collection containing only online users, kept in sync by deleting the documents on log out. Then it won't take a query operation to get them. The query's impact on your performance is likely minimal, but this would perform better with a large number of users.
The documentation also has a guide that may be useful: Firebase Docs: Build Presence in Cloud Firestore

Resources