firestore cloud functions for getting aggregated values in collection - firebase

lets say I have a app that stores users bills, user adds or deletes the bill. the structure of the data is as:
users/user_id/bills/bill_id
bill document structure as
{ bill_name: string, amount: number }
what I want to show the user aggregated values corresponding to bill name
lets say I have 2 entries in bills collection
{ bill_name: 'amazon', amount: 1000 }
{ bill_name: 'amazon', amount: 2000 }
my output should be
{ bill_name: 'amazon', amount: 3000 }
my question is what will be best to get the aggregated values
Create a cloud function that triggers on onWrite /user/user_id/bills/bill_id
and create a new entry in /users/user_id/aggregated_bills/ at the time user adds or deletes the bills and what this function does read the data from /users/user_id/aggregated_bills/bill_id/ where bill name is 'amazon' add do the math to this entry and store the new value in aggregated_bills collection.
also I want to know if we can add or read the database from cloud function other then it was referenced. in functions.firestore.document('/users/{user_id}/bills/{bill_id}').onWrite( ...
create a cloud function that triggers on HTTPS request, and read the data from /users/user_id/bills/ where bill name is 'amazon' and calculate the aggregated values there, and return the response.
may be any other solution for this problem
here I want to say that getting aggregated values will be not only for
one bill, but for multiple bills at the same time. lets say I want to
show the user dashboard where he is seeing aggregated values of his
top 20 bills

The onWrite trigger would make the most sense in this situation. Consider writing the aggregate results as an object/map on the parent document because then you will only need one read operation to consume the data - faster and cheaper.
Your cloud function would look something like this:
exports.aggregateBills = functions.firestore
.document('user/{user_id}/bills/{bill_id}')
.onWrite(event => {
const bill_id = event.params.bill_id;
const user_id = event.params.user_id;
const bill_name = event.data.data().bill_name;
// ref to the parent document
const docRef = admin.firestore().collection('users').doc(userId)
// get all bills and aggregate
return docRef.collection('bills')
.where('bill_name', '==', bill_name)
.get()
.then(querySnapshot => {
// get the total comment count
const bills = []
// loop over bills to create a plain JS array
querySnapshot.forEach(doc => {
bills.push( doc.data() )
});
const aggregateData = 'do your calculations here'
// run update
return docRef.update({ aggregateData })
})
});

Related

How do I know if there are more documents left to get from a firestore collection?

I'm using flutter and firebase. I use pagination, max 5 documents per page. How do I know if there are more documents left to get from a firestore collection. I want to use this information to enable/disable a next page button presented to the user.
limit: 5 (5 documents each time)
orderBy: "date" (newest first)
startAfterDocument: latestDocument (just a variable that holds the latest document)
This is how I fetch the documents.
collection.limit(5).orderBy("date", descending: true).startAfterDocument(latestDocument).get()
I thought about checking if the number of docs received from firestore is equal to 5, then assume there are more docs to get. But this will not work if I there are a total of n * 5 docs in the collection.
I thought about getting the last document in the collection and store this and compare this to every doc in the batches I get, if there is a match then I know I've reach the end, but this means one excess read.
Or maybe I could keep on getting docs until I get an empty list and assume I've reached the end of the collection.
I still feel there are a much better solution to this.
Let me know if you need more info, this is my first question on this account.
There is no flag in the response to indicate there are more documents. The common solution is to request one more document than you need/display, and then use the presence of that last document as an indicator that there are more documents.
This is also what the database would have to do to include such a flag in its response, which is probably why this isn't an explicit option in the SDK.
You might also want to check the documentation on keeping a distributed count of the number of documents in a collection as that's another way to determine whether you need to enable the UI to load a next page.
here's a way to get a large data from firebase collection
let latestDoc = null; // this is to store the last doc from a query
//result
const dataArr = []; // this is to store the data getting from firestore
let loadMore = true; // this is to check if there's more data or no
const initialQuery = async () => {
const first = db
.collection("recipes-test")
.orderBy("title")
.startAfter(latestDoc || 0)
.limit(10);
const data = await first.get();
data.docs.forEach((doc) => {
// console.log("doc.data", doc.data());
dataArr.push(doc.data()); // pushing the data into the array
});
//! update latest doc
latestDoc = data.docs[data.docs.length - 1];
//! unattach event listeners if no more docs
if (data.empty) {
loadMore = false;
}
};
// running this through this function so we can actual await for the
//docs to get from firebase
const run = async () => {
// looping until we get all the docs
while (loadMore) {
console.log({ loadMore });
await initialQuery();
}
};

Why is it not possible to orderBy on different fields in Cloud Firestore and how can I work around it?

I have a collection in firebase cloud firestore called 'posts' and I want to show the most liked posts in the last 24h on my web app.
The post documents have a field called 'like_count' (number) and another field called 'time_posted' (timestamp).
I also want to be able to limit the results to apply pagination.
I tried to apply a filter to only get the posts posted in the last 24 hours and then ordering them by the 'like_count' and then the 'time_posted' since I want the posts with the most likes to appear first.
postsRef.where("time_posted", ">", twentyFourHoursAgo)
.orderBy("like_count", "desc")
.orderBy("time_posted", "desc")
.limit(10)
However, I quickly found out that it is not possible to filter and then sort by two different fields.
(See the Limitations part of the documentation for Order and limit data with Cloud Firestore)
It states:
Invalid: Range filter and first orderBy on different fields
I thought about sorting the results by 'like_count' in the frontend, but this won't work properly because I don't have all the documents. And getting all the documents is infeasible for a large number of daily posts.
Is there an easy work-around I am missing or how can I go about this?
When performing a query, Firestore must be able to traverse an index in a continuous fashion.
This introduction video is a little outdated (because "OR" queries are now possible using the "in" operator) but it does give a good visualization of what Firestore is doing as it runs a query.
If your query was just postsRef.orderBy("like_count", "desc").limit(10), Firestore would load up the index it has for a descending "like_count", pluck the first 10 entries and return them.
To handle your query, it would have to pluck an entry off the descending "like_count" index, compare it to your "time_posted" requirement, and either discard it or add it to a list of valid entries. Once it has all of the recent posts, it then needs to sort the results as you specified. As these steps don't make use of a continuous read of an index, it is disallowed.
The solution would be to build your own index from the recent posts and then pluck the results off of that. Because doing this on the client is ill-advised, you should instead make use of a Cloud Function to do the work for you. The following code makes use of a Callable Cloud Function.
const MS_TWENTY_FOUR_HOURS = 24 * 60 * 60 * 1000;
export getRecentTopPosts = function.https.onCall((data, context) => {
// unless otherwise stated, return only 10 entries
const limit = Number(data.limit) || 10;
const postsRef = admin.firestore().collection("posts");
// OPTIONAL CODE SEGMENT: Check Cached Index
const twentyFourHoursAgo = Date.now() - MS_TWENTY_FOUR_HOURS;
const recentPostsSnapshot = await postsRef
.where("time_posted", ">", twentyFourHoursAgo)
.get();
const orderedPosts = recentPostsSnapshot.docs
.map(postDoc => ({
snapshot: postDoc,
like_count: postDoc.get("like_count"),
time_posted: postDoc.get("time_posted")
})
.sort((p1, p2) => {
const deltaLikes = p2.like_count - p1.like_count; // descending sort based on like_count
if (deltaLikes !== 0) {
return deltaLikes;
}
return p2.time_posted - p1.time_posted; // descending sort based on time_posted
});
// OPTIONAL CODE SEGMENT: Save Cached Index
return orderedPosts
.slice(0, limit)
.map(post => ({
_id: post.snapshot.id,
...post.snapshot.data()
}));
})
If this code is expected to be called by many clients, you may wish to cache the index to save it getting constantly rebuilt by inserting the following segments into the function above.
// OPTIONAL CODE SEGMENT: Check Cached Index
if (!data.skipCache) { // allow option to bypass cache
const cachedIndexSnapshot = await admin.firestore()
.doc("_serverCache/topRecentPosts")
.get();
const oneMinuteAgo = Date.now - 60000;
// if the index was created in the past minute, reuse it
if (cachedIndexSnapshot.get("timestamp") > oneMinuteAgo) {
const recentPostMetadataArray = cachedIndexSnapshot.get("posts");
const recentPostIdArray = recentPostMetadataArray
.slice(0, limit)
.map((postMeta) => postMeta.id)
const postDocs = await fetchDocumentsWithId(postsRef, recentPostIdArray); // see https://gist.github.com/samthecodingman/aea3bc9481bbab0a7fbc72069940e527
// postDocs is not ordered, so we need to be able to find each entry by it's ID
const postDocsById = {};
for (const doc of postDocs) {
postDocsById[doc.id] = doc;
}
return recentPostIdArray
.map(id => {
// may be undefined if not found (i.e. recently deleted)
const postDoc = postDocsById[id];
if (!postDoc) {
return null; // deleted post, up to you how to handle
} else {
return {
_id: postDoc.id,
...postDoc.data()
};
}
});
}
}
// OPTIONAL CODE SEGMENT: Save Cached Index
if (!data.skipCache) { // allow option to bypass cache
await admin.firestore()
.doc("_serverCache/topRecentPosts")
.set({
timestamp: Date.now(),
posts: orderedPosts
.slice(0, 25) // cache the maximum expected amount
.map(post => ({
id: post.snapshot.id,
like_count: post.like_count,
time_posted: post.time_posted,
}))
});
}
Other improvements you could add to this function include:
A field mask - i.e. instead of return every part of the post documents, return just the title, like count, time posted and the author.
Variable post age (instead of 24 hours)
Variable minimum likes count
Filter by author

How to automatically update same field in different collections in Firestore

I am using Cloud Firestore as my database and I have collections of users where are stored basic information about user such as id, name, last name, email, company id.
Also I have collection of companies and in each company I have collection of tasks.
In each task I have one user assigned from collections of users (user data is replicated, so I have same data for that user as in collection users)
The problem is when I update user (change name or email...) from collection users because data is replicated that data is not changed in collection of tasks for that specific user.
Is there any way that using firestore when user from collection users is updated to automatically update it in collection of tasks?
This is quite a standard case in NoSQL databases, where we often denormalize data and need to keep these data in sync.
Basically you have two possible main approaches:
#1 Update from the client
When you update the "user" document, update at the same time the other documents (i.e. "tasks") which contain the user's details. You should use a batched write to do so: A batch of writes completes atomically and can write to multiple documents.
Something along the following lines:
// Get a new write batch
var batch = db.batch();
var userRef = db.collection('users').doc('...');
batch.update(userRef, {name: '....', foo: '....'});
let userTaskRef = db.collection('companies').doc('...').collection('tasks').doc('taskId1');
batch.update(userTaskRef, {name: '....'});
userTaskRef = db.collection('companies').doc('...').collection('tasks').doc('taskId2');
batch.update(userTaskRef, {name: '....'});
// ...
// Commit the batch
batch.commit().then(function () {
// ...
});
Note that you need to know which are "the other ("tasks") documents which contain the user's details": you may need to do a query to get these documents (and their DocumentReferences).
#2 Update in the back-end via a Cloud Function
Write and deploy a Cloud Function that is triggered when any "user" document is updated and which takes the value of this "user" document and update the "tasks" documents which contain the user's details.
Like for the first approach, you also need, in this case, to know which are "the other ("tasks") documents which contain the user's details.
Following your comment ("Is there any option to reference to another table or put foreign key?") here is a Cloud Function that will update all the ("tasks") documents that have their DocumentReference contained in a dedicated Array field taskRefs in the "user" doc. The Array members are of data type Reference.
exports.updateUser = functions.firestore
.document('users/{userId}')
.onUpdate((change, context) => {
const newValue = change.after.data();
const name = newValue.name;
const taskRefs = newValue.taskRefs;
const promises = taskRefs.map(ref => { ref.update({ name: name, foo: "bar" }) });
return Promise.all(promises);
});
You would most probably set the value of this taskRefs field in the "user" doc from your frontend. Something along the following lines with the JS SDK:
const db = firebase.firestore();
db.collection('users').doc('...').set({
field1: "foo",
field2: "bar",
taskRefs: [ // < = This is an Array of References
db.collection('tasks').doc('....'),
db.collection('tasks').doc('....')]
});

Cloud Functions Update Sub-collections

I'm trying to create a cloud function to trigger every time a product on my project gets updated. Here is the idea.
I have 2 collections, stores and products.
Inside the stores collection, there is a sub-collection called products that contains all the products that the store sells. The data 'gets fed' by copying specific items from the main products root collection
The idea is the my project gets a good performance as well as cost effective.
In order for this to work, I need to create a cloud function to be triggered every time a product gets modified and query all the stores that has that same product id and update the data.
I'm having a really hard time with that. Can anybody shine a light here for me? This is my cloud function.
// Exporting the function
export const onProductChange = functions.firestore
.document('products/{productId}')
// Call the update method
.onUpdate(async (snap, context) => {
// Get the product ID
const productID = context.params.productID;
// Query for all the collections with the specific product ID.
const resultSnapshot = await db.collectionGroup('products')
.where('id', '==', productID).get();
// Filter for the collections with the 'products' root and return an array.
const snaphotsInStoreSubcollection = resultSnapshot.docs.filter(
(snapshot: any) => {
return snapshot.ref.parent === 'products';
});
const batch = db.batch();
// Takes each product and update
snaphotsInStoreSubcollection.forEach((el: any) => {
batch.set(el.ref, snaphotsInStoreSubcollection.product);
});
await batch.commit();
});
error on cloud function console
Error: Value for argument "value" is not a valid query constraint. Cannot use "undefined" as a Firestore value. at Object.validateUserInput (/srv/node_modules/#google-cloud/firestore/build/src/serializer.js:273:15) at validateQueryValue (/srv/node_modules/#google-cloud/firestore/build/src/reference.js:1844:18) at Query.where (/srv/node_modules/#google-cloud/firestore/build/src/reference.js:956:9) at exports.onProductChange.functions.firestore.document.onUpdate (/srv/lib/product-update.js:29:10) at cloudFunction (/srv/node_modules/firebase-functions/lib/cloud-functions.js:131:23) at /worker/worker.js:825:24 at at process._tickDomainCallback (internal/process/next_tick.js:229:7)
I would suggest you take a look at this documentation and specially in Event Triggers.
Let me know if this helps.
I think the snapshotsInStoreSubcollection.product is undefined
batch.set(el.ref, snaphotsInStoreSubcollection.product);
A snapshot is a document and its data is snapshot.data()
You cannot set undefined as a value in firestore and you are attempting to

Firestore Cloud Functions - Keeping Count of Amount of Documents in Collection

I am trying to write a cloud function that will keep track of the amount of Documents in the Collection. There isn't a ton of documentation on this probably because of Firestore is so now.. so I was trying to think of the best way to do this.. this is the solution I come up with.. I can't figure out how to return the count
Document 1 -> Collection - > Documents
In Document 1 there would ideally store the Count of Documents in the Collection, but I can't seem to figure out how to relate this
Let's just assume Document1 is a Blog post and the subcollection is comments.
Trigger the function on comment doc create.
Read the parent doc and increment its existing count
Write the data to the parent doc.
Note: If your the count value changes faster than once-per-second, you may need a distributed counter https://firebase.google.com/docs/firestore/solutions/counters
exports.aggregateComments = functions.firestore
.document('posts/{postId}/comments/{commentId}')
.onCreate(event => {
const commentId = event.params.commentId;
const postId = event.params.postId;
// ref to the parent document
const docRef = admin.firestore().collection('posts').doc(postId)
return docRef.get().then(snap => {
// get the total comment count and add one
const commentCount = snap.data().commentCount + 1;
const data = { commentCount }
// run update
return docRef.update(data)
})
});
I put together a detailed firestore aggregation example if you need to run advanced aggregation calculations beyond a simple count.

Resources