Firestore efficient/fast query posts a user has liked - firebase

I have a collection with users and a collection with posts.
In every post there is a sub collection of user ids that liked that post. Additionally I also have a sub collection in the users where I save the id of the posts that a user liked.
Now my question is, what would be the most efficient query to get all the posts a specific user liked.
The ways I know are: (examples are in Javascript)
Query the sub collection of a user to get all the post ids that the user liked. Than get all Documents with query and query Constraint
The problem is I need to separate this into multiple queries because the query constraint "in" can only get up to 10 elements in the array:
const data = query(collection(db, "posts"), where("postId", "in", postIds))
Query the sub collection of a user to get all the post ids that the user liked. Than get all Documents (posts) individually.
So for every document I need to do this:
const data = await getDoc(doc(db, "posts", "postId"))
Is there any other more efficient and faster way of achieving this.

Related

Firebase how to filter collection with a sub collection

I have a collection called "posts".
Posts has sub collection called "feedback".
when a user give feedback to a post his id and comment get added to feedback sub collection.
Now I want to find posts that user has not given a feedback.
Something like following sql query
select * from posts where userId not in (select userId from feedback)
Can someone provide advice on how to do this?
Firestore doesn't support joins between collections or subqueries. You won't be able to perform any queries that use data from more than one collection.
Also, querying for non-existence isn't supported by Firestore. So, you won't be able to query for the absence of data in a field. Firestore requires that all queries be able to use a highly performant index, which only tracks data present in documents.

Is it possible to fetch all documents whose sub-collection contains a specific document ID?

I am trying to fetch all documents whose sub-collection contain a specific document ID. Is there any way to do this?
For example, if the boxed document under 'enquiries' sub-collection exists, then I need the boxed document ID from 'books' collection. I couldn't figure out how to go backwards to get the parent document ID.
I make the assumption that all the sub-collections have the same name, i.e. enquiries. Then, you could do as follows:
Add a field docId in your enquiries document that contains the document ID.
Execute a Collection Group query in order to get all the documents with the desired docId value (Firestore.instance.collectionGroup("enquiries").where("docId", isEqualTo: "ykXB...").getDocuments()).
Then, you loop over the results of the query and for each DocumentReference you call twice the parent() methods (first time you will get the CollectionReference and second time you will get the DocumentReference of the parent document).
You just have to use the id property and you are done.
Try the following:
Firestore.instance.collection("books").where("author", isEqualTo: "Arumugam").getDocuments().then((value) {
value.documents.forEach((result) {
var id = result.documentID;
Firestore.instance.collection("books").document(id).collection("enquiries").getDocuments().then((querySnapshot) {
querySnapshot.documents.forEach((result) {
print(result.data);
});
First you need to retrieve the id under the books collection, to be able to do that you have to do a query for example where("author", isEqualTo: "Arumugam"). After retrieving the id you can then do a query to retrieve the documents inside the collection enquiries
For example, if the boxed document under 'enquiries' sub-collection exists, then I need the boxed document ID from 'books' collection.
There is no way you can do that in a single go.
I couldn't figure out how to go backwards to get the parent document ID.
There is no going back in Firestore as you probably were thinking. In Firebase Realtime Database we have a method named getParent(), which does exactly what you want but in Firestore we don't.
Queries in Firestore are shallow, meaning that it only get items from the collection that the query is run against. Firestore doesn't support queries across different collections in one go. A single query may only use the properties of documents in a single collection. So the solution to solving your problem is to perform two get() calls. The first one would be to check that document for existence in the enquiries subcollection, and if it exists, simply create another get() call to get the document from the books collection.
Renaud Tarnec's answer is great for fetching the IDs of the relevant books.
If you need to fetch more than the ID, there is a trick you could use in some scenarios. I imagine your goal is to show some sort of an index of all books associated with a particular enquiry ID. If the data you'd like to show in that index is not too long (can be serialized in less than 1500 bytes) and if it is not changing frequently, you could try to use the document ID as the placeholder for that data.
For example, let's say you wanted to display a list of book titles and authors corresponding to some enquiryId. You could create the book ID in the collection with something like so:
// Assuming admin SDK
const bookId = nanoid();
const author = 'Brandon Sanderson';
const title = 'Mistborn: The Final Empire';
// If title + author are not unique, you could add the bookId to the array
const uniquePayloadKey = Buffer.from(JSON.stringify([author, title])).toString('base64url');
booksColRef.doc(uniquePayloadKey).set({ bookId })
booksColRef.doc(uniquePayloadKey).collection('enquiries').doc(enquiryId).set({ enquiryId })
Then, after running the collection group query per Renaud Tarnec's answer, you could extract that serialized information with a regexp on the path, and deserialize. E.g.:
// Assuming Web 9 SDK
const books = query(collectionGroup(db, 'enquiries'), where('enquiryId', '==', enquiryId));
return getDocs(books).then(snapshot => {
const data = []
snapshot.forEach(doc => {
const payload = doc.ref.path.match(/books\/(.*)\/enquiries/)[1];
const [author, title] = JSON.parse(atob(details));
data.push({ author, title })
});
return data;
});
The "store payload in ID" trick can be used only to present some basic information for your child-driven search results. If your book document has a lot of information you'd like to display once the user clicks on one of the books returned by the enquiry, you may want to store this in separate documents whose IDs are the real bookIds. The bookId field added under the unique payload key allows such lookups when necessary.
You can reuse the same data structure for returning book results from different starting points, not just enquiries, without duplicating this structure. If you stored many authors per book, for example, you could add an authors sub-collection to search by. As long as the information you want to display in the resulting index page is the same and can be serialized within the 1500-byte limit, you should be good.
The (quite substantial) downside of this approach is that it is not possible to rename document IDs in Firestore. If some of the details in the payload change (e.g. an admin fixes a book titles), you will need to create all the sub-collections under it and delete the old data. This can be quite costly - at least 1 read, 1 write, and 1 delete for every document in every sub-collection. So keep in mind it may not be pragmatic for fast changing data.
The 1500-byte limit for key names is documented in Usage and Limits.
If you are concerned about potential hotspots this can generate per Best Practices for Cloud Firestore, I imagine that adding the bookId as a prefix to the uniquePayloadKey (with a delimiter that allows you to throw it away) would do the trick - but I am not certain.

Which is a more optimal Firestore schema for getting a Social Media feed?

I'm toying with several ideas for using Firestore for a social media feed. So far, the ideas I've had haven't panned out, so for this one I'm hoping to get the community's feedback.
The idea is to allow users to post information, or to record their activity, and to any user following/subscribed to that information, display it. The posts information would be in a root collection called posts.
The approaches, as far as I can tell, require roughly the same number of reads and writes.
One idea is to have within the users/{userId} have a field called posts which is an array of documentIds that I'm interested in pulling for the user. This would allow me to pull directly from posts and get the most up-to-date version of the data.
Another approach seems more Firebasey which is to store documents within users/{userId}/feeds that are copies of the posts themselves. I can use the same postID as the data in posts. Presumably, if I need to update the data for any review, I can use a group collection query to get all collections called feeds, where the docID is equal (or just create a field to do a proper "where", "==", docId).
Third approach is all about updating the list of people who should view the posts. This seems better as long as the list of posts is shorter than the lists of followers. Instead of maintaining all posts on every follower, you're maintaining all followers on each post. For every new follower, you need to update all posts.
This list would not be a user's own posts. Instead it would be a list of all the posts to show that user.
Three challengers:
users/{userId} with field called feed - an array of doc Ids that point to the global posts. Get that feed, get all docs by ID. Every array would need to be updated for every single follower each time a user has activity.
users (coll)
-> uid (doc)
-> uid.feed: postId1, postId2, postId3, ...] (field)
posts (coll)
-> postId (doc)
Query (pseudo):
doc(users/{uid}).get(doc)
feed = doc.feed
for postId in feed:
doc(posts/{postId}).get(doc)
users/{userId}/feed which has a copy of all posts that you would want this user to see. Every activity/post would need to be added to every relevant feed list.
users (coll)
-> uid (doc)
-> feed: (coll)
-> postId1 (doc)
-> postId2
-> postId3
posts (coll)
-> postId (doc)
Query (pseudo):
collection(users/{uid}/feed).get(docs)
for post in docs:
doc(posts/{post}).get(doc)
users/{userId}/feed which has a copy of all posts that you would want this user to see. Every activity/post would need to be added to every relevant feed list.
users (coll)
-> uid (doc)
posts (coll)
-> postId (doc)
-> postId.followers_array[followerId, followerId2, ...] (field)
Query (pseudo):
collection(posts).where(followers, 'array_contains', uid).get(docs)
Reads/Writes
1. Updating the Data
For the author user of every activity, find all users following that
user. Currently, the users are stored as documents in a collection, so this is followerNumber document reads. For each of the users, update their array by prepending the postId this would be followerNumber document writes.
1. Displaying the Data/Feed
For each fetch of the feed: get array from user document (1 doc read). For each postId, call, posts/{postId}
This would be numberOfPostsCalled document reads.
2. Updating the Data
For the author user of every activity, find all users following that
user. Currently, the users are stored as documents in a collection, so this is followerNumber document reads. For each of the users, add a new document with ID postId to users/{userId}/feed this would be followerNumber document writes.
2. Displaying the Data/Feed
For each fetch of the feed: get a certain number of posts from users/{userId}/feed
This would be numberOfPostsCalled document reads.
This second approach requires me to keep all of the documents up to date in the event of an edit. So despite this approach seeming more firebase-esque, the approach of holding a postId and fetching that directly seems slightly more logical.
3. Updating the Data
For every new follower, each post authored by the person being followed needs to be updated. The new follower is appended to an array called followers.
3. Displaying the Data
For each fetch of the feed: get a certain number of posts from posts where uid == viewerUid
Nice, when I talk about what is more optimal I really need a point or a quality attribute to compare, I' will assume you care about speed (not necessary performance) and costs.
This is how I would solve the problem, it involves several collections but my goal is 1 query only.
user (col)
{
"abc": {},
"qwe": {}
}
posts (col)
{
"123": {},
"456": {}
}
users_posts (col)
{
"abc": {
"posts_ids": ["123"]
}
}
So far so good, the problem is, I need to do several queries to get all the posts information... This is where cloud functions get into the game. You can create a 4th collection where you can pre-calculate your feed
users_dashboard
{
"abc": {
posts: [
{
id: "123", /.../
}, {
id: "456", /.../
}
]
}
}
The cloud function would look like this:
/* on your front end you can manage the add or delete ids from user posts */
export const calculateDashboard = functions.firestore.document(`users_posts/{doc}).onWrite(async(change, _context) {
const firestore = admin.firestore()
const dashboardRef = firestore.collection(`users_dashboard`)
const postRef = firestore.collection(`posts`)
const user = change.after.data()
const payload = []
for (const postId of user.posts_ids) {
const data = await postRef.doc(postId).get().then((doc) => doc.exists ? doc.data() : null)
payload.push(data)
}
// Maybe you want to exponse only certain props... you can do that here
return dashboardRef.doc(user.id).set(payload)
})
The doc max size is 1 MiB (1,048,576 bytes) that is plenty of data you can store in, so you can have like a lot of posts here. Let's talk about costs; I used to think firestore was more like to have several small docs but I've found in practice it works equally well with big size into a big amount of docs.
Now on your dashboard you only need query:
const dashboard = firestore.collection(`users_dashboard`).doc(userID).get()
This a very opinionated way to solve this problem. You could avoid using the users_posts, but maybe you dont want to trigger this process for other than posts related changes.
It looks like your second approach is best in this situation.. I don't really understand what #andresmijares was trying to do and he mentioned something like storing posts in a document which is not a good approach, imagine if you have more than 20K posts (which what I think a document can hold) then the document won't be able to store any more data.. a better approach is to store posts as a document inside a Collection (just like in your 2nd option).. So let's recall here what's the best approach.
1)_ You share a post in the (posts "Collection") and in users you're following's (Feed "Collection").. maybe this can be done with cloud function and let's not forget to aggregate (with cloud functions also) the number of posts that needs to appear in the user's profile.
2)_ You follow a user and get all of their posts from the (posts "Collection") into your (Feed "Collection") this way you get to see all of their posts on your feed.
with this approach, there will be a lot of writes once but the read will be fast.. and if your app is about reading more and writing less then there's nothing to worry about unless i'm wrong.

Store and Query Posts in Firestore in a performant way

So I need to store Posts that are created by Users, now the data modell is the problem, bringing all existing Posts in a Posts Collection with a field of creatorUserID will make it able to show posts belonging to a user.
Now a User has a Subcollection called Followers with the ID of people following, the problem with that is that Im not sure how a query would look to show only Posts of People that the User follows.
Also im worried about performance when there are 10mio+ Posts in the collection.
In order to query a document in Firestore the data you want to query by needs to be on the Document you want to query, there is no way of querying a collection by the data of a document from another collection. This is why your use-case is a bit tricky. It might not seem very elegant, but this is a way of solving it:
We use two collections, users and posts.
User
- email: string
- followingUserIDs: array with docIds of users you are following
Posts
- postName: string
- postText: string
- creatorUserID: string
To find all the posts belonging to all the users the logged in user is following, we can do the following in the code:
1 Retrieve the logged in user document
2 For each id in the "followingUserIDs" array I would query Firestore for the Posts. In JavaScript it would be something like:
followingUserIDs.map(followingUserId => {
return firestore.collection('Posts', ref => ref.where('creatorUserID',
'==', followingUserId));
})
3 Combine the result from all the queries into one array of posts

Firestore social media posts table

so I want to create a sort of social media application and use firestore as main database.
the goal is to create "facebook" news feed.
each user will have a list of friends and each user will be able to create posts.
each post can be modified to be visible to all the users of the application or just the user friends. so each user will be able to post posts to all his friends and to post posts to everyone in the application.
also, users can "save" posts they liked in the newsfeed.(LikedPosts subcollection)
USERS (collection)
DocumentID - userId gathered from Authentication uid
firstName: String
lastName: String
username: String
birthdate: Timestamp
accountCreationDate: Timestamp
FRIENDS (subcollection)
DocumentID - userId of the friend
username: String
LikedPosts (subcollection)
authorUserId: String
authorUsername: String
text: String
imagePath: String
POSTS (collection)
DocumentID - postId randomly generated
authorUserId: String
authorUsername: String
text: String
imagePath: String
likesCount: Number
forFriendsOnly:yes
LIKES (subcollection)
DocumentID - userID of the liker
username: String
now in the newsfeed for a user - How can I query for all the visible post (forFriendsOnly:no) and also to all the posts for friend only, that the current user is in the author friends subcollection.
also, if the user change his name, how can I change his name accordingly for all his previous posts, and all the save posts related to the user?(located in user likedpost subcollection)
I guess you were asking 2 questions.
First, Firestore recommends data duplication instead of joining query across collections. The way you designed the post and user has to rely on query concept in SQL.
It is still possible to achieve that, if you don't mind to have all the author's friend id as an array inside of that post document. Meanwhile, you have to sync author's friend array through trigger function when author add/delete friends.
I wouldn't really recommend this solution, because as a social platform, user's friends might be changing constantly, then you have to keep on updating all his post's friend array.
There is another solution, which is add one more subcollection under user as his visible "feeds". Then whenever an author creates a post, trigger function will write this post's summary to all his friends' visible "feeds" collection.
However, both above solutions are not perfect if you are concerned about accuracy, realtime, cost, etc. I guess that is the drawback we have to bear with. If you have to achieve the same thing as SQL, I guess the only option is using other solutions for query part, such as elastic search, mysql, neo4j, etc. PS: You can still wrap it with cloud functions.
Regards to your 2nd question, one way is not duplicate username if you think your user would change their name frequently. And always query username by user id from user collection. The other way is using trigger function to update the duplicated username when user change their names. I would recommend the second way, since user wouldn't change their names frequently.
Not necessarily related to your original question, but your LikedPosts subcollection likely needs a restructuring. If you can ensure uniqueness on your postId, then it should probably be something like:
LikedPosts (subcollection)
postId: Unique identifier for liked post
authorUserId: String
authorUsername: String
text: String
imagePath: String
The current structure only allows for one liked post, so you'll need to change it to be one document per liked post, or a document containing a list of all of the liked post ids.

Resources