Store and Query Posts in Firestore in a performant way - firebase

So I need to store Posts that are created by Users, now the data modell is the problem, bringing all existing Posts in a Posts Collection with a field of creatorUserID will make it able to show posts belonging to a user.
Now a User has a Subcollection called Followers with the ID of people following, the problem with that is that Im not sure how a query would look to show only Posts of People that the User follows.
Also im worried about performance when there are 10mio+ Posts in the collection.

In order to query a document in Firestore the data you want to query by needs to be on the Document you want to query, there is no way of querying a collection by the data of a document from another collection. This is why your use-case is a bit tricky. It might not seem very elegant, but this is a way of solving it:
We use two collections, users and posts.
User
- email: string
- followingUserIDs: array with docIds of users you are following
Posts
- postName: string
- postText: string
- creatorUserID: string
To find all the posts belonging to all the users the logged in user is following, we can do the following in the code:
1 Retrieve the logged in user document
2 For each id in the "followingUserIDs" array I would query Firestore for the Posts. In JavaScript it would be something like:
followingUserIDs.map(followingUserId => {
return firestore.collection('Posts', ref => ref.where('creatorUserID',
'==', followingUserId));
})
3 Combine the result from all the queries into one array of posts

Related

Firestore efficient/fast query posts a user has liked

I have a collection with users and a collection with posts.
In every post there is a sub collection of user ids that liked that post. Additionally I also have a sub collection in the users where I save the id of the posts that a user liked.
Now my question is, what would be the most efficient query to get all the posts a specific user liked.
The ways I know are: (examples are in Javascript)
Query the sub collection of a user to get all the post ids that the user liked. Than get all Documents with query and query Constraint
The problem is I need to separate this into multiple queries because the query constraint "in" can only get up to 10 elements in the array:
const data = query(collection(db, "posts"), where("postId", "in", postIds))
Query the sub collection of a user to get all the post ids that the user liked. Than get all Documents (posts) individually.
So for every document I need to do this:
const data = await getDoc(doc(db, "posts", "postId"))
Is there any other more efficient and faster way of achieving this.

Cloud Firestore rule for Read - on the basis of where clause [duplicate]

I'm trying to secure requests to a collection to allow any single get, but to allow list only if a specific key is matched.
Database structure is like this:
posts
post1
content: "Post 1 content"
uid: "uid1"
post2
content: "Post 2 content"
uid: "uid1"
post3
content: "Post 3 content"
uid: "uid2"
The Firestore query I'm making from Vue:
// Only return posts matching the requested uid
db
.collection("posts")
.where("uid", "==", this.uid)
The security rules I'd like to have would be something like this:
match /posts/{post} {
allow get: if true // this works
allow list: if [** the uid in the query **] != null
I want to do this so you can list the posts of a specific user if you know their uid but can't list all posts of the system.
Is there a way to access the requested .where() in the security rules or how can I write such rule or structure my data in this case?
Relevant & credits:
Seemingly, I can make a request on a query's limit, offset, and orderBy. But there's nothing on where. See: #1 & #2.
I copy-pasted much from this question. I don't see how the accepted answer answers the question. It seems like it answers another case where a user is allowed to list some other users' posts. That is not my case; in my case, what's public is public. So, it doesn't answer the main question in my case, it seems.
There's currently no way, using security rules, to check if a field is being used in query. The only thing you can do is verify that a document field is being used as a filter using only values you allow.
Instead, consider duplicating enough data into another collection organized like this:
user-posts (collection)
{uid} (document using UID as document ID)
posts (subcollection)
{postId} (documents using post ID as document ID)
This will require the client to call out a UID to query in order to get all the posts associated with that user. You can store as much information about the post documents as you like, for the purpose of satisfying the query.
Duplicating data like this is common in NoSQL databases. You might even want to make this your new default structure if you don't want your users to query across all posts at any given moment. Note that a collection group query naming the "posts" subcollection would still query across all posts for all users, so you'd have to make sure your security rules are set up so that this is enabled only when you allow it to happen.
Also note that UIDs are typically not hidden from users, especially if your web site is collaborative in nature, and you combine multiple users' data on a single page.

Firebase Firestore - security rule based on "where" query parameters

I'm trying to secure requests to a collection to allow any single get, but to allow list only if a specific key is matched.
Database structure is like this:
posts
post1
content: "Post 1 content"
uid: "uid1"
post2
content: "Post 2 content"
uid: "uid1"
post3
content: "Post 3 content"
uid: "uid2"
The Firestore query I'm making from Vue:
// Only return posts matching the requested uid
db
.collection("posts")
.where("uid", "==", this.uid)
The security rules I'd like to have would be something like this:
match /posts/{post} {
allow get: if true // this works
allow list: if [** the uid in the query **] != null
I want to do this so you can list the posts of a specific user if you know their uid but can't list all posts of the system.
Is there a way to access the requested .where() in the security rules or how can I write such rule or structure my data in this case?
Relevant & credits:
Seemingly, I can make a request on a query's limit, offset, and orderBy. But there's nothing on where. See: #1 & #2.
I copy-pasted much from this question. I don't see how the accepted answer answers the question. It seems like it answers another case where a user is allowed to list some other users' posts. That is not my case; in my case, what's public is public. So, it doesn't answer the main question in my case, it seems.
There's currently no way, using security rules, to check if a field is being used in query. The only thing you can do is verify that a document field is being used as a filter using only values you allow.
Instead, consider duplicating enough data into another collection organized like this:
user-posts (collection)
{uid} (document using UID as document ID)
posts (subcollection)
{postId} (documents using post ID as document ID)
This will require the client to call out a UID to query in order to get all the posts associated with that user. You can store as much information about the post documents as you like, for the purpose of satisfying the query.
Duplicating data like this is common in NoSQL databases. You might even want to make this your new default structure if you don't want your users to query across all posts at any given moment. Note that a collection group query naming the "posts" subcollection would still query across all posts for all users, so you'd have to make sure your security rules are set up so that this is enabled only when you allow it to happen.
Also note that UIDs are typically not hidden from users, especially if your web site is collaborative in nature, and you combine multiple users' data on a single page.

Which is a more optimal Firestore schema for getting a Social Media feed?

I'm toying with several ideas for using Firestore for a social media feed. So far, the ideas I've had haven't panned out, so for this one I'm hoping to get the community's feedback.
The idea is to allow users to post information, or to record their activity, and to any user following/subscribed to that information, display it. The posts information would be in a root collection called posts.
The approaches, as far as I can tell, require roughly the same number of reads and writes.
One idea is to have within the users/{userId} have a field called posts which is an array of documentIds that I'm interested in pulling for the user. This would allow me to pull directly from posts and get the most up-to-date version of the data.
Another approach seems more Firebasey which is to store documents within users/{userId}/feeds that are copies of the posts themselves. I can use the same postID as the data in posts. Presumably, if I need to update the data for any review, I can use a group collection query to get all collections called feeds, where the docID is equal (or just create a field to do a proper "where", "==", docId).
Third approach is all about updating the list of people who should view the posts. This seems better as long as the list of posts is shorter than the lists of followers. Instead of maintaining all posts on every follower, you're maintaining all followers on each post. For every new follower, you need to update all posts.
This list would not be a user's own posts. Instead it would be a list of all the posts to show that user.
Three challengers:
users/{userId} with field called feed - an array of doc Ids that point to the global posts. Get that feed, get all docs by ID. Every array would need to be updated for every single follower each time a user has activity.
users (coll)
-> uid (doc)
-> uid.feed: postId1, postId2, postId3, ...] (field)
posts (coll)
-> postId (doc)
Query (pseudo):
doc(users/{uid}).get(doc)
feed = doc.feed
for postId in feed:
doc(posts/{postId}).get(doc)
users/{userId}/feed which has a copy of all posts that you would want this user to see. Every activity/post would need to be added to every relevant feed list.
users (coll)
-> uid (doc)
-> feed: (coll)
-> postId1 (doc)
-> postId2
-> postId3
posts (coll)
-> postId (doc)
Query (pseudo):
collection(users/{uid}/feed).get(docs)
for post in docs:
doc(posts/{post}).get(doc)
users/{userId}/feed which has a copy of all posts that you would want this user to see. Every activity/post would need to be added to every relevant feed list.
users (coll)
-> uid (doc)
posts (coll)
-> postId (doc)
-> postId.followers_array[followerId, followerId2, ...] (field)
Query (pseudo):
collection(posts).where(followers, 'array_contains', uid).get(docs)
Reads/Writes
1. Updating the Data
For the author user of every activity, find all users following that
user. Currently, the users are stored as documents in a collection, so this is followerNumber document reads. For each of the users, update their array by prepending the postId this would be followerNumber document writes.
1. Displaying the Data/Feed
For each fetch of the feed: get array from user document (1 doc read). For each postId, call, posts/{postId}
This would be numberOfPostsCalled document reads.
2. Updating the Data
For the author user of every activity, find all users following that
user. Currently, the users are stored as documents in a collection, so this is followerNumber document reads. For each of the users, add a new document with ID postId to users/{userId}/feed this would be followerNumber document writes.
2. Displaying the Data/Feed
For each fetch of the feed: get a certain number of posts from users/{userId}/feed
This would be numberOfPostsCalled document reads.
This second approach requires me to keep all of the documents up to date in the event of an edit. So despite this approach seeming more firebase-esque, the approach of holding a postId and fetching that directly seems slightly more logical.
3. Updating the Data
For every new follower, each post authored by the person being followed needs to be updated. The new follower is appended to an array called followers.
3. Displaying the Data
For each fetch of the feed: get a certain number of posts from posts where uid == viewerUid
Nice, when I talk about what is more optimal I really need a point or a quality attribute to compare, I' will assume you care about speed (not necessary performance) and costs.
This is how I would solve the problem, it involves several collections but my goal is 1 query only.
user (col)
{
"abc": {},
"qwe": {}
}
posts (col)
{
"123": {},
"456": {}
}
users_posts (col)
{
"abc": {
"posts_ids": ["123"]
}
}
So far so good, the problem is, I need to do several queries to get all the posts information... This is where cloud functions get into the game. You can create a 4th collection where you can pre-calculate your feed
users_dashboard
{
"abc": {
posts: [
{
id: "123", /.../
}, {
id: "456", /.../
}
]
}
}
The cloud function would look like this:
/* on your front end you can manage the add or delete ids from user posts */
export const calculateDashboard = functions.firestore.document(`users_posts/{doc}).onWrite(async(change, _context) {
const firestore = admin.firestore()
const dashboardRef = firestore.collection(`users_dashboard`)
const postRef = firestore.collection(`posts`)
const user = change.after.data()
const payload = []
for (const postId of user.posts_ids) {
const data = await postRef.doc(postId).get().then((doc) => doc.exists ? doc.data() : null)
payload.push(data)
}
// Maybe you want to exponse only certain props... you can do that here
return dashboardRef.doc(user.id).set(payload)
})
The doc max size is 1 MiB (1,048,576 bytes) that is plenty of data you can store in, so you can have like a lot of posts here. Let's talk about costs; I used to think firestore was more like to have several small docs but I've found in practice it works equally well with big size into a big amount of docs.
Now on your dashboard you only need query:
const dashboard = firestore.collection(`users_dashboard`).doc(userID).get()
This a very opinionated way to solve this problem. You could avoid using the users_posts, but maybe you dont want to trigger this process for other than posts related changes.
It looks like your second approach is best in this situation.. I don't really understand what #andresmijares was trying to do and he mentioned something like storing posts in a document which is not a good approach, imagine if you have more than 20K posts (which what I think a document can hold) then the document won't be able to store any more data.. a better approach is to store posts as a document inside a Collection (just like in your 2nd option).. So let's recall here what's the best approach.
1)_ You share a post in the (posts "Collection") and in users you're following's (Feed "Collection").. maybe this can be done with cloud function and let's not forget to aggregate (with cloud functions also) the number of posts that needs to appear in the user's profile.
2)_ You follow a user and get all of their posts from the (posts "Collection") into your (Feed "Collection") this way you get to see all of their posts on your feed.
with this approach, there will be a lot of writes once but the read will be fast.. and if your app is about reading more and writing less then there's nothing to worry about unless i'm wrong.

Firestore social media posts table

so I want to create a sort of social media application and use firestore as main database.
the goal is to create "facebook" news feed.
each user will have a list of friends and each user will be able to create posts.
each post can be modified to be visible to all the users of the application or just the user friends. so each user will be able to post posts to all his friends and to post posts to everyone in the application.
also, users can "save" posts they liked in the newsfeed.(LikedPosts subcollection)
USERS (collection)
DocumentID - userId gathered from Authentication uid
firstName: String
lastName: String
username: String
birthdate: Timestamp
accountCreationDate: Timestamp
FRIENDS (subcollection)
DocumentID - userId of the friend
username: String
LikedPosts (subcollection)
authorUserId: String
authorUsername: String
text: String
imagePath: String
POSTS (collection)
DocumentID - postId randomly generated
authorUserId: String
authorUsername: String
text: String
imagePath: String
likesCount: Number
forFriendsOnly:yes
LIKES (subcollection)
DocumentID - userID of the liker
username: String
now in the newsfeed for a user - How can I query for all the visible post (forFriendsOnly:no) and also to all the posts for friend only, that the current user is in the author friends subcollection.
also, if the user change his name, how can I change his name accordingly for all his previous posts, and all the save posts related to the user?(located in user likedpost subcollection)
I guess you were asking 2 questions.
First, Firestore recommends data duplication instead of joining query across collections. The way you designed the post and user has to rely on query concept in SQL.
It is still possible to achieve that, if you don't mind to have all the author's friend id as an array inside of that post document. Meanwhile, you have to sync author's friend array through trigger function when author add/delete friends.
I wouldn't really recommend this solution, because as a social platform, user's friends might be changing constantly, then you have to keep on updating all his post's friend array.
There is another solution, which is add one more subcollection under user as his visible "feeds". Then whenever an author creates a post, trigger function will write this post's summary to all his friends' visible "feeds" collection.
However, both above solutions are not perfect if you are concerned about accuracy, realtime, cost, etc. I guess that is the drawback we have to bear with. If you have to achieve the same thing as SQL, I guess the only option is using other solutions for query part, such as elastic search, mysql, neo4j, etc. PS: You can still wrap it with cloud functions.
Regards to your 2nd question, one way is not duplicate username if you think your user would change their name frequently. And always query username by user id from user collection. The other way is using trigger function to update the duplicated username when user change their names. I would recommend the second way, since user wouldn't change their names frequently.
Not necessarily related to your original question, but your LikedPosts subcollection likely needs a restructuring. If you can ensure uniqueness on your postId, then it should probably be something like:
LikedPosts (subcollection)
postId: Unique identifier for liked post
authorUserId: String
authorUsername: String
text: String
imagePath: String
The current structure only allows for one liked post, so you'll need to change it to be one document per liked post, or a document containing a list of all of the liked post ids.

Resources