Limiting Children of Object in Firebase - firebase

I am looking to get back my whole object, but limit one of my children objects.
For example, say you take a chat app like firebase does and you do "rooms".
So you might have
rooms: {
mainroom:{
name: something,
otherAttrs: mfasfd,
messages: {
0: {
message: something
},
1: {
message: something else
}
}
}
I may have 300 messages in that mainroom, but I want to limit it to 30 say. This example is basic, but in my actual application my objects are very related so I don't want to denormalize any further.
I could do a mainroom call, and then do another child call off of that, but I am wondering if I would get dinged twice. in the initial call it would load all messages anyways, and then I would load 30 of them with the child call. Was just hoping someone would have a better recommendation.

Start by reading up about denormalization. This is a concept which is enforced in SQL by table structures, but also important in NoSQL, although you're given enough rope to tangle yourself up and have a bad day.
So the first step is to split messages into its own path:
URL/rooms
URL/messages
Now you can grab your meta data and messages separately, and call limit to set the number loaded:
var fbRef = new Firebase(URL);
var roomRef = fbRef.child('rooms/'+roomId);
var chatRef = fbRef.child('messages/'+roomId).limit(30);
In case you're not convinced that these should be split up, you're going to run into this same issue when you want to create a dropdown containing a list of room names (you have to load all your messages in the current data structure, just to get the room names).
For great justice, split meta data and detailed records into their own paths. Otherwise, all your base are belong to bandwidth.

Related

Firebase RTD, atomic "move" ... delete and add from two "tables"?

In Firebase Realtime Database, it's a pretty common transactional thing that you have
"table" A - think of it as "pending"
"table" B - think of it as "results"
Some state happens, and you need to "move" an item from A to B.
So, I certainly mean this would likely be a cloud function doing this.
Obviously, this operation has to be atomic and you have to be guarded against racetrack effects and so on.
So, for item 123456, you have to do three things
read A/123456/
delete A/123456/
write the value to B/123456
all atomically, with a lock.
In short what is the Firebase way to achieve this?
There's already the awesome ref.transaction system, but I don't think it's relevant here.
Perhaps using triggers in a perverted manner?
IDK
Just for anyone googling here, it's worth noting that the mind-boggling new Firestore (it's hard to imagine anything being more mind-boggling than traditional Firebase, but there you have it...), the new Firestore system has built-in .......
This question is about good old traditional Firebase Realtime.
Gustavo's answer allows the update to happen with a single API call, which either complete succeeds or fails. And since it doesn't have to use a transaction, it has much less contention issues. It just loads the value from the key it wants to move, and then writes a single update.
The problem is that somebody might have modified the data in the meantime. So you need to use security rules to catch that situation and reject it. So the recipe becomes:
read the value of the source node
write the value to its new location while deleting the old location in a single update() call
the security rules validate the operation, either accepting or rejecting it
if rejected, the client retries from #1
Doing so essentially reimplements Firebase Database transactions with client-side code and (some admittedly tricky) security rules.
To be able to do this, the update becomes a bit more tricky. Say that we have this structure:
"key1": "value1",
"key2": "value2"
And we want to move value1 from key1 to key3, then Gustavo's approach would send this JSON:
ref.update({
"key1": null,
"key3": "value1"
})
When can easily validate this operation with these rules:
".validate": "
!data.child("key3").exists() &&
!newData.child("key1").exists() &&
newData.child("key3").val() === data.child("key1").val()
"
In words:
There is currently no value in key3.
There is no value in key1 after the update
The new value of key3 is the current value of key1
This works great, but unfortunately means that we're hardcoding key1 and key3 in our rules. To prevent hardcoding them, we can add the keys to our update statement:
ref.update({
_fromKey: "key1",
_toKey: "key3",
key1: null,
key3: "value1"
})
The different is that we added two keys with known names, to indicate the source and destination of the move. Now with this structure we have all the information we need, and we can validate the move with:
".validate": "
!data.child(newData.child('_toKey').val()).exists() &&
!newData.child(newData.child('_fromKey').val()).exists() &&
newData.child(newData.child('_toKey').val()).val() === data.child(newData.child('_fromKey').val()).val()
"
It's a bit longer to read, but each line still means the same as before.
And in the client code we'd do:
function move(from, to) {
ref.child(from).once("value").then(function(snapshot) {
var value = snapshot.val();
updates = {
_fromKey: from,
_toKey: to
};
updates[from] = null;
updates[to] = value;
ref.update(updates).catch(function() {
// the update failed, wait half a second and try again
setTimeout(function() {
move(from, to);
}, 500);
});
}
move ("key1", "key3");
If you feel like playing around with the code for these rules, have a look at: https://jsbin.com/munosih/edit?js,console
There are no "tables" in Realtime Database, so I'll use the term "location" instead to refer to a path that contains some child nodes.
Realtime Database provides no way to atomically transaction on two different locations. When you perform a transaction, you have to choose a single location, and you may only make changes under that single location.
You might think that you could just transact at the root of the database. This is possible, but those transactions may fail in the face of concurrent non-transaction write operations anywhere within the database. It's a requirement that there must be no non-transactional writes anywhere at the location where transactions take place. In other words, if you want to transact at a location, all clients must be transacting there, and no clients may write there without a transaction.
This rule is certainly going to be problematic if you transact at the root of your database, where clients are probably writing data all over the place without transactions. So, if you want perform an atomic "move", you'll either have to make all your clients use transactions all the time at the common root location for the move, or accept that you can't do this truly atomically.
Firebase works with Dictionaries, a.k.a, key-value pair. And to change data in more than one table on the same transaction you can get the base reference, with a dictionary containing "all the instructions", for instance in Swift:
let reference = Database.database().reference() // base reference
let tableADict = ["TableA/SomeID" : NSNull()] // value that will be deleted on table A
let tableBDict = ["TableB/SomeID" : true] // value that will be appended on table B, instead of true you can put another dictionary, containing your values
You should then merge (how to do it here: How do you add a Dictionary of items into another Dictionary) both dictionaries into one, lets call it finalDict,
then you can update those values, and both tables will be updated, deleting from A and "moving to" B
reference.updateChildValues(finalDict) // update everything on the same time with only one transaction, w/o having to wait for one callback to update another table

Reactively show number of unread comments in a thread?

I'm making a forum type app with Threads and Comments within a Thread. I'm trying to figure out how to show the total number of unread comments within a thread to each user.
I considered publishing all the Comments for every Thread, but this seems like excessive data to be publishing to the client when all I want is a single number showing the unread Comments. But if I start adding metadata to the Thread collection (such as numComments, numCommentsUnread...), this adds extra moving parts to the app (i.e. I have to track every time a different user adds a Comment to a Thread, etc...).
What are some of the best practices for dealing with this?
I would recommend using the Publish-Counts package (https://github.com/percolatestudio/publish-counts) if all you need is the count. If you need the actual related comments take a look at the meteor-composite-publish (https://github.com/englue/meteor-publish-composite) package.
This sounds like a database design problem.
You will have to keep a collection of UserThreads, which tracks when the last time the user checked the thread. It has the userId, the threadId, and the lastViewed date(or whatever sensible alternatives you might use).
IF the user has never checked the thread then do not have an object in the UserThreads then the unread count would be the comment count.
WHEN the user views the thread for the first time, create a UserThread object for him.
UPDATE the lastViewed on the UserThread whenever he views the thread.
The UnreadCommentCount will be calculated reactively. It is the sum of comments on the thread where the comment's createdAt is newer than the lastViewed on the UserThread. This can be a template helper function that is executed in the view on an as needed basis. For example, when listing Threads in a subforum view, then it would only calculate for the Threads being viewed in that list at that time.
Alternatively, you could keep an unreadCommentCount attribute on the UserThread. Every time a comment is posted to the thread, then you would iterate through that Thread's UserThreads, updating the unreadCommentCount. When the user later visits that thread, you then reset the unreadCommentCount to zero and updated the lastViewed. The user would then subscribe to a publication of his own UserThreads, which would update reactively.
It seems that in building a forum type site that UserThread object would be indispensable for tracking how a User interacts with Threads. If he had viewed it, ignored it, has commented in it, wants to subscribe to it but has not commented yet, etc.
Based on #datacarl answer, you can modify your thread publication to integrate additional data, such as a count of your unread comments. Here is how you can achieve it, using Cursor.observe().
var self = this;
// Modify the document we are sending to the client.
function filter(doc) {
var length = doc.item.length;
// White list the fields you want to publish.
var docToPublish = _.pick(doc, [
'someOtherField'
]);
// Add your custom fields.
docToPublish.itemLength = length;
return docToPublish;
}
var handle = myCollection.find({}, {fields: {item:1, someOtherField:1}})
// Use observe since it gives us the the old and new document when something is changing.
// If this becomes a performance issue then consider using observeChanges,
// but its usually a lot simpler to use observe in cases like this.
.observe({
added: function(doc) {
self.added("myCollection", doc._id, filter(doc));
},
changed: function(newDocument, oldDocument)
// When the item count is changing, send update to client.
if (newDocument.item.length !== oldDocument.item.length)
self.changed("myCollection", newDocument._id, filter(newDocument));
},
removed: function(doc) {
self.removed("myCollection", doc._id);
});
self.ready();
self.onStop(function () {
handle.stop();
});
I guess you can adapt this example to your case. You can remove the white list part if you need to. The count part will be covered using a request such as post.find({"unread":true, "thread_id": doc._id}).count()
Another way to achieve that is to use collection hooks. Each time you insert a comment, you hook on after the insert and you update a dedicated field "unread comments count" in your related thread document. Each time, the user read a post, you update the value.

Meteor Deps.Autorun know when data has been fully fetched

Is there a way to know when data has been initially fully fetched from the server after running Deps.autorun for the first time?
For example:
Deps.autorun(function () {
var data = ItemsCollection.find().fetch();
console.log(data);
});
Initially my console log will show Object { items=[0] } as the data has not yet been fetched from the server. I can handle this first run.
However, the issue is that the function will be rerun whenever data is received which may not be when the full collection has been loaded. For example, I sometimes received Object { items=[12] } quickly followed by Object { items=[13] } (which isn't due to another client changing data).
So - is there a way to know when a full load has taken place for a certain dependent function and all collections within it?
You need to store the subscription handle somewhere and then use the ready method to determine whether the initial data load has been completed.
So if you subscribe to the collection using:
itemSub = Meteor.subscribe('itemcollections', blah blah...)
You can then surround your find and console.log statements with:
if (itemSub.ready()) { ... }
and they will only be executed once the initial dataset has been received.
Note that there are possible ocassions when the collection handle will return ready marginally before some of the items are received if the collection is large and you are dealing with significant latency, but the problem should be very minor. For more on why and how the ready () method actually works, see this.
Meteor.subscribe returns a handle with a reactive ready method, which is set to true when "an initial, complete snapshot of the record set has been sent" (see http://docs.meteor.com/#publish_ready)
Using this information you can design something simple such as :
var waitList=[Meteor.subscribe("firstSub"),Meteor.subscribe("secondSub"),...];
Deps.autorun(function(){
// http://underscorejs.org/#every
var waitListReady=_.every(waitList,function(handle){
return handle.ready();
});
if(waitListReady){
console.log("Every documents sent in publications is now available.");
}
});
Unless you're prototyping a toy project, this is not a solid design and you probably want to use iron-router (http://atmospherejs.com/package/iron-router) which provides great design patterns to address this kind of problems.
In particular, take a moment and have a look at these 3 videos from the main iron-router contributor :
https://www.eventedmind.com/feed/waiting-on-subscriptions
https://www.eventedmind.com/feed/the-reactive-waitlist-data-structure
https://www.eventedmind.com/feed/using-wait-waiton-and-ready-in-routes

Is there a way to tell meteor a collection is static (will never change)?

On my meteor project users can post events and they have to choose (via an autocomplete) in which city it will take place. I have a full list of french cities and it will never be updated.
I want to use a collection and publish-subscribes based on the input of the autocomplete because I don't want the client to download the full database (5MB). Is there a way, for performance, to tell meteor that this collection is "static"? Or does it make no difference?
Could anyone suggest a different approach?
When you "want to tell the server that a collection is static", I am aware of two potential optimizations:
Don't observe the database using a live query because the data will never change
Don't store the results of this query in the merge box because it doesn't need to be tracked and compared with other data (saving memory and CPU)
(1) is something you can do rather easily by constructing your own publish cursor. However, if any client is observing the same query, I believe Meteor will (at least in the future) optimize for that so it's still just one live query for any number of clients. As for (2), I am not aware of any straightforward way to do this because it could potentially mess up the data merging over multiple publications and subscriptions.
To avoid using a live query, you can manually add data to the publish function instead of returning a cursor, which causes the .observe() function to be called to hook up data to the subscription. Here's a simple example:
Meteor.publish(function() {
var sub = this;
var args = {}; // what you're find()ing
Foo.find(args).forEach(function(document) {
sub.added("client_collection_name", document._id, document);
});
sub.ready();
});
This will cause the data to be added to client_collection_name on the client side, which could have the same name as the collection referenced by Foo, or something different. Be aware that you can do many other things with publications (also, see the link above.)
UPDATE: To resolve issues from (2), which can be potentially very problematic depending on the size of the collection, it's necessary to bypass Meteor altogether. See https://stackoverflow.com/a/21835534/586086 for one way to do it. Another way is to just return the collection fetch()ed as a method call, although this doesn't have the benefits of compression.
From Meteor doc :
"Any change to the collection that changes the documents in a cursor will trigger a recomputation. To disable this behavior, pass {reactive: false} as an option to find."
I think this simple option is the best answer
You don't need to publish your whole collection.
1.Show autocomplete options only after user has inputted first 3 letters - this will narrow your search significantly.
2.Provide no more than 5-10 cities as options - this will keep your recordset really small - thus no need to push 5mb of data to each user.
Your publication should look like this:
Meteor.publish('pub-name', function(userInput){
var firstLetters = new RegExp('^' + userInput);
return Cities.find({name:firstLetters},{limit:10,sort:{name:1}});
});

How to avoid race conditions on cursor.observe?

Race condiditions
In my Meteor application, I made an observe within a publish, that insert some new data in certain conditions. The point is that sometimes we have duplicated subscriptions, and race condition leads us to duplicate inserted data.
If it is not possible to have "singleton observers":
How can we avoid race conditions and duplicated inserted data on database?
Example:
Meteor.publish("fortuneUpdate", function () {
var selector = {user: this.userId, seen:false};
DailyFortunes.find(selector).observe({
removed: function(doc, beforeIndex){
if(DailyFortunes.find(selector).count()<1)
createDailyFortune(this.userId);
}
});
}
This question has been moved from How cursor.observe works and how to avoid multiple instances running?
According to Tom, it is not possible, for now, to ensure that calls to subscribe that have the same arguments are shared.
So, if you are having the same problem I had, of redundant data created inside observers, I suggest you, as workaround, to:
Create robust indexes that prevent repeted data creating. Compound Keys is probable what you need here.
Treat duplicate key error exceptions inside your observer ignoring race conditions.
example:
Collection.find(selector).observe({
removed: function(document){
try {
// Workaround to avoid race conditions > https://stackoverflow.com/q/13095647/599991
createNewDocument();
} catch (e) {
// XXX string parsing sucks, maybe
// https://jira.mongodb.org/browse/SERVER-3069 will get fixed one day
if (e.name !== 'MongoError') throw e;
var match = e.err.match(/^E11000 duplicate key error index: ([^ ]+)/);
if (!match) throw e;
//if match, just do nothing.
}
self.flush();
}
});
This is an odd pattern. Can you share some example code?
Generally I'd either expect to see mutations in a method, or setting up an observe inside Meteor.startup() on the server. (The latter is tricky if you're running multiple server processes, but so are many other things in a multi process regime. We'll have a better pattern down the line.)
Because it can be arbitrary JS, a publish function has to run once per subscribing client. It may log new subscriptions, set up per-client server state, or vary its behavior based on this.userId or even a random source. For example, consider a subscription that returns 10 randomly selected documents from a DB collection to each subscribed client!
So the place to optimize the case of many clients subscribing to the same data set is at the DB query layer: if a thousand clients are subscribed to the same DB query, we'll just run that underlying query once.

Resources