Event Sourcing and synchronous reads. Is it possible? - asynchronous

Just can't find any definitive answer online. I'd like to employ ES for my project, but what bugs me is it's asynchronous nature.
Consider a collaborative blog site (I'm making stuff up for the sake of simplicity as my domain is far more complex).
Users can create blog posts and edit them. That's it, nothing more.
So, I've just created a blog entry with
createBlogEntryCommand = new CreateBlogEntryCommand(body, tags)
createBlogEntryCommand.execute()
With ES I'd store BlogEntryCreatedEvent in the ES store, something like
eventStore.append({
"id": "1d11071c-33c6-4621-bb86-cafcc3ca23a6",
"body": "Lorem ipsum dolor sit amet....",
"tags": ["awesome-reading", "awesome-writing"],
})
Now, I have no idea when a consumer will pick this event up and process. Of course, I can have a heuristic metric and guarantee to some degree this event will be processed in X ms, but what if the consumer is down for maintenance, for instance? How do I make the blog entry available for the author straight away?
Sure enough, I can poll the database for blog id of 1d11071c-33c6-4621-bb86-cafcc3ca23a6 (since it's pre-generated, so issuing ids is not a DB concern) and once consumer has picked the event and created materialised view in the database, make it available for the user, but is it the only way?
P.S. I watched a lot of videos online on the subject, I have read a ton of blogs too, but any source seems to be circumventing this caveat of ES not explaining any approaches. Most often then not, the answer I've heard "this can be resolved on UI/UX level". So if there are good books / articles / videos that are discussing how to overcome ES pitfalls in details, please share in the comments.
UPDATE
Reading history for synchronisation straight away... I actually thought about this, but refused this idea really quickly.
Considering a simple todo app where a user can sign up and create todo list, I envision having something like
UserRegistered
TodoCreated
TodoUpdated
TodoCompleted
TodoDeleted
TodoUnDeleted
When reading events from the events store, I'm not interested in other user's todo items, but I'm only interested in current user's todo items. Which means having a store of events like so:
UserRegistered {name: Bob, id: 1}
UserRegistered {name: Alice, id: 2}
TodoCreated {id: 123, todo: Buy milk, user: 1}
TodoUpdated {id: 123, todo: Buy skim milk, user: 1}
TodoCompleted {id: 123}
TodoCreated {id: 456, todo: Pay the dues, user: 2}
TodoCompleted {id: 456}
TodoDeleted {id: 123}
TodoUnDeleted {id: 123}
TodoUpdated {id: 123, todo: Buy full cream milk!, user: 1}
If I'd be required to get the latest state of Bob's todo id 123 (buy milk), I'd need to read ALL the events despite though they might be nothing to do with Bob's list of items. So I'd be traversing through heaps of events of todos created by other users apart from Bob to only filter and apply Bob's ones.
Does it mean I will be required to have a special "channel" in my events store to only contain Bob's actions upon todo items.
In addition, what if somebody else is able to manage Bob's todo list items? What if Alice has an access to modify Bob's todos? Won't it greatly increase events storage schema?

is it the only way?
No.
There is nothing wrong with reading a history of events out of the event store, and then using those events to compute your view on demand.
View v = View.from(events);
Does it mean I will be required to have a special "channel" in my events store to only contain Bob's actions upon todo items. In addition, what if somebody else is able to manage Bob's todo list items? What if Alice has an access to modify Bob's todos? Won't it greatly increase events storage schema?
The usual answer is that events that belong together will share a correlation id, and the message store allows you to specify which key to use.
For example, if you were using an RDBMS as your store, you might have a single blob column for all of your event data, and then a number of additional columns to store bits of meta data that are useful for retrieval.
Access rules (who can make changes) are a separate concern from the context of the events themselves (who are the events about).

Related

Optimize Firebase database design

I am having trouble designing the database of my app. In the app users are allowed to create jobs and then using GeoFire I find people nearby.
This is my design for the jobs so far:
As you can see there are the users and then the workers. After pushing the new job to the users Unique ID (UID) under serviceUsers, I then use geoFire to find the workerUsers that are nearby. I then push the jobs into the UID's of the workerUsers.
Now here are my questions:
I am basically creating copies of these jobs. Once for the person who created it (under serviceUsers) and once for every nearby workerUsers.
Is this inefficient? Should I rather pass some kind of pointer instead of the whole job object to the nearby users?
And here for the more important question: If the design is fine as it is, how would I go on about when the creator of the job deletes it? I would then need to find each job in workerUsers and delete the job with the jobs UID. Does Firebase support queries for this?
Thank you very much in advance!
I am basically creating copies of these jobs. Once for the person who
created it (under serviceUsers) and once for every nearby workerUsers.
Is this inefficient? Should I rather pass some kind of pointer instead
of the whole job object to the nearby users?
Every job should have a UUID which can act as a "pointer" (I'd rather call it a key). Then every user should include a job UUID, not a whole copy, so you can refer to it. I won't completely replicate your use case, but you should get an idea.
{
users: {
exampleUserId: {
jobs: ['exampleUUID']
}
},
jobs: {
exampleUUID: {
name: 'awesome job'
}
}
}
If the design is fine as it is, how would I go on about when the
creator of the job deletes it? I would then need to find each job in
workerUsers and delete the job with the jobs UID. Does Firebase
support queries for this?
It does support it, but you should implement my suggestion from above to do it in a sane way. After this, you can create a cloud function whose job should sound like this: "When a job with given UUID is removed, then go through every user and remove a reference to it if it exists"
exports.checkReferences = functions.database.ref('/jobs/{uuid}').onWrite(event => {
// check information here
if (!event.data.val()) {
// job was removed! get its uuid and iterate through users and remove the uuid from them
}
});

A loose user creation for temporary game room (no account, no password)

I'm trying to create a simple game where rooms are generated for game sessions and destroyed after the game ends. I would like to have users join this room to play the game by just entering a nickname.
Would I need to use the meteor user accounts or is there a simpler way to do this as I won't need any authentication or password of sorts.
I am thinking of creating just a player collection and inserting the nickname when they click to join the room, but I don't know how I can keep track of who 'I am' or keep them tracked if they loose connection and have to rejoin.
But if I were to use user accounts, I'm not sure how to customise it so that a casual/loose user can be easily created and probably destroyed after game sessions without any password or email, etc.
Thanks in advance!
You could just use a collection, but as you say, you would still need to track the user's id, so that would take some effort.
Personally, I would use the Meteor accounts system, using the accounts-token package:
https://atmospherejs.com/andrei/accounts-token
You can generate a token on the server, and then log the user in. They don't need to know that they are logged in.
Using Meteor's accounts system has many advantages in tracking the user, and if at a later stage you do want people to login using accounts-password, accounts-facebook etc you won't need to restructure your app.
You don't need to user Meteor's account system.
Just create a Rooms collection and each document inside that collection will be each "game room," or however you want to call it. Then inside, add a "players" key and store all the players or whatever you need ($pull player who leaves the game, $push player who joins). When the game ends, you can completely remove that MongoDB document via its _id field.
Something like:
db.rooms = {
_id: xzu90udfmd,
players: [
{
tempUsername: "username1",
tempWhatever: "whatever",
token: 3zz97hrnw
}, {
tempUsername: "username2",
tempWhatever: "whatever",
token: 3zz97hrnw
}, {
tempUsername: "username3",
tempWhatever: "whatever",
token: 3zz97hrnw
}
]
};
Then, you can add all the customization in either the individual user objects, or outside that could apply to the entire game room. Something like this would be super easy with React.

Template level subscription, is running a lot of time... Should I use?

I'm doing my meteor app and it has 1 Collection: Students
In Server I made a Publish that receives 3 params: query, limit and skip; to avoid client to subscribe all data and just show the top 10.
I have also 3 Paths:
student/list -> Bring top 10, based on search input and pagination (using find);
student/:id -> Show the student (using findOne)
student/:id/edit -> Edit the student (using findOne)
Each Template subscribe to the Students collection, but every time the user change between this paths, my Template re-render and re-subscribe.
Should I make just one subscribe, and make the find based on this "global" subscription?
I see a lot of people talking about Template level subscription, but I don't know if it is the better choice.
And about making query on server to publish and not send all data, I saw people talking too, to avoid data traffic...
In this case, when I have just 1 Collection, is better making an "global" subscription?
You're following a normal pattern although it's a bit hard to tell without the code. If there many students then you don't really want to publish them all, only what is really necessary for the current route. What you should do is figure out why your pub-sub is slow. Is it the find() on the server? Do you have very large student objects? (In which case you will probably want to limit what fields are returned). Is the search you're running hitting mongo indexes?
Your publication for a list view can have different fields than for a individual document view, for example:
Meteor.publish('studentList',function(){
let fields = { field1: 1, field2: 1 }; // only include two fields
return Students.find({},fields);
});
Meteor.publish('oneStudent',function(_id){
return Students.find(_id); // here all fields will be included
});

Should I be further denormalizing? [duplicate]

I've read the Firebase docs on Stucturing Data. Data storage is cheap, but the user's time is not. We should optimize for get operations, and write in multiple places.
So then I might store a list node and a list-index node, with some duplicated data between the two, at very least the list name.
I'm using ES6 and promises in my javascript app to handle the async flow, mainly of fetching a ref key from firebase after the first data push.
let addIndexPromise = new Promise( (resolve, reject) => {
let newRef = ref.child('list-index').push(newItem);
resolve( newRef.key()); // ignore reject() for brevity
});
addIndexPromise.then( key => {
ref.child('list').child(key).set(newItem);
});
How do I make sure the data stays in sync in all places, knowing my app runs only on the client?
For sanity check, I set a setTimeout in my promise and shut my browser before it resolved, and indeed my database was no longer consistent, with an extra index saved without a corresponding list.
Any advice?
Great question. I know of three approaches to this, which I'll list below.
I'll take a slightly different example for this, mostly because it allows me to use more concrete terms in the explanation.
Say we have a chat application, where we store two entities: messages and users. In the screen where we show the messages, we also show the name of the user. So to minimize the number of reads, we store the name of the user with each chat message too.
users
so:209103
name: "Frank van Puffelen"
location: "San Francisco, CA"
questionCount: 12
so:3648524
name: "legolandbridge"
location: "London, Prague, Barcelona"
questionCount: 4
messages
-Jabhsay3487
message: "How to write denormalized data in Firebase"
user: so:3648524
username: "legolandbridge"
-Jabhsay3591
message: "Great question."
user: so:209103
username: "Frank van Puffelen"
-Jabhsay3595
message: "I know of three approaches, which I'll list below."
user: so:209103
username: "Frank van Puffelen"
So we store the primary copy of the user's profile in the users node. In the message we store the uid (so:209103 and so:3648524) so that we can look up the user. But we also store the user's name in the messages, so that we don't have to look this up for each user when we want to display a list of messages.
So now what happens when I go to the Profile page on the chat service and change my name from "Frank van Puffelen" to just "puf".
Transactional update
Performing a transactional update is the one that probably pops to mind of most developers initially. We always want the username in messages to match the name in the corresponding profile.
Using multipath writes (added on 20150925)
Since Firebase 2.3 (for JavaScript) and 2.4 (for Android and iOS), you can achieve atomic updates quite easily by using a single multi-path update:
function renameUser(ref, uid, name) {
var updates = {}; // all paths to be updated and their new values
updates['users/'+uid+'/name'] = name;
var query = ref.child('messages').orderByChild('user').equalTo(uid);
query.once('value', function(snapshot) {
snapshot.forEach(function(messageSnapshot) {
updates['messages/'+messageSnapshot.key()+'/username'] = name;
})
ref.update(updates);
});
}
This will send a single update command to Firebase that updates the user's name in their profile and in each message.
Previous atomic approach
So when the user change's the name in their profile:
var ref = new Firebase('https://mychat.firebaseio.com/');
var uid = "so:209103";
var nameInProfileRef = ref.child('users').child(uid).child('name');
nameInProfileRef.transaction(function(currentName) {
return "puf";
}, function(error, committed, snapshot) {
if (error) {
console.log('Transaction failed abnormally!', error);
} else if (!committed) {
console.log('Transaction aborted by our code.');
} else {
console.log('Name updated in profile, now update it in the messages');
var query = ref.child('messages').orderByChild('user').equalTo(uid);
query.on('child_added', function(messageSnapshot) {
messageSnapshot.ref().update({ username: "puf" });
});
}
console.log("Wilma's data: ", snapshot.val());
}, false /* don't apply the change locally */);
Pretty involved and the astute reader will notice that I cheat in the handling of the messages. First cheat is that I never call off for the listener, but I also don't use a transaction.
If we want to securely do this type of operation from the client, we'd need:
security rules that ensure the names in both places match. But the rules need to allow enough flexibility for them to temporarily be different while we're changing the name. So this turns into a pretty painful two-phase commit scheme.
change all username fields for messages by so:209103 to null (some magic value)
change the name of user so:209103 to 'puf'
change the username in every message by so:209103 that is null to puf.
that query requires an and of two conditions, which Firebase queries don't support. So we'll end up with an extra property uid_plus_name (with value so:209103_puf) that we can query on.
client-side code that handles all these transitions transactionally.
This type of approach makes my head hurt. And usually that means that I'm doing something wrong. But even if it's the right approach, with a head that hurts I'm way more likely to make coding mistakes. So I prefer to look for a simpler solution.
Eventual consistency
Update (20150925): Firebase released a feature to allow atomic writes to multiple paths. This works similar to approach below, but with a single command. See the updated section above to read how this works.
The second approach depends on splitting the user action ("I want to change my name to 'puf'") from the implications of that action ("We need to update the name in profile so:209103 and in every message that has user = so:209103).
I'd handle the rename in a script that we run on a server. The main method would be something like this:
function renameUser(ref, uid, name) {
ref.child('users').child(uid).update({ name: name });
var query = ref.child('messages').orderByChild('user').equalTo(uid);
query.once('value', function(snapshot) {
snapshot.forEach(function(messageSnapshot) {
messageSnapshot.update({ username: name });
})
});
}
Once again I take a few shortcuts here, such as using once('value' (which is in general a bad idea for optimal performance with Firebase). But overall the approach is simpler, at the cost of not having all data completely updated at the same time. But eventually the messages will all be updated to match the new value.
Not caring
The third approach is the simplest of all: in many cases you don't really have to update the duplicated data at all. In the example we've used here, you could say that each message recorded the name as I used it at that time. I didn't change my name until just now, so it makes sense that older messages show the name I used at that time. This applies in many cases where the secondary data is transactional in nature. It doesn't apply everywhere of course, but where it applies "not caring" is the simplest approach of all.
Summary
While the above are just broad descriptions of how you could solve this problem and they are definitely not complete, I find that each time I need to fan out duplicate data it comes back to one of these basic approaches.
To add to Franks great reply, I implemented the eventual consistency approach with a set of Firebase Cloud Functions. The functions get triggered whenever a primary value (eg. users name) gets changed, and then propagate the changes to the denormalized fields.
It is not as fast as a transaction, but for many cases it does not need to be.

Meteor - subscribe to same collection twice - keep results separate?

I have a situation in which I need to subscribe to the same collection twice. The two publish methods in my server-side code are as follows:
Meteor.publish("selected_full_mycollection", function (important_id_list) {
check(important_id_list, Match.Any); // should do better check
// this will return the full doc, including a very long array it contains
return MyCollection.find({
important_id: {$in: important_id_list}
});
});
Meteor.publish("all_brief_mycollection", function() {
// this will return all documents, but only the id and first item in the array
return MyCollection.find({}, {fields: {
important_id: 1,
very_long_array: {$slice: 1}
}});
});
My problem is that I am not seeing the full documents on the client end after I subscribe to them. I think this is because they are being over-written by the method that publishes only the brief versions.
I don't want to clog up my client memory with long arrays when I don't need them, but I do want them available when I do need them.
The brief version is subscribed to on startup. The full version is subscribed to when the user visits a template that drills down for more insight.
How can I properly manage this situation?
TL/DR - skip to the third paragraph.
I'd speculate that this is because the publish function thinks that the very_long_array field has already been sent to the client, so it doesn't send it again. You'd have to fiddle around a bit to confirm this, but sending different data on the same field is bound to cause some problems.
In terms of subscribing on two collections, you're not supposed to be able to do this as the unique mongo collection name needs to be provided to the client and server-side collections object. In practice, you might be able to do something really hacky by making one client subscription a fake remote subscription via DDP and having it populate a totally separate Javascript object. However, this cannot be the best option.
This situation would be resolved by publishing your summary on something other than the same field. Unfortunately, you can't use transforms when returning cursors from a publish function (which would be the easiest way), but you have two options:
Use the low-level publications API as detailed in this answer.
Use collection hooks to populate another field (like very_long_array_summary) with the first item in the array whenever very_long_array changes and publish just the summary field in the former publication.
A third option might be publishing the long version to a different collection that exists for this purpose on the client only. You might want to check the "Advanced Pub/Sub" Chapter of Discover Meteor (last sub chapter).

Resources