How to handle required relationships in Firebase Realtime Database? - firebase

Local SQL
Locally in my application I have data models Foo and Bar, each stored in a separate SQL table.
Foo references Bar by its id, and it would be an error to have a Foo without its counterpart: Bar.
Sometimes they are shown together, other times only one is shown at a time, and its possible to edit one without it affecting the other.
Firebase Realtime Database
I am now investigating how I can best integrate it into my application to ensure a smooth sync & backup experience. Everything is pretty much straight forward; but I cannot wrap my head around how I can ensure that the above mentioned contract stays true.
There will be cases where Foo is synchronized; and Bar is only synchronized sometime later (consider a simple drop of network connectivity). During this period of time, the realtime database for this particular user would be considered to be in an illegal state. When connectivity returns, both Foo and Bar will eventually be synchronized and the entire thing can continue flowing - but what do I do until then? How can I safeguard against it? Do I need to, or am I looking for the answers in all the wrong places?
Some ideas that Ive been exploring:
Allow Bar to be missing locally. Consider Foo to be in a partial state whenever its bar is missing, e.g. it could be hidden in the UI until its "ready".
Somehow combine Foo and Bar locally, only save Foo when its Bar is available, etc.
Nest Bar inside Foo such that theyre always synchronized together. (This seems troublesome, I have a couple of relationships like Foo->Bar in my application and I would have to nest Bar inside all of them if this was the 'correct' way of handling it)
Please enlighten me if you can. Firebase isnt new to me, but synchronizing this kind of complex data with a multitude of relationships, is.
Additional context
While I cant share the actual Foo & Bar, I have an example which I believe resembles the challenge well.
Consider a chat application where there are messages, rooms and users.
You can post a message to a room, and you can edit a room name and color. A message cannot exist without a user and a room. When looking at it from a sync perspective, user, room and message now has to be synchronized "together" (or at the very least, a message would be invalid if it came through without the others).
Handling this with multi-path updates I would update all 3 everytime a message is posted/edited. Have I understood this correctly? If so, that would mean updating 1+N+N nodes in my own scenario. The N+N nodes wouldnt have any actual updated data, but would be included in the update to ensure all of it is synchronized together.
Further clarity
First, going back to the root of the problem - Foo references Bar by its id locally. If my application ends up in a state where either exist without the other, it would cause issues pretty much everywhere (this is what Im referring to as an illegal state, it cannot happen).
Considering ways to safeguard against it, Ive been recommended to use a multi-path update in firebase (I believe this is similar/identical to fan-out?). Ill try this approach shortly even though Im feeling reserved about it, just to verify whether what Im feeling is just from inexperience.
Im having a very hard time wrapping my head around the fact that if Foo changes locally, it would now also result in a query happening for Bar, such that both can be synchronized together to firebase. It sounds OK for such a simple scenario, however in practice it would mean something along the lines of: you make an edit to X and it gets saved locally, shortly thereafter a sync is happening to firebase, now 100:s of Y and Z are queried as well, and inserted into firebase in the same operation. Y & Z are likely already up to date in firebase at this point, and they havent changed locally.
I dont have much experience with these kinds of scenarios when it comes to firebase, my thinking goes along the lines of it being a huge waste of resources - Im sure firebase does a lot of work locally such that unnecessary synchronization work wont happen for data that hasnt actually changed, but I cant escape the thought that I would still be querying for 100:s and 100:s of complex models locally a lot, in cases where it doesnt seem necessary (surely another approach must exist where I tackle the problem from another angle?).

There will be cases where Foo is synchronized; and Bar is only synchronized sometime later (consider a simple drop of network connectivity).
If both writes come from the same client, consider using multi-path updates to prevent such partial updates.

I finally managed to solve this, hopefully this can help someone else out there that runs into the same challenge further down the line!
First and foremost, when registering ChildEventListener on a DatabaseReference; I get one notification with the latest version of the data, and then notifications each time the data changes.
Using that, I filter data by its "version" (basically a timestamp of when it was edited) so that only data that is newer than the latest local version of it is actually processed.
Anytime a Foo comes through and its Bar counterpart hasnt arrived yet, Ill mark it as a deferred-task that should be retried after the next sync operation is complete - in almost all cases, Bar will come through after Foo and the task will succeed on its next retry. In some cases a couple of other sync operations will come through before Bar, but my retry logic is all local and cheap, so I dont mind that.
If no Bar shows up during this lifetime of the application, its not a problem as the latest version of Foo will come through the next time the application is started, and the cycle just repeats itself until both Foo & Bar are available, at which point theyre inserted into my local database.
I dont know if this is the "correct" way to solve it. But using my knowledge, it is the best I could come up with that works, and I dont see any real downsides to it. Feel free to criticize it in the comments, or provide another answer if you think you have a better solution!
Thank you.

Related

something in ngrx (redux pattern) than I still dont get for large applications

I've been building data driven applications for about 18 years and for the past two, I've been successfuly using angular for my large forms/crud based apps. You know, the classic sql server db with hundreds of tables with millons of records. So far, so good.
Now I'm porting/re-engineering a desktop app with about 50 forms, all complex, all fully functional, "smart". My approach for the last couple years was to simply work tightly with the backend rest API to retrieve, insert or update data as needed and everything works fine.
Then I stumbled across ngrx and I understand exactly how it works, what it does and why it is good for a "reactive" app.
My problem is the following: In the usual lifecycle of the kind of systems i mentioned, you always have to deal with fresh data and always have to tell everything to the server. Almost no data in such apps can be safely "stored" localy since transactional systems rely on centralized data interactions. There's no such thing as "hey lets keep this employee's sales here for later use".
So why would it be so important to manage a local 'store' when most of my data is volatile? I understand why it would be useful for global app data like user-profile or general ui related state, but for the core data itself? I dont get it. You query for data, plug that data in the form, it gets processed by the user and sent back to the server. That data is no longer needed, and if you do need it, you ask for it again, as it could have changed its state since the last time you interacted with it.
I do not understand the great lengths i have to go to mantain a local store and all the boilerplate if that state is so volatile.
They say change detection does not scale but I've build some really large web apps with a simple "http service" pattern and it works just fine, cause most of the component-tree is destroyed anyway as you go somewhere else in the app, and any previous subscriptions become useless. Even with large-bulky-kinky forms, it's never that big of a problem the inner workings of a form as to require external "aid" fro a store. The way I see it, the "state" of a form is a concern of that form in that moment alone. Is it to keep the component tree in sync? never had problems with that before... even for complicated trees with lots of shared data, master detail is kind of a flat pattern in the end if al lthe data is there.
For other components, such as grids, charts, reporte, etc, same thing applyes. They get the data they need and then "puf", gone.
So now you see my mindset. I AM trying to change it to something better. Why am I missing out the redux pattern?
I have a bit of experience here! It's all subjective, so what I've done may not suit you. My system is a complex system that sounds like it's on a similar scale as yours. I battled at first with the same issues of "why build complex logic on the front end and back end", and "why bother keeping stuff in state".
A redux/NGRX approach works for me because there are multiple ways data can be changed - perhaps it's a single user using the front end, perhaps it's another user making a change and I want to respond to that change straight away to avoid concurrency issues down the track. Perhaps there are multiple parts within my front end that can manipulate the same data.
At the back end, I use a CQRS pattern instead of a traditional REST API. Typically, one might suggest to re-implement the commands/queries to "reduce" changes to the state, however I opted for a different approach. I don't just want to send a big object graph back to the server and have it blindly insert, and I don't want to re-implement logic on the client and server.
My basic "use case" life cycle looks a bit like:
Load a list of data (limited size, not all attributes).
User selects item from list
Client requests "full" object/view/dto from server
Client stores response in object entity state
User starts modifying data
These changes are stored as "in progress" changes in a different part of state. The system is now responding to the data in the "in progress" part
If another change comes in from server, it doesn't overwrite the "in progress" data, but it does replace what is in the object entity state.
If required, UI shows that the underlying data has changed / is different to what user has entered / whatever.
User clicks on the "perform action" button, or otherwise triggers a command to be sent to server
server performs command. Any errors are returned, or success
server notifies client that change was successful, the client clears the "in progress" information
server notifies client that Entity X has been updated, client re-requests entity X and puts it into the object entity state. This notification is sent to all connected clients, so they can all behave appropriately.

Looking for an efficient way to update the data

I'm writing a small game for Android in Unity. Basically the person have to guess whats on the photo. Now my boss wants me to add an additional function-> after successful/unsuccessful guess the player will get the panel to rate the photo (basically like or dislike), because we want to track which photos are not good/remove the photos after a couple of successful guesses.
My understanding is that if we want to add +1 to the variable in Firebase first I have to make the call and get it then we have to make a separate call with adding 1 to the value we got. I was wandering if there is a more efficient way to do it?
Thanks for any suggestions!
Instead of requesting firebase when you want to add ,you can request firebase in the beginning (onCreate like method) and save the object and then use it when you want to update it.
thanks
Well, one thing you can do is to store your data temporarily in some object, but NOT send it to Firebase right away. Instead, you can send the data to Firebase in times when the app/game is about to get paused/minimized; hence, reducing potential lags and increasing player satisfaction. OnApplicationPause(bool) is one of such functions that gets called when the game is minimized.
To do what you want, I would recommend using a Transaction instead of just doing a SetValueAsync. This lets you change values in your large shared database atomically, by first running your transaction against the local cache and later against the server data if it differs (see this question/answer).
This gets into some larger interesting bits of the Firebase Unity plugin. Reads/writes will run against your local cache, so you can do things like attach a listener to the "likes" node of a picture. As your cache syncs online and your transaction runs, this callback will be asynchronously triggered letting you keep the value up to date without worrying about syncing during app launch/shutdown/doing your own caching logic. This also means that generally, you don't have to worry too much about your online/offline state throughout your game.

Firebase - Multi Path Updates or Cloud Function Listeners

After watching a fair amount of youtube videos, it seems that Google is advocating for multipath updates when changing data stored in multiple places, however, The more I've messed with cloud functions, it seems like they're and even more viable option as they can just sit in the back and listen for changes to a specific reference and push changes as needed to the other references in real time. Is there a con to going this route? Just curious as to why Google doesn't recommend them for this use case.
NEWER UPDATE: Literally as I was writing this, I received a response from Google regarding my issues. It's too late to turn our apps direction around at this point but it may be useful for someone else.
If your function doesn't return a value, then the server doesn't know how long to wait before giving up and terminating it. I'd wager a quick guess that this might be why the DB calls aren't getting invoked.
Note that since DatabaseReference.set() returns a promise, you can simply return that if you want.
Also, you may want to add a .catch() and log the output to verify the set() op isn't failing.
~firebase-support#google.com
UPDATE: My experience with cloud functions in the last month or so has been sort of a love-hate. A lot of our denormalized data relied on Cloud Functions to keep everything in sync. Unfortunately (and this was a bad idea from the start) we were dealing with transactional/money data and storing that in multiple areas was uncomfortable. When we started having issues with Cloud Functions, i.e. the execution of them on a DB listener was not 100% reliable, we knew that Firebase would not work at least for our transaction data.
Overall the concept is awesome. They work amazingly well when they trigger, but due to some inconsistencies in triggering the functions, they weren't reliable enough for our use case.
We're currently using SQL for our transactional data, and then store user data and other objects that need to be maintained real-time in Firebase. So far that's working pretty well for us.

Using realm with infinite scrolling table view in swift

Im learning about using NSOperation, NSOperationQueue for my networking calls to deliver a more responsive UI in my apps' table view.
The result of the networking operation get stored into the realm and displayed in the table view.
This is an infinite scroll table view and as the user gets the end, more data is pulled into the app.
I am wondering what is the best design paradigm to use here, and where is the best spot to clear the realm. I don't want to inflate the app with useless data. I just want them to have data if they log back in with no network (airplane mode).
I also would like to know where the best spot to trigger these networking operations is? cellForRowAtIndexPath perhaps? I am not to sure since I usually just use Alamofire and trigger a network request in viewDidLoad. But these are not cancellable calls.
I've gone through the great tutorials on ray wenderlich but other then the playground examples, I am still not getting a real world application tutorial. If anyone knows of a good one on this subject let me know
thanks
This might be tricky to answer since it all depends on your app, the size/type of data it's displaying and how often you want to perform network fetches. In the end, it will be most likely be a compromise between what 'feels good' and how many system resources need to be consumed to make it happen.
In this particular scenario, Realm is being used as a caching mechanism and nothing more, so when to clear it should probably depend on how aggressively you wish to clear it.
If I was building a system like this, I would decide on a set number of the latest items I would always want to have available and save them in Realm. If the user then decided to start scrolling down beyond that limit, more data would be downloaded and appended to the Realm database as they went. Eventually the user will get tired and scroll back to the top (Or they might even just quit the app and restart from the top). At that point, it would be appropriate to trigger an operation to review the size of the Realm cache and remove as many items as necessary to bring it back to the desired size. If they start scrolling down again, then it's appropriate to just re-download that data.
Unlike SQLite, where items are copied into memory, Realm is very good at lazy-loading resources mapped from disk, so it's not necessary to worry about the number of Realm items in memory, more just the size of the Realm file on disk, which again depends on how big the data you're downloading is.
As for when to trigger another network operation to request more data, it's probably best to do it in tableView(_:willDisplay:forRowAt:). Depending on how large the data to download is (and the size of your table cells are), you should play with it until it feels natural when scrolling at a pretty normal speed. As a starting point, I'd recommend starting at maybe a whole screen-worth of table cells before hitting the bottom of the scroll view.
Good luck!

Preventing Users from Working on the Same Row

I have a web application at work that is similar to a ticket working system. Some users enter new issues. Other workers choose and resolve issues. All of the data is maintained in MS SQL server 2005.
The users working to resolve issues go to a page where they can view open issues. Because up to twenty people can be looking at this page at the same time, one potential problem I had to address was what happens if someone picks an issue that someone else picked just after their page loaded.
To address this, I did two things. First, the gridview displaying the issues to select uses an AJAX timer to update every second. Once an issue has been selected, it disappears one second later at most. In case they select one within this second, they get a message asking them to choose another.
The problem is that the AJAX part of this is sending too many updates (this is what I am assuming) and it is affecting the performance of the page and database. In addition, the updates are not performing every second. I find the timer to be unreliable when working to trigger stored procedures.
There has to be a better way, but I can't seem to find one. Does anyone have experience with a situation like this or have suggestions to keep multiple users from selecting the same record to maintain? I really do not want to disable the AJAX part entirely because I feel the message alone would make the application frustrating to use.
Thanks,
Put a lock timestamp field on the row in the database. Write a stored proc that returns true or false if the expiration timsetamp is older than a specific time. Set your sessions on your web app to expire in the same time, a minute or two. When a user select a row they hit the stored proc which helps the app to decide if it should let the user to modify it.
Hope that makes sense....
Two things can help mitigate your problem.
First, after-selection notification that the case has been taken is needed regardless of your ajax update time frame. Even checking every second doesn't mean two people cannot click the same case at what they perceive to be the same time. In such cases, one of the users needs to be notified that their selection is invalid even though it appeared valid when selected. This notification doesn't need to be elaborate; keeping a light, helpful tone can improve user perception even in the light of disappointment. And if you identify the user who selected that record already, that will not only help your users coordinate in future but also divert attention from your program to the user who snaked the juicy case. (indeed, management may like giving your users the occasional collision as it will motivate them to select cases faster)
Second, a small tweak to how you display your cases can reduce selection collisions. Adding a random element to display order and/or filtering out every other case on display will help your users select different cases naturally. Human pattern recognition and task selection isn't really random so small changes to presentation can equal big changes to selection behavior. Reductions in collision chance keeps your collision notifications rare (and thus less frustrating to your users). This is even better if your users can be separated into classifications that can help determine useful case ordering/filtering.
Okay, a third thing that will help you over time is if you keep a log of when collisions occur (with helpful meta data about the collision—like who was involved and selection timing). Armed with solid collision data, you can find what works and what doesn't. Over time, you can hone your application to your actual use cases as well as identify potential problems early. Nothing reassures your users more than being on top of a problem (and able to explain your plans to solve it) before they're even aware it exists.
With these mitigating patterns, you'll probably find you can safely reduce your ajax query timeframe without affecting user experience. And with useful logging, you'll have the assurance that any tweaks you put in place are actually working (or not—which is maybe even more useful to know).
I did something similar where once a user opened a ticket (row) it assigned that ticket to that user and set a value on that record, like and FK to that particular user, so if anyone else tried to open that ticket (row) it would let them know it has already been assigned to someone else.
If possible limit the system so that they just get the next open issue off the work queue as opposed having them be able choose from all open issues.
If that isn't possible, I suppose you could check upon the choosing of an issue to see if it is still available. If it's not available, then make it disappear after the user clicks on it. This way you are only requesting when they actually click on something as opposed to constant polling of the data.
Have you tried increasing the time between refreshes. I would expect that once per 30 seconds would be sufficient. 40 requests/minute is a lot less load than 1200/minute. Your users may not even notice the difference.
If they do, how about providing a refresh button on the page so the users can manually refresh the list just prior to selecting an item to avoid the annoying message if they choose.
I'm missing to see the issue, specially after you mentioned you are already flagging tickets as in progress/being maintained and have a timestamp/version of the item.
Isn't the following enough:
User browses the tickets and sees a list of available tickets i.e. this excludes ones that are in the db as in progress. If you want the users to also see tickets in progress, you indicate it clearly in the ticket status and disable the option to take it.
User either flags a ticket as in progress explicitly or implicitly by opening the ticket (depends on the user experience / how its presented to the users).
User explicitly moves the ticket to a different status i.e. completed, invalid, awaiting for feedback, etc.
When the items are retrieved at 1, you include a timestamp/version. When 2 happens, you use a optimistic concurrency approach to make sure that if 2 persons try to update the take the ticket at the same time only the first one will be successful.
What will happen is that for the second person, the update ... where ... timestamp = #timestamp will not find any records to update and you will report back that the ticket was already taken.
If you want, you can build on top of the above to update the UI as tickets are grabbed. This could be by just doing a full refresh of the current page of tickets after x time (maybe alerting/prompting the user), or even by retrieving a list of tickets changed for the page of tickets being showed with ajax. You still have the earlier steps in place, as this modification its just a convenience for the users.

Resources