Google Firebase concerns and mixing database technologies in a web application - firebase

Hi I am working with a small start-up which is building a web application. The initial tech stack we had chosen was, a React JS front end, Python on the server-side to handle some external data requests, and Googles Fire-base (real-time DB) as the back-end.
We specifically looked at Firebase due to the documented plugins to the Google suite of tools including google analytics and big query, and the already provided users authentication that comes along with the Google Firebase dashboard.
However since engaging a group of developers, concerns have been raised by them in two areas when using Firebase with our application. Firebase has documented limitations to its depth and complexity of search that it supports.
In that for queries that require complex joins across tables or search criteria that is partial or requires returns of similar or LIKE, Firebase is said to either have no capability or very limited capability.
with regards to users the product is said to have limited capability when building an environment that requires user groups and roles.
It has been therefore suggest we look at moving away from Firebase. Or we consider reducing the use of Firebase to simpler elements of our application environment, moving critical data and data that is found and displayed via complex searching / data queries, onto alternative database technologies that have greater support for data and search complexity.
To that end I am looking to understand if anyone else has their entire web application back-end in either of the two Firebase database offerings (real time db or FireServe) and if you have faced any issues around performance, lack of functionality, lack of capability when trying to do complex things within your back-end.
Then if you did how did you resolve the issue. Did you add on to Firebase with third-party plugins, did you move part or all of your data off Firebase to alternative database technologies, or completely moved away from Firebase altogether?
And lastly I would like to know if using Firebase in a more limited way, for example to manage user access to your application while the critical data resides in another database (for example MongoDB or SQL) is possible or are we over complicating the infrastructure build by leveraging two different database technologies?
Thanks for to anyone who offers their advise. Duncan

This isn't exactly an answer but reading the comments shows that one requirement of the post was a partial string query. According the comment, this is the concern
One user types in “Al” in an attempt to search for his friend “Alex”.
You want to fetch all users in the database whose name contains the
text “Al” to help the user narrow down his search. This task is not
possible through Cloud Firestore alone.
So examining that one requirement, let's suppose we want to perform that task; return all users that have a name that starts with 'Al'
func findPartialString(startingWith: String) {
let usersColl = self.db.collection("users")
let endingWith = startingWith + "\u{f8ff}"
let query = usersColl.whereField("name", isGreaterThanOrEqualTo: startingWith).whereField("name", isLessThan: endingWith)
query.getDocuments { querySnapshot, err in
if let err = err {
print("there was an error: \(err.localizedDescription)")
return
}
if let snap = querySnapshot {
if snap.count > 0 {
for document in snap.documents {
let key = document.documentID
let name = document.get("name") as! String
print(key, name)
}
} else {
print("no matches")
}
} else {
print("no data returned in snapshot")
}
}
}
and would be called like this
findPartialString(startingWith: "Al")
and the result is
uid_5 Alex
uid_4 Albert
There are several other ways to perform this same task, so clearly it's not only very possible to do, it's also compact code that easy to maintain.

Related

How do people build a friend presence system? (not a global user presence system)

There are several articles (firestore and firebase realtime database) explaining how to build a user presence system but I cannot find a resource for a friend presence system.
A simple user presence system is not perfect for some applications such as chat apps where there are millions of users and each user wants to listen to only his/her friends. I've found similar questions:
exact same question on stackoverflow
exact same issue on github
Two ok solutions with a realtime database are: (solutions are from the above stackoverflow post)
Use many listeners (one for each friend) with a collection of users. Possibly have a cap on the number of friends to keep track of.
Each user has friends collections and whenever a user's status changes, his/her status changes wherever he/she shows up in some user's friends collection as well.
Is there a better way to do? What kind of databases do chat apps like discord, whatsapp and etc. use to build their friends presence system?
I came to two approaches that might be worth looking into. Note, that I have not tested how it will scale longer term as I just pushed to prod. First step, write a users presence on their user document (will need firebase, cloud functions, and cloud firestore per https://firebase.google.com/docs/firestore/solutions/presence).
Then take either approach:
Create an array field on your user documents (users> {userID}) called friends. Every time you add a friend add your id to this array, and vice versa. Then, on the client run a function like:
db.collection(users).where("friends", "array-contains", clientUserId).onSnapshot(...)
In doing so, all documents with friends field that contains the clientUserId will be listened to for real-time updates. For some reason, my team didn't approve of this design but it works. If anyone can share their opinion as to why I'd appreciate it
Create a friend sub-collection like so: users>{userID}>friends
. When you add a friend, add a document to your friend sub-collection with the id equal to your friends userID. When a user logs on, run a get query for all documents in this collection. Get the doc IDs and store into an array (call it friendIDs). Now for the tricky part. It'd be ideal if you can read use the in operator for unlimited comparison values because you can just run an onSnapshot as so:
this.unSubscribeFriends = db.collection(users).where(firebase.firestore.FieldPath.documentId(), "in", friendIDs).onSnapshot((querySnapshot) => {get presence data}). Since this onSnapshot is attached to this.unSubscribeFriends you just need to call this once to detach the listener:
componentWillUnmount() {
this.unSubscribeFriends && this.unSubscribeFriends()
}
Because a given users friends can definetely increase into the hundreds I had to create a new array called chunkedFriendsArray consisting of a chunked version of friendIDs (chunked as in every 10 string IDs I splice into a new array to bypass the in operator 10 comparison values limit). Thus, I had to map chunkedFriendsArray and set an onSnapshot like the one above for every array of a max length of 10 inside chunkedFriendsArray. The problem with this is that the all the listeners are attached to the same const (or this.unSubscribeFriends in my case). I have to call this.unSubscribeFriends as many times as chunkedArrays exist in chunkedFriendsArray:
componentWillUnmount() {
this.state.chunkedFriendsArray.forEach((doc) => {
this.unSubscribeFriends && this.unSubscribeFriends()
})
}
It feels weird having many listeners attached to the same const (method this.unSubscribeFriends) and calling the same exact one to stop listening to them. I'm sure this will lead to bugs in my production code.
There are other decentralize approaches but the two I listed are my best attempts at avoiding having a bunch of decentralized presence data.

Best practice Firebase search and query

Current State
I have a Flutter app which shows me a list of data from Firebase in a list view.
return new ListView(
children: snapshot.data.docs.map((DocumentSnapshot documentSnapshot) {
return _createRows(
documentSnapshot.data()['id'],
documentSnapshot.reference.id,
);
}).toList());
Problem/Question
But the list will get bigger and therefore the loading times will increase but much more important the usage of the read processes will increase exponentially. I also plan to add a search function.
Firebase docs:
[...] downloading an entire collection to search for fields
client-side isn't practical.
Is there a possibility to only query the used data from the ListView.builder and to do the search via Firebase?
(One possibility is shown here. However, this is not too advantageous for data storage use)
Also, there are a few third party sites, but I couldn't find any free ones in addition, I'm not sure whether the effort to implement in Flutter is worth it. e.g.elastic
I am curious to hear your suggestions
I decided to download all the data at the start of the app and then pass it around inside the app, as I have comparatively little data to download and a long stay in the app, it is most worthwhile.
I realised the search with the plugin TypeAhead.

Cloud Firestore rules on subcollection

I'm working on an iOS app which has (whoah surprise!) chat functionality. The whole app is heavily using the Firebase Tools, for the database I’m using the new Cloud Firestore solution.
Currently I'm in the process of tightening the security using the database rules, but I'm struggling a bit with my own data model :) This could mean that my data model is poorly chosen, but I'm really happy with it, except for implementing the rules part.
The conversation part of the model looks like this. At the root of my database I have a conversations collection:
/conversations/$conversationId
- owner // id of the user that created the conversation
- ts // timestamp when the conversation was created
- members: {
$user_id_1: true // usually the same as 'owner'
$user_id_2: true // the other person in this conversation
...
}
- memberInfo: {
// some extra info about user typing, names, last message etc.
...
}
And then I have a subcollection on each conversation called messages. A message document is a very simple and just holding information about each sent message.
/conversations/$conversationId/messages/$messageId
- body
- sender
- ts
And a screenshot of the model:
The rules on the conversation documents are fairly straightforward and easy to implement:
match /conversations/{conversationId} {
allow read, write: if resource.data.members[(request.auth.uid)] == true;
match /messages/{messageId} {
allow read, write: if get(/databases/$(database)/documents/conversations/$(conversationId)).data.members[(request.auth.uid)] == true;
}
}
Problem
My problem is with the messages subcollection in that conversation. The above works, but I don’t like using the get() call in there.
Each get() call performs a read action, and therefore affects my bill at the end of the month, see documentation.
Which might become a problem if the app I’m building will become a succes, the document reads ofcourse are really minimal, but to do it every time a user opens a conversation seems a bit inefficient. I really like the subcollection solution in my model, but not sure how to efficiently implement the rules here.
I'm open for any datamodel change, my goal is to evaluate the rules without these get() calls. Any idea is very welcome.
Honestly, I think you're okay with your structure and get call as-is. Here's why:
If you're fetching a bunch of documents in a subcollection, Cloud Firestore is usually smart enough to cache values as needed. For example, if you were to ask to fetch all 200 items in "conversions/chat_abc/messages", Cloud Firestore would only perform that get operation once and re-use it for the entire batch operation. So you'll end up with 201 reads, and not 400.
As a general philosophy, I'm not a fan of optimizing for pricing in your security rules. Yes, you can end up with one or two extra reads per operation, but it's probably not going to cause you trouble the same way, say, a poorly written Cloud Function might. Those are the areas where you're better off optimizing.
If you want to save those extra reads, you can actually implement a "cache" based on custom claims.
You can, for example, save the chats the user has access to in the custom claims under the object "conversations". Keep in mind custom claims has a limit of 1000 bytes as mentioned in their documentation.
One workaround to the limit is to just save the most recent conversations in the custom claims, like the top 50. Then in the security rules you can do this:
allow read, write: if request.auth.token.conversations[conversationId] || get(/databases/$(database)/documents/conversations/$(conversationId)).data.members[(request.auth.uid)] == true;
This is especially great if you're already using cloud functions to moderate messages after they were posted, all you need is to update the custom claims

Azure Cache/DataCache style Regions in Redis

I am in the planning process of moving a C# ASP.Net web application over to Azure (currently hosted on a single dedicated server) and am looking at caching options. Currently, because we only have one instance of the application running at one time, we have an 'in process' memory cache to relieve the SQL DB of some identical requests.
The process at the moment is to clear certain parts of the cache when the managers/services make a change to those parts of the database, e.g. we have a users table and we'll have keys like "User.{0}" returning a single User record/object and "Users.ForeignKey.{0}" returning all users related to the foreign key. If we update a single user record then we remove the "User.1" key (if the userid = 1) and for ease all of the list collections as they could have changed. We do this by removing keys by pattern, this means that only the affected keys are removed and all others persist.
We've been planning this move to Azure for a while now and when we first started looking at everything the Azure Redis Cache service wasn't available, at least supported, so we looked at the Azure Cache service, based on AppFabric. Using this we decided that we would use DataCache regions to separate the different object types and then just flush the region that was affected, not quite as exact as our current method but OK. Now, since Redis has come on to the scene, we've been looking at that and would prefer to use it if possible. However, it seems that to achieve the same thing we would have to have separate Redis caches for each 'Region'/section, which from how I understand it would mean we would pay for lots of small instances of the Redis Cache service from Azure which would cost quite a lot given that we would need 10+ separately flushable sections to the cache.
Anyone know how to achieve something similar to Azure DataCache Regions with Redis or can you suggest something glaringly obvious that I'm probably missing.
Sorry for such a long question/explanation but I found it difficult to explain what I'm trying to achieve without background/context.
Thanks,
Gareth
Update:
I've found a few bash commands that can do the job of deleting keys by pattern, including using the 'KEYS' command here and the lua script EVAL command here.
I'm planning on using the StackExchange.Redis client to interact, does anyone know how to use these types of commands or alternatives to those (to delete keys by pattern) when using StackExchange.Redis?
Thanks for reading, Gareth
You can use this method which leverage the async/await features and redis pipelining to delete keys by pattern using stack exchange redis client
private static Task DeleteKeysByPatternAsync(string pattern)
{
IDatabase cache1 = Connection.GetDatabase();
var redisServer1 = Connection.GetServer(Connection.GetEndPoints().First());
var deleteTasks = new List<Task>();
var counter = 0;
foreach (var key in redisServer1.Keys(pattern: pattern, database: 0, pageSize: 5000))
{
deleteTasks.Add(cache1.KeyDeleteAsync(key));
counter++;
if (counter % 1000 == 0)
Console.WriteLine($"Delete key tasks created: {counter}");
}
return Task.WhenAll(deleteTasks);
}
Then you can use it like this:
DeleteKeysByPatternAsync("*user:*").Wait(); //If you are calling from main method for example where you cant use await.
or
await DeleteKeysByPatternAsync("*user:*"); //If you run from async method
You can tweak the pageSize or receive as method param.
For what I understand by your question you need to group your data according to some criteria (user in your case), so that whenever record related to that criteria is changed, all of data related to that record is also invalidated in cache using a single cache api call.
You can achieve this in Azure using NCache for Azure, a distributed caching solution for azure by Alachisoft which has a rich set of features along with multiple caching topologies.
NCache allows multiple ways to perform this type of operations. One suitable for your use case is data grouping feature that will allow you to group data in groups/subgroups on addition. Data can be later fetched/removed on the basis of groups/subgroups.
NCache also allows to add tags with items being added. These tags can then be used for removing/fetching all data containing one or more specified tags. Querying feature (Delete query) provided in NCache can also be used to remove data satisfying a particular criteria.

Firebase filter records using the REST API (active vs inactive)

My company is considering using Firebase for a particular project dealing with massive dynamic forms. There have been some flags raised with the Firebase service that I would like to remove before we start.
First off I've read through the blog posts here:
https://www.firebase.com/blog/2013-10-01-queries-part-one.html
https://www.firebase.com/blog/2014-01-02-queries-part-two.html
The basic queries they outline are pretty self explanatory and I believe will be enough for the projects reporting inquiries, however is this only available for the Web API? Is there a way to search or get filtered data from Firebase from the REST API or is it up to the developers here to filter through a bunch of data?
My problem stems from a couple models that need to show reports based off some of the data we plan to offload to Firebase.
For example:
Site (Hosted internally)
activeSurveyId: *firebase survey id*
Survey (Hosted through Firebase)
siteId: *site this firebase survey belongs to*
status: "In Progress"
...
If I just wanted the survey for a site that's simple, however if i wanted to compile a report of all Survey's with the status of "In Progress" is there a simple way to do that from the REST API?
I'd love to use this product as it's quite slick. We just need to make sure it meets our requirements without doing a bunch of extra work on our end for filtering / searching.
As is the case with a lot of questions about querying in Firebase (and other noSql databases), the answer is to avoid the need to make complex queries. Structure your data to reduce or eliminate the need for queries.
Firebase has a blog entry about this, Denormalizing Your Data is Normal, which I recommend you read.
To resolve your specific issue, your best bet is to store the surveys that are in progress in a different root node than those that are completed. Your structure might look something like this:
"surveys": {
"in-progress": {
"-JTICjjZfm8faeMo11FF": { ... data for one survey ... },
...
}
"completed": {
"-JTICdawddwMo11DWCwd": { ... data for one survey ... },
...
}
}

Resources