Firebase data structure - is the Firefeed structure relevant? - firebase

Firefeed is a very nice example of what can be achieved with Firebase - a fully client side Twitter clone. So there is this page : https://firefeed.io/about.html where the logic behind the adopted data structure is explained. It helps a lot to understand Firebase security rules.
By the end of the demo, there is this snippet of code :
var userid = info.id; // info is from the login() call earlier.
var sparkRef = firebase.child("sparks").push();
var sparkRefId = sparkRef.name();
// Add spark to global list.
sparkRef.set(spark);
// Add spark ID to user's list of posted sparks.
var currentUser = firebase.child("users").child(userid);
currentUser.child("sparks").child(sparkRefId).set(true);
// Add spark ID to the feed of everyone following this user.
currentUser.child("followers").once("value", function(list) {
list.forEach(function(follower) {
var childRef = firebase.child("users").child(follower.name());
childRef.child("feed").child(sparkRefId).set(true);
});
});
It's showing how the writing is done in order to keep the read simple - as stated :
When we need to display the feed for a particular user, we only need to look in a single place
So I do understand that. But if we take a look at Twitter, we can see that some accounts has several millions followers (most followed is Katy Perry with over 61 millions !). What would happen with this structure and this approach ? Whenever Katy would post a new tweet, it would make 61 millions Write operations. Wouldn't this simply kill the app ? And even more, isn't it consuming a lot of unnecessary space ?

With denormalized data, the only way to connect data is to write to every location its read from. So yeah, to publish a tweet to 61 million followers would require 61 million writes.
You wouldn't do this in the browser. The server would listen for child_added events for new tweets, and then a cluster of workers would split up the load paginating a subset of followers at a time. You could potentially prioritize online users to get writes first.
With normalized data, you write the tweet once, but pay for the join on reads. If you cache the tweets in feeds to avoid hitting the database for each request, you're back to 61 million writes to redis for every Katy Perry tweet. To push the tweet in real time, you need to write the tweet to a socket for every online follower anyway.

Related

How to do pattern searching in fire base real time DB [duplicate]

I am using firebase for data storage. The data structure is like this:
products:{
product1:{
name:"chocolate",
}
product2:{
name:"chochocho",
}
}
I want to perform an auto complete operation for this data, and normally i write the query like this:
"select name from PRODUCTS where productname LIKE '%" + keyword + "%'";
So, for my situation, for example, if user types "cho", i need to bring both "chocolate" and "chochocho" as result. I thought about bringing all data under "products" block, and then do the query at the client, but this may need a lot of memory for a big database. So, how can i perform sql LIKE operation?
Thanks
Update: With the release of Cloud Functions for Firebase, there's another elegant way to do this as well by linking Firebase to Algolia via Functions. The tradeoff here is that the Functions/Algolia is pretty much zero maintenance, but probably at increased cost over roll-your-own in Node.
There are no content searches in Firebase at present. Many of the more common search scenarios, such as searching by attribute will be baked into Firebase as the API continues to expand.
In the meantime, it's certainly possible to grow your own. However, searching is a vast topic (think creating a real-time data store vast), greatly underestimated, and a critical feature of your application--not one you want to ad hoc or even depend on someone like Firebase to provide on your behalf. So it's typically simpler to employ a scalable third party tool to handle indexing, searching, tag/pattern matching, fuzzy logic, weighted rankings, et al.
The Firebase blog features a blog post on indexing with ElasticSearch which outlines a straightforward approach to integrating a quick, but extremely powerful, search engine into your Firebase backend.
Essentially, it's done in two steps. Monitor the data and index it:
var Firebase = require('firebase');
var ElasticClient = require('elasticsearchclient')
// initialize our ElasticSearch API
var client = new ElasticClient({ host: 'localhost', port: 9200 });
// listen for changes to Firebase data
var fb = new Firebase('<INSTANCE>.firebaseio.com/widgets');
fb.on('child_added', createOrUpdateIndex);
fb.on('child_changed', createOrUpdateIndex);
fb.on('child_removed', removeIndex);
function createOrUpdateIndex(snap) {
client.index(this.index, this.type, snap.val(), snap.name())
.on('data', function(data) { console.log('indexed ', snap.name()); })
.on('error', function(err) { /* handle errors */ });
}
function removeIndex(snap) {
client.deleteDocument(this.index, this.type, snap.name(), function(error, data) {
if( error ) console.error('failed to delete', snap.name(), error);
else console.log('deleted', snap.name());
});
}
Query the index when you want to do a search:
<script src="elastic.min.js"></script>
<script src="elastic-jquery-client.min.js"></script>
<script>
ejs.client = ejs.jQueryClient('http://localhost:9200');
client.search({
index: 'firebase',
type: 'widget',
body: ejs.Request().query(ejs.MatchQuery('title', 'foo'))
}, function (error, response) {
// handle response
});
</script>
There's an example, and a third party lib to simplify integration, here.
I believe you can do :
admin
.database()
.ref('/vals')
.orderByChild('name')
.startAt('cho')
.endAt("cho\uf8ff")
.once('value')
.then(c => res.send(c.val()));
this will find vals whose name are starting with cho.
source
The elastic search solution basically binds to add set del and offers a get by wich you can accomplish text searches.
It then saves the contents in mongodb.
While I love and reccomand elastic search for the maturity of the project, the same can be done without another server, using only the firebase database.
That's what I mean:
(https://github.com/metaschema/oxyzen)
for the indexing part basically the function:
JSON stringifies a document.
removes all the property names and JSON to leave only the data
(regex).
removes all xml tags (therefore also html) and attributes (remember
old guidance, "data should not be in xml attributes") to leave only
the pure text if xml or html was present.
removes all special chars and substitute with space (regex)
substitutes all instances of multiple spaces with one space (regex)
splits to spaces and cycles:
for each word adds refs to the document in some index structure in
your db tha basically contains childs named with words with childs
named with an escaped version of "ref/inthedatabase/dockey"
then inserts the document as a normal firebase application would do
in the oxyzen implementation, subsequent updates of the document ACTUALLY reads the index and updates it, removing the words that don't match anymore, and adding the new ones.
subsequent searches of words can directly find documents in the words child. multiple words searches are implemented using hits
SQL"LIKE" operation on firebase is possible
let node = await db.ref('yourPath').orderByChild('yourKey').startAt('!').endAt('SUBSTRING\uf8ff').once('value');
This query work for me, it look like the below statement in MySQL
select * from StoreAds where University Like %ps%;
query = database.getReference().child("StoreAds").orderByChild("University").startAt("ps").endAt("\uf8ff");

How to create an alert to notify an user when some amount % of threshold reached DailyAsyncApex Executions

On 2 occasions in the past month, we have managed to hit our daily limit on asynchronous apex executions. Salesforce temporarily increased our limit to 425000 but it will be scaled down to 250000 in a week's time. Once we reach the limit, a lot of the SF functions will fail and this has tremendously impacted both internal staff and external customers.
So to prevent this from happening in the future, we need to create some kind of alert in Salesforce to monitor our daily asynchronous apex method executions. Our maximum daily limit is 250000. The alert will need to create a P3 helpdesk ticket and notify couple of users say USER A and USER B once it reaches 70% threshold.
Kindly advise what is possible to achieve the same
Thanks & Regards,
Harjeet
There's a promising Limits method but it doesn't seem to work currently ("reserved for future use"): System.debug(Limits.getAsyncCalls() + ' / ' + Limits.getLimitAsyncCalls());
There's an idea you can upvote: https://success.salesforce.com/ideaView?id=0873A0000003VIFQA2 ;)
You could query SELECT COUNT() FROM AsyncApexJob WHERE ... but that sounds like a bad idea ;)
I think your best course of action is to use SF REST API. There's a "limits" resource you can fetch. You could do it from SF itself (bad idea because if you'd schedule it to run every hour then well, of course it will contribute to the limit consumption too ;)) or from some external app that'd connect to your SF...
https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/dome_limits.htm
You can quickly try it out for example in workbench.developerforce.com before you decide you do want to deep dive into coding it.
Of course if you have control over your batch jobs, queuable, schedulable & #future calls you could implement some rough counter of executions in a helper object for example... won't help you much if most of the jobs are coming from managed packages though...
Got 1 more idea but it's pretty hardcore - you should be able to make a REST API call from javascript. so you could create a simple VF page (even without any apex controller), put JS callout on it, have it check every 5 mins and do something if threshold is hit... But that means IT person would have to have this page open all the time (perhaps as a home page component)... Messy :)
I was having the exact same issue so I created a simple JsForce script in NodeJS to monitor the call to the /limits endpoint.
You can connect a Free Monitoring service like UpTimerobot.com or PingDom.com and get an email when you find the Word "Warning" >50% or "Error" > 80%.
async function getSfLimits() {
try {
//Let's login into salesforce
const login = await conn.login(SF_USERNAME, SF_PASSWORD+SF_SECURITY_TOKEN);
//Call the API
const sfLimits = await conn.requestGet('/services/data/v51.0/limits');
return sfLimits;
} catch(err) {
console.log(err);
}
}
https://github.com/carlosdevia/salesforcelimits

Having quite a lot of issues when write massive record to firebase database

From my last question let me decided to write huge amount of data to
firebase databases for testing purpose.
Here is the outcome
1000 record: Nothing significant happen it work fine.
10000 record: Response from all other read operation from firebase return only after the write operation complete.
100000 record : Same as the result of 10000 record but took more longer, I can't perform any firebase operation unless force close the app and reopen it. The screen started hang for some time might because I perform loop in main thread(ios).
1 million record : I'm afraid so never try.
The reason I need to write such amount of data is because I building some social app(android,ios,web) it use SQL before but I think it is time to switch to firebase. By studying this I having the idea on how to build a user feed without using the IN clause. The data structure look like this
users
user1
name: bob
user2
name: alice
follows:
user1: true
posts
post1
author: user1
text: 'Hi there'
feeds
user2
post1: true
As the example If one of the user having 61 million follower it will need to insert record to 61 million feeds/$uid/. Which the write operation barely survive with 100k. On this link it suggest not to do it in the client side but big point of firebase is it is backendless how I suppose to write beside from client side.
So my question is there any efficient way to achieve on how to not get other read operation interupt by this kind of massive write operation? Or there is way better data modeling for this?
I really need help. I really apperciated even just a comment.
Lets start with an example Firebase structure using observers and events to capture feeds from specific users.
We have a users node which stores data about all of the users, in this case just their name. We also have a feeds node which stores the feeds. In this case we could also call it messages as we are just storing messages users create.
users
uid_0
name: "Scott"
uid_1
name: "Frank"
uid_2
name: "Leroy"
feeds
feed_0
uid: uid_1
msg: "some message from uid_1"
feed_1
uid: uid_2
msg: "a message from uid_2"
Assume that Scott (uid_0) wants to 'subscribe' to any feeds from Frank and Leroy.
func addSomeFeeds() {
self.addFeed(uidFeed: "uid_1")
self.addFeed(uidFeed: "uid_2")
}
func addFeed(uidFeed: String) {
let feedsRef = self.ref.child("feeds")
let feedQuery = feedsRef.queryOrdered(byChild: "uid").queryEqual(toValue: uidFeed)
feedQuery.observe(.childAdded, with: { snapshot in
let feedDict = snapshot.value as! [String: Any]
let msg = feedDict["msg"] as! String
print(msg)
})
}
and the output
some message from uid_1
a message from uid_2
and then if uid_2 adds another feed (with a message)
Feed From uid_2
The above code is run on Scott's device and attaches an observer to the Feeds node that 'watches' for any feeds added by Frank or Leroy.
This could expanded and watch for changes via .childChanged so if the feed has multiple 'posts' in it any time a post is updated in a particular feed, Scott's app will be notified of the changes within that feed.
Another option is to add a more generic observer to the feeds node wheras the app would be notified for any feed added, changed or removed. In that case, when the app receives a snapshot of the feed, simply compare it to an array of feeds the user is interested in and if doesn't match one of those, ignore it, otherwise, notify the user.

Most efficient method of pulling in weather data for multiple locations

I'm working on a meteor mobile app that displays information about local places of interest and one of the things that I want to show is the weather in each location. I've currently got my locations stored with latlng coordinates and they're searchable by radius. I'd like to use the openweathermap api to pull in some useful 'current conditions' information so that when a user looks at an entry they can see basic weather data. Ideally I'd like to limit the number of outgoing requests to keep the pages snappy (and API requests down)
I'm wondering if I can create a server collection of weather data that I update regularly, server-side (hourly?) that my clients then query (perhaps using a mongo $near lookup?) - that way all of my data is being handled within meteor, rather than each client going out to grab the latest data from the API. I don't want to have to iterate through all of the locations in my list and do a separate call out to the api for each as I have approx. 400 locations(!). I'm afraid I'm new to API requests (and meteor itself) so apologies if this is a poorly phrased question.
I'm not entirely sure if this is doable, or if it's even the best approach - any advice (and links to any useful code snippets!) would be greatly appreciated!
EDIT / UPDATE!
OK I haven't managed to get this working yet but I some more useful details on the data!
If I make a request to the openweather API I can get data back for all of their locations (which I would like to add/update to a collection). I could then do regular lookup, instead of making a client request straight out to them every time a user looks at a location. The JSON data looks like this:
{
"message":"accurate",
"cod":"200",
"count":50,
"list":[
{
"id":2643076,
"name":"Marazion",
"coord":{
"lon":-5.47505,
"lat":50.125561
},
"main":{
"temp":292.15,
"pressure":1016,
"humidity":68,
"temp_min":292.15,
"temp_max":292.15
},
"dt":1403707800,
"wind":{
"speed":8.7,
"deg":110,
"gust":13.9
},
"sys":{
"country":""
},
"clouds":{
"all":75
},
"weather":[
{
"id":721,
"main":"Haze",
"description":"haze",
"icon":"50d"
}
]
}, ...
Ideally I'd like to build my own local 'weather' collection that I can search using mongo's $near (to keep outbound requests down, and speed up), but I don't know if this will be possible because the format that the data comes back in - I think I'd need to structure my location data like this in order to use a geo search:
"location": {
"type": "Point",
"coordinates": [-5.47505,50.125561]
}
My questions are:
How can I build that collection (I've seen this - could I do something similar and update existing entries in the collection on a regular basis?)
Does it just need to live on the server, or client too?
Do I need to manipulate the data in order to get a geo search to work?
Is this even the right way to approach it??
EDIT/UPDATE2
Is this question too long/much? It feels like it. Maybe I should split it out.
Yes easily possible. Because your question is so large I'll give you a high level explanation of what I think you need to do.
You need to create a collection where you're gonna save the weather data in.
A request worker that requests new data and updates the collection on a set interval. Use something like cron-tick for scheduling the interval.
Requesting data should only happen server side and I can recommend the request npm package for that.
Meteor.publish the weather collection and have the client subscribe to that, with optionally a filter for it's location.
You should now be getting the weather data on your client and should be able to get freaky with it.

Limit reading children from a location through client side

var commentsRef = new Firebase('https://test.firebaseio.com/comments');
var last10Comments = commentsRef.limit(10);
//Rendering last 10 comments
last10Comments.on('child_added', function (snapshot) {
});
From the client side a user can change the limit number and can render all comments from comments reference.
Is there any way to restrict reading limit to some number at any point of time for a location?
No, there isn't currently a way to put Firebase security rules around that type of limiting of data. Another approach that would work would be to have another section of the tree that contains a denormalized portion of the data that just contains the last 10 comments and nothing more.
Thanks for bringing this up. I've added this to our internal tracker to keep it in mind when we design V2 of our security API.

Resources