Firebase realtime database is taking a long time to load data. Here's a screenshot of the data that I have in the database. What can I do to optimize the loading? Also are there other places that I can store the data other than firebase? The data is 3.8MB in size, and has the following structure
{"10-happier": {body: "test"}, "zero-to-one": {body: "test2"}}
Here's my code
var defer = Q.defer();
app.database().ref('content').once('value').then(snapshot => {
if (snapshot && snapshot.val()) {
defer.resolve(snapshot.val());
} else {
defer.resolve({});
}
}).catch( error => {
defer.reject(error);
});
return defer.promise;
There's nothing you can do to optimize a query like this. When you fetch an entire node:
app.database().ref('content').once('value')
The SDK will download everything, and it will take as long as it takes to get the whole thing. The total performance is going to be determined by the speed of the client's connection to the server. If you want the query to be faster, your only viable option is to get a faster connection to Realtime Database.
Alternatively, you can bypass the use of a database altogether and use a different method of storage that involves some form of compression or CDN to deliver the content more efficiently for the end user. Since recommendations for software and services are off-topic for Stack Overflow, you will have to do some research to figure out what your options are and what will work best for your specific situation.
Related
Duplicate of: Firebase storage URL keeps changing with new token
When a user uploads a profile pic I store this in firebase storage with the file name as the uid.
Lets say the user then goes and makes say 100 posts + 500 comments and then updates their profile image.
Currently I have a trigger which goes and updates the profile image url in all of the post and comment documents. The reason I have to do this is that when the image is changed in storage the access token is changed and this is part of the url so the old url no longer works.
What I want to do is not have the access token change. If I can do this I can avoid the mass updates that will massively increase my firestore writes.
Is there any way to do this? or an alternative?
Edit:
Another solution if you don't mind making the file public.
Add this storage rule and you won't have to use a token to access the file.
This will allow read access to "mydir" globally in any subfolder.
match /{path=**}/mydir/{doc} {
allow read: if true;
}
There are only two options here:
You store the profile image URL only once, probably in the user's profile document, and look it up every time it is needed. In return you only have to write it once.
You store the profile image URL for every post, in which case you only have to load the post documents and not the profile URL for each. In return you'll have to write the profile URL in each post document, and update it though.
For smaller networks the former is more common, since you're more likely to see multiple posts from the same user, so you amortizing the cost of the extra lookup over multiple posts.
The bigger the network of users, the more interesting the second approach becomes, as you'll care about read performance and simplicity more than the writes you're focusing on right now.
In the end, there's no singular right answer here though. You'll have to decide for yourself what performance and cost profile you want your app to have.
Answer provided by #Prodigy here: https://stackoverflow.com/a/64129850/10222449
I tried this and it works well.
This will save millions of writes.
var storage = firebase.storage();
var pathReference = storage.ref('users/' + userId + '/avatar.jpg');
pathReference.getDownloadURL().then(function (url) {
$("#large-avatar").attr('src', url);
}).catch(function (error) {
// Handle any errors
});
I have to delete multiple fields in different locations.
I have millions of "messages" with multiple fields like:
messagesId: {
text: "...",
datetime: "...",
unusedField: "...", <-- REMOVE
...
}, ...
To save storage space, I want to delete old and not used fields for each message. It means, saves tens of Giga, alias money (hundreds of dollars that we currently pay because in addition to the first GB guaranteed by Firebase).
Problem #1 - Database Peak: deleting a huge amount of data, programmatically, put the Peak to 100%, temporarily blocking it.
To solve this, the solution suggested by Firebase support is to use the CLI (firebase database: remove path).
Once I've listed millions of lines:
firebase database:remove /root/messages/messageid_1/field --confirm &&
firebase database:remove /root/messages/messageid_2/field --confirm &&
...
even considering a few milliseconds of execution for each line, the overall execution may take an unacceptably long time (days).
Problem #2 - Delete locally and re-upload the DB: another solution suggested, is to download the entire database, remove the json paths and re-upload it.
Currently, the entire database weighs 60GB.
Is it possible to reload the entire database from the Firebase console? (Given the fact that I would have to suspend any writes in the meantime, to avoid data loss)
Are there any other possible solutions?
The common path for this is:
Enable automated backups for your database, and download the JSON from the Storage bucket.
Process the JSON locally, and determine the exact path of all nodes to remove.
Process the paths in reasonably sized chunks through the API, using multi-location updates.
Removing each chunk of nodes would be something like:
var nodesToRemove = ["/root/messages/messageid_1/field", "/root/messages/messageid_2/field"];
var updates = nodesToRemove.map(function(path) {
return { [path]: null };
});
firebase.database().ref().update(updates);
I'm having slow performance issues with Firestore while retrieving basic data stored in a document compared to the realtime database with 1/10 ratio.
Using Firestore, it takes an average of 3000 ms on the first call
this.db.collection(‘testCol’)
.doc(‘testDoc’)
.valueChanges().forEach((data) => {
console.log(data);//3000 ms later
});
Using the realtime database, it takes an average of 300 ms on the first call
this.db.database.ref(‘/test’).once(‘value’).then(data => {
console.log(data); //300ms later
});
This is a screenshot of the network console :
I'm running the Javascript SDK v4.50 with AngularFire2 v5.0 rc.2.
Did anyone experience this issue ?
UPDATE: 12th Feb 2018 - iOS Firestore SDK v0.10.0
Similar to some other commenters, I've also noticed a slower response on the first get request (with subsequent requests taking ~100ms). For me it's not as bad as 30s, but maybe around 2-3s when I have good connectivity, which is enough to provide a bad user experience when my app starts up.
Firebase have advised that they're aware of this "cold start" issue and they're working on a long term fix for it - no ETA unfortunately. I think it's a separate issue that when I have poor connectivity, it can take ages (over 30s) before get requests decide to read from cache.
Whilst Firebase fix all these issues, I've started using the new disableNetwork() and enableNetwork() methods (available in Firestore v0.10.0) to manually control the online/offline state of Firebase. Though I've had to be very careful where I use it in my code, as there's a Firestore bug that can cause a crash under certain scenarios.
UPDATE: 15th Nov 2017 - iOS Firestore SDK v0.9.2
It seems the slow performance issue has now been fixed. I've re-run the tests described below and the time it takes for Firestore to return the 100 documents now seems to be consistently around 100ms.
Not sure if this was a fix in the latest SDK v0.9.2 or if it was a backend fix (or both), but I suggest everyone updates their Firebase pods. My app is noticeably more responsive - similar to the way it was on the Realtime DB.
I've also discovered Firestore to be much slower than Realtime DB, especially when reading from lots of documents.
Updated tests (with latest iOS Firestore SDK v0.9.0):
I set up a test project in iOS Swift using both RTDB and Firestore and ran 100 sequential read operations on each. For the RTDB, I tested the observeSingleEvent and observe methods on each of the 100 top level nodes. For Firestore, I used the getDocument and addSnapshotListener methods at each of the 100 documents in the TestCol collection. I ran the tests with disk persistence on and off. Please refer to the attached image, which shows the data structure for each database.
I ran the test 10 times for each database on the same device and a stable wifi network. Existing observers and listeners were destroyed before each new run.
Realtime DB observeSingleEvent method:
func rtdbObserveSingle() {
let start = UInt64(floor(Date().timeIntervalSince1970 * 1000))
print("Started reading from RTDB at: \(start)")
for i in 1...100 {
Database.database().reference().child(String(i)).observeSingleEvent(of: .value) { snapshot in
let time = UInt64(floor(Date().timeIntervalSince1970 * 1000))
let data = snapshot.value as? [String: String] ?? [:]
print("Data: \(data). Returned at: \(time)")
}
}
}
Realtime DB observe method:
func rtdbObserve() {
let start = UInt64(floor(Date().timeIntervalSince1970 * 1000))
print("Started reading from RTDB at: \(start)")
for i in 1...100 {
Database.database().reference().child(String(i)).observe(.value) { snapshot in
let time = UInt64(floor(Date().timeIntervalSince1970 * 1000))
let data = snapshot.value as? [String: String] ?? [:]
print("Data: \(data). Returned at: \(time)")
}
}
}
Firestore getDocument method:
func fsGetDocument() {
let start = UInt64(floor(Date().timeIntervalSince1970 * 1000))
print("Started reading from FS at: \(start)")
for i in 1...100 {
Firestore.firestore().collection("TestCol").document(String(i)).getDocument() { document, error in
let time = UInt64(floor(Date().timeIntervalSince1970 * 1000))
guard let document = document, document.exists && error == nil else {
print("Error: \(error?.localizedDescription ?? "nil"). Returned at: \(time)")
return
}
let data = document.data() as? [String: String] ?? [:]
print("Data: \(data). Returned at: \(time)")
}
}
}
Firestore addSnapshotListener method:
func fsAddSnapshotListener() {
let start = UInt64(floor(Date().timeIntervalSince1970 * 1000))
print("Started reading from FS at: \(start)")
for i in 1...100 {
Firestore.firestore().collection("TestCol").document(String(i)).addSnapshotListener() { document, error in
let time = UInt64(floor(Date().timeIntervalSince1970 * 1000))
guard let document = document, document.exists && error == nil else {
print("Error: \(error?.localizedDescription ?? "nil"). Returned at: \(time)")
return
}
let data = document.data() as? [String: String] ?? [:]
print("Data: \(data). Returned at: \(time)")
}
}
}
Each method essentially prints the unix timestamp in milliseconds when the method starts executing and then prints another unix timestamp when each read operation returns. I took the difference between the initial timestamp and the last timestamp to return.
RESULTS - Disk persistence disabled:
RESULTS - Disk persistence enabled:
Data Structure:
When the Firestore getDocument / addSnapshotListener methods get stuck, it seems to get stuck for durations that are roughly multiples of 30 seconds. Perhaps this could help the Firebase team isolate where in the SDK it's getting stuck?
Update Date March 02, 2018
It looks like this is a known issue and the engineers at Firestore are working on a fix. After a few email exchanges and code sharing with a Firestore engineer on this issue, this was his response as of today.
"You are actually correct. Upon further checking, this slowness on getDocuments() API is a known behavior in Cloud Firestore beta. Our engineers are aware of this performance issue tagged as "cold starts", but don't worry as we are doing our best to improve Firestore query performance.
We are already working on a long-term fix but I can't share any timelines or specifics at the moment. While Firestore is still on beta, expect that there will be more improvements to come."
So hopefully this will get knocked out soon.
Using Swift / iOS
After dealing with this for about 3 days it seems the issue is definitely the get() ie .getDocuments and .getDocument. Things I thought were causing the extreme yet intermittent delays but don't appear to be the case:
Not so great network connectivity
Repeated calls via looping over .getDocument()
Chaining get() calls
Firestore Cold starting
Fetching multiple documents (Fetching 1 small doc caused 20sec delays)
Caching (I disabled offline persistence but this did nothing.)
I was able to rule all of these out as I noticed this issue didn't happen with every Firestore database call I was making. Only retrievals using get(). For kicks I replaced .getDocument with .addSnapshotListener to retrieve my data and voila. Instant retrieval each time including the first call. No cold starts. So far no issues with the .addSnapshotListener, only getDocument(s).
For now, I'm simply dropping the .getDocument() where time is of the essence and replacing it with .addSnapshotListener then using
for document in querySnapshot!.documents{
// do some magical unicorn stuff here with my document.data()
}
... in order to keep moving until this gets worked out by Firestore.
Almost 3 years later, firestore being well out of beta and I can confirm that this horrible problem still persists ;-(
On our mobile app we use the javascript / node.js firebase client. After a lot of testing to find out why our app's startup time is around 10sec we identified what to attribute 70% of that time to... Well, to firebase's and firestore's performance and cold start issues:
firebase.auth().onAuthStateChanged() fires approx. after 1.5 - 2sec, already quite bad.
If it returns a user, we use its ID to get the user document from firestore. This is the first call to firestore and the corresponding get() takes 4 - 5sec. Subsequent get() of the same or other documents take approx. 500ms.
So in total the user initialization takes 6 - 7 sec, completely unacceptable. And we can't do anything about it. We can't test disabling persistence, since in the javascript client there's no such option, persistence is always enabled by default, so not calling enablePersistence() won't change anything.
I had this issue until this morning. My Firestore query via iOS/Swift would take around 20 seconds to complete a simple, fully indexed query - with non-proportional query times for 1 item returned - all the way up to 3,000.
My solution was to disable offline data persistence. In my case, it didn't suit the needs of our Firestore database - which has large portions of its data updated every day.
iOS & Android users have this option enabled by default, whilst web users have it disabled by default. It makes Firestore seem insanely slow if you're querying a huge collection of documents. Basically it caches a copy of whichever data you're querying (and whichever collection you're querying - I believe it caches all documents within) which can lead to high Memory usage.
In my case, it caused a huge wait for every query until the device had cached the data required - hence the non-proportional query times for the increasing numbers of items to return from the exact same collection. This is because it took the same amount of time to cache the collection in each query.
Offline Data - from the Cloud Firestore Docs
I performed some benchmarking to display this effect (with offline persistence enabled) from the same queried collection, but with different amounts of items returned using the .limit parameter:
Now at 100 items returned (with offline persistence disabled), my query takes less than 1 second to complete.
My Firestore query code is below:
let db = Firestore.firestore()
self.date = Date()
let ref = db.collection("collection").whereField("Int", isEqualTo: SomeInt).order(by: "AnotherInt", descending: true).limit(to: 100)
ref.getDocuments() { (querySnapshot, err) in
if let err = err {
print("Error getting documents: \(err)")
} else {
for document in querySnapshot!.documents {
let data = document.data()
//Do things
}
print("QUERY DONE")
let currentTime = Date()
let components = Calendar.current.dateComponents([.second], from: self.date, to: currentTime)
let seconds = components.second!
print("Elapsed time for Firestore query -> \(seconds)s")
// Benchmark result
}
}
well, from what I'm currently doing and research by using nexus 5X in emulator and real android phone Huawei P8,
Firestore and Cloud Storage are both give me a headache of slow response
when I do first document.get() and first storage.getDownloadUrl()
It give me more than 60 seconds response on each request. The slow response only happen in real android phone. Not in emulator. Another strange thing.
After the first encounter, the rest request is smooth.
Here is the simple code where I meet the slow response.
var dbuserref = dbFireStore.collection('user').where('email','==',email);
const querySnapshot = await dbuserref.get();
var url = await defaultStorage.ref(document.data().image_path).getDownloadURL();
I also found link that is researching the same.
https://reformatcode.com/code/android/firestore-document-get-performance
I am using firebase for data storage. The data structure is like this:
products:{
product1:{
name:"chocolate",
}
product2:{
name:"chochocho",
}
}
I want to perform an auto complete operation for this data, and normally i write the query like this:
"select name from PRODUCTS where productname LIKE '%" + keyword + "%'";
So, for my situation, for example, if user types "cho", i need to bring both "chocolate" and "chochocho" as result. I thought about bringing all data under "products" block, and then do the query at the client, but this may need a lot of memory for a big database. So, how can i perform sql LIKE operation?
Thanks
Update: With the release of Cloud Functions for Firebase, there's another elegant way to do this as well by linking Firebase to Algolia via Functions. The tradeoff here is that the Functions/Algolia is pretty much zero maintenance, but probably at increased cost over roll-your-own in Node.
There are no content searches in Firebase at present. Many of the more common search scenarios, such as searching by attribute will be baked into Firebase as the API continues to expand.
In the meantime, it's certainly possible to grow your own. However, searching is a vast topic (think creating a real-time data store vast), greatly underestimated, and a critical feature of your application--not one you want to ad hoc or even depend on someone like Firebase to provide on your behalf. So it's typically simpler to employ a scalable third party tool to handle indexing, searching, tag/pattern matching, fuzzy logic, weighted rankings, et al.
The Firebase blog features a blog post on indexing with ElasticSearch which outlines a straightforward approach to integrating a quick, but extremely powerful, search engine into your Firebase backend.
Essentially, it's done in two steps. Monitor the data and index it:
var Firebase = require('firebase');
var ElasticClient = require('elasticsearchclient')
// initialize our ElasticSearch API
var client = new ElasticClient({ host: 'localhost', port: 9200 });
// listen for changes to Firebase data
var fb = new Firebase('<INSTANCE>.firebaseio.com/widgets');
fb.on('child_added', createOrUpdateIndex);
fb.on('child_changed', createOrUpdateIndex);
fb.on('child_removed', removeIndex);
function createOrUpdateIndex(snap) {
client.index(this.index, this.type, snap.val(), snap.name())
.on('data', function(data) { console.log('indexed ', snap.name()); })
.on('error', function(err) { /* handle errors */ });
}
function removeIndex(snap) {
client.deleteDocument(this.index, this.type, snap.name(), function(error, data) {
if( error ) console.error('failed to delete', snap.name(), error);
else console.log('deleted', snap.name());
});
}
Query the index when you want to do a search:
<script src="elastic.min.js"></script>
<script src="elastic-jquery-client.min.js"></script>
<script>
ejs.client = ejs.jQueryClient('http://localhost:9200');
client.search({
index: 'firebase',
type: 'widget',
body: ejs.Request().query(ejs.MatchQuery('title', 'foo'))
}, function (error, response) {
// handle response
});
</script>
There's an example, and a third party lib to simplify integration, here.
I believe you can do :
admin
.database()
.ref('/vals')
.orderByChild('name')
.startAt('cho')
.endAt("cho\uf8ff")
.once('value')
.then(c => res.send(c.val()));
this will find vals whose name are starting with cho.
source
The elastic search solution basically binds to add set del and offers a get by wich you can accomplish text searches.
It then saves the contents in mongodb.
While I love and reccomand elastic search for the maturity of the project, the same can be done without another server, using only the firebase database.
That's what I mean:
(https://github.com/metaschema/oxyzen)
for the indexing part basically the function:
JSON stringifies a document.
removes all the property names and JSON to leave only the data
(regex).
removes all xml tags (therefore also html) and attributes (remember
old guidance, "data should not be in xml attributes") to leave only
the pure text if xml or html was present.
removes all special chars and substitute with space (regex)
substitutes all instances of multiple spaces with one space (regex)
splits to spaces and cycles:
for each word adds refs to the document in some index structure in
your db tha basically contains childs named with words with childs
named with an escaped version of "ref/inthedatabase/dockey"
then inserts the document as a normal firebase application would do
in the oxyzen implementation, subsequent updates of the document ACTUALLY reads the index and updates it, removing the words that don't match anymore, and adding the new ones.
subsequent searches of words can directly find documents in the words child. multiple words searches are implemented using hits
SQL"LIKE" operation on firebase is possible
let node = await db.ref('yourPath').orderByChild('yourKey').startAt('!').endAt('SUBSTRING\uf8ff').once('value');
This query work for me, it look like the below statement in MySQL
select * from StoreAds where University Like %ps%;
query = database.getReference().child("StoreAds").orderByChild("University").startAt("ps").endAt("\uf8ff");
Here I have a simple feature on ASP.NET MVC3 which host on Azure.
1st step: user upload a picture
2nd step: user crop the uploaded picture
3rd: system save the cropped picture, delete the temp file which is the uploaded original picture
Here is the problem I am facing now: where to store the temp file?
I tried on windows system somewhere, or on LocalResources: the problem is these resources are per Instance, so here is no guarantee the code on an instance shows the picture to crop will be the same code on the same instance that saved the temp file.
Do you have any idea on this temp file issue?
normally the file exist just for a while before delete it
the temp file needs to be Instance independent
Better the file can have some expire setting (for example, 1H) to delete itself, in case code crashed somewhere.
OK. So what you're after is basically somthing that is shared storage but expires. Amazon have just announced a rather nice setting called object expiration (https://forums.aws.amazon.com/ann.jspa?annID=1303). Nothing like this for Windows Azure storage yet unfortunately, but, doesnt mean we can't come up with some other approach; indeed even come up with a better (more cost effective) approach.
You say that it needs to be instance independant which means using a local temp drive is out of the picture. As others have said my initial leaning would be towards Blob storage but you will have cleanup effort there. If you are working with large images (>1MB) or low throughput (<100rps) then I think Blob storage is the only option. If you are working with smaller images AND high throughput then the transaction costs for blob storage will start to really add up (I have a white paper coming out soon which shows some modelling of this but some quick thoughts are below).
For a scenario with small images and high throughput a better option might be to use the Windows Azure Cache as your temporary storaage area. At first glance it will be eye wateringly expensive; on a per GB basis (110GB/month for Cache, 12c/GB for Storage). But, with storage your transactions are paid for whereas with Cache they are 'free'. (Quotas are here: http://msdn.microsoft.com/en-us/library/hh697522.aspx#C_BKMK_FAQ8) This can really add up; e.g. using 100kb temp files held for 20 minutes with a system throughput of 1500rps using Cache is about $1000 per month vs $15000 per month for storage transactions.
The Azure Cache approach is well worth considering, but, to be sure it is the 'best' approach I'd really want to know;
Size of images
Throughput per hour
A bit more detail on the actual client interaction with the server during the crop process? Is it an interactive process where the user will pull the iamge into their browser and crop visually? Or is it just a simple crop?
Here is what I see as a possible approach:
user upload the picture
your code saves it to a blob and have some data backend to know the relation between user session and uploaded image (mark it as temp image)
display the image in the cropping user interface interface
when user is done cropping on the client:
4.1. retrieve the original from the blob
4.2. crop it according the data sent from the user
4.3. delete the original from the blob and the record in the data backend used in step 2
4.4. save the final to another blob (final blob).
And have one background process checking for "expired" temp images in the data backend (used in step 2) to delete the images and the records in the data backend.
Please note that even in WebRole, you still have the RoleEntryPoint descendant, and you still can override the Run method. Impleneting the infinite loop in the Run() (that method shall never exit!) method, you can check if there is anything for deleting every N seconds (depending on your Thread.Sleep() in the Run().
You can use the Azure blob storage. Have look at this tutorial.
Under sample will be help you.
https://code.msdn.microsoft.com/How-to-store-temp-files-in-d33bbb10
you have two way of temp file in Azure.
1, you can use Path.GetTempPath and Path.GetTempFilename() functions for the temp file name
2, you can use Azure blob to simulate it.
private long TotalLimitSizeOfTempFiles = 100 * 1024 * 1024;
private async Task SaveTempFile(string fileName, long contentLenght, Stream inputStream)
{
try
{
//firstly, we need check the container if exists or not. And if not, we need to create one.
await container.CreateIfNotExistsAsync();
//init a blobReference
CloudBlockBlob tempFileBlob = container.GetBlockBlobReference(fileName);
//if the blobReference is exists, delete the old blob
tempFileBlob.DeleteIfExists();
//check the count of blob if over limit or not, if yes, clear them.
await CleanStorageIfReachLimit(contentLenght);
//and upload the new file in this
tempFileBlob.UploadFromStream(inputStream);
}
catch (Exception ex)
{
if (ex.InnerException != null)
{
throw ex.InnerException;
}
else
{
throw ex;
}
}
}
//check the count of blob if over limit or not, if yes, clear them.
private async Task CleanStorageIfReachLimit(long newFileLength)
{
List<CloudBlob> blobs = container.ListBlobs()
.OfType<CloudBlob>()
.OrderBy(m => m.Properties.LastModified)
.ToList();
//get total size of all blobs.
long totalSize = blobs.Sum(m => m.Properties.Length);
//calculate out the real limit size of before upload
long realLimetSize = TotalLimitSizeOfTempFiles - newFileLength;
//delete all,when the free size is enough, break this loop,and stop delete blob anymore
foreach (CloudBlob item in blobs)
{
if (totalSize <= realLimetSize)
{
break;
}
await item.DeleteIfExistsAsync();
totalSize -= item.Properties.Length;
}
}