I'm using firebase firestore for my react-native app, I'm creating a app that will send user geolocation to firestore and generate heatmap, and the app will send user's location every 5 mins, my data looks like this
Right now I have about 1000 documents, every time I refresh the app, it will try to fetch all coords to generate the heatmap.
The problem I'm having is when it generate the heatmap, it will need to read all 1000 documents, what if I have 5000 coords/documents, and I have 10 users to use it, it will reach the documents read limit in firebase free plan which is 50k/day.
I know I can pay some money to increase the read limit, but just wondering if any one run into this and find optimize way to solve it. Thanks!
I don't know all the constraints of your application, but you could possibly store all the coordinates of one month in one document, in an array, reducing by 8928 the number of document reads.
If I did the maths corrrecly, based on this documentation page https://firebase.google.com/docs/firestore/storage-size which explains the Storage size calculations, you can calculate that a doc with 3 arrays named lat, long and ts under your coords collection which stores the data for 288*31 triplet values (288 = every 5 minutes in one day) will have a size of maximum 857,088 bytes, which is under the maximum possible size for a document (i.e. 1,048,576 bytes) as presented here: https://firebase.google.com/docs/firestore/quotas
Of course, you'll have to deal with the array fields but for that you can use firebase.firestore.FieldValue.arrayUnion();, see https://firebase.google.com/docs/firestore/manage-data/add-data#update_elements_in_an_array
Related
I have around 25 devices located in different parts of my city. Each device generates temperature and humidity data as a write in the Firestore database. Every hour I want to average the temp and humidity info and store/append it as a map in a single document.
{ DateTime: { temp: x, humidity: y }}
{ DateTime: { temp: x1, humidity: y1 }}
I'm using python to get the data and average it and then write it in the document.
Then I want to load this data in the front end to show the user analytics based on hours, days, weeks, and months. But Firestore cannot store more than 5 MB in a single document so I will run out of space fast. Is there any better way of generating analytics from stored data in collections?
Most of the people I talk to recommend switching away from firebase for such use cases, but I do believe there is a way of doing this that is generally a good practice.
Firestore cannot store more than 5 MB in a single document
According to the docs, it's actually 1 Mib the maximum size. But that is more than enough to store such data as yours.
Since you cannot store all the data of all devices in a single document, I recommend you store only the data that corresponds to a single day. If you're only storing numbers, maybe it can fit in one document the data for a week, or even for a month. So you have to measure that.
So in this way, there are no limitations. You can create as many documents as you need. And of course, you won't get "out of space".
I am working on a more complicated database where I a want to store lots of data, the issue with fire store is the limit to 1MB per documents, and I am splitting my data in to different document but still according to my calculation the size will be bigger than the limit, yet I cannot find the limit for the Realtime database, and I want to be sure before switching to it, my single document in some cases could hit 6-9mb when scaling big.... at first I want to go with mongodb but I wanted to try the google cloud services.. any idea if the do size is same for both Realtime and firestore ?
Documents are part of Firestore (that have 1 MB max size limit each) while Realtime Database on the other hand is just a large JSON like thing. You can find limits of Realtime database in the documentation.
Property
Limit
Description
Maximum depth of child nodes
32
Each path in your data tree must be less than 32 levels deep.
Length of a key
768 Bytes
Keys are UTF-8 encoded and can't contain new lines or any of the following characters: . $ # [ ] / or any ASCII control characters (0x00 - 0x1F and 0x7F)
Maximum size of a string
10 MB
Data is UTF-8 encoded
There isn't a limit of number of child nodes you can have but just keep the max depth in mind. Also it might be best if you could share a sample of what currently takes over 6 MB in Firestore and maybe restructure the database.
I have the longitude and latitude of all users in cloud-firebase and I wanted to loop the data to get the distance between the user and other users and put a condition if the distance is <200 m between the user and users, a notification will appear to notify the user that someone is close to you. Can I have suggestions of any tutorial videos that teach this?
While Firestore can store lat/lon data, it has no built-in functionality to query the pairs. If you want to be able to query for documents within a certain range from a specific point, you will need to:
Add a so-called geohash to each document, which encoded the lat/lon into a single value.
Then query on that geohash value to find documents that may be in range.
Post-process the documents in your application code, to verify their actual distance.
This entire process is quite well documented nowadays in the Firebase solution page for running geoqueries on Firestore. If you have a bit more time, I also recommend watching the video of my talk on the topic from a few years ago: Querying Firebase and Firestore based on geographic location or distance.
I have a Cloud Function in Python 3.7 to write/update small documents to Firestore. Each document has an user_id as Document_id, and two fields: a timestamp and a map (a dictionary) with three key-value objects, all of them are very small.
This is the code I'm using to write/update Firestore:
doc_ref = db.collection(u'my_collection').document(user['user_id'])
date_last_seen=datetime.combine(date_last_seen, datetime.min.time())
doc_ref.set({u'map_field': map_value, u'date_last_seen': date_last_seen})
My goal is to call this function one time every day, and write/update ~500K documents. I have tried the following tests, for each one I include the execution time:
Test A: Process the output to 1000 documents. Don't write/update Firestore -> ~ 2 seconds
Test B: Process the output to 1000 documents. Write/update Firestore -> ~ 1 min 3 seconds
Test C: Process the output to 5000 documents. Don't write/update Firestore -> ~ 3 seconds
Test D: Process the output to 5000 documents. Write/update Firestore -> ~ 3 min 12 seconds
My conclusion here: writing/updating Firestore is consuming more than 99% of my compute time.
Question: How to write/update ~500 K documents every day efficiently?
It's not possible to prescribe a single course of action without knowing details about the data you're actually trying to write. I strongly suggest you read the documentation about best practices for Firestore. It will give you a sense of what things you can do to avoid problems with heavy write loads.
Basically, you will want to avoid these situations, as described in that doc:
High read, write, and delete rates to a narrow document range
Avoid high read or write rates to lexicographically close documents,
or your application will experience contention errors. This issue is
known as hotspotting, and your application can experience hotspotting
if it does any of the following:
Creates new documents at a very high rate and allocates its own monotonically increasing IDs.
Cloud Firestore allocates document IDs using a scatter algorithm. You should not encounter hotspotting on writes if you create new
documents using automatic document IDs.
Creates new documents at a high rate in a collection with few documents.
Creates new documents with a monotonically increasing field, like a timestamp, at a very high rate.
Deletes documents in a collection at a high rate.
Writes to the database at a very high rate without gradually increasing traffic.
I won't repeat all the advice in that doc. What you do need to know is this: because of the way that Firestore is built to scale massively, limits are placed on how quickly you can write data into it. The fact that you have to scale up gradually is probably going to be your main problem that can't be solved.
I achieved my needs with batched queries. But according to Firestore documentation there is another faster way:
Note: For bulk data entry, use a server client library with
parallelized individual writes. Batched writes perform better than
serialized writes but not better than parallel writes. You should use
a server client library for bulk data operations and not a mobile/web
SDK.
I also recommend to take a look to this post in stackoverflow with examples in Node.js
Firestore costs are based on document operations and on size of stored data.
In the Firebase console, we can easily track number of document operations but I don't find any place where I can track size of stored data.
I have only found in Google Cloud Console (in App Engine > Quotas) a metric corresponding to the amount of stored data in gigabyte stored the current day, but not the total amount of stored data.
Is there a means of monitoring total size of stored data (ideally with indexes included) ?
It seems that the only available option at this moment is to calculate the storage size for Cloud Firestore in Native mode manually.
I have submitted a feature request asking to implement a solution that would display the size. I'd recommend you to star that request to be notified once there is an update in the thread.