Realtime database cpu usage %100 - firebase

Whatever method I try, database will be locked.
the processor will be 100% for at least 5 minutes.
I used this data to log
Example Structure
there are at least 10 million records here
RoomName:
playcount:
user_id:
proccess:
value:
time:
123232132321312313443:
gmc_10:
xasdddfdsdffdsdfff:
remove_begin:
value: 1200
timestamp: 9888439944
what I tried
1- I tried deletes nodes. I can't delete one nodes. because I can't read one nodes.
FirebaseDatabase.DefaultInstance.RootReference.Child ("player_room").LimitToFirst(1).
Result : cpu usage 100%
2- I tried orderbykey and started at end_at
FirebaseDatabase.DefaultInstance.RootReference.Child ("player_room").OrderByKey().StartAt("0").EndAt("100").LimitToFirst(1).GetValueAsync().ContinueWith(task=>{
Result: Cpu usage %100
3- I tried to export from the firebase console
Result : cpu usage 100%
how can i read any one nodes?
i want to read "123232132321312313443" wildcard nodes
Not important ordering, sorting, equals.
i want just read any nodes.

If you have a list of 10 million nodes you're going to have a hard time reading that list, and can only read data if you know its exact path. So it is somewhat expected that the database will be at 100% capacity for a bit when you try to access that list. But I would expect it to eventually completely the operation, as long as you request a reasonable subset of the data. If that doesn't happen for you, add information about the exact error you get to the question or reach out to Firebase support for personalized help in troubleshooting.
Aside from that: you might want to enable Firebase's nightly backups, and use those to locally determine the exact path of the specific node you want to read/delete from the online database.

Related

How to delete many of data in "Realtime Database"

I want to delete all data in "Realtime Database", without increasing "Usage Load" in "Realtime Database".
any idea for deleting that data?
is 420,000+ data in realtime database
Here is image
If you can help me, its very usefull..
Image Usage Load
There is support for deleting large nodes built into the Firebase CLI these days as explained in this blog How to Perform Large Deletes in the Realtime Database:
If you want to delete a large node, the new recommended approach is to use the Firebase CLI (> v6.4.0). The CLI automatically detects a large node and performs a chunked delete efficiently.
$ firebase database:remove /path/to/delete
My initial write-up is below. 👇 I'm pretty sure the CLI mentioned above implements precisely this approach, so that's likely a faster way to accomplish this, but I'm still leaving this explanation as it may be useful as background.
Deleting data is a write operation, so it's by definition going to put load on the database. Deleting a lot of data causes a lot of load, either as a spike in a short period or (if you spread it out) as a lifted load for a longer period. Spreading the load out is the best way to minimize impact for your regular users.
The best way to delete a long, flat list of keys (as you seem to have) is to:
Get a list of that keys, either from a backup of your database (which happens out of band), or by using the shallow parameter on the REST API.
Delete the data in reasonable batches, where reasonable depends on the amount of data you store per key. If each key is just a few properties, you could start deleting 100 keys per batch, and check how that impacts the load to determine if you can ramp up to more keys per batch.

Huge amount of RU to write document of 400kb - 600kb on Azure cosmos db

This is the log of my azure cosmos db for last write operations:
Is it possible that write operations of documents with size between 400kb to 600kb have this costs?
Here my document (a list of coordinate):
Basically I thought at the beginning it was a hotPartition problem, but afterwards I understood (I hope) that it is a problem in the loading of documents ranging in size from 400kb to 600kb. I wanted to understand if there was something wrong in the database setting, in the indexing policy or other as it seems to me anomalous that about 3000 ru are used to load a json of 400kb, when in the documentation it is indicated that to load a file of equal size at 100kb it takes about 50ru. Basically the document to be loaded is a road route and therefore I would not know in what other way to model it.
This is my indexing policy:
Thanks to everybody. I spent months behind this problem without having solutions...
It's hard to know for sure what the expected RU/s cost should be to ingest a 400KB-600KB item. The cost of this operation will depend on the size of the item, your indexing policy and the structure of the item itself. Greater hierarchy depth is more expensive to index.
You can get a good estimate for what the cost for a single write for an item will be using the Cosmos Capacity Calculator. In the calculator, click Sign-In, cut/paste your index policy, upload a sample document, reduce the writes per second to 1, then click calculate. This should give you the cost to insert a single item.
One thing to note here, is if you have frequent updates to a small number of properties I would recommend you split the documents into two. One with static properties, and another that is frequently updated. This can drastically reduce the cost for updates on large documents.
Hope this is helpful.
You can also pull the RU cost for a write using the SDK.
Check storage consumed
To check the storage consumption of an Azure Cosmos container, you can run a HEAD or GET request on the container, and inspect the x-ms-request-quota and the x-ms-request-usage headers. Alternatively, when working with the .NET SDK, you can use the DocumentSizeQuota, and DocumentSizeUsage properties to get the storage consumed.
Link.

Unity + Firebase: is it possible to append data to a keys value, or do I have to retrieve keys data every time?

I'm a bit worried that I will reach the free data limits of Firebase in a student project.
Basically my question is:
is it possible to append to the end of the string instead of retrieving key and value, appending and uploading again.
What I want to achieve:
I have to create statistics of user right/wrong answers for particular questions.
I want to have a kvp:
answers: 1r/5w/3r
Where number is the number of users guesses and r/w means right wrong. Whenever the guessing session ends I want to add /numberOfGuesses+RightOrWrongAnswer and the end.
I'm using Unity 2018.
Thank you in advance for all the help!
I don't know how your game is architected or how many people are playing, but I'd be surprised if you hit your free limit on a student project (you can store 1GB and download 10GB). That string is 8 bytes, let's assume worst case scenario: as a UTF32 string, that would be 32 bytes of data - you'd have to pull that down 312 million times to hit a cap (there'll be some overhead, but I can't imagine it being a hugely impactful). If you're afraid of being charged, you can opt to not have a credit card on file to be doubly sure you stay on a student budget.
If you want to reduce the amount of reading/writing though, I might suggest that instead of:
key: <value_string> (so, instead of session_id: "1r/5w/3r")
you structure more like:
key:
- wrong: 5
- right: 3
So have two more values nested under your key. One for all the wrong answers, just an incrementing integer. Then one for all the right answers: just an incrementing integer.
The mechanism to "append" would be a transaction, and you should use these whether you're mutating a string or counter. Firebase tries to be smart with data usage and offline caching, but you don't get much more control other than that.
If order really matters, you might want to get cleverer. You'll generally want to work with the abstractions Realtime Database gives you though to maximize any inherent optimizations (it likes to think in terms of JSON documents, so think about your data layout similarly). This may not be as data optimal, but you may want to consider instead using a ledger of some kind (perhaps using ServerValue.Timestamp to record a single right or wrong answer, and having a cloud function listening to sum up the results in the background after a game - this would be especially useful if you plan on having a lot of users trying to write the same key at the same time).

Azure Cosmos DB aggregation and indexes

I'm trying to use Cosmos DB and I'm having some trouble making a simple count in a collection.
My collection schema is below and I have 80.000 documents in this collection.
{
"_id" : ObjectId("5aca8ea670ed86102488d39d"),
"UserID" : "5ac161d742092040783a4ee1",
"ReferenceID" : 87396,
"ReferenceDate" : ISODate("2018-04-08T21:50:30.167Z"),
"ElapsedTime" : 1694,
"CreatedDate" : ISODate("2018-04-08T21:50:30.168Z")
}
If I run this command below to count all documents in collection, I have the result so quickly:
db.Tests.count()
But when I run this same command but to a specific user, I've got a message "Request rate is large".
db.Tests.find({UserID:"5ac161d742092040783a4ee1"}).count()
In the Cosmos DB documentation I found this cenario and the suggestion is increase RU. Currently I have 400 RU/s, when I increase to 10.000 RU/s I'm capable to run the command with no errors but in 5 seconds.
I already tryed to create index explicity, but it seems Cosmos DB doesn't use the index to make count.
I do not think it is reasonable to have to pay 10,000 RU / s for a simple count in a collection with approximately 100,000 documents, although it takes about 5 seconds.
Count by filter queries ARE using indexes if they are available.
If you try count by filter on a not indexed column the query would not time out, but fail. Try it. You should get error along the lines of:
{"Errors":["An invalid query has been specified with filters against path(s) excluded from indexing. Consider adding allow scan header in the request."]}
So definitely add a suitable index on UserID.
If you don't have index coverage and don't get the above error then you probably have set the enableScanInQuery flag. This is almost always a bad idea, and full scan would not scale. Meaning - it would consume increasingly large amounts of RU as your dataset grows. So make sure it is off and index instead.
When you DO have index on the selected column your query should run. You can verify that index is actually being used by sending the x-ms-documentdb-populatequerymetrics header. Which should return you confirmation with indexLookupTimeInMs and indexUtilizationRatio field. Example output:
"totalExecutionTimeInMs=8.44;queryCompileTimeInMs=8.01;queryLogicalPlanBuildTimeInMs=0.04;queryPhysicalPlanBuildTimeInMs=0.06;queryOptimizationTimeInMs=0.00;VMExecutionTimeInMs=0.14;indexLookupTimeInMs=0.11;documentLoadTimeInMs=0.00;systemFunctionExecuteTimeInMs=0.00;userFunctionExecuteTimeInMs=0.00;retrievedDocumentCount=0;retrievedDocumentSize=0;outputDocumentCount=1;outputDocumentSize=0;writeOutputTimeInMs=0.01;indexUtilizationRatio=0.00"
It also provides you some insight where the effort has gone if you feel like RU charge is too large.
If index lookup time itself is too high, consider if you index is selective enough and if the index settings are suitable. Look at your UserId values and distribution and adjust the index accordingly.
Another wild guess to consider is to check if the API you are using would defer executing find(..) until it knows that count() is really what you are after. It is unclear which API you are using. If it turns out it is fetching all matching documents to client side before doing the counting then that would explain unexpectedly high RU cost, especially if there are large amount of matching documents or large documents involved. Check the API documentation.
I also suggest executing the same query directly in Azure Portal to compare the RU cost and verify if the issue is client-related or not.
I think it just doesn't work.
The index seems to be used when selecting the documents to be counted, but then the count is done by reading each document, so effectively consuming a lot of RU.
This query is cheap and fast:
db.Tests.count({ UserID: { '$eq': '5ac161d742092040783a4ee1' }})
but this one is slow and expensive:
db.Tests.count({ ReferenceID: { '$gt': 10 }})
even though this query is fast:
db.Tests.find({ ReferenceID: { '$gt': 10 }}).sort({ ReferenceID: 1 })
I also found this: https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/36142468-make-count-aware-of-indexes. Note the status: "We have started work on this feature. Will update here when this becomes generally available."
Pretty disappointing to be honest, especially since this limitation hasn't been addressed for almost 2 years. Note - I am not an expert in this matter and I'd love to be proven wrong, since I also need this feature.
BTW: I noticed that simple indexes seem to be created automatically for each individual field, so no need to create them manually.

Firebase database high delay after a long standby

I'm currently testing Firebase on a non-production Firebase app which I am the only one who works on.
When I try to query the database to retrieve the data after there has not been any query during the last 24 hours, the query take about 8 seconds. After a query is done, the next ones would take normal amount of time (about 100ms).
This is not about caching the queries, by "next queries" I mean new queries which are not the same.
To reproduce it:
Create a database node called users, users children are user data (first name, last name, age, gender, etc)
Add 500,000 users to this node
Get a user by its UID and measure the time. (It should take about 100ms)
Wait 24 hours (I don't know the exact time, but I'm sure about 24 hours)
Get any user by its UID and measure the time. (It should take about 8sec)
Get any user by its UID and measure the time. (It should take about 100ms)
I want to know if this is a known issue to Firebase realtime database or not?
I reached Firebase support, they were able to recreate the issue and faced a wait time of about 6 seconds. Here is their answer after the investigation:
It looks like this is intended behavior. The realtime database queries work by building the index in-memory, which takes time linear to the number of nodes at that location. Once the index is built things are very fast, but the initial build can take a bit to build, especially for large locations.
If you wants the index to stay in memory on the database you should have a listener always listening for this query.
So basically the database takes a long time to process the query because of indexing the large database.
The problem can be solved by keeping a listener on the database or querying the database every few hours.
In production it is not very likely that you face this problem, because the database is being accessed by the user all the time, but if your database is not accessed all the time and you don't want the users experience that long wait time, you should utilize the discussed solution.
Firebase keeps recently used data in its internal cache. This cache is cleared after a few minutes.
But the exact numbers depend on how much data you're loading and how you're loading that data. Without seeing a specific setup that shows how to reproduce these numbers there really isn't much anyone can say.

Resources