I am planning to use lokijs with a node server. I am not sure how much of the load can it handle in coparison with mongodb?
I would like to know how smooth can it run with 100000 documents in it?
Thanks
Vasanth
I am mid flow an app where I have about 7 collection and 60 + million docs combined ... so yes 100000 is fine, but will vary on doc size I suppose.
Related
I'm making a Flutter app that is like social media, so uses pictures a lot. With that in mind, looking at the Firestore data, the reads with about 15 people look like this:
But they are far outpaced by my storage bandwidth usage:
I'm thinking about possible reasons why this is the case. First, I am not sure if this could potentially be a problem, but I save each user's images into a respective folder:
Furthermore, looking through the images, I notice that file sizes average around a few hundred kilobytes, with a few being sized in megabytes, the largest I found is 9 MB. Are these too large?
Are there any other ideas I am missing? I am trying to implement caching on my front end to help with this problem, but I am open to any other possible reasons and solutions.
Furthermore, looking through the images, I notice that file sizes average around a few hundred kilobytes.
If you take a look, at the major social media apps, the average size is around a few tens of kilobytes and not a few hundred kilobytes. If you look at Instagram for instance, all images are around 40-50 kilobytes.
With a few being sized in megabytes, the largest I found is 9 MB. Are these too large?
Way too large.
Are there any other ideas I am missing?
Yes, resize the images before uploading them to storage.
I am trying to implement caching on my front end to help with this problem, but I am open to any other possible reasons and solutions.
For not reading an image from the server, each time the user uses your app, a caching mechanism will be helpful. However, it will not help the first time the user opens the app. When indeed it is needed to be downloaded.
Instead of having a single 9 MB image, it's better to have 180 images of 50 KB.
Going through the documentation, I implemented pagination but I am confused how firestore's local cache behave with pagination. Suppose my query is to get first 20 documents. For testing, I change paging size to 25 and start application again to get exactly same 20 documents(cached before) + 5 new documents. How does the cache mechanism will behave with respect to number of reads in this case? Will it cost 5 new reads or 25 new reads? I tried several times to see if firebase console stats could help but the read counts there made no sense.
Console stats before the call show 68 reads but after second query it should be either (68+5) or (68+25), instead it shows 76 read operations. These stats didn't help me out to figure out the behavior.
The cache only has an effect for any query when:
The client is offline
The query specifically uses the cache as a source
All other cases, the cache is not used, and the server sends all documents. Each document is read and sent to the client, and you will be billed for all those document reads. Pagination doesn't change this behavior at all.
Read this to learn more about how the cache works.
I am trying to use the set api to set an object in firebase. The object is fairly large, the serialized json is 2.6 mb in size. The root node has around 90 chidren, and in all there are around 10000 nodes in the json tree.
The set api seems to hang and does not call the callback.
It also seems to cause problems with the firebase instance.
Any ideas on how to work around this?
Since this is a commonly requested feature, I'll go ahead and merge Robert and Puf's comments into an answer for others.
There are some tools available to help with big data imports, like firebase-streaming-import. What they do internally can also be engineered fairly easily for the do-it-yourselfer:
1) Get a list of keys without downloading all the data, using a GET request and shallow=true. Possibly do this recursively depending on the data structure and dynamics of the app.
2) In some sort of throttled fashion, upload the "chunks" to Firebase using PUT requests or the API's set() method.
The critical components to keep in mind here is that the number of bytes in a request and the frequency of requests will have an impact on performance for others viewing the application, and also count against your bandwidth.
A good rule of thumb is that you don't want to do more than ~100 writes per second during your import, preferably lower than 20 to maximize your realtime speeds for other users, and that you should keep the data chunks in low MBs--certainly not GBs per chunk. Keep in mind that all of this has to go over the internets.
I'm writing an iPhone app, which needs to store about 10 records, each takes up 20-200 KB space. I need to choose between NSKeyedArchiver and SQLite. My questions are:
Which takes up less space to store?
Which one is faster to read into memory? Which is faster to write back?
Thanks!
For only 10 records, it will be much easier to use NSKeyedArchiver. SQL is for more intense database tasks in the hundreds of thousands of records, usually. I'd say it may not be worth the time to code that up unless you really need it.
We've developed a system with a search screen that looks a little something like this:
(source: nsourceservices.com)
As you can see, there is some fairly serious search functionality. You can use any combination of statuses, channels, languages, campaign types, and then narrow it down by name and so on as well.
Then, once you've searched and the leads pop up at the bottom, you can sort the headers.
The query uses ROWNUM to do a paging scheme, so we only return something like 70 rows at a time.
The Problem
Even though we're only returning 70 rows, an awful lot of IO and sorting is going on. This makes sense of course.
This has always caused some minor spikes to the Disk Queue. It started slowing down more when we hit 3 million leads, and now that we're getting closer to 5, the Disk Queue pegs for up to a second or two straight sometimes.
That would actually still be workable, but this system has another area with a time-sensitive process, lets say for simplicity that it's a web service, that needs to serve up responses very quickly or it will cause a timeout on the other end. The Disk Queue spikes are causing that part to bog down, which is causing timeouts downstream. The end result is actually dropped phone calls in our automated VoiceXML-based IVR, and that's very bad for us.
What We've Tried
We've tried:
Maintenance tasks that reduce the number of leads in the system to the bare minimum.
Added the obvious indexes to help.
Ran the index tuning wizard in profiler and applied most of its suggestions. One of them was going to more or less reproduce the entire table inside an index so I tweaked it by hand to do a bit less than that.
Added more RAM to the server. It was a little low but now it always has something like 8 gigs idle, and the SQL server is configured to use no more than 8 gigs, however it never uses more than 2 or 3. I found that odd. Why isn't it just putting the whole table in RAM? It's only 5 million leads and there's plenty of room.
Poured over query execution plans. I can see that at this point the indexes seem to be mostly doing their job -- about 90% of the work is happening during the sorting stage.
Considered partitioning the Leads table out to a different physical drive, but we don't have the resources for that, and it seems like it shouldn't be necessary.
In Closing...
Part of me feels like the server should be able to handle this. Five million records is not so many given the power of that server, which is a decent quad core with 16 gigs of ram. However, I can see how the sorting part is causing millions of rows to be touched just to return a handful.
So what have you done in situations like this? My instinct is that we should maybe slash some functionality, but if there's a way to keep this intact that will save me a war with the business unit.
Thanks in advance!
Database bottlenecks can frequently be improved by improving your SQL queries. Without knowing what those look like, consider creating an operational data store or a data warehouse that you populate on a scheduled basis.
Sometimes flattening out your complex relational databases is the way to go. It can make queries run significantly faster, and make it a lot easier to optimize your queries, since the model is very flat. That may also make it easier to determine if you need to scale your database server up or out. A capacity and growth analysis may help to make that call.
Transactional/highly normalized databases are not usually as scalable as an ODS or data warehouse.
Edit: Your ORM may have optimizations as well that it may support, that may be worth looking into, rather than just looking into how to optimize the queries that it's sending to your database. Perhaps bypassing your ORM altogether for the reports could be one way to have full control over your queries in order to gain better performance.
Consider how your ORM is creating the queries.
If you're having poor search performance perhaps you could try using stored procedures to return your results and, if necessary, multiple stored procedures specifically tailored to which search criteria are in use.
determine which ad-hoc queries will most likely be run or limit the search criteria with stored procedures.. can you summarize data?.. treat this
app like a data warehouse.
create indexes on each column involved in the search to avoid table scans.
create fragments on expressions.
periodically reorg the data and update statistics as more leads are loaded.
put the temporary files created by queries (result sets) in ramdisk.
consider migrating to a high-performance RDBMS engine like Informix OnLine.
Initiate another thread to start displaying N rows from the result set while the query
continues to execute.