Firebase: cloud storage high bandwith - firebase

I'm making a Flutter app that is like social media, so uses pictures a lot. With that in mind, looking at the Firestore data, the reads with about 15 people look like this:
But they are far outpaced by my storage bandwidth usage:
I'm thinking about possible reasons why this is the case. First, I am not sure if this could potentially be a problem, but I save each user's images into a respective folder:
Furthermore, looking through the images, I notice that file sizes average around a few hundred kilobytes, with a few being sized in megabytes, the largest I found is 9 MB. Are these too large?
Are there any other ideas I am missing? I am trying to implement caching on my front end to help with this problem, but I am open to any other possible reasons and solutions.

Furthermore, looking through the images, I notice that file sizes average around a few hundred kilobytes.
If you take a look, at the major social media apps, the average size is around a few tens of kilobytes and not a few hundred kilobytes. If you look at Instagram for instance, all images are around 40-50 kilobytes.
With a few being sized in megabytes, the largest I found is 9 MB. Are these too large?
Way too large.
Are there any other ideas I am missing?
Yes, resize the images before uploading them to storage.
I am trying to implement caching on my front end to help with this problem, but I am open to any other possible reasons and solutions.
For not reading an image from the server, each time the user uses your app, a caching mechanism will be helpful. However, it will not help the first time the user opens the app. When indeed it is needed to be downloaded.
Instead of having a single 9 MB image, it's better to have 180 images of 50 KB.

Related

Barcode that can hold a large amount of data (50 mb)

I am creating an app that allows you to quickly share a file (like a photo or video). It will work by generating something like a barcode, but it can be animated or anything. Just a scannable thing that can be used to shared large amounts of data. I know a QR code can only hold up to 3kb, but in this case I’m not limited to a static image. Anything that can be used to transfer data from a screen to a high resolution camera works. (However I don’t want it to upload the file to a server and then generate a QR code link to that file)
Thanks!
You need to have a storage account somewhere and hold the data in that storage and your QR code needs to redirect to that storage

Loading Bulk data in Firebase

I am trying to use the set api to set an object in firebase. The object is fairly large, the serialized json is 2.6 mb in size. The root node has around 90 chidren, and in all there are around 10000 nodes in the json tree.
The set api seems to hang and does not call the callback.
It also seems to cause problems with the firebase instance.
Any ideas on how to work around this?
Since this is a commonly requested feature, I'll go ahead and merge Robert and Puf's comments into an answer for others.
There are some tools available to help with big data imports, like firebase-streaming-import. What they do internally can also be engineered fairly easily for the do-it-yourselfer:
1) Get a list of keys without downloading all the data, using a GET request and shallow=true. Possibly do this recursively depending on the data structure and dynamics of the app.
2) In some sort of throttled fashion, upload the "chunks" to Firebase using PUT requests or the API's set() method.
The critical components to keep in mind here is that the number of bytes in a request and the frequency of requests will have an impact on performance for others viewing the application, and also count against your bandwidth.
A good rule of thumb is that you don't want to do more than ~100 writes per second during your import, preferably lower than 20 to maximize your realtime speeds for other users, and that you should keep the data chunks in low MBs--certainly not GBs per chunk. Keep in mind that all of this has to go over the internets.

Serving Lazy Thumbnail Images from Azure Blob Storage - What is the overhead of Exists?

I have a website where users upload images. These images are shown on various sections of the site with various thumbnail dimensions. Since the site is still under rapid development, I don't yet want to commit to a set number of thumb sizes. Thus I believe I should be generating thumbnails on a lazy basis.
Of the two options, which is the most performant way to do this:
When I go to serve the thumbnail, convert the dimensions into a canonical filename (like "bighouse-thumb-160x120"). Check if the file exists in blob storage using client.GetContainerReference(containerName).GetBlockBlobReference(key).Exists(); If it does not exist, generate it and save it.
When I go to serve the thumbnail, query my SQL database to see if the thumbnail exists. If it exists, get the blob URI from the DB and emit that as HTML. If it does not exist, generate it and update the SQL database.
I've used #2 in the past, but design-wise it is duplicating state which is bad. If querying azure for the existence of blobs is scalable, I'd rather do that. I don't really understand the threading model in Asp.Net. If I have 200 users requesting thumbs, will my azure Exists calls all happen in parallel? Even if they do, two round trips seem like a lot of overhead. I assume roundtripping the database is faster and lends itself more easily to generic caching solutions.
What is the right answer?
Regardless of the overhead, I would pre-generate thumbnails when you upload/store the image. This way you move the burden of generating thumbnails from something that is done many times (retrieving an image) to one that is much less often executed (storing an image).
Consider the following scenario, when you lazily generate thumbnails on the first view:
Check for an existing thumbnail (is false, first view remember ;))
Generate a thumbnail
Store the thumbnail
Send the thumbnail to the client
With pre-generated thumbnails the process is much shorter:
Send the thumbnail to the client
Done.
With 'lazy generating' the check for existing can be expensive due to network overhead (on every hit!), generating the thumbnail can be hugely expensive memory- & CPU-wise and than you have to store it, with network overhead again. You can even offload generating the thumbnail(s) to a separate process, possibly started by queue messages, to take the burden of generating the images even further away from your webservers.
However, this brings up the question of what you should do when you introduce a new thumbnail/image size. When you pre-generate the thumbnails you can write a simple tool to create the new sizes and store them, and if you went the separate process route it's even simpler. Just upgrade the separate process, generate a queue message for every existing image and just let it do its work.

Is it good to store Images in DB and retrieve? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Storing Images in DB - Yea or Nay?
Friends,
I have a requirement to show dynamically changing images in DataList. what I did was I am storing the images in DB as image datatype and retrieving the images. Is that good technique to store images in DB?
FYI, user can upload the images.
Regards,
Abhi
The answer is - it depends... Studies have been done (http://research.microsoft.com/pubs/64525/tr-2006-45.pdf) which have basically concluded if objects are larger than one megabyte on average, NTFS has a clear advantage over SQL Server. If the objects are under 256 kilobytes, the database has a clear advantage. Inside this range, it depends on how write intensive the workload is, and the storage age of a typical replica in the system.
I would store them as physical files on the server, but store the file path in the database, not the actual image. Storing the image in the database will increase it's size dramatically over time.
Images sizes will increase your DB size unnecessarily so not good practice to store it, instead store the file path in your db, which is not that big.
Storing image in DB should be done if you have some strong requirement or use case.
The thing you have to address if you store paths etc. is maintaining referential integrity with your images. What if somebody moves files, what if somebody uploads a new file with the same name (I'd suggest uploads get renamed to reflect some kind of key rather than keeping their original name of bob.jpg). You'll need to look at segmenting your directories etc. to keep the list sensible. Finding the images may be harder than if you store them in a DB also.
However, on the up side, you can form a CDN based on distribution of your images over diverse servers, subdomains, cloud etc. if you don't jam them all in your database
Depends on the size of the images and the DB you use.
For SQL Server it is pretty bad idea if they are larger than 1MB and you do not use the NTFS Filestreams for storage of your BLOB fields.
See for example http://www.simple-talk.com/sql/learn-sql-server/an-introduction-to-sql-server-filestream/
If you have a document oriented database like Couch DB it might be ok.
I would store them as physical files on the server, but store the file path in the database, not the actual image. And search the file as per the location store into Databasse. Storing the image in the database will increase it's size dramatically over time.
Storing them in database is also useful if you need to scale your site across multiple web servers.
If they are static then there is no use as they can be deployed with your site but things like avatars are generally better stored in the DB so they are available to all cluster members.

Optimal ASP.Net cache duration for a large site?

I've read lots of material on how to do ASP.Net caching but little on the optimal duration that pages should be cached for.
Let's say that I have a popular site with 50,000 pages. The content does not change frequently, so I could cache pages for up to an hour if I wanted. The server has 16 GB of RAM, but database connections are limited.
How long should pages be cached for?
My thinking is that if I set the cache duration too high (let's say 60 minutes), I will fill up memory with a fraction of the total content, which will continually be shuffled in and out of memory.
Furthermore, let's say that 10% of the pages are responsible for 90% of traffic. If the popular pages are hit every second, and the unpopular ones every hour, then a 60 second cache would only keep the load-intensive content cached without sacrificing freshness.
Should numerous but rarely-accessed content be cached at all?
I don't think you'll ever find a published formula or strict guideline on optimizing your cache. Every situation is going to be radically different, and even from the number of variables you're talking about here, it's impossibly to quantify.
It'll be a combination of experience, guessing, monitoring and incremental adjustments to find your caching sweet spot for any given application.
It sounds like you've already got a handle on what might happen, so I would start with caching the 10% that represent your high traffic, and optimize from there. You may or may not find any performance gains further down the road with your less-used content, but put your effort into the major optimizations first.
One option would be to utilize some sort of disk caching, rather than memory caching, at least for infrequently visited pages. But in general, the point of caching is to support a request that is likely to be made again in a short time. So I would certainly give preference to the frequently requested pages.

Resources