Safe to Save Binary Data in Cloud Firestore Database? - firebase

I've always used the Cloud Firestore Database (and the old real-time one) to store text, and then use the Storage for images.
While using SurveyJS and AngularFirestore, I discovered I can push binary files into and out of the Firestore Database with the attached code. My question is: Is this OK?? I mean it works great, but I don't want to incur a cost or network slowdown...Thanks
var resultAsString = JSON.stringify(this.survey.data);
this.qs.saveSupplierQuestionnaire(this.companyid, this.id,this.survey.data)
...
saveSupplierQuestionnaire(userid:string, questionnaireid:string, questionnaireData:any) {
var resultAsString = JSON.stringify(questionnaireData);
var numCompleted = 0; /////test grading
const dbRef = this.afs.collection<questionnaire>('companies/' + userid + '/questionnaires/').doc(questionnaireid).update({results:resultAsString})

If it meets the needs of your application, then it's OK.
You should be aware than any time a document is read, the entire document is transferred to the client. So, even if you don't use the field with the binary data, you are going to make the user wait for the entire contents to be downloaded. This is true for all fields of a document, regardless of their types. There is really nothing special about binary fields, other than how the data is typed.

Related

Firebase storage url, new file keep same access token

Duplicate of: Firebase storage URL keeps changing with new token
When a user uploads a profile pic I store this in firebase storage with the file name as the uid.
Lets say the user then goes and makes say 100 posts + 500 comments and then updates their profile image.
Currently I have a trigger which goes and updates the profile image url in all of the post and comment documents. The reason I have to do this is that when the image is changed in storage the access token is changed and this is part of the url so the old url no longer works.
What I want to do is not have the access token change. If I can do this I can avoid the mass updates that will massively increase my firestore writes.
Is there any way to do this? or an alternative?
Edit:
Another solution if you don't mind making the file public.
Add this storage rule and you won't have to use a token to access the file.
This will allow read access to "mydir" globally in any subfolder.
match /{path=**}/mydir/{doc} {
allow read: if true;
}
There are only two options here:
You store the profile image URL only once, probably in the user's profile document, and look it up every time it is needed. In return you only have to write it once.
You store the profile image URL for every post, in which case you only have to load the post documents and not the profile URL for each. In return you'll have to write the profile URL in each post document, and update it though.
For smaller networks the former is more common, since you're more likely to see multiple posts from the same user, so you amortizing the cost of the extra lookup over multiple posts.
The bigger the network of users, the more interesting the second approach becomes, as you'll care about read performance and simplicity more than the writes you're focusing on right now.
In the end, there's no singular right answer here though. You'll have to decide for yourself what performance and cost profile you want your app to have.
Answer provided by #Prodigy here: https://stackoverflow.com/a/64129850/10222449
I tried this and it works well.
This will save millions of writes.
var storage = firebase.storage();
var pathReference = storage.ref('users/' + userId + '/avatar.jpg');
pathReference.getDownloadURL().then(function (url) {
$("#large-avatar").attr('src', url);
}).catch(function (error) {
// Handle any errors
});

Updating all the references to an image url in Cloud Firestore when the image is updated using Flutter

In my Flutter app, I have a userData collection on Cloud Firestore where I store user's data including name, image url, etc.. The user can create posts, add comments to post, etc. similar to any other social apps out there and so I have multiple other collections where the user's info is stored including the link to their profile image.
Let's say if the user adds a comment to a post, I save their name, profile image url and comment text as a document inside "postComment" collection and then I display his/her profile image, name and the comment text on the screen by reading this collection and document.
Now, if the user updates their profile image or even their name which will be reflected in the userData collection, I need to make sure that their name and image url are updated in all other collections as well.
What's the easiest and least costly way to do that? Do I need to loop through all my collections and their documents and update the field values, or is there like a simple cloud function that can handle this?
Thanks!
I also store user profile images in Firestore Storage, BUT I use a very consistent schema to make the images easy to "guess":
When I have a document such as "/People/{userID}", and within the document is a field "image" which stores the URL to the image...
...then I store it in Firestore at the path "People/{userID/image/image.jpg" (eg). This way it is trivial to generate a StorageRef to it, and a downloadURL.
All the other uses of it always are to the now-standardized URL. Change the image in Storage; all references update.
For most "user" applications, the only use of the image is to feed it to a web-page, so just the URL is needed, and let the browser do the rest of the work.
As Fattie somewhat more aggressively stated, generally all you need is the URL. But following by itself that means you still would have to find all the references and update them if the user changes the URL. Saving a copy in Firestore Storage, and using that consistent URL, means all references will be "updated" just by changing what is stored at that location. Disadvantage is it will count as a storage read when fetched.
I'm finding duplicating data in NoSQL is great when it's fairly static - created once, and not dynamically changed (which is a LOT of cases). If your application doesn't fit that, it's better to store a reference to the source-of-truth, and incur the cost of the "lookup".
Here's a couple utilities I use to make this easier:
export const makeStorageRefFromRecord = (
record,
key = null,
filename = null
) => {
return FirebaseStorage.ref(
record.ref.path + (key ? "/" + key : "") + (filename ? "/" + filename : "")
);
};
export const makeFileURLFromRecord = (record, key = null, filename = null) => {
return FirebaseStorage.ref(
record.ref.path + (key ? "/" + key : "") + (filename ? "/" + filename : "")
).getDownloadURL();
};
("key" is essentially the fieldname)
remember the refpath is a string of the "/" separated collection/document path to the record, and is completely knowable in a simple situation, such as "People/{userID}". If you keep this internal, you can use "filename" as simple as "image.jpg" so it's always the same - it's unique, because of the path.
Do I need to loop through all my collections and their documents and update the field values
Minimally, yes, that's what you have to do.
or is there like a simple cloud function that can handle this?
You can certainly write your own Cloud Function to do this as well. There is not an existing function that will just do what you want - you have to code it.
Alternatively, you can just store the URL is one document, store the ID of that document in the other documents that need to refer to it, and have the client make an query for the single document with the URL you need.
There are multiple ways to do that.
The best way to do that is instead of storing the profile picture image again and again, you can store document references. If you are storing the images as base64, this would also save a lost of space and is cost efficient.
Another way of doing it is less efficient but you can store the image in firestore and refer it from there.
Both of these are from refereces
The last way of doing it and probably the most inefficient is by querying. You can go to that collection of post (Or if you store each post as a collection, loop through all of them) and then add a where filter and search for the imageURL or more safely a unique ID and then you can change them all one by one
These are the ways that I know

How to store keywords in firebase firestore

My application use keywords extensively, everything is tagged with keywords, so whenever use wants to search data or add data I have to show keywords in auto complete box.
As of now I am storing keywords in another collection as below
export interface IKeyword {
Id:string;
Name:string;
CreatedBy:IUserMin;
CreatedOn:firestore.Timestamp;
}
export interface IUserMin {
UserId:string;
DisplayName:string;
}
export interface IKeywordMin {
Id:string;
Name:string;
}
My main document holds array of Keywords
export interface MainDocument{
Field1:string;
Field2:string;
........
other fields
........
Keywords:IKeywordMin[];
}
But problem is auto complete reads data frequently and my document reads quota increases very fast.
Is there a way to implement this without increasing reads for keyword ? Because keyword is not the real data we need to get.
Below is my query to get main documents
query = query.where("Keywords", "array-contains-any", keywords)
I use below query to get keywords in auto complete text box
query = query.orderBy("Name").startAt(searchTerm).endAt(searchTerm+ '\uf8ff').limit(20)
this query run many times when user types auto complete search which is causing more document reads
Does this answer your question
https://fireship.io/lessons/typeahead-autocomplete-with-firestore/
Though the receommended solution is to use 3rd party tool
https://firebase.google.com/docs/firestore/solutions/search
To reduce documents read:
A solution that come to my mind however I'm not sure if it's suitable for your use case is using Firestore caching feature. By default, firestore client will always try to reach the server to get the new changes on your documents and if it cannot reach the server, it will reach to the cached data on the client device. you can take advantage of this feature by using the cache first and reach the server only when you want. For web application, this feature is disabled by default and you can enable it like in
https://firebase.google.com/docs/firestore/manage-data/enable-offline
to help you understand this feature more check this article:
https://firebase.google.com/docs/firestore/manage-data/enable-offline
I found a solution, thought I would share here
Create a new collection named typeaheads in below format
export interface ITypeAHead {
Prefix:string;
CollectionName:string;
FieldName:string;
MatchingValues:ILookupItem[]
}
export interface ILookupItem {
Key:string;
Value:string;
}
depending on the minimum letters add either 2 or 3 letters to Prefix, and search based on the prefix, collection and field. so most probably you will end up with 2 or 3 document reads for on search.
Hope this helps someone else.

Cosmos DB image/file attachment

I have created a document with pdf attachment using below code and it's working (able to retrieve attached file).
var myDoc = new { id = "42", Name = "Max", City="Aberdeen" }; // this is the
document you are trying to save
var attachmentStream = File.OpenRead("c:/Path/To/File.pdf"); // this is the
document stream you are attaching
var client = await GetClientAsync();
var createUrl = UriFactory.CreateDocumentCollectionUri(DatabaseName,
CollectionName);
Document document = await client.CreateDocumentAsync(createUrl, myDoc);
await client.CreateAttachmentAsync(document.SelfLink, attachmentStream, new
MediaOptions()
{
ContentType = "application/pdf", // your application type
Slug = "78", // this is actually attachment ID
});
I can upload a document directly in blob storage and put that blob URL in the document.
Can anyone help me to understand the value of inbuild attachment feature? how this is better than blob and other option? where cosmos DB keep attachment?
I want to understand which scenario we should consider this option (I know 2GB per account limitation)
Can anyone help me to understand the value of inbuild attachment
feature?how this is better than blob and other option?
Based on this official doc, you could get answer for your question.
You could store two types data:
1.binary blobs/media
2.metadata (for example, location, author etc.) of a media stored in a remote media storage
In addition,attachments has garbage disposal mechanism which is different with Azure Blob Storage I think.
Azure Cosmos DB will ensure to garbage collect the media when all of
the outstanding references are dropped. Azure Cosmos DB automatically
generates the attachment when you upload the new media and populates
the _media to point to the newly added media. If you choose to store
the media in a remote blob store managed by you (for example,
OneDrive, Azure Storage, DropBox, etc.), you can still use attachments
to reference the media. In this case, you will create the attachment
yourself and populate its _media property.
So,per my understanding,if your resource data will be frequently added or deleted, I think you could consider using attachment. You just need to store remote URL into _media property.
where cosmos DB keep attachment?
Attachment is stored in the collection as JSON format document,it can be created, replaced, deleted, read, or enumerated easily using either REST APIs or any of the client SDKs. As I know, it can't display on the portal so far.
BTW, Azure cosmos db is more expensive than blob storage usually.I think cost is an important factor to consider. More details, you could refer to the price doc.
Hope I'm clear on this.

Please suggest a way to store a temp file in Windows Azure

Here I have a simple feature on ASP.NET MVC3 which host on Azure.
1st step: user upload a picture
2nd step: user crop the uploaded picture
3rd: system save the cropped picture, delete the temp file which is the uploaded original picture
Here is the problem I am facing now: where to store the temp file?
I tried on windows system somewhere, or on LocalResources: the problem is these resources are per Instance, so here is no guarantee the code on an instance shows the picture to crop will be the same code on the same instance that saved the temp file.
Do you have any idea on this temp file issue?
normally the file exist just for a while before delete it
the temp file needs to be Instance independent
Better the file can have some expire setting (for example, 1H) to delete itself, in case code crashed somewhere.
OK. So what you're after is basically somthing that is shared storage but expires. Amazon have just announced a rather nice setting called object expiration (https://forums.aws.amazon.com/ann.jspa?annID=1303). Nothing like this for Windows Azure storage yet unfortunately, but, doesnt mean we can't come up with some other approach; indeed even come up with a better (more cost effective) approach.
You say that it needs to be instance independant which means using a local temp drive is out of the picture. As others have said my initial leaning would be towards Blob storage but you will have cleanup effort there. If you are working with large images (>1MB) or low throughput (<100rps) then I think Blob storage is the only option. If you are working with smaller images AND high throughput then the transaction costs for blob storage will start to really add up (I have a white paper coming out soon which shows some modelling of this but some quick thoughts are below).
For a scenario with small images and high throughput a better option might be to use the Windows Azure Cache as your temporary storaage area. At first glance it will be eye wateringly expensive; on a per GB basis (110GB/month for Cache, 12c/GB for Storage). But, with storage your transactions are paid for whereas with Cache they are 'free'. (Quotas are here: http://msdn.microsoft.com/en-us/library/hh697522.aspx#C_BKMK_FAQ8) This can really add up; e.g. using 100kb temp files held for 20 minutes with a system throughput of 1500rps using Cache is about $1000 per month vs $15000 per month for storage transactions.
The Azure Cache approach is well worth considering, but, to be sure it is the 'best' approach I'd really want to know;
Size of images
Throughput per hour
A bit more detail on the actual client interaction with the server during the crop process? Is it an interactive process where the user will pull the iamge into their browser and crop visually? Or is it just a simple crop?
Here is what I see as a possible approach:
user upload the picture
your code saves it to a blob and have some data backend to know the relation between user session and uploaded image (mark it as temp image)
display the image in the cropping user interface interface
when user is done cropping on the client:
4.1. retrieve the original from the blob
4.2. crop it according the data sent from the user
4.3. delete the original from the blob and the record in the data backend used in step 2
4.4. save the final to another blob (final blob).
And have one background process checking for "expired" temp images in the data backend (used in step 2) to delete the images and the records in the data backend.
Please note that even in WebRole, you still have the RoleEntryPoint descendant, and you still can override the Run method. Impleneting the infinite loop in the Run() (that method shall never exit!) method, you can check if there is anything for deleting every N seconds (depending on your Thread.Sleep() in the Run().
You can use the Azure blob storage. Have look at this tutorial.
Under sample will be help you.
https://code.msdn.microsoft.com/How-to-store-temp-files-in-d33bbb10
you have two way of temp file in Azure.
1, you can use Path.GetTempPath and Path.GetTempFilename() functions for the temp file name
2, you can use Azure blob to simulate it.
private long TotalLimitSizeOfTempFiles = 100 * 1024 * 1024;
private async Task SaveTempFile(string fileName, long contentLenght, Stream inputStream)
{
try
{
//firstly, we need check the container if exists or not. And if not, we need to create one.
await container.CreateIfNotExistsAsync();
//init a blobReference
CloudBlockBlob tempFileBlob = container.GetBlockBlobReference(fileName);
//if the blobReference is exists, delete the old blob
tempFileBlob.DeleteIfExists();
//check the count of blob if over limit or not, if yes, clear them.
await CleanStorageIfReachLimit(contentLenght);
//and upload the new file in this
tempFileBlob.UploadFromStream(inputStream);
}
catch (Exception ex)
{
if (ex.InnerException != null)
{
throw ex.InnerException;
}
else
{
throw ex;
}
}
}
//check the count of blob if over limit or not, if yes, clear them.
private async Task CleanStorageIfReachLimit(long newFileLength)
{
List<CloudBlob> blobs = container.ListBlobs()
.OfType<CloudBlob>()
.OrderBy(m => m.Properties.LastModified)
.ToList();
//get total size of all blobs.
long totalSize = blobs.Sum(m => m.Properties.Length);
//calculate out the real limit size of before upload
long realLimetSize = TotalLimitSizeOfTempFiles - newFileLength;
//delete all,when the free size is enough, break this loop,and stop delete blob anymore
foreach (CloudBlob item in blobs)
{
if (totalSize <= realLimetSize)
{
break;
}
await item.DeleteIfExistsAsync();
totalSize -= item.Properties.Length;
}
}

Resources