MeteorJS: Store/Call MongoDB Document Size (bsonsize) - meteor

In my MeteorJS app, documents grow very rapidly. When documents reach the 16MB, standard document size limit in MongoDB, my application starts erroring, notifying me that the document size is too large to perform any more updates to the document:
exception: BSONObj size: 16895320 (0xEEEEEEEE) is invalid. Size must be between 0 and 16793600(16MB)
To prevent my application from reaching this state of error, I want to be able to lookup the size (bsonsize) of a the document ahead of time. If the bsonsize is above, say 15.8MB, create a new document and start logging data there. This way, no errors will be encountered.
Using the Mongo Shell, bsonsize can be determined via: Object.bsonsize(db.collection.findOne({_id:'document id here'})). But Object.bsonsize() does not appear to be supported by Meteor within javascript:
TypeError: Object function Object() { [native code] } has no method 'bsonsize'
How can this be done in javascript/MeteorJS? Thanks!

Related

How to store keywords in firebase firestore

My application use keywords extensively, everything is tagged with keywords, so whenever use wants to search data or add data I have to show keywords in auto complete box.
As of now I am storing keywords in another collection as below
export interface IKeyword {
Id:string;
Name:string;
CreatedBy:IUserMin;
CreatedOn:firestore.Timestamp;
}
export interface IUserMin {
UserId:string;
DisplayName:string;
}
export interface IKeywordMin {
Id:string;
Name:string;
}
My main document holds array of Keywords
export interface MainDocument{
Field1:string;
Field2:string;
........
other fields
........
Keywords:IKeywordMin[];
}
But problem is auto complete reads data frequently and my document reads quota increases very fast.
Is there a way to implement this without increasing reads for keyword ? Because keyword is not the real data we need to get.
Below is my query to get main documents
query = query.where("Keywords", "array-contains-any", keywords)
I use below query to get keywords in auto complete text box
query = query.orderBy("Name").startAt(searchTerm).endAt(searchTerm+ '\uf8ff').limit(20)
this query run many times when user types auto complete search which is causing more document reads
Does this answer your question
https://fireship.io/lessons/typeahead-autocomplete-with-firestore/
Though the receommended solution is to use 3rd party tool
https://firebase.google.com/docs/firestore/solutions/search
To reduce documents read:
A solution that come to my mind however I'm not sure if it's suitable for your use case is using Firestore caching feature. By default, firestore client will always try to reach the server to get the new changes on your documents and if it cannot reach the server, it will reach to the cached data on the client device. you can take advantage of this feature by using the cache first and reach the server only when you want. For web application, this feature is disabled by default and you can enable it like in
https://firebase.google.com/docs/firestore/manage-data/enable-offline
to help you understand this feature more check this article:
https://firebase.google.com/docs/firestore/manage-data/enable-offline
I found a solution, thought I would share here
Create a new collection named typeaheads in below format
export interface ITypeAHead {
Prefix:string;
CollectionName:string;
FieldName:string;
MatchingValues:ILookupItem[]
}
export interface ILookupItem {
Key:string;
Value:string;
}
depending on the minimum letters add either 2 or 3 letters to Prefix, and search based on the prefix, collection and field. so most probably you will end up with 2 or 3 document reads for on search.
Hope this helps someone else.

How to read documents from Change Feed in Azure Cosmos DB since last checkpoint after restart?

I am using Change Feed processor library to read the Change Feed on a partitioned collection and below is the code for how I have configure it. I ma using most of the default options.
ChangeFeedProcessorOptions feedProcessorOptions = new
{
LeaseRenewInterval = TimeSpan.FromSeconds(15),
};
var docObserverFactory = DocumentFeedObserverFactory.Create(this.destinationCollectionInfo, this.dbRepository);
this.builder
.WithHostName(hostName)
.WithFeedCollection(this.monitoredCollectionInfo)
.WithLeaseCollection(this.leaseCollectionInfo)
.WithProcessorOptions(feedProcessorOptions)
.WithObserverFactory(docObserverFactory);
This runs fine as long as the Change Feed application is running and documents are being inserted/updated in the collection and the Change Feed app picks them up as expected.
The problem happens when I stop the Change Feed app for sometime and insert/update few documents in the Collection. Then when I start the Change Feed app, it doesn't pick changes from where it last left. Those changes that were inserted when the Change Feed app was stopped are lost. But when I set the flag StartFromBeginning to true, it picks everything from the start including changes that were inserted when the Change Feed app was stopped in between for sometime.
My understanding of read from current (StartFromBeginning to false) is that the Change Feed reads documents since it last left. But that doesn't seem to happen. Please help.
There are two ways to continue from exactly where you left it.
The first, and more accurate one, is to store the Continuation token of the last thing you read. That way you can specify it when you start again and it will win over both the StartTime and the StartFromBeginning flags.
The second one is to provide the StartTime property which will try and find the continuation token of a given time automatically. It has an approximate 5 second precision so there is a chance that you might miss some documents though.

Increase firebase cloud firestore string upload size

Hello I am using firebase and I am running into an error when I am trying to store a (large) data object into the database. Here is the exact error I am getting:
ERROR Error: Uncaught (in promise): FirebaseError: [code=invalid-argument]: The value of property "values" is longer than 1048487 bytes.
This is because firebase doesnt allow storage of strings larger than 1048487 bytes. (documentation).
Has anyone experienced this before and found a way to increase the store limit? From what I have read it doesn't seem like purchasing a plan will fix this.
Here is the code I am using to store the data just incase its needed:
storeSearchItem(searchItem, userID, data) {
this.db.collection('users').doc(userID).collection('searches').add({
searchTerm: searchItem.searchTerm,
searchDate: new Date().toString(),
values: data, //this values property is huge, a couple megabytes
});
}
The documentation is telling you there is a hard limit. It can't be changed. You will have to find a different way to store your data.

Firebase database change automatically [duplicate]

Given the starting time/date and duration, how can I make a server side calculation that determines if an object is "finished", "in progress", or "upcoming"
--Show
--duration: "144"
--startDate: "2015-11-10"
--startTime: "14:00"
--status: "?"
Client-side javascript to determine if the show has started yet:
// if negative, then show hasn't started yet
var time = (-(startdate.getTime() - currentdate.getTime()) / 1000 / 60);
Client-side javascript to determine if the show has finished running yet:
// if negative, then show has finished
var timeLeft = channelDuration - timerStartTime;
There is no way to run your own server-side code on Firebase. See:
Common Firebase application architectures
Firebase Hosting with own server node.js
How would I run server-side code in Firebase?
How to write custom code (logic) when using firebase
But you can store a server-side timestamp, which seems what you're trying to do:
ref.child('Show/startTimestamp').set(Firebase.ServerValue.TIMESTAMP);
You can then get the shows that haven't started yet with:
var shows = ref.child('Shows');
ref.orderByChild('startTimeStamp').startAt(Date.now()).on(...
For someone passing by, I think now Firebase allow you to do this by Cloud Function. For this case, you can create the function that determine the status of the status by other parameter when the data is added to you database.
Please checkout
https://firebase.google.com/docs/functions/

Please suggest a way to store a temp file in Windows Azure

Here I have a simple feature on ASP.NET MVC3 which host on Azure.
1st step: user upload a picture
2nd step: user crop the uploaded picture
3rd: system save the cropped picture, delete the temp file which is the uploaded original picture
Here is the problem I am facing now: where to store the temp file?
I tried on windows system somewhere, or on LocalResources: the problem is these resources are per Instance, so here is no guarantee the code on an instance shows the picture to crop will be the same code on the same instance that saved the temp file.
Do you have any idea on this temp file issue?
normally the file exist just for a while before delete it
the temp file needs to be Instance independent
Better the file can have some expire setting (for example, 1H) to delete itself, in case code crashed somewhere.
OK. So what you're after is basically somthing that is shared storage but expires. Amazon have just announced a rather nice setting called object expiration (https://forums.aws.amazon.com/ann.jspa?annID=1303). Nothing like this for Windows Azure storage yet unfortunately, but, doesnt mean we can't come up with some other approach; indeed even come up with a better (more cost effective) approach.
You say that it needs to be instance independant which means using a local temp drive is out of the picture. As others have said my initial leaning would be towards Blob storage but you will have cleanup effort there. If you are working with large images (>1MB) or low throughput (<100rps) then I think Blob storage is the only option. If you are working with smaller images AND high throughput then the transaction costs for blob storage will start to really add up (I have a white paper coming out soon which shows some modelling of this but some quick thoughts are below).
For a scenario with small images and high throughput a better option might be to use the Windows Azure Cache as your temporary storaage area. At first glance it will be eye wateringly expensive; on a per GB basis (110GB/month for Cache, 12c/GB for Storage). But, with storage your transactions are paid for whereas with Cache they are 'free'. (Quotas are here: http://msdn.microsoft.com/en-us/library/hh697522.aspx#C_BKMK_FAQ8) This can really add up; e.g. using 100kb temp files held for 20 minutes with a system throughput of 1500rps using Cache is about $1000 per month vs $15000 per month for storage transactions.
The Azure Cache approach is well worth considering, but, to be sure it is the 'best' approach I'd really want to know;
Size of images
Throughput per hour
A bit more detail on the actual client interaction with the server during the crop process? Is it an interactive process where the user will pull the iamge into their browser and crop visually? Or is it just a simple crop?
Here is what I see as a possible approach:
user upload the picture
your code saves it to a blob and have some data backend to know the relation between user session and uploaded image (mark it as temp image)
display the image in the cropping user interface interface
when user is done cropping on the client:
4.1. retrieve the original from the blob
4.2. crop it according the data sent from the user
4.3. delete the original from the blob and the record in the data backend used in step 2
4.4. save the final to another blob (final blob).
And have one background process checking for "expired" temp images in the data backend (used in step 2) to delete the images and the records in the data backend.
Please note that even in WebRole, you still have the RoleEntryPoint descendant, and you still can override the Run method. Impleneting the infinite loop in the Run() (that method shall never exit!) method, you can check if there is anything for deleting every N seconds (depending on your Thread.Sleep() in the Run().
You can use the Azure blob storage. Have look at this tutorial.
Under sample will be help you.
https://code.msdn.microsoft.com/How-to-store-temp-files-in-d33bbb10
you have two way of temp file in Azure.
1, you can use Path.GetTempPath and Path.GetTempFilename() functions for the temp file name
2, you can use Azure blob to simulate it.
private long TotalLimitSizeOfTempFiles = 100 * 1024 * 1024;
private async Task SaveTempFile(string fileName, long contentLenght, Stream inputStream)
{
try
{
//firstly, we need check the container if exists or not. And if not, we need to create one.
await container.CreateIfNotExistsAsync();
//init a blobReference
CloudBlockBlob tempFileBlob = container.GetBlockBlobReference(fileName);
//if the blobReference is exists, delete the old blob
tempFileBlob.DeleteIfExists();
//check the count of blob if over limit or not, if yes, clear them.
await CleanStorageIfReachLimit(contentLenght);
//and upload the new file in this
tempFileBlob.UploadFromStream(inputStream);
}
catch (Exception ex)
{
if (ex.InnerException != null)
{
throw ex.InnerException;
}
else
{
throw ex;
}
}
}
//check the count of blob if over limit or not, if yes, clear them.
private async Task CleanStorageIfReachLimit(long newFileLength)
{
List<CloudBlob> blobs = container.ListBlobs()
.OfType<CloudBlob>()
.OrderBy(m => m.Properties.LastModified)
.ToList();
//get total size of all blobs.
long totalSize = blobs.Sum(m => m.Properties.Length);
//calculate out the real limit size of before upload
long realLimetSize = TotalLimitSizeOfTempFiles - newFileLength;
//delete all,when the free size is enough, break this loop,and stop delete blob anymore
foreach (CloudBlob item in blobs)
{
if (totalSize <= realLimetSize)
{
break;
}
await item.DeleteIfExistsAsync();
totalSize -= item.Properties.Length;
}
}

Resources