Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm not sure if this is the place to ask this, but I have a best practices question.
I have a dashboarding service fed by Salesforce data that displays the number of Task X performed this week (X being Opportunities Closed - Won, Leads Created, etc).
Currently, the data is being pulled regularly and stored in a SQL database, which is mapped to a REST API that the Client App calls to get the aggregations between two date values, and will be fed additionally by Webhook calls via SF's Insert Triggers.
I want to know if having a Firestore Collection as a Cache for Aggregated SQL is a good idea, or if there is a better approach. The benefits I see are reduced traffic on my SQL server, instant updates (if the "cache" (Firestore) is updated, the client's value updates instantly as well).
When data is pulled from SF or a new record is received via the Insert Trigger/Webhook, I can update the Firestore record and the client will receive the change immediately.
My idea for a Firestore Document would be
{
user: "123",
sfOwnerId: "124",
sfTaskType: "Opportunities Closed Won This Week",
count: 23
}
Is this a good idea? Is there a better one out there?
Thank you in advanced!
Your strategy of storing the aggregated data is what the Firestore documentation suggests for aggregation, so I think it's pretty solid idea.
An alternative strategy would be to only store the Salesforce data in Firestore as it comes in, not aggregated, and let the client perform the aggregation. This can be achieved by subscribing to real-time updates to a query of Collection. In this setup, you would perform the calculation within the onSnapshot callback (assuming you're using the Web environment).
The advantage here is a possible increase to performance, since Cloud Functions often suffer from "cold start" latency.
Note: Several of the recommendations in this document center around
what is known as a cold start. Functions are stateless, and the
execution environment is often initialized from scratch, which is
called a cold start. Cold starts can take significant amounts of time
to complete. It is best practice to avoid unnecessary cold starts, and
to streamline the cold start process to whatever extent possible (for
example, by avoiding unnecessary dependencies).
Source
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
I have been developing an app which has a feature to chat with other users. I have been been really confused for the past few days trying to find out which would be the better database to use in terms of pricing, I am talking about going at scale here.
Whenever a user sends a message he creates a document and while doing so I am also checking if the other user is online or not, for which I read another document. On average I have to make around 7-8 writes per message and 8-10 reads per message. Also whenever a user opens a conversation he sees the last 15 messages, and if he scrolls then he sees more. This increases the reads as well.
Also I want to know if I send a message and the other user gets it , who previously read 15 documents to see the last 15 messages , when I send a new message , the 15th message gets replaced with the new one , so do I get charged for the 15 document reads again?
Pricing is the main concern here, please help me find the best approach here.
Firebase has a database recommender and a pricing calculator in its documentation to help you answer this question. I'd expect it to point to the Realtime Database here, primarily based on the fact that you'll have many smaller write operations.
For your second question (please limit yourself to a single question per post going forward): if your listener remains active, or if your local cache already contains the 14 unmodified documents, you will be charged only one document read (for the new document that needs to be read on the server to return it to the client).
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I watched this video on data modelling in Cosmos DB.
In the video, it is explained that if you can model your data such that your most common queries are in partition queries, then you can minimize RUs, which in turn minimizes cost and maximizes performance.
The example used in the video is a blogging system. They showed that by moving things around such that Blog Posts and Comments are stored as separate entities in the same collection all partitioned by blogId they could achieve a low RU for a common query.
They then showed that searching for all blog posts by a specific user, being a cross partition query, is very expensive. So they then duplicate all blog post data and add each blog post as a separate entity to the users collection, which is already partitioned by userId. Searching for posts by a user is now cheap. The argument is storage is much cheaper than CPU time so this is a fine thing to do.
My question is: do I continue to follow this pattern when I want to make more things efficiently searchable? For example, I want to be able to search on blog topic (of which there could be many per blog post), a discrete blog rating, and so on.
I feel like extending this pattern for each search term is unsustainable. In these cases, do I just have to live with high RU searches or is there some clever way of making things efficient?
The essentially comes down to knowing whether the cost of using change feed to copy data from one container to another is less than the cost of doing cross-partition queries. This requires knowing the access patterns of your application and also requires measuring the average cost of these queries versus the cost of using change feed to make another copy. Change feed consumes 2 RU/s when it polls the container, then 1 RU for each 1Kb or less read from the source container and ~8 RU for each 1Kb or less insert on target container depending on your index policy. Multiply that by the rate at which data is inserted or updated. Then calculate this per day or per month to compare cost.
If what you're looking for is to do free text search on your data, you may want to look at using Azure Search. This is simpler than using the approach using change feed, but Azure Search can be quite expensive as well.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I'm doing a fair bit of work on a set of Firestore collections and documents. It amounts to a good amount of writes and reads, as I'm setting two-way refs and whatnot. Multiple documents are being written to multiple times.
Since Firestore offers offline capability, is it possible to reduce the number of writes via preparing all the data locally, then sending it to the server?
I'm using the Admin SDK.
It depends on what you mean. One document write is always going to cost one document write, no matter when or how that document was written. Batch writes don't in any way reduce the number of documents written, they just make all the document writes take effect at the exact same moment in time.
If you're staging lots of changes to a single document to take effect later, then feel free to do that. Just write the document whenever you've figured out what final document looks like, and no sooner.
I'am moving away from google appengine standard Python 2.7 NDB to Svelte, Firestore and RxFire.
I was able to dramatically reduce the number of reads and writes by batching hundreds of appengine NDB entities (datastore / data objects) into a single document using a data object map.
Every data object has a batchId prop to optimize (batched) batch writes / document writes. (batchId = docId)
Most of the querying is now done in the client using filters. This resulted in very simple reactive Firestore queries using RxFire observables. This also dramatically reduced the number of composite indexes.
doc:
batchId: docId
map: data Objects
batchId: docId
other props ...
....
I also used maps of data objects for putting all kinds of configuration and definition data into single documents. This setup is easy to maintain and available with a single doc read. The doc reads are part of observables to react to doc changes.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am aware for Cloud Firestore a read is a document (whether the documents has 5 or 50 nodes). How does this compare to the RTDB?
If I have a query that has a limit of 25, is this going to be 25 reads, or 25 times x amount of items in each node?
Cheers.
Your question is a bit of a non-sequitur, as realtime database doesn't bill by reads, it bills by data transferred (and storage, of course). So, the thing that affects your cost is the size of the items transferred, which is only indirectly based on the number of items due to a limit on the query. Currently, the costs are about US $1 per GB downloaded assuming you are on the Blaze plan.
To compare this with the costs for Firestore would require knowing a lot more about the shape of your traffic -- how many reads and writes, average size of a read, etc. Note that Cloud Firestore also indirectly charges for data transferred, but at a much lower rate, as it is only the Google Cloud Network pricing.
This means that you can generally get quite a large number of Firestore document reads for the cost that RTDB charges for transferring 1 GB.. (e.g. at current prices for egress to most of the internet excluding some asia/pacific destinations, you could get 1 GB + over 1.4M firestore document reads for your $1 of 1 GB RTDB transfer).
The documentation references several things you can do to help control costs, including (but not limited to):
Prefer the native SDKs to the REST API
Monitoring your data usage and use the profiler tool to measure read operations.
Use fewer, longer lived connections, as SSL and connection overhead can contribute to your costs (but generally are not the bulk of your cost).
Ensure your listeners are limited to only the data you care about, and are as low in the database tree as possible, and only download updates, (e.g. on() vs once()).
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm halfway through coding a basic multi-tenant SaaS ECM solution. Each client has it's own instance of the database / datastore, but the .Net app is single instance. The documents are pretty much read only (i.e. an image archive of tiffs or PDFs)
I've used MSSQL so far, but then started thinking this might be viable in a NoSQL DB (e.g. MongoDB, CouchDB). The basic premise is that it stores documents, each with their own particular indexes. Each tenant can have multiple document types.
e.g. One tenant might have an invoice type, which has Customer ID, Invoice Number and Invoice Date. Another tenant might have an application form, which has Member Number, Application Number, Member Name, and Application Date.
So far I've used the old method which Sharepoint (used?) to use, and created a document table which has int_field_1, int_field_2, date_field_1, date_field_2, etc. Then, I've got a "mapping" table which stores the customer specific index name, and the database field that will map to. I've avoided the key-value pair model in the DB due to volume of documents.
This way, we can support multiple document types in the one table, and get reasonably high performance out of it, and allow for custom document type searches (i.e. user selects a document type, then they're presented with a list of search fields).
However, a NoSQL DB might make this a lot simpler, as I don't need to worry about denormalizing the document. However, I've just got concerns about the rest of the data around a document. We store an "action history" against the document. This tracks views, whether someone emails the document from within the system, and other "future" functionality (e.g. faxing).
We have control over the document load process, so we can manipulate the data however it needs to be to get it in the document store (e.g. assign unique IDs). Users will not be adding in their own documents, so we shouldn't need to worry about ACID compliance, as the documents are relatively static.
So, my questions I guess :
Is a NoSQL DB a good fit
Is MongoDB the best for Asp.Net (I saw Raven and Velocity, but they're still kinda beta)
Can I store a key for each document, and then store the action history in a MSSQL DB with this key? I don't need to do joins, it would be if a person clicks "View History" against a document.
How would performance compare between the two (NoSQL DB vs denormalized "document" table)
Volumes would be up to 200,000 new documents per month for a single tenant. My current scaling plan with the SQL DB involves moving the SQL DB into a cluster when certain thresholds are reached, and then reviewing partitioning and indexing structures.
Answer:
For the document oriented portions, and likely for the whole thing, a nosql solution should work fine.
I've played with, and heard good things about mongodb, and would probably recommend it first for a .net project.
For the document-oriented portion, the performance should be excellent compared to a sql database. for smaller scale, it should be equivalent, but the ability to scale out later will be a huge benefit.