I seem to remember the old datastore properties have something like 2 datastore write operations (DWOs) per storage, and perhaps a few more depending on how they are indexed.
In the old datastore, I would frequently store everything I didn't need to index in a JSON string and store that as a TextProperty to save on multiples of writes.
Having gotten used to saving and working with everything in JSON straight from the datastore, when switching to the NDB for a new app, I naturally used the NDB JsonProperty.
As is usual, I became paranoid about optimization the first time I checked out my quota limits (quintessential free-quota-limit user experience ?), and noticed all the datastore writes (which consisted entirely of models with only JsonProperties) were clocking up a lot of DWO quota.
Immediately I wondered : Does GAE DataStore make multiples of writes depending on the structure of the JsonProperty? Or does it just store the whole property blob to the datastore in as few DWOs as are required for a "blob" store?
I thought the latter, and remembered reading so in the docs, but the massive quota consumption (quintessential free-quota-limit user paranoia ?), made me wonder if maybe using JsonProperty was not as efficient as using the old datastore and saving the JSON strings as TextProperty -- which are certainly unstructured blobs.
It would be good if this could be cleared up definitively, so I can get back to the "appengine promise" of focussing only on the app. :)
The datastore is runtime agnostic and has no idea there is such a thing as python, ndb or JSON, so it can't index/write differently based on your data. In the implementation, JsonProperty is a BlobProperty and just uses json to serialize and de-serialize data:
class JsonProperty(BlobProperty):
def __init__(self, name=None, compressed=False, json_type=None, **kwds):
A BlobProperty can be indexed or not and also compressed or not:
class BlobProperty(Property):
_indexed = False
_compressed = False
def __init__(self, name=None, compressed=False, **kwds):
It seems you may be comparing a situation where compressed was True to the default of False. Try setting it to True and maybe posting some raw numbers to compare (even some numbers from the db case to get a sense).
UPDATE:
I wasn't sure I was clear enough on this, and after Guido's comment it's clear I wasn't. The datastore writes for your ndb blob property will be exactly the same as the datastore writes for your db blob property. These numbers change based on whether the entities exist or not, and whether the properties are indexed or not. My comment about compressed was to address any other performance/bandwidth/size issues you may have been confused by.
If you check out the billing page, there is a mapping from high-level operations to low-level operations. Relevant to what you are asking about we have:
New Entity Put (per entity, regardless of entity size): 2 writes + 2 writes per indexed property value + 1 write per composite index value
Existing Entity Put (per entity): 1 write + 4 writes per modified indexed property value + 2 writes per modified composite index value
Related
The documentation in google cloud's datastore section is very confusing to me.
I was under the belief datastore may be used as a key-value storage. (Similar to mongoDB) But I guess I misunderstood how its keys work, since I can't use string keys outright, but I need to transform series of strings to a (list) key via some list => dataStore.key(list) transformation.
That's weird, and I don't understand why use a list instead of a string, and I don't understand why I don't use the list outright, and need to use datastore.key first, but I can do that. However, after playing with it a little bit, I discovered that the return value of datastore.key(list) would get me different values for the same list if I run it repeatedly!
So, now I need to somehow remember this key somewhere, but the reason I wanted to use datastore was that I was running in a service with no persistent memory to begin with.
Am I using datastore wrong? Can I use it at all for simple persistent key-value storage?
It appears that the issue was that I used a list that was too short.
The datastore expects collections, which contain documents, each of which is a key-value mapping. So instead of having my key point at a mapping, I needed to set a collection, and in it have keys mapped to mappings. (In other words, I needed to have one more element in the keys list)
results = await this.dataStore.get(this.dataStore.key([collectionId, documentId]));
AND
await this.dataStore.save({ key: this.dataStore.key([collectionId, documentId]), data: someKeyValueObject });
I have a hashmap in a document. Let's say it looks like:
userHasFinished: {
'user1': false,
'user2': false,
'user3': false,
}
If I'm updating specific fields in this hashmap from false to true, and I know that only one user can initiate a write for a particular field (this is guarded by authentication), do I need a transaction for this update?
Put another away, do I need a transaction to make concurrent updates to a hashmap even though those concurrent updates will always be to different keys in the hashmap?
I'm assuming not because inherently an entire Firestore document is essentially a hashmap and you certainly don't need transactions to update individual fields in a document.
You only need to use a transaction if the data that you write depends on the current data in the same document.
A user adding their own UID to a map does not require the existing data in the document, so can be safely (and more efficiently) done with a merging set or update call, as long as you address the specific subfield with a .. For example: { "userHasFinished.user1": false }.
Also see the documentation on updating fields in nested objects, which contains example code for many supported languages.
I am using DocumentDb and would like to replace a document only if the property of the document takes a certain value. Notice that all stored documents have this property (and the value can never be empty).
The only way I found is to do this in 3 steps:
1) Read the document with ReadDocumentAsync
2) Check if the resource response has the property value I expect
3) If step 2 returns true then do the replace with ReplaceDocumentAsync, otherwise do something else
I am concerned about the additional request charge and latency as this is 2 queries to the db. Is that the only way with the current .Net SDK or is there a more clever way?
Thank you
You could optimize this by using a Stored Procedure that executes directly in the database. The order of operations would be the same, you would include your document as part of the payload to the SPROC but there would be no round trips or extra latency.
I've read almost everywhere about structuring one's Firebase Database for efficient querying, but I am still a little confused between two alternatives that I have.
For example, let's say I want to get all of a user's "maxBenchPressSessions" from the past 7 days or so.
I'm stuck between picking between these two structures:
In the first array, I use the user's id as an attribute to index on whether true or false. In the second, I use userId as the attribute NAME whose value would be the user's id.
Is one faster than the other, or would they be indexed a relatively same manner? I kind of new to database design, so I want to make sure that I'm following correct practices.
PROGRESS
I have come up with a solution that will both flatten my database AND allow me to add a ListenerForSingleValueEvent using orderBy ONLY once, but only when I want to check if a user has a session saved for a specific day.
I can have each maxBenchPressSession object have a key in the format of userId_dateString. However, if I want to get all the user's sessions from the last 7 days, I don't know how to do it in one query.
Any ideas?
I recommend to watch the video. It is told about the structuring of the data very well.
References to the playlist on the firebase 3
Firebase 3.0: Data Modelling
Firebase 3.0: Node Client
As I understand the principle firebase to use it effectively. Should be as small as possible to query the data and it does not matter how many requests.
But you will approach such a request. We'll have to add another field to the database "negativeDate".
This field allows you to get the last seven entries. Here's a video -
https://www.youtube.com/watch?v=nMR_JPfL4qg&feature=youtu.be&t=4m36s
.limitToLast(7) - 7 entries
.orderByChild('negativeDate') - sort by date
Example of a request:
const ref = firebase.database().ref('maxBenchPressSession');
ref.orderByChild('negativeDate').limitToLast(7).on('value', function(snap){ })
Then add the user, and it puts all of its sessions.
const ref = firebase.database().ref('maxBenchPressSession/' + userId);
ref.orderByChild('negativeDate').limitToLast(7).on('value', function(snap){ })
I would like to find the date and time that any schema modification has taken place on a particular database. Modifications are things like tables or columns that have been created, altered, or dropped. It does not include any data that has been inserted, updated, or deleted.
The reason why I need this is because I am writing a .NET utility that depends heavily on the data returned from dbc.tables, dbc.columns, and dbc.indices. Since querying these views can be a very expensive operation, I want to read it all into custom business objects and then serialize the objects to an XML file stored on disk. This way, I can just deserialize the data when I need it unless the database's current_timestamp is greater than or equal to the datetime of the last schema change, at which point I'll refresh the local XML file with the updated schema.
LastAlterTimestamp - If it is equal to CreateTimestamp then object has not been modified since being created or replaced. It is updated when an attribute specific to that data dictionary object was updated.
For example, DBC.Databases.LastAlterTimestamp is not update when a child object (table, view, macro, stored procedure, function, etc.) is added, removed, or altered. It is updated in situations such as when the password, default role, profile, or account is changed.