Datastore - Allocate ids with max length - google-cloud-datastore

I want to auto generate ids in cloud datastore that doesn't exceed more than 7 digits in length.
In previous version of python cloud datastore library one was able to provide a max parameter to ensure that the auto generated id doesn't exceed a maximum limit.
Ref: https://cloud.google.com/appengine/docs/standard/python/ndb/creating-entity-keys
Since now this feature has been removed from the latest python datastore library:
Ref: https://googleapis.dev/python/datastore/latest/_modules/google/cloud/datastore/client.html#Client.key
How can one achieve the same ?

There isn't a way to achieve that, unfortunately, in the newest version. As the official documentation confirms here, there isn't a replacement for the method max_allocate_ids_keys(), so it's not possible to limit the size of the id, which as you probably know, can be up to 16 digits.
If you find it useful, you can raise a Feature Request here, on Google's Issue Tracker, so they can check about possibly implementing a similar feature in new versions.

Related

Has firestore removed the soft limit of 1 write per second to a single document?

Firestore has always had a soft limit of 1 write per second to a single document. That meant that for doing things like a counter that updates more frequently than once per second, the recommended solution was sharded counters.
Looking at the Firestore Limits documentation, this limit appears to have disappeared. The Firebase summit presentation mentioned that Firestore is now more scalable, but only mentioned hard limits being removed.
Can anyone confirm whether this limit has indeed been removed, and we can remove all our sharded counters in favor of writing to a single count document tens or hundreds of times per second?
firebaser here
This was indeed removed from the documentation. It was always a soft limit that was not enforced in the code, but instead an estimate for the physical limitation of how long it takes to synchronize the changes to indexes and multiple data center.
We've significantly improved the infrastructure used for these write operations, and now provide tools such a the key visualizer to better analyze performance and look for hot spots in the read and write behavior of your app. While there's still some physical limit, we recommend using these tools rather than depending on a single documented value to analyze your app's database performance.
For most use-cases I'd recommend using the new COUNT() operator nowadays. But if you want to continue using write-time aggregation counters, it is still recommended to use a sharded counter for high-volume count operations, we've just stopped giving a hard number for when to use it.

Firestore unused Index limitation?

I have an application that uses Firebase Firestore as a database. Currently, I am out of indexes(composite) in the database. It reaches the limit to 200 and cant add more.
I'd manually deleted the index to some extent. Since my application is pretty large It is very difficult to manually find the unused index. Also, it takes time to recheck the same in multiple parts of the application.
I am looking for a solution. Either to identify the unused indexes in a better way than manually searching or an option to extend the limit of the index.
You may file a support ticket with Google Cloud Support, or Firebase Support to get the unused indexes, there is no automated way to do this.
Additionally, there is no way to increase the limit.
Changing the way your data is modeled so that you will need less composite indexes to support your query might also be an alternative as mentioned here.

How to find which kinds are not being used in Google Datastore

There's any way to list the kinds that are not being used in google's datastore by our app engine app without having to look into our code and/or logic? : )
I'm not talking about indexes, which I can list by issuing an
gcloud datastore indexes list
and then compare with the datastore-indexes.xml or index.yaml.
I tried to check datastore kinds statistics and other metadata but I could not find anything useful to help me on this matter.
Should I give up to find ways of datastore providing me useful stats and code something to keep collecting datastore statistics(like data size), during a huge period to have at least a clue of which kinds are not being used and then, only after this research, take a look into our app code to see if the kind Model was removed?
Example:
select bytes from __Stat_Kind__
Store it somewhere and keep updating for a period. If the Kind bytes size does not change than probably the kind is not being used anymore.
The idea is to do some cleaning in datastore.
I would like to find which kinds are not being used anymore, maybe for a long time or were created manually to be used once... You know, like a table in oracle that no one knows what is used for and then if we look into the statistics of that table we would see that this table was only used once 5 years ago. I'm trying to achieve the same in datastore, I want to know which kinds are not being used anymore or were used a while ago, then ask around and backup/delete it if no owner was found.
It's an interesting question.
I think you would be best-placed to audit your code and instill organizational practice that requires this documentation to be performed in future as a business|technical pre-prod requirement.
IIRC, Datastore doesn't automatically timestamp Entities and keys (rightly) aren't incremental. So there appears no intrinsic mechanism to track changes short of taking a snapshot (expensive) and comparing your in-flight and backup copies for changes (also expensive and inconclusive).
One challenge with identifying a Kind that appears to be non-changing is that it could be referenced (rarely) by another Kind and so, while it does not change, it is required.
Auditing your code and documenting it for posterity should not only provide you with a definitive answer (and identify owners) but it pays off a significant technical debt that has been incurred and avoids this and probably future problems (e.g. GDPR-like) requirements that will arise in the future.
Assuming you are referring to records being created/updated, then I can think of the following options
Via the Cloud Console (Datastore > Dashboard) - This lists all your 'Kinds' and the number of records in each Kind. Theoretically, you can take a screen shot and compare the counts so that you know which one has experienced an increase or not.
Use of Created/LastModified Date columns - I usually add these 2 columns to most of my datastore tables. If you have them, then you can have a stored function that queries them. For example, you run a query to sort all of your Kinds in descending order of creation (or last modified date) and you only pull the first record from each one. This tells you the last time a record was created or modified.
I would write a function as part of my App, put it behind a page which requires admin privilege (only app creator can run it) and then just clicking a link on my App would give me the information.

What is the "offset clause" in Firestore?

The latest Firecast, Doug Stevenson mentioned request.query, however, he only discussed request.query.limit and request.query.orderBy (here is the timestamp).
The documentation names a third property, i.e. request.query.offset:
offset - query offset clause.
In all the time I have been using Cloud Firestore, I have never seen the "offset clause".
Can someone explain what this offset clause is and how the request.query.offset property is implemented?
That currently doesn't do anything. Offset is currently only available to server SDKs (for example: here); it's not an option in web and mobile client SDKs. Since server SDKs always bypass security rules, there's nothing you can do with request.query.offset that would affect way rules would evaluate.
The reference to this should actually be removed from the documentation altogether.

Is there a Common Format for Auto Id's?

The auto id's generated by an Android client in a Firestore collection seem to all meet certain criteria for me:
20 characters of length
Start with a - dash
Seem to cycle through characters based on time?
With the last point I mean that the first characters will look very similar if the creation happened in a similar time frame, e.g. -LZ.., -L_.., and -La... This describes the Flutter implementation.
However, looking at the Javascript implementation of auto id, I would assume that the only common criterion of all clients is the length of 20 characters. Is this assumption correct?
Accross all clients, the auto id has a length of 20 characters:
iOS
Android
JavaScript (Web)
Flutter
You're referring to two types of IDs:
The push IDs as they are generated by the Firebase Realtime Database SDK when you call DatabaseReference.push() (or childByAutoId in iOS). These are described in The 2^120 Ways to Ensure Unique Identifiers, and a JavaScript implementation can be found here.
The auth IDs that are generated by the Cloud Firestore SDK when you call add(..) or doc() (without arguments). The JavaScript implementation of this can indeed be found in the Firestore SDK repo.
The only things these two IDs have in common is that they're designed to ensure enough entropy that realistically they will be globally unique, and that they're both 20 characters long.

Resources