Is there a Common Format for Auto Id's?

Is there a Common Format for Auto Id's? - firebase

The auto id's generated by an Android client in a Firestore collection seem to all meet certain criteria for me:
20 characters of length
Start with a - dash
Seem to cycle through characters based on time?
With the last point I mean that the first characters will look very similar if the creation happened in a similar time frame, e.g. -LZ.., -L_.., and -La... This describes the Flutter implementation.
However, looking at the Javascript implementation of auto id, I would assume that the only common criterion of all clients is the length of 20 characters. Is this assumption correct?

Accross all clients, the auto id has a length of 20 characters:
iOS
Android
JavaScript (Web)
Flutter

You're referring to two types of IDs:
The push IDs as they are generated by the Firebase Realtime Database SDK when you call DatabaseReference.push() (or childByAutoId in iOS). These are described in The 2^120 Ways to Ensure Unique Identifiers, and a JavaScript implementation can be found here.
The auth IDs that are generated by the Cloud Firestore SDK when you call add(..) or doc() (without arguments). The JavaScript implementation of this can indeed be found in the Firestore SDK repo.
The only things these two IDs have in common is that they're designed to ensure enough entropy that realistically they will be globally unique, and that they're both 20 characters long.

Related

Best way to filter out a subset of document IDs with firebase/firestore JS SDK?

I wanted to get some community consensus on how to achieve the following with the Firebase JS SDK (e.g., in React):
Suppose I have a collection users and I wanted to paginate users that do not have document IDs matching a subset of IDs (O(100-1000)). This subset of excluded IDs is dynamic based on the authenticated user.
It seems the not in query only supports up to 10 entries, so this is out of the question.
It also seems it's not possible to fetch all document IDs and filter on the client side, at least not in the 'firebase' JS SDK.
The only workaround I can think of is to have a document that keeps an array of all users document IDs, pull that document locally and perform the filtering/pagination logic locally. The limitation here is that a document can be at most 1MB, so realistically the single document can store at most O(10K) IDs.

Firestore has a special bunch of methods for pagination which may be useful for you. Those are called "query cursors".
You can use them to define the start point startAt() or startAfter() and to define an end point endAt() or endBefore(). Additionally, if needed, those can be combined with limit method.
I strongly encourage you to check this tutorial. Here you can find a quick video explaining the matter and lot of examples in all popular languages.

When using db.collection('...').add({...}) can bad words appear in the random id?

I'm using firebase firestore for a project and I use the unique id's firestore generates in my URLs. So I was wondering if by using the add operator there might be bad words in the id. I could not find any notes on the firestore documentation that mentions this.

The ID generation for Cloud Firestore is purely random. So it is indeed possible that a sequence of characters appears in the key that may be offensive or unwanted for another reason.
If this is a concern for your application, you'll have to check the generated IDs in your application code, or use an alternative method to render IDs for new documents that don't have this risk.

What are the chances for firestore to generate two identical random keys?

I am working on a project and firestore random keys where kind of important in this scenario, so my question is, what are the chances for firebase firestore or the real-time database to generate two or more identical random variables?

According to this blog link : The 2^120 Ways to Ensure Unique Identifiers
How Push IDs are Generated
Push IDs are string identifiers that are generated client-side. They
are a combination of a timestamp and some random bits. The timestamp
ensures they are ordered chronologically, and the random bits ensure
that each ID is unique, even if thousands of people are creating push
IDs at the same time.
What's in a Push ID?
A push ID contains 120 bits of information. The first 48 bits are a
timestamp, which both reduces the chance of collision and allows
consecutively created push IDs to sort chronologically. The timestamp
is followed by 72 bits of randomness, which ensures that even two
people creating push IDs at the exact same millisecond are extremely
unlikely to generate identical IDs. One caveat to the randomness is
that in order to preserve chronological ordering if a client creates
multiple push IDs in the same millisecond, we just 'increment' the
random bits by one.
To turn our 120 bits of information (timestamp + randomness) into an
ID that can be used as a Firebase key, we basically base64 encode it
into ASCII characters, but we use a modified base64 alphabet that
ensures the IDs will still sort correctly when ordered
lexicographically (since Firebase keys are ordered lexicographically).

While Gastón Saillén's answer is 100% correct regarding the pushed key from Firebase realtime database, I'll try to add a few more details.
When using DatabaseReference's push() method, it generates a key that has a time component, so basically two events can theoretically take place within the same millisecond but there is an astronomically small chance that two users can generate a key in the exact same moment and with the exact same randomness. Please also note, that these keys are generated entirely on the client without consultation Firebase server. If you are interested, here is the algorithm that generates those keys. In the end, I can tell you that I haven't heard of a person who reported a problem with key collisions so far.
So unlike Fireabase realtime database keys, Cloud Firestore ids are actually purely random. There's no time component included. This built-in generator for unique ids that is used in Firestore when you call CollectionReference's add() methods or CollectionReference's document() method without passing any parameters, generates random and highly unpredictable ids, which prevents hitting certain hotspots in the backend infrastructure. That's also the reason why there is no order, if you check the documents in a collection in the Firebase console. The collisions of ids in this case is incredibly unlikely and you can/should assume they'll be completely unique. That's what they were designed for. Regarding the algorithm, you can check Frank van Puffelen's answer from this post. So you don't have to be concerned about this ids.

Firebase/GeoFire - Most popular item at location

I am currently in the evaluation process for a database that should serve as a backend for a mobile application.
Right now I am looking at Firebase, and for now I like it really much.
It is a requirement to have the possibility to fetch the
most popular items
at a certain location
(possibly in the future: additionally for a certain time range that would be an attribute of the item)
from the database.
So naturally I stumbled upon GeoFire that provides location based query possibilities for Firebase.
Unfortunately - at least as far as I understood - there is no possibility to order the results by an attribute other than the distance. (correct me if I am wrong).
So what do I do if I am not interested in the distance (I only want to have items in a certain radius, no matter how far from the center) but in the popularity factor (e.g. for the sake of simplicity a simple number that symbolizes popularity)?
IMPORTANT:
Filtering/Sorting on the client-side is not an option (or at least the least preferred one), as the result set could potentially grow to an infinite amount.
First version of the application will be for android, so the Firebase Java Client Library would be used in the first step.
Are there possibilities to solve this or is Firebase out of the race and not the right candidate for the job?

There is no way to add an extra condition to the server-side query of Geofire.
The query model of the Firebase database allows filtering only on a single property. Geofire already performs a seemingly impossible feat of filtering on both latitude and longitude. It does this through the magic of Geohashes, which combine latitude and longitude into a single string.
Strictly speaking you could find a way to extend the magic of Geohashes to add a third number into the mix. But while possible, I doubt it's feasible for most of us.

How are Firebase IDs generated?

Can we assume anything about them? Are they globally unique (across all of Firebase)? Is there any sort of ordering? Does the client matter?
Is there a public library / documentation so I can generate those IDs as well?
I am referring to the ones generated by push

There is a blog post on it, as well as a Gist.
From the blog post, here's the core of What's in a Push Id:
A push ID contains 120 bits of information. The first 48 bits are a
timestamp, which both reduces the chance of collision and allows
consecutively created push IDs to sort chronologically. The timestamp
is followed by 72 bits of randomness, which ensures that even two
people creating push IDs at the exact same millisecond are extremely
unlikely to generate identical IDs. One caveat to the randomness is
that in order to preserve chronological ordering if a client creates
multiple push IDs in the same millisecond, we just ‘increment’ the
random bits by one.
To turn our 120 bits of information (timestamp + randomness) into an
ID that can be used as a Firebase key, we basically base64 encode it
into ASCII characters, but we use a modified base64 alphabet that
ensures the IDs will still sort correctly when ordered
lexicographically (since Firebase keys are ordered lexicographically).
Also something amazing to note, is the ports for several different languages, done by the community:
Ruby
PHP
Python
Java
Nimrod
Go
Lua
Swift
So perhaps the best way to learn is pick a language not on that list and port it!