How are firestore arrays implemented? - firebase

I need to store a list of strings in a field and I need to be able to easily access each string by its value, but there isn't a "set" or "1d map" option in firestore. As a hack, I used a map and just stored each of the strings as "value": value. But firestore's array's methods and functionality seems to behave like a set, and if so it would be perfect for storing this type of data. Are firestore array's implemented as sets or more as a traditional array?

Firestore arrays are just arrays, but come with some special operators that allow you to use them as sets too.
Specifically, you can perform unions on arrays, which add an item to the array if it is not in there yet.
You can also query on documents where an array contains a specific, or one of a number of values.
Note that in all these cases (unions and queries), you need to specify the entire array value and cannot specify only one property of them. In your case with simple string values that makes sense, but when you store objects in arrays this also applies: you must specify the entire object for the query to match it.

Related

firebase query - find doc where map value is in an array

I'm trying to find the id of a doc where a map value in an array of maps equals "x".
in the following case, I'm trying to find which rep owns the cause with code "hog"
I'll likely be going down the denormalizing route, but is this possible?
Firestore has an array-contains operator that you can use to query whether a certain item exists in an array field, but that operator only works if you specify the exact, complete value of the field. It can't test for a partial match.
The common approach to your use-case is to add an additional array field with just the values you want to query on, i.e.
cause-codes: ["hog"]
Once you modified your documents with this additional field, you can then use a query like:
repsRef.where('cause-codes:', 'array-contains', 'hof')

For arrays in Firestore, is there any downside to searching it for string vs int values?

I noticed that Firestore allows arrays and some operations on them like containsAny([...]).
I'm thinking of having a array of values, but the values I'll be putting in are UUIDs (strings). So, it may look like this:
MyCollection {
categoryIds List<String>
}
And I'll do operations like where(categoryIds, containsAny(uuid1, uuid2))
Is there a performance hit vs if I had stored numbers instead of string?DOes it matter?
Firestore queries are generally based on indexes, so I doubt there's any performance difference between the two.
Also note: Firestore "arrays" are ABSOLUTELY NOT ARRAYS. They are ORDERED LISTS, generally in the order they were added to the array. The SDK presents them to the CLIENT as arrays, but Firestore itself does not STORE them as actual arrays - THE NUMBER YOU SEE IN THE CONSOLE is the order, not an index. Matching elements in an array (arrayContains, e.g.) requires matching the WHOLE element - if you store an ordered list of objects, you CANNOT query the "array" on sub-elements.
The client SDKs generally present the values in the arrays/"ordered lists" to you as an array - which has more to do with most languages not having a primitive element that is an ordered list.

How can I limit and sort on document ID in firestore?

I have a collection where the documents are uniquely identified by a date, and I want to get the n most recent documents. My first thought was to use the date as a document ID, and then my query would sort by ID in descending order. Something like .orderBy(FieldPath.documentId, descending: true).limit(n). This does not work, because it requires an index, which can't be created because __name__ only indexes are not supported.
My next attempt was to use .limitToLast(n) with the default sort, which is documented here.
By default, Cloud Firestore retrieves all documents that satisfy the query in ascending order by document ID
According to that snippet from the docs, .limitToLast(n) should work. However, because I didn't specify a sort, it says I can't limit the results. To fix this, I tried .orderBy(FieldPath.documentId).limitToLast(n), which should be equivalent. This, for some reason, gives me an error saying I need an index. I can't create it for the same reason I couldn't create the previous one, but I don't think I should need to because they must already have an index like that in order to implement the default ordering.
Should I just give up and copy the document ID into the document as a field, so I can sort that way? I know it should be easy from an algorithms perspective to do what I'm trying to do, but I haven't been able to figure out how to do it using the API. Am I missing something?
Edit: I didn't realize this was important, but I'm using the flutterfire firestore library.
A few points. It is ALWAYS a good practice to use random, well distributed documentId's in firestore for scale and efficiency. Related to that, there is effectively NO WAY to query by documentId - and in the few circumstances you can use it (especially for a range, which is possible but VERY tricky, as it requires inequalities, and you can only do inequalities on one field). IF there's a reason to search on an ID, yes it is PERFECTLY appropriate to store in the document as well - in fact, my wrapper library always does this.
the correct notation, btw, would be FieldPath.documentId() (method, not constant) - alternatively, __name__ - but I believe this only works in Queries. The reason it requested a new index is without the () it assumed you had a field named FieldPath with a subfield named documentid.
Further: FieldPath.documentId() does NOT generate the documentId at the server - it generates the FULL PATH to the document - see Firestore collection group query on documentId for a more complete explanation.
So net:
=> documentId's should be as random as possible within a collection; it's generally best to let Firestore generate them for you.
=> a valid exception is when you have ONE AND ONLY ONE sub-document under another - for example, every "user" document might have one and only one "forms of Id" document as a subcollection. It is valid to use the SAME ID as the parent document in this exceptional case.
=> anything you want to query should be a FIELD in a document,and generally simple fields.
=> WORD TO THE WISE: Firestore "arrays" are ABSOLUTELY NOT ARRAYS. They are ORDERED LISTS, generally in the order they were added to the array. The SDK presents them to the CLIENT as arrays, but Firestore it self does not STORE them as ACTUAL ARRAYS - THE NUMBER YOU SEE IN THE CONSOLE is the order, not an index. matching elements in an array (arrayContains, e.g.) requires matching the WHOLE element - if you store an ordered list of objects, you CANNOT query the "array" on sub-elements.
From what I've found:
FieldPath.documentId does not match on the documentId, but on the refPath (which it gets automatically if passed a document reference).
As such, since the documents are to be sorted by timestamp, it would be more ideal to create a timestamp fieldvalue for createdAt rather than a human-readable string which is prone to string length sorting over the value of the string.
From there, you can simply sort by date and limit to last. You can keep the document ID's as you intend.

DynamoDB Set order

From DynamoDB docs:
An attribute of type String Set. For example:
"SS": ["Giraffe", "Hippo" ,"Zebra"]
Type: Array of strings
Required: No
This is all I could find. I did some testing but that's clearly not enough for production environments and I would like to get a confirmation/confutation from people who have actually worked with these Sets.
Do DynamoDB Sets maintain insertion order? Can I count on that fact & build logic around that?
Im mainly interested in String Set but it probably applies to all of them (String, Number, Binary).
Here is the documentation. SET data type doesn't preserve the order.
SET : The order of the values within a set are not preserved;
therefore, your applications must not rely on any particular order of
elements within the set.
LIST - A list type attribute can store an ordered collection of values
Similar discussion on AWS forum

In eXist, how can I run an expression on a dynamically determined collection?

I have a query that dynamically determines a collection name and an expression to evaluate on that collection that return a boolean. Say:
$my-collection points to the collection, e.g. containing the string /db/my/collection.
The boolean expression is exists(/foo/bar).
I can run exists(/foo/bar) on the collection itself, which will return either true or false, depending on whether a document in the collection contains a /foo/bar. But how can I do the same when the collection name isn't known in advance?
Naively, I tried collection($my-collection)/exists(/foo/bar). But since collection() returns the document nodes in the collection, this would return as many booleans as there are documents in the collection, instead of just one boolean. This isn't what I want, plus it can be extremely slow as my collection can contain several tens of thousands of documents.
So, how should I write this instead?
You can rewrite your expression to this:
exists(collection($my-collection)/foo/bar)
Or perhaps this, which - depending on the eXists query optimizer - might perform better:
exists((collection($my-collection)/foo/bar)[1])
HTH!

Resources