Firebase better way of getting total number of records - firebase

From the Transactions doc, second paragraph:
The intention here is for the client to increment the total number of
chat messages sent (ignore for a moment that there are better ways of
implementing this).
What are some standard "better ways" of implementing this?
Specifically, I'm looking at trying to do things like retrieve the most recent 50 records. This requires that I start from the end of the list, so I need a way to determine what the last record is.
The options as I see them:
use a transaction to update a counter each time a record is added, use the counter value with setPriority() for ordering
forEach() the parent and read all records, do my own sorting/filtering at client
write server code to analyze Firebase tables and create indexed lists like "mostRecent Messages" and "totalNumberOfMessages"
Am I missing obvious choices?

To view the last 50 records in a list, simply call "limit()" as shown:
var data = new Firebase(...);
data.limit(50).on(...);
Firebase elements are ordering first by priority, and if priorities match (or none is set), lexigraphically by name. The push() command automatically creates elements that are ordered chronologically, so if you're using push(), then no additional work is needed to use limit().
To count the elements in a list, I would suggest adding a "value" callback and then iterating through the snapshot (or doing the transaction approach we mention). The note in the documentation actually refers to some upcoming features we haven't released yet which will allow you to count elements without loading them first.

Related

How can I get a document at a specific index after orderBy

I have some code like this:
...
const snapshot = firestore().collection("orders").orderBy("deliveryDate")
...
I want to access only the 100th order in the returned documents. So far, the only way I achieve this is to do firestore().collection("orders").orderBy("deliveryDate").limit(100) and this returns first 100 documents and I can access the last order. But, I end up fetching 99 unwanted documents and this could become quite slower if I want the 200th document or higher.
So, I basically want to know if there's a possible way of getting just the index I want after sorting.
As far as I know, startAt() and startAfter() only accept a doc reference or field values, not an index/offset
Firestore does not offer any way to offset by some numeric amount to web and mobile clients (and doing so would end up having the exact same cost as what you're doing now).
If you need to impose some sort of offset into your collection, you will need to maintain that in the document itself for querying, or use some other type of storage that gives you fast cheap access by index.

How to query among two fields in firestore?

Consider I have an Events collection where it has startTimestamp and endTimestamp indicating when the event starts, ends respectively.
How to query in firestore to find out if the Event is live/finished/upcoming?
If both startTimestamp and endTimestamp properties exist in the database and are of type Date and not String or Number, then you can simply use a query to check if a particular date is within the bounds or not.
For example in Android, if you want to check if a particular date is within the bounds, you might think that a query like the one below will work:
eventsRef.whereGreaterThanOrEqualTo("startTimestamp", yourDate)
.whereLessThanOrEqualTo("endTimestamp", yourDate);
But it won't. You'll get an Exception with the following message:
All where filters other than whereEqualTo() must be on the same field. But you have filters on 'startTimestamp' and 'endTimestamp'
The only solutuin you have is to create three separate queries.
Edit:
According to your comment, one query should check if your yourDate is before startTimestamp
eventsRef.whereLessThanOrEqualTo("startTimestamp", yourDate);
If it is, it means it's an upcoming event.
The second one would be to see if it's grater than the startTimestamp:
eventsRef.whereGreaterThanOrEqualTo("startTimestamp", yourDate);
Where we have two cases. One case, you perform a new (third) query to check if the data is less than endTimestamp:
eventsRef.whereLessThanOrEqualTo("endTimestamp", yourDate);
If it is, it means that the event is within the bounds, so it's a live event otherwise is grater than that which means that the event is finished.
To get that data in realtime, you should use a snapshot listener for every query.
Here are the cases to handle this scenario. I'm pretty sure this is a very common problem but didn't find effective solutions for this anywhere.
Solution 1: Have all documents in a single collection called subscribedEvents
As suggested Alex, We need to do for the following status.
Upcoming : currentTimestamp < startTimestamp
Finished : currentTimestamp > endTimestamp
Live : currentTimestamp > startTimestamp in 1st Query and currentTimestamp < endTimestamp in second query.
Problem : I can have lots of documents (nearly 10,000) in subscribedTimestamp and Live condition is not scalable as I can't limit the results while querying. As it needs to be intersected from the two queries, I need to query with out filters.
Solution 2: This is a bit of hack but scalable. Don't have all the documents in a single subCollection. Separate Upcoming events and put those documents in subscribedEvents/others/Upcoming collection.
When a user subscribes, If its an upcoming event, you can directly store in the subscribedEvents/others/Upcoming collection.
Rest of the documents can go directly into subscribedEvents collection.
Upcoming : Query all the documents with a limit filter from subscribedEvents/others/Upcoming collection.
Finished : currentTimestamp > endTimestamp
Live : currentTimestamp < endTimestamp
The benefit with this structure is we can apply limit filter and lots of documents don't need to be read for your query and there will be only one query required for each status.
Now this step needs additionally a cron job to make sure the upcoming events from the upcoming sub-collection are moved back to subscribedEvents.
However, if you have lesser documents, Solution 1 is the way to go. But not in my case.
Hope it helps someone where they have to scale efficiently.

Using timestamp as an Attribute in DynamoDB

I'm quite new to DynamoDB, but have some experience in Cassandra. I'm trying to adapt a pattern I followed in Cassandra, where each column represented a timestamped event, and wondering if it will carry over gracefully into DynamoDB or if I need to change my approach.
My goal is to query a set of documents within a date range by using the milliseconds-since-epoch timestamp as an Attribute name. I'm successfully storing the following as each report is generated with each new report being added under its own column:
{ PartitionKey:customerId,
SortKey:reportName_yyyymm,
'#millis_1#':{'report':doc_1},
'#millis_2#':{'report':doc_2},
. . .
'#millis_n#':{'report':doc_n}
}
My question is, given a millisecond-based date range, and the accompanying Partition and Sort keys, is it possible to query the set of Attributes that fall within that range or must I retrieve all columns for the matching keys and filter them at the client?
Welcome to the most powerful NoSQL database ;)
To kick off with the positive news, there is no way to query out specific attributes. You can project certain attributes in a query. But you would have to write your own logic to determine which attributes or columns should be included in the projected query. To get close to your solution you could use a map attribute inside an item with the milliseconds as a key. But there is another thing you have to be aware of when starting on this path.
There is a maximum total item size of 400KB for each item in DynamoDB, including key and attribute names.(Limits in DynamoDB Items) This means you can only store so many attributes in an item. This is especially true if you intend to put the actual report inside of the attribute. Which I would advise against, also because you will be burning up read capacity units every time you get one attribute out of the whole item. You would be better of putting this data in a separate table with the keys in the map. But truthfully in DynamoDB I would split this whole thing up, just add the milliseconds to the sort key and make every document its own item. That way you can directly query to these items and you can use the "between" where clause to select specific date-time ranges. Please let me you meant something else.

Get auto-Id by time

In my app I use Firebase's childByAutoId() (swift) or .push() (web) to insert some data in the following format:
- events
- $autoId
- time:
- name:
- $autoId
- time:
- name:
Where $autoId are the randomly generated keys Firebase makes. time is the epoch time of when the data was pushed.
I want to allow users to modify each inserted entry's time. However, I want to keep the nodes under events sorted by their key and by time which Firebase naturally does when you use .push(). But if they modify the time so that it should actually be in a different order, the entries won't be sorted correctly.
Is there a way to generate an id by the modified time so that if it were inserted into events it would be in the right order? That way I could just delete the old entry and insert the new one while just duplicating the data.
Since the algorithm for Firebase's push IDs is well documented, you could easily modify the function to generate them based on a specific timestamp.
But I'd recommend instead keeping the necessary values as named properties for each child node. If you need to be able to sort by both creation and modification time, keep two separate properties. That way you won't have to depend on the behavior of the push IDs, but instead use more explicitly named properties to accomplish what you need.

Voting on items - how to design database/aws-lambda to minimize AWS costs

I'm working on a website that mostly displays items created by registered users. So I'd say 95% of API calls are to read a single item and 5% are to store a single item. System is designed with AWS API Gateway that calls AWS Lambda function which manipulates data in DynamoDB.
My next step is to implement voting system (upvote/downvote) with basic fetaures:
Each registered user can vote only once per item, and later is only allowed to change that vote.
number of votes needs to be displayed to all users next to every item.
items have only single-item views, and are (almost) never displayed in a list view.
only list view I need is "top 100 items by votes" but it is ok to calculate this once per day and serve cached version
My goal is to design a database/lambda to minimize costs of AWS. It's easy to make the logic work but I'm not sure if my solution is the optimal one:
My items table currently has hashkey slug and sortkey version
I created items-votes table with hashkey slug and sortkey user and also voted field (containing -1 or 1)
I added field votes to items table
API call to upvote/downvote inserts to item-votes table but before checks constraints that user has not already voted that way. Then in second query updates items table with updated votes count. (so 1 API call and 2 db queries)
old API call to show an item stays the same but grabs new votes count too (1 API call and 1 db query)
I was wondering if this can be done even better with avoiding new items-votes table and storing user votes inside items table? It looks like it is possible to save one query that way, and half the lambda execution time but I'm worried it might make that table too big/complex. Each user field is a 10 chars user ID so if item gets thousands of votes I'm not sure how Lambda/DynamoDB will behave compared to original solution.
I don't expect thousands of votes any time soon, but it is not impossible to happen to a few items and I'd like to avoid situation where I need to migrate to different solution in the near future.
I would suggest to have a SET DynamoDB (i.e. SS) attribute to maintain the list of users who voted against the item. Something like below:-
upvotes : ['user1', 'user2']
downvotes : ['user1', 'user2']
When you update the votes using UpdateExpression, you can use ADD operator which adds users to SET only if it doesn't exists.
ADD - Adds the specified value to the item, if the attribute does not
already exist. If the attribute does exist, then the behavior of ADD
depends on the data type of the attribute:
If the existing data type is a set and if Value is also a set, then
Value is added to the existing set. For example, if the attribute
value is the set [1,2], and the ADD action specified [3], then the
final attribute value is [1,2,3]. An error occurs if an ADD action is
specified for a set attribute and the attribute type specified does
not match the existing set type. Both sets must have the same
primitive data type. For example, if the existing data type is a set
of strings, the Value must also be a set of strings.
This way you don't need to check whether the user already upvote or downvote for the item or not.
Only thing you may need to ensure is that the same user shouldn't be present on upvote and downvote set. Probably, you can use REMOVE or ConditionExpression to achieve this.

Resources