I have no idea what to call what I'm trying to do, but I can explain it quite well. I have two tables with the following structure in my WebSQL database. This is being used in my mobile application (Hybrid app) to keep storage of user messages. These are my tables:
message_threads [ thread_id, user_id, last_seen, last_active ]
messages [ message_id, thread_id, message_type, message_content, message_date ]
I already have all of the logic handled for adding messages to the database, and creating new message threads, but the problem I have is ordering them when trying to retrieve them.
I would like to order my results (of messages) by the last_active field of the message_thread with the corresponding thread_id. However, now that I've done that, I also want to order all of the messages for that thread by the message's message_date field.
So, basically, I want to group all of my messages by their thread, order the thread by the last_active field, then order the messages inside of that thread by the message_date field. I assume there's a way to do this in SQLite, and if not I can just do a lot of loop logic on the front-end, which really won't hurt anything, but it's always nice to know the tricks of the query language.
Related
Ive been building a serverless app using Dynamodb as the database, and have been following the single table design pattern (e.g. https://www.alexdebrie.com/posts/dynamodb-single-table/). Something that I'm starting to come up against is the use of dynamodb streams - I want to be able to use a dynamodb stream to keep an Elasticsearch instance up to date.
At the minute the single dynamodb table holds about 10 different types of items (which will continue to expand), one of these item types, 'event' (as in a sporting event) will be sent to the elastic search instance for complex querying/searching. Therefore any changes to an 'event' item will need to be updated in Elasticsearch via a lambda function triggered by the stream.
What I am struggling with is that I will have a lambda being triggered on 'update' on any of the table items, but that could also be an update of one of the other 9+ item types, I get that inside the lambda I can check for the item that was updated and check its type etc, but it seems wasteful that pretty much any update to any item type will trigger the lambda, which could be potentially a lot more times than needed.
Is there a better way to handle this to be less wasteful and more targeted to only one item type? I'm thinking that as the app grows and more stream triggers are needed, at least there would be an 'update' lambda already being triggered that I could run some logic to see what type of item was updated, but I'm just concerned i've missed a point on something.
You can use Lambda Event Filtering. This will allow you to prevent specific events from ever invoking your function. In the case of your single table DynamoDB design pattern, you can filter out only records with type: EVENT.
If you so happen to be utilizing the Serverless Framework, the following yaml snippet showcases how you can easily implement this feature.
functionName:
handler: src/functionName/function.handler
# other properties
events:
- stream:
type: dynamodb
arn: !GetAtt DynamoDbTable.StreamArn
maximumRetryAttempts: 1
batchSize: 1
filterPatterns:
- eventName: [MODIFY]
dynamodb:
MyTableName:
type:
S: [EVENT]
Note multiple comparison operators exist such as begins with i.e. [{"prefix":"EVENT"}] ~ see Filter rule syntax for more.
Source Pawel Zubkiewicz on Dev.to
Unfortunately, the approach you are describing is the only way to process DynamoDb streams. I went down the same path myself, thinking it could not be the correct usage, but it is the only way you can process streams.
I am creating a leave tracker app where I want to store the user ID along with the from date and to date. I am using Amazon's DynamoDB as the database, and the user enters a leave through a custom command.
Eg: apply-leave from-date to-date
I want to avoid duplicate entries in the database. For example, if a user has already applied for a leave between 06-10-2019 to 10-10-2019 and applies for a leave between the same dates again, they should get a message saying that this already exists and a new record should not be created for the same.
However, a user can apply for multiple leaves and two users can take a leave between the same dates.
I tried using a conditional statement as follows:
table.put_item(
Item={
'leave_id': leave_id,
'user_id': user_id,
'from_date': from_date,
'to_date': to_date,
},
ConditionExpression='attribute_not_exists(user_id) AND attribute_not_exists(from_date) AND attribute_not_exists(to_date)'
)
where leave_id is the partition key. However, this does not work and a new row is added every time, even if it is the same dates. I have looked through similar other questions, but haven't been able to understand how to get this configured correctly.
Any ideas on how I should go about this, or if there is a different design that I should follow?
If you are calling your code with the leave_id that doesn't yet exist in the table, the item will always be inserted. If you call your code with leave_id that does already exist in your table you should be getting An error occurred (ConditionalCheckFailedException) when calling the PutItem operation: The conditional request failed error message.
I have two suggestions:
If you don't want to change your table, you can create a secondary index with user_id as the partition key and then query the index for all the items where the given user has some from_date and to_date attributes.
Like this:
table.query(
IndexName='user_id-index',
KeyConditionExpression=Key('user_id').eq(user_id),
FilterExpression=Attr('from_date').exists() & Attr('from_date').exists()
)
Then you will need to check for overlapping leave requests, etc. (eg. leave request that starts before the one that is already in place finishes). After deciding that the leave request is a valid one you will call put_item.
Another suggestion and probably a better one would be to create a composite primary key on your table with user_id as a partition key and leave_id as a sort key. That way you could execute a query for all leave requests from a particular user without the need to create a secondary index.
I've read almost everywhere about structuring one's Firebase Database for efficient querying, but I am still a little confused between two alternatives that I have.
For example, let's say I want to get all of a user's "maxBenchPressSessions" from the past 7 days or so.
I'm stuck between picking between these two structures:
In the first array, I use the user's id as an attribute to index on whether true or false. In the second, I use userId as the attribute NAME whose value would be the user's id.
Is one faster than the other, or would they be indexed a relatively same manner? I kind of new to database design, so I want to make sure that I'm following correct practices.
PROGRESS
I have come up with a solution that will both flatten my database AND allow me to add a ListenerForSingleValueEvent using orderBy ONLY once, but only when I want to check if a user has a session saved for a specific day.
I can have each maxBenchPressSession object have a key in the format of userId_dateString. However, if I want to get all the user's sessions from the last 7 days, I don't know how to do it in one query.
Any ideas?
I recommend to watch the video. It is told about the structuring of the data very well.
References to the playlist on the firebase 3
Firebase 3.0: Data Modelling
Firebase 3.0: Node Client
As I understand the principle firebase to use it effectively. Should be as small as possible to query the data and it does not matter how many requests.
But you will approach such a request. We'll have to add another field to the database "negativeDate".
This field allows you to get the last seven entries. Here's a video -
https://www.youtube.com/watch?v=nMR_JPfL4qg&feature=youtu.be&t=4m36s
.limitToLast(7) - 7 entries
.orderByChild('negativeDate') - sort by date
Example of a request:
const ref = firebase.database().ref('maxBenchPressSession');
ref.orderByChild('negativeDate').limitToLast(7).on('value', function(snap){ })
Then add the user, and it puts all of its sessions.
const ref = firebase.database().ref('maxBenchPressSession/' + userId);
ref.orderByChild('negativeDate').limitToLast(7).on('value', function(snap){ })
I'm designing a chat app much like Facebook Messenger. My two current root nodes are chats and users. A user has an associated list of chats users/user/chats, and the chats are added by autoID in the chats node chats/a151jl1j6. That node stores information such as a list of the messages, time of the last message, if someone is typing, etc.
What I'm struggling with is where to make the definition of which two users are in the chat. Originally, I put a reference to the other user as the value of the chatId key in the users/user/chats node, but I thought that was a bad idea incase I ever wanted group chats.
What seems more logical is to have a chats/chat/members node in which I define userId: true, user2id: true. My issue with this is how to efficiently query it. For example, if the user is going to create a new chat with a user, we want to check if a chat already exists between them. I'm not sure how to do the query of "Find chat where members contains currentUserId and friendUserId" or if this is an efficient denormalized way of doing things.
Any hints?
Although the idea of having ids in the format id1---||---id2 definitely gets the job done, it may not scale if you expect to have large groups and you have to account for id2---||---id1 comparisons which also gets more complicated when you have more people in a conversation. You should go with that if you don't need to worry about large groups.
I'd actually go with using the autoId chats/a151jl1j6 since you get it for free. The recommended way to structure the data is to make the autoId the key in the other nodes with related child objects. So chats/a151jl1j6 would contain the conversation metadata, members/a151jl1j6 would contain the members in that conversation, messages/a151jl1j6 would contain the messages and so on.
"chats":{
"a151jl1j6":{}}
"members":{
"a151jl1j6":{
"user1": true,
"user2": true
}
}
"messages":{
"a151jl1j6":{}}
The part where this gets is little "inefficient" is the querying for conversations that include both user1 and user2. The recommended way is to create an index of conversations for each user and then query the members data.
"user1":{
"chats":{
"a151jl1j6":true
}
}
This is a trade-off when it comes to querying relationships with a flattened data structure. The queries are fast since you are only dealing with a subset of the data, but you end up with a lot of duplicate data that need to be accounted for when you are modifying/deleting i.e. when the user leaves the chat conversation, you have to update multiple structures.
Reference: https://firebase.google.com/docs/database/ios/structure-data#flatten_data_structures
I remember I had similar issue some time ago. The way how I solved it:
user 1 has an unique ID id1
user 2 has an unique ID id2
Instead of adding a new chat by autoId chats/a151jl1j6 the ID of the chat was id1---||---id2 (superoriginal human-readable delimeter)
(which is exactly what you've originally suggested)
Originally, I put a reference to the other user as the value of the chatId key in the users/user/chats node, but I thought that was a bad idea in case I ever wanted group chats.
There is a saying: https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it
There might a limitation of how many userIDs can live in the path - you can always hash the value...
I am working on a card game platform with a lot (10,000+) of cards dynamically updated with real-world data. Cards are populated/updated once a day.
I have two basic collections at the foundation (asides from users):
1) data - all individual items with different data values for same data fields/parameters (for example, various car models with their specifications). I update this collection once a day from a json API I have on another server for another purpose.
2) cards - "printed" cards with unique IDs but duplicates are off course possible (so we can have 10 Ford Focus 2010. cards).
Cards collection has a couple of most important fields from data collection (model, brand, top performance parameter(s) of the card) to provide efficient user card browsing, and a "dataId" field which links it to data collection for detailed info.
Cards in collection "cards" should be inserted ("issued" or "printed") with functions/methods on server side but in response to client side events (such as new-game etc). When a new card is inserted/dispatched, it first gets a unique "admin-owner" with a user _id from users table for one-to-one relationship, which is later updated to create ownership.
So, on client side, cards collection is like a user "deck" (all cards where owner is user). If I am correct, it should be written on the server side as:
Meteor.publish('cards', function() {
return Cards.find({"userID":this.userId});
});
This is all quite clear and up to that point Meteor is fantastic as it saves me months of work!
But, I am not sure about:
1) I would like to have a client-side data collection publication to cover client detailed card view (by linking cards with data). It should off course have only all data items from data collection with details for each card in client card collection ("deck"). I see it as something like:
Meteor.publish('data', function (dataIds *array with all unique data item ids in client card collection *) {
return Data.find("dataID":{$in:dataIds);
});
2) I need a server/client method to add/insert new cards from data-items ("create 10 news Ford Focus 2010 cards") with an empty/admin user by executing Meteor.call methods from client console of "admin" user, and a server/client method to change ownership of a random card so that it becomes a part of a client cards collection ("cast random card to user").
Where would I place those methods? How can I access server methods from client console (if a certain admin user is logged)?
4) I need a clever way of handling a server publication/client subscription of data collection that will have only the data used in cards from client cards collection.
Should I use client side minimongo query to create an array with all dataIds needed to cover local cards collection? I am new to mongo, so I am not sure how would I write something like SELECT DISTINCT or GROUP BY to get that. Also, not sure if that is the best way to handle that, or should I do something server side as a publication?
Having a clear idea on 1-4 would get me going and then I guess I would dig my way around (and under :)
1) The publish function you wrote makes perfect sense. Of course, there's a bit confusion in the term "client-side data collection publication": publications are on the server side, while on the client side you've got subscriptions. Also, while you didn't specify your schema, I suppose you've got dataID field in cards collection, that joins with _id in data collection, so your find should say {_id: {$in: dataIds}}.
2) Read this carefully, there's all you need for that. Remember to check user privileges within the server side method. A rule of thumb for security is that you should never trust the client.
3) There's no point 3?
4) I'm not sure how the question here is different from 1. However, you should probably familiarize with this method, which you can use in your subscription to ensure the _ids in the array are unique.