How to search for unique users in this dynamodb table? - amazon-dynamodb

How does one return a list of unique users from a dynamodb table with the following (simplified) schema? Does it require a GSI? This is for an app with small number of users, and I can think of ways that will work for my needs without creating a GSI (like scanning and filtering on SK, or creating a new item with list of user ids inside). But what is the scalable solution?
------------------------------------------------------
| pk | sk | amount | balance
------------------------------------------------------
| "user1" | "2021-01-01T12:00:00Z" | 7 |
| "user1" | "2021-01-03T12:00:00Z" | 5 |
| "user2" | "2021-01-01T12:00:00Z" | 3 |
| "user2" | "2021-01-03T12:00:00Z" | 2 |
| "user1" | "user1" | | 12
| "user2" | "user2" | | 5

Your data model isn't designed to fetch all unique users efficiently.
You certainly could use a scan operation and filter with your current data model, but that is inefficient.
If you want to fetch all users in a single query, you'll need to get all user information into a single partition. As you've identified, you could do this with a GSI. You could also re-organize your data model to accommodate this access pattern.
For example, you mentioned that the application has a small number of users. If the number of users is small enough, you could create a partition that stores a list of all users (e.g. PK=USERS). If you could do this under 400kb, that may be a viable solution.
The idiomatic solution is to create a global secondary index.

Related

DynamoDB access pattern for storing shopping history

What is a solid DynamoDB access pattern for storing data from a bunch of receipts of identical format? I would use SQL for maximum flexibility on more advanced analytics, but as a learning exercise want to see how far one can go with DynamoDB here. For starters I'd like to query for aggregate overall and per product spending for a given time range, track product price history, sort receipts by total, stuff along those lines. But I also want it to be as flexible as possible for future queries I haven't thought of yet. Would something like this, plus some GSI's, work?
-----------------------------------------------------------------------------------------------------------
| pk | sk | unit $ | qty | total $ | receipt total | items
-----------------------------------------------------------------------------------------------------------
| "product a" | "2021-01-01T12:00:00Z" | 2 | 2 | 4 | |
| "product b" | "2021-01-01T12:00:00Z" | 2 | 3 | 6 | |
| "receipt" | "2021-01-01T12:00:00Z" | | | | 10 | array of above item data
| "product a" | "2021-01-02T12:00:00Z" | 1.75 | 3 | 5.25 | |
| "product c" | "2021-01-02T12:00:00Z" | 2 | 2 | 4 | |
| "receipt" | "2021-01-02T12:00:00Z" | | | | 9.25 | array of above item data
-----------------------------------------------------------------------------------------------------------
You have to decide your access patterns, and build the design of the dynamo off that not the other way around. No one outside your team/product can tell you what your access patterns are. That entirely depends on your products need.
You have to ask: What pieces of Information do you have, and what do you need to retrieve when you have those pieces of information? You then have decide what is the most common ones that will be done the most and craft your PK/SK combinations off that. If you can't fit all your queries into just one or two bits of information, you may want to set up an Index - but Index's should be maintained only for far less often accessed queries.
If you need to, its also Accepted Practice to enter the same information twice - in two documents in the table - as writes are easier/cheaper than multiple reads (a write is pretty much one WCU per document - any query/scan can be multiple RCUs even if you only need one part -- plus Index's being replications of the table mean there is a desync chance if you write/read too quickly or try to write/read the same document in parallel calls)
Take your time now to sit down and consider everything your app will need to query the dynamo for. The more you can figure out now, the better, and if you can set your PK to something that will almost always be available to the calling function trying to query then you will be in a much better state.

Firestore, follower data structure

I`m just creating a Instagram clone app for testing
my data structure is below
--- users (root collection)
|
--- uid (one of documents)
|
--- name: "name"
|
--- email: "email#email.com"
|
--- following (sub collection)
| |
| --- uid (one of documents)
| |
| --- customUserId : "blahblah"
| |
| --- name : "name"
| |
| --- pictureStorageUrl : "https://~~"
|
--- followers (sub collection)
| |
| --- uid (one of documents)
| |
| --- customUserId : "blahblah"
| |
| --- name : "name"
| |
| --- pictureStorageUrl : "https://~~"
|
Assume user A has 1 million followers and then if user A edits a picture or name or customUserId, should the document of each sub collection "following" of 1 million followers users be modified?
Should there be 1 million updates? Are there any more efficient rescue methods? And if there is no other good way, is it appropriate to batch data modification through the database trigger of the cloud function in the case of the above method?
Should the document of each sub collection "following" of 1 million followers users be modified? Should there be 1 million updates?
That's entirely up to you to decide. If you don't want to update them, then don't. But if you want the data to stay in sync, then you will have to find and update all of the documents where that data is copied.
Are there any more efficient rescue methods?
To update 1 million documents? No. If you have 1 million documents to update, then you will have to find and update them each individually.
And if there is no other good way, is it appropriate to batch data modification through the database trigger of the cloud function in the case of the above method?
Doing the updates in Cloud Functions still costs 1 million updates. There aren't any shortcuts to this work - it's the same on both the frontend and the backend. Cloud Functions will just let you trigger that work to happen on the backend automatically.
If you want to avoid 1 million updates, then you should instead not copy the data 1 million times. Just store a UID, and do a second query to look up information about that user.

Put key on wrong record

I have a problem connecting app maker with Google Cloud Platform. First I created two tables with one(department) to many(employee) relation. I also created an employee form.
Datasource on panel using employee, and on form using inherited:employee(create). Submit button using SaveChanges To Datasource. My Problem is, when I create the first record, it only saves employee data, while department_fk is empty.
When I create the second record, it will create data, but the department_fk will be placed on the first record.
+------+--------+---------------+-----------------------+-----------+
| id | name | department_fk | email | phone |
+------+--------+---------------+-----------------------+-----------+
| 150 | john | 262 | bermuara#gmail.com | 3393939 |
| 151 | brian | NULL | takdungder#gmail.com | 03030303 |
+------+--------+---------------+-----------------------+-----------+
On the above record, brian should be in department_fk 262, but it assigns it on john's record. How do I fix it?

how does telegram checks if a newly joined user number exists in another user contacts list

I've been trying to research this for a while now, what I want is very simple. I'm trying to compare two phone numbers and checks if they match because I'm tryign to implement something similar to telegram, notify a user if one of his contacts list created an account.
My problem is the following:
If I saved my contact using this format 0791234567 and my contact joined using this number +962791234567 both numbers are the same but the first is using local formats and the second using international formats. Does telegram finds these two numbers as a match and sends me a notification indicating that my contact has joined the network ?
I tried to use google library for parsing the numbers, but unfortunately the library doesn't always parse numbers in any format especially if the region was not provided.
Any hints ? or this is just not possible and all numbers must be of a specific format to be able to find a match ?
I think you should have two fields: ‍counry_code and phone_number, and when registering, login, changing the mobile number and etc, get each of the fields individually.
for example :
id | first_name| last_name | password | country_code |phone_number|...
----------------------------------------------------------------------
1 | alihossein| shahabi | XXXXX | +98 |9377548654
or two tables users and phone_numbers :
id | first_name| last_name | password |
------------------------------------------
1 | alihossein| shahabi | XXXXX |
id | user_id| country_code | phone_number | active
--------------------------------------------------
1 | 1 | +98 | 9377541258 | 1
2 | 1 | +98 | 9377543333 | 0

How to handle additional columns in join tables when using Symfony?

Let's assume I have two Entities in my Symfony2 bundle, User and Group. Associated by a many-to-many relationship.
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
| USER | | USER_GROUP_REL | | GROUP |
├────────────────┤ ├────────────────┤ ├────────────────┤
| id# ├---------┤ user_id# | ┌----┤ id# |
| username | | group_id# ├----┘ | groupname |
| email | | created_date | | |
└────────────────┘ └────────────────┘ └────────────────┘
What would be a good practice or a good approach to add additional columns to the join table, like a created date which represents the date when User joined Group?
I know that I could use the QueryBuilder to write an INSERT statement.
But as far as I have not seen any INSERT example of QueryBuilder or native SQL which makes me believe that ORM/Doctrine try to avoid direct INSERT statements (e.g. for security reasons). Plus as far as I have understood Symfony and Doctrine I would be taken aback if such a common requirement wouldn't be covered by the framework.
You want to set a property of the relation. This is how it's done in doctrine:
doctrine 2 many to many (Products - Categories)
I answered that question with a use case (like yours).
This is an additional question / answer which considers the benefits and use cases: Doctrine 2 : Best way to manage many-to-many associations

Resources