DynamoDB How to Setup "Reverse lookup GS" - amazon-dynamodb

I'm trying to figure out how to implement a reverse lookup GSI in DyamoDB. I attended an amazing talk about DynamoDB at reInvent this year (https://youtu.be/HaEPXoXVf2k?t=2674). Around 44 minutes into the talk the idea of a Reverse Lookup GSI is presented. I can't figure out how to implement this in Dynamo.
I want to add a single GSI to do a reverse lookup.
My current Scheme looks like:
I would like to be able to query on just the CXSK. I'm planning on overloading the CXSK and would love to be able to do a query with a begins with for that key.
I'm not sure what I'm missing when I go to create the GSI. I'm not sure what should go in the following fields. I'm also curious if it makes sense to have an overloaded Sort Key.

Let's say this is your original table
| pk | sk | prop1 | prop2 | ...
| a | b | xyz | abc
| a | c | lmn | opq
| b | x | rst | lme
| b | b | tuv | opq
in the above table you can do queries like
select * where pk = a It will return row 1 and 2
select * where pk = a and sk = b it will return row 1
Now to do reverse lookup mean you want to aggregate data by some other field name.
Let's say we want to do it by sk. To do this we will create a GSI with sk as partitionKey and pk as SortKey. And this view of table will look like
This will be your GSI1 table
| pk | sk | prop1 | prop2 | ...
| b | a | xyz | abc
| c | a | lmn | opq
| x | b | rst | lme
| b | b | tuv | opq
in the above table you can do queries like
select * where pk = b It will return row 1 and 4
select * where pk = b and sk = a it will return row 1
Considering the above description, in your case you should create GSI with pk as CXSK and sk as USERId

Related

How to scan DynamoDB table for retrieving only one item in each partition key

Let say I have a table with partition key "ID" and range key "Time" with the following items:
ID | Time | Data
------------------
A | 1 | abc
A | 2 | def
B | 2 | ghi
B | 3 | jkl
And I want to scan only one item in each partition that has the highest time value in each partition. So the outcome of the scan should look like:
ID | Time | Data
------------------
A | 2 | def
B | 3 | jkl
Is this possible with the DynamoDB's scan feature?
(I want to avoid scan all and do such filtering by myself).
If you want to fetch just a few IDs along with their highest Time, you can query with reverse index, so for every ID, you will have only 1 item read. But for this you need an existing list of IDs.
So for each ID, there will be:
1 query
1 item read
Otherwise, the only way is to scan everything unfortunately.

Inserting Duplicate Records into a Temporary Table

I have a table ABC with duplicate records. I want to Insert only the duplicate records into another table ABC_DUPE in same schema using Bteq.
Any suggestions ?
Thanks,
Mukesh
You can use QUALIFY statement to identify and output duplicates:
Since you didn't share your table... Then consider the following ABC table:
+----+----+----+
| f1 | f2 | f3 |
+----+----+----+
| 1 | a | x |
| 1 | b | y |
| 2 | a | z |
| 2 | b | w |
| 2 | a | n |
+----+----+----+
Where a unique record is determined by using fields f1 and f2. In this example the record where f1=2 and f2='a' is a duplicate with f3 values z and n. To output these we use qualify:
SELECT *
FROM ABC
QUALIFY COUNT(*) OVER (PARTITION BY f1, f2) > 1;
QUALIFY uses Window functions to determine which records to include in the outputted record set. Here we use window function COUNT(*) partitioning by our unique composite key f1, f2. We keep only records where the Count(*) over that partition is greater than 1.
This will output:
+----+----+----+
| f1 | f2 | f3 |
+----+----+----+
| 2 | a | z |
| 2 | a | n |
+----+----+----+
You can use this in a CREATE TABLE statement like:
CREATE TABLE ABC_DUPE AS
(
SELECT *
FROM ABC
QUALIFY COUNT(*) OVER (PARTITION BY f1, f2) > 1
) PRIMARY INDEX (f1, f2);

Query to get the user list which are belongs to a group

I can select the users which has 'sex = 2' by this sql
createQueryBuilder('s')
->where('s.sex = 2');
How can I select the users which are belonging to group A?
My tables are below.
my user table.
ID | name |sex
1 | bob |1
2 | kayo |2
3 | ken |1
my fos_group table
ID | name
1 | student
2 | teacher
my fos_user_user_group
user_id | group_id
1 | 1
2 | 2
3 | 1
it means that
Bob and Ken are belonging to group_1(student)
Kayo is belonging to group_2(teacher)
I would like to select the lists from user table which are belonging to 'student' or 'teacher'
What I want to have is username list belonging to student.
ID | name | sex
1 | bob |1
3 | ken |1
You need to do a join first, and then filter on the association property.
$entityRepository
->createQueryBuilder('s')
->join('s.groups', 'g') // Assuming the association on your user entity is 'groups'
->where('g.name = :group')->setParameter('group', 'student');
See http://docs.doctrine-project.org/en/2.0.x/reference/dql-doctrine-query-language.html#joins for examples of filtering on associations with DQL.
SELECT
g.name AS GroupName,
GROUP_CONCAT(u.name) AS Users
FROM fos_group AS g
INNER JOIN fos_user_user_group AS ug ON ug.group_id = g.ID
INNER JOIN user AS u ON u.id = ug.user_id
GROUP BY g.name
OUTPUT :
GroupName | Users
---------------------------
student | bob , ken
teacher | kayo

How to add a new column in a View in sqlite?

I have this database in sqlite (table1):
+-----+-------+-------+
| _id | name | level |
+-----+-------+-------+
| 1 | Mike | 3 |
| 2 | John | 2 |
| 3 | Bob | 2 |
| 4 | David | 1 |
| 5 | Tom | 2 |
+-----+-------+-------+
I want to create a view with all elements of level 2 and then to add a new column indicating the order of the row in the new table. That is, I would want this result:
+-------+------+
| index | name |
+-------+------+
| 1 | John |
| 2 | Bob |
| 3 | Tom |
+-------+------+
I have tried:
CREATE VIEW words AS SELECT _id as index, name FROM table1;
But then I get:
+-------+------+
| index | name |
+-------+------+
| 2 | John |
| 3 | Bob |
| 5 | Tom |
+-------+------+
I suppose it should be something as:
CREATE VIEW words AS SELECT XXXX as index, name FROM table 1;
What should I use instead of XXXX?
When ordered by _id, the number of rows up to and including this one is the same as the number of rows where the _id value is less than or equal to this row's _id:
CREATE VIEW words AS
SELECT (SELECT COUNT(*)
FROM table1 b
WHERE level = 2
AND b._id <= a._id) AS "index",
name
FROM table1 a
WHERE level = 2;
(The computation itself does not actually require ORDER BY _id because the order of the rows does not matter when we're just counting them.)
Please note that words is not guaranteed to be sorted; add ORDER BY "index" if needed.
And this is, of course, not very efficient.
You have two options. First, you could simply add a new column with the following:
ALTER TABLE {tableName} ADD COLUMN COLNew {type};
Second, and more complicatedly, but would actually put the column where you want it, would be to rename the table:
ALTER TABLE {tableName} RENAME TO TempOldTable;
Then create the new table with the missing column:
CREATE TABLE {tableName} (name TEXT, COLNew {type} DEFAULT {defaultValue}, qty INTEGER, rate REAL);
And populate it with the old data:
INSERT INTO {tableName} (name, qty, rate) SELECT name, qty, rate FROM TempOldTable;
Then delete the old table:
DROP TABLE TempOldTable;
I'd much prefer the second option, as it will allow you to completely rename everything if need be.

Pull a row from SQL database based on if the value of a column is changed

I need to pull a row in a select statement from a SQL database if a certain value in a table is changed.
For example, I have a column called price in a Price table. If the user changes the value for price (through an asp.net app), I want to select that entire row. This is going to be done in a workflow and an email is sent to the user that the row that was changed AFTER it was changed.
Does this make sense? Can someone point me in the right direction of a procedure or function to use? Thanks.
You could use an SQL trigger to accomplish this.
There is a tutorial (using Price as you described) that shows how to accomplish this here: http://benreichelt.net/blog/2005/12/13/making-a-trigger-fire-on-column-change/
well, in order to update a row, you'll have to update that row "WHERE uniqueID = [someid]". Can't you simply run a select immediately after that? (SELECT * FROM [table] WHERE uniquueID = [someid])
Without knowing what your data looks like (or what database this is, it's a little difficult) but assuming you have a history table with a date and an ID that stays the same like this...
+----+-------+------------+
| ID | PRICE | CHNG_DATE |
+----+-------+------------+
| 1 | 2.5 | 2001-01-01 |
| 1 | 42 | 2001-01-01 |
| 2 | 4 | 2001-01-01 |
| 2 | 4 | 2001-01-01 |
| 3 | 4 | 2001-01-01 |
| 3 | 3 | 2001-01-01 |
| 3 | 2 | 2001-01-01 |
+----+-------+------------+
and your database supports With and Row_number You could write the following
WITH data
AS (SELECT id,
price,
chng_date,
Row_number()
OVER (
partition BY id
ORDER BY chng_date) rn
FROM price)
SELECT data.id,
data.price new,
data_prv.price old,
data.chng_date
FROM data
INNER JOIN data data_prv
ON data.id = data_prv.id
AND data.rn = data_prv.rn + 1
WHERE data.price <> data_prv.price
That would produce this
+----+-----+-----+------------+
| ID | NEW | OLD | CHNG_DATE |
+----+-----+-----+------------+
| 1 | 42 | 2.5 | 2001-01-01 |
| 3 | 3 | 4 | 2001-01-01 |
| 3 | 2 | 3 | 2001-01-01 |
+----+-----+-----+------------+
Demo
If your Database supports LAG() its even eaiser
WITH data
AS (SELECT id,
price new,
chng_date,
Lag(price)
OVER (
partition BY id
ORDER BY chng_date) old
FROM price)
SELECT id,
new,
old,
chng_date
FROM data
WHERE new <> old
Demo

Resources