DynamoDB intersection select with pagination - amazon-dynamodb

I have following DB schema and I'd like to find the best way how to select list of Sorted keys which are common for PK_A and PK_B:
+---------------+---------+
| PK | SortKey |
+---------------+---------+
| | SK_A |
| PK_A | SK_B |
| | SK_C |
| - - - - - - - | |
| | SK_B |
| PK_B | SK_C |
| | SK_D |
+---------------+---------+
so when I do select by PK_A and PK_B it should return me only SK_B and SK_C?
Any help is appreciated.

Simple answer, you can't do it (in one call).
Dynamo is not a relational database, operations such as intersection are not supported.
You'd need to query() once for each partition key and then calculate the intersect yourself.

Related

MariaDB DATETIME Index not working with Between FROM_UNIXTIME()

I have a table with DATETIME field, which is indexed by a BTree. Now i want to query it with following statement:
SELECT
count(us.CITY) as metric,
us.CITY as Name,
us.LATITUDE as latitude,
us.LONGITUDE as longitude
FROM
FACT
LEFT JOIN
USER us
ON
us.ID_USER = FACT.USER
WHERE
ASSESSMENT_DATE BETWEEN FROM_UNIXTIME(1601568552) AND FROM_UNIXTIME(1604028277)
GROUP BY us.CITY, us.LATITUDE, us.LONGITUDE;
EXPLAIN:
+------+-------------+-------+--------+----------------------------+---------+---------+------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+--------+----------------------------+---------+---------+------------------------------+--------+----------------------------------------------+
| 1 | SIMPLE | FACT | ALL | INDEX_FACT_ASSESSMENT_DATE | NULL | NULL | NULL | 762621 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | us | eq_ref | PRIMARY | PRIMARY | 46 | dwh0.FACT.USER,dwh0.FACT.ENV | 1 | |
+------+-------------+-------+--------+----------------------------+---------+---------+------------------------------+--------+----------------------------------------------+
2 rows in set (0.001 sec)
Interestingly, by only changing the dates manually into the DATETIME Format string it uses the index. But the FROM_UNIXTIME() function should in my opinion return the exactly same thing...
SELECT
count(us.CITY) as metric,
us.CITY as Name,
us.LATITUDE as latitude,
us.LONGITUDE as longitude
FROM
FACT
LEFT JOIN
USER us
ON
us.ENV = FACT.ENV AND us.ID_USER = FACT.USER
WHERE
-- ASSESSMENT_DATE BETWEEN FROM_UNIXTIME(1596649101) AND FROM_UNIXTIME(1599108827)
ASSESSMENT_DATE BETWEEN '2020-08-05 11:30:11.987' AND '2020-09-03 11:30:11.987'
GROUP BY us.CITY, us.LATITUDE, us.LONGITUDE;
EXPLAIN:
+------+-------------+-------+--------+----------------------------+----------------------------+---------+------------------------------+--------+--------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
|
+------+-------------+-------+--------+----------------------------+----------------------------+---------+------------------------------+--------+--------------------------------------------------------+
| 1 | SIMPLE | FACT | range | INDEX_FACT_ASSESSMENT_DATE | INDEX_FACT_ASSESSMENT_DATE | 5 | NULL | 132008 | Using index condition; Using temporary; Using filesort |
| 1 | SIMPLE | us | eq_ref | PRIMARY | PRIMARY | 46 | dwh0.FACT.USER,dwh0.FACT.ENV | 1 |
|
+------+-------------+-------+--------+----------------------------+----------------------------+---------+------------------------------+--------+--------------------------------------------------------+
2 rows in set (0.001 sec)
Can anyone refer to such a problem? the where clause is generated by grafana, so i can not change that, but the rest i can change if it changes something.
Thanks for suggestions!
Sorry for bothering.. after around 10^5 more inserts, it works for both cases... Maybe it was just bad luck

SQLite Versioning. Is it possible to use EXCEPT to show differences between rows where only one column changes?

I'm quite new to SQLite and I'm trying to use an EXCEPT statement in order to compare two tables with very similar data. The data comes from a CSV file I download daily, and within the file new rows are added and deleted, and old rows can have one or more columns change each day. I'm trying to find a way to select rows that have had a column's data change, when I am unable to predict which column's data will change.
Say for example I have:
TABLE contracts:
|ID|Description|Name|Contract Type|
|1 |Plumbing |Bob |Paper |
|2 |Cooking |Ryan|Paper |
|3 |Driving |Eric|Paper |
|4 |Dancing |Emma|Paper |
and:
TABLE updated_contracts:
|ID|Description|Name|Contract Type|
|1 |Hiking |Bob |Paper |
|2 |Cooking |Ryan|Paper |
|3 |Driving |Eric|Paper |
|4 |Dancing |Emma|Digital |
I'd like it to return:
|1 |Hiking |Bob |Paper |
|4 |Dancing |Emma|Digital |
because contract 1 has changed the description and contract 4 has changed the contract type.
Is it possible to do this in SQLite?
One way to do it is with a LEFT join of updated_contracts to contracts where the matching rows are filtered out:
select uc.*
from updated_contracts uc left join contracts c
using(id, Description, Name, `Contract Type`)
where c.id is null
EXCEPT can also be used like this:
select * from updated_contracts
except
select * from contracts
This will work only if the tables have the same number of columns and its advantage is that it compares null values in columns and returns true if they are both null.
See the demo.
Results:
| ID | Description | Name | Contract Type |
| --- | ----------- | ---- | ------------- |
| 1 | Hiking | Bob | Paper |
| 4 | Dancing | Emma | Digital |

WSO2 analytics server database is growing

I am using WSO2 API Manager along with it's analytics server. I configured MySQL as it's database.
After a year of PROD use, I found that there are couple of tables from Analytics module, which consumes most of the DB space, around 95%.
Would like to know the significance of these tables. As well the challenges if we delete those tables.
Table names are
+--------------------------------+------------------------------------------------------+------------+
| Database | Table | Size in MB |
+--------------------------------+------------------------------------------------------+------------+
| wso2_analytics_event_store | anx___7lsekeca_ | 665.03 |
| wso2_analytics_event_store | anx___7lmnf2xa_ | 638.00 |
| wso2_analytics_event_store | anx___7lqcf_8o_ | 636.14 |
| wso2_analytics_event_store | anx___7lmk3tr0_ | 398.13 |
| analytics_processed_data_store | anx___7lpteea4_ | 282.75 |
| analytics_processed_data_store | anx___7lsn7ita_ | 249.97 |
| wso2_analytics_event_store | anx___7lsgqyce_ | 209.25 |
| wso2_analytics_event_store | anx___7lmno15m_ | 207.25 |
| wso2_analytics_event_store | anx___7lver1fy_ | 191.16 |
You can enable data purging for analytics tables. See below section taken from the docs.
Ref: https://docs.wso2.com/display/AM220/Purging+Analytics+Data

Modeling Relational Data in DynamoDB (nested relationship)

Entity Model:
I've read AWS Guide about create a Modeling Relational Data in DynamoDB. It's so confusing in my access pattern.
Access Pattern
+-------------------------------------------+------------+------------+
| Access Pattern | Params | Conditions |
+-------------------------------------------+------------+------------+
| Get TEST SUITE detail and check that |TestSuiteID | |
| USER_ID belongs to project has test suite | &UserId | |
+-------------------------------------------+------------+------------+
| Get TEST CASE detail and check that | TestCaseID | |
| USER_ID belongs to project has test case | &UserId | |
+-------------------------------------------+------------+------------+
| Remove PROJECT ID, all TEST SUITE | ProjectID | |
| AND TEST CASE also removed | &UserId | |
+-------------------------------------------+------------+------------+
So, I model a relational entity data as guide.
+-------------------------+---------------------------------+
| Primary Key | Attributes |
+-------------------------+ +
| PK | SK | |
+------------+------------+---------------------------------+
| user_1 | USER | FullName | |
+ + +----------------+----------------+
| | | John Doe | |
+ +------------+----------------+----------------+
| | prj_01 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-04-22 | |
+ +------------+----------------+----------------+
| | prj_02 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-05-26 | |
+------------+------------+----------------+----------------+
| user_2 | USER | FullName | |
+ + +----------------+----------------+
| | | Harry Potter | |
+ +------------+----------------+----------------+
| | prj_01 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-04-25 | |
+------------+------------+----------------+----------------+
| prj_01 | PROJECT | Name | Description |
+ + +----------------+----------------+
| | | Facebook Test | Do some stuffs |
+ +------------+----------------+----------------+
| | t_suite_01 | | |
+ + +----------------+----------------+
| | | | |
+------------+------------+----------------+----------------+
| prj_02 | PROJECT | Name | Description |
+ + +----------------+----------------+
| | | Instagram Test | ... |
+------------+------------+----------------+----------------+
| t_suite_01 | TEST_SUITE | Name | |
+ + +----------------+----------------+
| | | Test Suite 1 | |
+ +------------+----------------+----------------+
| | t_case_1 | | |
+ + +----------------+----------------+
| | | | |
+------------+------------+----------------+----------------+
| t_case_1 | TEST_CASE | Name | |
+ + +----------------+----------------+
| | | Test Case 1 | |
+------------+------------+----------------+----------------+
If I just have UserID and TestCaseId as a parameter, how could I get TestCase Detail and verify that UserId has permission.
I've thought about storing complex hierarchical data within a single item. Something likes this
+------------+-------------------------+
| t_suite_01 | user_1#prj_1 |
+------------+-------------------------+
| t_suite_02 | user_1#prj_2 |
+------------+-------------------------+
| t_case_01 | user_1#prj_1#t_suite_01 |
+------------+-------------------------+
| t_case_02 | user_2#prj_1#t_suite_01 |
+------------+-------------------------+
Question: What is the best way for this case? I appreciate if you could give me some suggestion for this approach (bow)
I think the schema below does what you want. Create a Partition Key only GSI on the "GSIPK" attribute and query as follows:
Get Test Suite Detail and Validate User: Query GSI - PK == ProjectId, FilterCondition [SK == TestSuiteId || PK == UserId]
Get Test Case Detail and Validate User: Query GSI - PK == TestCaseId, FilterCondition [SK = TestSuiteId:TestCaseId || PK = UserId]
Remove Project: Query GSI - PK == ProjectId, remove all items returned.
Queries 1 and 2 come back with 1 or 2 items. One is the detail item and the other is the user permissions for the test suite or test case. If only one item returns then its the detail item and the user has no access.
The first question you should ask is: why do I want to use key-value document DB over relational DB when I clearly have strong relations in my data?
The answer might be: I need a single-digit millisecond queries at any scale (millions of records). Or, I want to save money using dynamodb on-demand. If this is not the case, you might be better with a relational DB.
Let’s say you have to go for dynamodb. If so, most of patterns applicable for relational DBs are anti-patterns when it comes to NoSQL. There is a useful talk from last re-invent about design patterns for dynamodb and advice to watch it https://youtu.be/HaEPXoXVf2k.
For your data I’d think about taking similar approach, and having two tables: users and projects.
Projects should store sub-set of test suits as map of array of objects and test cases as map of array of objects. Plus you could add list of user ids in the map of strings. Of course you will need to maintain this list when users join or leave the project/s.
This should satisfy your access patterns.

SQLite3: dynamic between query

I have this sqlite3 table (simplified):
+--------+----------+-------+
| ROUTE | WPNumber | WPID |
+--------+----------+-------+
| A123 | 1 | WP001 |
| A123 | 2 | WP002 |
| A123 | 3 | WP003 |
| [...] | [...] | [...] |
| A123 | 20 | WP020 |
+--------+----------+-------+
Lets say I want to travel this route in the reverse direction (020 to 001).
How do I get all the WPID's in between? I know it is possible to build a query using BETWEEN and DESC, but then I'd have to build two seperate queries and have Python check when to use which query. Is it possible to have sqlite3 do the work, independent of the direction (reverse or not).
You can reverse the sorting order by reversing the number used in the ORDER BY clause.
Set the parameter ? to either 1 or -1:
SELECT WPID
FROM ThisTable
WHERE ROUTE = 'A123'
ORDER BY WPNumber * ?
If you would just use a similar query with DESC, the database would have a better opportunity to optimize the sorting with an index.

Resources