I have a table with DATETIME field, which is indexed by a BTree. Now i want to query it with following statement:
SELECT
count(us.CITY) as metric,
us.CITY as Name,
us.LATITUDE as latitude,
us.LONGITUDE as longitude
FROM
FACT
LEFT JOIN
USER us
ON
us.ID_USER = FACT.USER
WHERE
ASSESSMENT_DATE BETWEEN FROM_UNIXTIME(1601568552) AND FROM_UNIXTIME(1604028277)
GROUP BY us.CITY, us.LATITUDE, us.LONGITUDE;
EXPLAIN:
+------+-------------+-------+--------+----------------------------+---------+---------+------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+--------+----------------------------+---------+---------+------------------------------+--------+----------------------------------------------+
| 1 | SIMPLE | FACT | ALL | INDEX_FACT_ASSESSMENT_DATE | NULL | NULL | NULL | 762621 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | us | eq_ref | PRIMARY | PRIMARY | 46 | dwh0.FACT.USER,dwh0.FACT.ENV | 1 | |
+------+-------------+-------+--------+----------------------------+---------+---------+------------------------------+--------+----------------------------------------------+
2 rows in set (0.001 sec)
Interestingly, by only changing the dates manually into the DATETIME Format string it uses the index. But the FROM_UNIXTIME() function should in my opinion return the exactly same thing...
SELECT
count(us.CITY) as metric,
us.CITY as Name,
us.LATITUDE as latitude,
us.LONGITUDE as longitude
FROM
FACT
LEFT JOIN
USER us
ON
us.ENV = FACT.ENV AND us.ID_USER = FACT.USER
WHERE
-- ASSESSMENT_DATE BETWEEN FROM_UNIXTIME(1596649101) AND FROM_UNIXTIME(1599108827)
ASSESSMENT_DATE BETWEEN '2020-08-05 11:30:11.987' AND '2020-09-03 11:30:11.987'
GROUP BY us.CITY, us.LATITUDE, us.LONGITUDE;
EXPLAIN:
+------+-------------+-------+--------+----------------------------+----------------------------+---------+------------------------------+--------+--------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
|
+------+-------------+-------+--------+----------------------------+----------------------------+---------+------------------------------+--------+--------------------------------------------------------+
| 1 | SIMPLE | FACT | range | INDEX_FACT_ASSESSMENT_DATE | INDEX_FACT_ASSESSMENT_DATE | 5 | NULL | 132008 | Using index condition; Using temporary; Using filesort |
| 1 | SIMPLE | us | eq_ref | PRIMARY | PRIMARY | 46 | dwh0.FACT.USER,dwh0.FACT.ENV | 1 |
|
+------+-------------+-------+--------+----------------------------+----------------------------+---------+------------------------------+--------+--------------------------------------------------------+
2 rows in set (0.001 sec)
Can anyone refer to such a problem? the where clause is generated by grafana, so i can not change that, but the rest i can change if it changes something.
Thanks for suggestions!
Sorry for bothering.. after around 10^5 more inserts, it works for both cases... Maybe it was just bad luck
Related
I have following DB schema and I'd like to find the best way how to select list of Sorted keys which are common for PK_A and PK_B:
+---------------+---------+
| PK | SortKey |
+---------------+---------+
| | SK_A |
| PK_A | SK_B |
| | SK_C |
| - - - - - - - | |
| | SK_B |
| PK_B | SK_C |
| | SK_D |
+---------------+---------+
so when I do select by PK_A and PK_B it should return me only SK_B and SK_C?
Any help is appreciated.
Simple answer, you can't do it (in one call).
Dynamo is not a relational database, operations such as intersection are not supported.
You'd need to query() once for each partition key and then calculate the intersect yourself.
Entity Model:
I've read AWS Guide about create a Modeling Relational Data in DynamoDB. It's so confusing in my access pattern.
Access Pattern
+-------------------------------------------+------------+------------+
| Access Pattern | Params | Conditions |
+-------------------------------------------+------------+------------+
| Get TEST SUITE detail and check that |TestSuiteID | |
| USER_ID belongs to project has test suite | &UserId | |
+-------------------------------------------+------------+------------+
| Get TEST CASE detail and check that | TestCaseID | |
| USER_ID belongs to project has test case | &UserId | |
+-------------------------------------------+------------+------------+
| Remove PROJECT ID, all TEST SUITE | ProjectID | |
| AND TEST CASE also removed | &UserId | |
+-------------------------------------------+------------+------------+
So, I model a relational entity data as guide.
+-------------------------+---------------------------------+
| Primary Key | Attributes |
+-------------------------+ +
| PK | SK | |
+------------+------------+---------------------------------+
| user_1 | USER | FullName | |
+ + +----------------+----------------+
| | | John Doe | |
+ +------------+----------------+----------------+
| | prj_01 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-04-22 | |
+ +------------+----------------+----------------+
| | prj_02 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-05-26 | |
+------------+------------+----------------+----------------+
| user_2 | USER | FullName | |
+ + +----------------+----------------+
| | | Harry Potter | |
+ +------------+----------------+----------------+
| | prj_01 | JoinedDate | |
+ + +----------------+----------------+
| | | 2019-04-25 | |
+------------+------------+----------------+----------------+
| prj_01 | PROJECT | Name | Description |
+ + +----------------+----------------+
| | | Facebook Test | Do some stuffs |
+ +------------+----------------+----------------+
| | t_suite_01 | | |
+ + +----------------+----------------+
| | | | |
+------------+------------+----------------+----------------+
| prj_02 | PROJECT | Name | Description |
+ + +----------------+----------------+
| | | Instagram Test | ... |
+------------+------------+----------------+----------------+
| t_suite_01 | TEST_SUITE | Name | |
+ + +----------------+----------------+
| | | Test Suite 1 | |
+ +------------+----------------+----------------+
| | t_case_1 | | |
+ + +----------------+----------------+
| | | | |
+------------+------------+----------------+----------------+
| t_case_1 | TEST_CASE | Name | |
+ + +----------------+----------------+
| | | Test Case 1 | |
+------------+------------+----------------+----------------+
If I just have UserID and TestCaseId as a parameter, how could I get TestCase Detail and verify that UserId has permission.
I've thought about storing complex hierarchical data within a single item. Something likes this
+------------+-------------------------+
| t_suite_01 | user_1#prj_1 |
+------------+-------------------------+
| t_suite_02 | user_1#prj_2 |
+------------+-------------------------+
| t_case_01 | user_1#prj_1#t_suite_01 |
+------------+-------------------------+
| t_case_02 | user_2#prj_1#t_suite_01 |
+------------+-------------------------+
Question: What is the best way for this case? I appreciate if you could give me some suggestion for this approach (bow)
I think the schema below does what you want. Create a Partition Key only GSI on the "GSIPK" attribute and query as follows:
Get Test Suite Detail and Validate User: Query GSI - PK == ProjectId, FilterCondition [SK == TestSuiteId || PK == UserId]
Get Test Case Detail and Validate User: Query GSI - PK == TestCaseId, FilterCondition [SK = TestSuiteId:TestCaseId || PK = UserId]
Remove Project: Query GSI - PK == ProjectId, remove all items returned.
Queries 1 and 2 come back with 1 or 2 items. One is the detail item and the other is the user permissions for the test suite or test case. If only one item returns then its the detail item and the user has no access.
The first question you should ask is: why do I want to use key-value document DB over relational DB when I clearly have strong relations in my data?
The answer might be: I need a single-digit millisecond queries at any scale (millions of records). Or, I want to save money using dynamodb on-demand. If this is not the case, you might be better with a relational DB.
Let’s say you have to go for dynamodb. If so, most of patterns applicable for relational DBs are anti-patterns when it comes to NoSQL. There is a useful talk from last re-invent about design patterns for dynamodb and advice to watch it https://youtu.be/HaEPXoXVf2k.
For your data I’d think about taking similar approach, and having two tables: users and projects.
Projects should store sub-set of test suits as map of array of objects and test cases as map of array of objects. Plus you could add list of user ids in the map of strings. Of course you will need to maintain this list when users join or leave the project/s.
This should satisfy your access patterns.
I have defined a schema in BigQuery as such:
+------------------+----------+----------+
| name | type | mode |
+------------------+----------+----------+
| warehouse | INTEGER | NULLABLE |
| transaction_date | DATETIME | NULLABLE |
| style | STRING | NULLABLE |
| piece | STRING | NULLABLE |
| fabric_1 | STRING | NULLABLE |
| fabric_2 | STRING | NULLABLE |
| serial | STRING | NULLABLE |
| customer_po | STRING | NULLABLE |
| order_number | STRING | NULLABLE |
+------------------+----------+----------+
The two fields I'm focusing on are serial and order_number, which when previewed in R, look like this:
+-----------+------------------+--------+-------+-----------+----------+------------+--------------+--------------+
| warehouse | transaction_date | style | piece | fabric_1 | fabric_2 | serial | customer_po | order_number |
+-----------+------------------+--------+-------+-----------+----------+------------+--------------+--------------+
| 80 | 4/3/19 | K28300 | ARMH | ALL CHAR | NA | 8040418253 | 1486838165 | 464374 |
| 80 | 4/3/19 | K28300 | ARMH | ALL CHAR | NA | 9040542252 | 1485798731-P | 464069 |
| 80 | 4/3/19 | K28300 | ARMH | ELEG NAVY | NA | 8040355550 | 1486826068 | 464369 |
| 80 | 4/3/19 | K28300 | ARMH | ELEG NAVY | NA | 8040532364 | 1485366411-R | 464071 |
+-----------+------------------+--------+-------+-----------+----------+------------+--------------+--------------+
Within R, those two fields appear to be read as characters in the dataframe I'm uploading, which is what I'm looking for. Yet when I push the data to BigQuery, those two fields end up like such:
+-----------+------------------+--------+-------+-----------+----------+------------+--------------+--------------+
| warehouse | transaction_date | style | piece | fabric_1 | fabric_2 | serial | customer_po | order_number |
+-----------+------------------+--------+-------+-----------+----------+------------+--------------+--------------+
| 80 | 4/3/19 | K28300 | ARMH | ALL CHAR | NA | 8040418253.0 | 1486838165 | 464374.0 |
| 80 | 4/3/19 | K28300 | ARMH | ALL CHAR | NA | 9040542252.0 | 1485798731-P | 464069.0 |
| 80 | 4/3/19 | K28300 | ARMH | ELEG NAVY | NA | 8040355550.0 | 1486826068 | 464369.0 |
| 80 | 4/3/19 | K28300 | ARMH | ELEG NAVY | NA | 8040532364.0 | 1485366411-R | 464071.0 |
+-----------+------------------+--------+-------+-----------+----------+------------+--------------+--------------+
Why is this happening, and how can I change it? For reference, my code to upload it:
bqr_upload_data(projectId = "project-test",
datasetId = "orders",
tableId = "daily_orders",
upload_data = df_daily_orders,
maxBadRecords = 1000,
overwrite = TRUE)
The upload from R looks at the class of the column to decide which is the best schema for BigQuery. Try changing the class of the data frame column to string to avoid it changing it to float as what looks like is happening via something like
as.character(df$column)
Now I'm not completely sure in my answer, as I am still a beginner, but it may help you. I would add this as a comment, but I don't have enough reputation yet.
If I understood properly, you are actually doing an implicit casting - from a numerical value to a string value and BigQuery is catching the decimal point as to be sure that it's properly catching the whole value
Check here BigQuery's conversion rules - second table, FLOAT64 to String.
In your place and depending on what you need to do with the table - I would:
Recreate the table, but change the schema for serial and order_number columns to an integer type
or
Try to update the already created table with an update query - and modifying the '.0' at the end of every string value
I'm having some trouble getting App Maker to respect the order of a many-to-many relation.
Let's say I have two models:
Model 1 has an ID and a many-to-many relation to model 2 which also has an ID.
App maker generates three tables:
DESCRIBE model_1;
+--------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------+----------------+
| Id | int(11) | NO | PRI | NULL | auto_increment |
+--------------------+--------------+------+-----+---------+----------------+
DESCRIBE model_2;
+--------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------+----------------+
| Id | int(11) | NO | PRI | NULL | auto_increment |
+--------------------+--------------+------+-----+---------+----------------+
DESCRIBE model_1_Has_model_2;
+------------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+---------+------+-----+---------+-------+
| parentModel1_fk | int(11) | NO | MUL | NULL | |
| childModel2_fk | int(11) | NO | MUL | NULL | |
+------------------+---------+------+-----+---------+-------+
Now let's say I have a model_1 object with ID 1 and three model_2 objects with IDs 1, 2, 3. If I assign model_1.childModel_2 to [model_2_ID_1, model_2_ID_2] the model_1_Has_model_2 table will contain:
parentModel1_fk | childModel2_fk
--------------------------------
1 | 1
1 | 2
Now let's say I splice model_1.childModel_2 using model_1.childModel_2.splice(0, 1) and then insert model_2 ID 3 in index 0 using model_1.childModel_2.splice(0, 0, model_2_ID_3). I would expect my table to contain the following:
parentModel1_fk | childModel2_fk
--------------------------------
1 | 3
1 | 1
However it contains the opposite:
parentModel1_fk | childModel2_fk
--------------------------------
1 | 1
1 | 3
Is there any way I can stop this behavior short of clearing the entire relation and then setting it to my new expected order?
The short answer is no. App Maker is just creating a new record, not rearranging the table. Otherwise it would have to edit all the records below the desired insertion point (which could be a prohibitively time consuming transaction). If this is the desired functionality, you'll have to do it manually.
I would seriously consider creating your own join table that will allow you to have additional columns, where you can store the desired sort order.
I'm trying to change the type of 2 columns. The first works but the second gives a syntax error for the same command:
> show full columns from KernelParams;
+-------+------------------+-------------------+------+-----+---------+----------------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+-------+------------------+-------------------+------+-----+---------+----------------+---------------------------------+---------+
| id | int(10) unsigned | NULL | NO | PRI | NULL | auto_increment | select,insert,update,references | |
| param | varchar(256) | latin1_swedish_ci | YES | UNI | NULL | | select,insert,update,references | |
| desc | varchar(256) | latin1_swedish_ci | YES | | NULL | | select,insert,update,references | |
+-------+------------------+-------------------+------+-----+---------+----------------+---------------------------------+---------+
> ALTER TABLE KernelParams MODIFY param varchar(128);
Query OK, 6 rows affected (0.08 sec)
Records: 6 Duplicates: 0 Warnings: 0
> ALTER TABLE KernelParams MODIFY desc varchar(128);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc varchar(128)' at line 1
Any ideas what is wrong there?
DESC is a reserved word, so you need to quote the column name, like OTTA said in their comment. The table and column quoting character in MySQL and MariaDB is the backtick (`)
ALTER TABLE KernelParams MODIFY `desc` varchar(128);
This works as expected:
MariaDB [test]> describe new_table;
+-------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+-------+
| idnew_table | int(11) | NO | PRI | NULL | |
| desc | varchar(45) | YES | | NULL | |
+-------------+-------------+------+-----+---------+-------+
2 rows in set (0.02 sec)
MariaDB [test]> ALTER TABLE new_table MODIFY `desc` varchar(128);
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [test]> describe new_table;
+-------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+-------+
| idnew_table | int(11) | NO | PRI | NULL | |
| desc | varchar(128) | YES | | NULL | |
+-------------+--------------+------+-----+---------+-------+
2 rows in set (0.02 sec)