one to many unique "set" oracle db - oracle11g

I am not sure how to create a one to many relationship, but restrict the many items as a "set" to each unique primary key.
DB: Oracle 11g
Example:
PK Table:
CUST(PK)
100
200
Valid FK Table Data:
CUST(FK) | ITEM
100 | 101
100 | 102
200 | 101
200 | 102
Invalid FK Table Data:
CUST(FK) | ITEM
100 | 101
100 | 101
200 | 104
200 | 104
Any suggestions how to setup such a relationship? I'd like to limit the uniqueness so it is not possible to add a value to the FK table that violates the above "set" uniqueness.
Can this be done purely on the Oracle DB end, or must I enforce this from the accessing Java code?

Just create a unique constraint with two columns: CUST and ITEM, similar to:
ALTER TABLE secondtable
ADD CONSTRAINT custItem UNIQUE (CUST, ITEM)
Create this constraint in addition to your Foreign key

this might help you..create your tables following way....
create table cust_id
(cus_id number primary key)
tablespace ts1;
create table Valid_FK_Tabl
(cus_id number,item number,constraints pk1 primary key(cus_id,item))
tablespace ts1 ;
alter table Valid_FK_Tabl
add constraints fk1 foreign key(cus_id)
references schema2.cust_id(cus_id);

Related

Dynamo DB - get only unique values from a table column

I have a table with 2 columns, user id & book id.
userId | bookId |
-----------------------
12 | 3
23 | 4
34 | 2
56 | 1
45 | 4
345 | 1
Is there a way to get only the unique values of bookId? like GROUP BY in sql.
Meaning query and get - [1,2,3,4]
Thanks.
DynamoDB doesn't have "columns" like a SQL table. Instead, it has documents (called items in DynamoDB terminology) which are indexed by a key (either simple or composite). And these items have attributes, but for the purposes of retrieval it's useful to imagine the items as being arbitrary payloads.
As such, there are no aggregate query APIs for DynamoDB tables. So you can't ask Dynamo to compute aggregations over multiple items.
If you need to identify unique items in a table you'll have to scan and perform the aggregation in your application. It's useful to think about how you might need to query the data upfront and use secondary indexes, or precompute aggregations as you update the data in your table.

How to do a basic sort in DynamoDb?

Dynamodb can make the simplest of database operations difficult. I have the following table and all I want to do is simply sort by the due column. How is this achieved in DynamoDb? I read everything I could find online and there doesn't seem to be a straightforward walk through anywhere.
payor | amount | due | paid
----------------------------------
Ally | 200.00 | 13 | 1
Chase | 80.00 | 2 | 0
Wells | 30.00 | 17 | 1
Directv | 150.00 | 5 | 0
So without considering the payor, amount or paid columns, how can I simply sort on the due column.
Simply, this can't be achieved in DynamoDB if the due attribute is not defined as sort key. Even if you define the due attribute as sort key, the ordering can be done only within the particular partition key. The ordering can't be done across the partition key.
Assume, you have defined the due as sort key of the table. You can use ScanIndexForward to true/false to order the items in ascending / descending order.
Data modeling in dynamo db involves designing the partition key and then determining the sort key for a use case. Partition key is compulsory for any query. This is a basic design premise of a key value nosql store which is completely different than a relational store

Determining a partition key in Dynamo DB for GSI

I am new to DynamoDB and I am finding it hard to think of how I should decide my partition key. I am using a condensed version of my use case:
I have an attribute which is a boolean value => B
For a given ID, I need to return all the data for it. The ID is either X or Y attribute. For the given ID, if B is true, I need to read attribute X, else Y.
While inserting into the table I know the the value of B and hence I can fill it in either X or Y depending on the value of it.
However while fetching, I just am given an ID, and I need to figure out whether it exists in column X or column Y ( I won't be getting the value of B in the input).
In a RDBMS I could run a query like select * from tab where (B == true && X == ID) || (B==false && Y == ID).
I think creating a GSI in DynamoDB will be the way to go about solving this in Dynamo. However I am not able to figure out the best way to approach this. Could I get suggestions?
Not sure if I got your use case correctly, but why not just swapping target columns based on value B while inserting a row.
Consider the following input:
+-----+------+--------+
| X | Y | B |
+-----+------+--------+
| ID1 | ID2 | true |
+-----+------+--------+
| ID3 | ID4 | true |
+-----+------+--------+
| ID5 | ID6 | false |
+-----+------+--------+
| ID7 | ID8 | false |
+-----+------+--------+
What if you store the values like this:
+-----------+-------------------------+
| id | opposite id |
|(hash key) | or whatever you call it |
+-----------+-------------------------+
| ID1 | ID2 |
+-----------+-------------------------+
| ID3 | ID4 |
+-----------+-------------------------+
| ID6 | ID5 |
+-----------+-------------------------+
| ID8 | ID7 |
+-----------+-------------------------+
This way, while fetching an item by an IDXXX value you would need to perform a query on the single column id.
UPD: Notice, if your use case allows having multiple records with a same id, you would need an another field to serve as a range key. This holds true no matter whether you swap columns like shown above or not.
As Per AWS DynamoDB Blog Post : Choosing the Right DynamoDB Partition Key
Choosing the Right DynamoDB Partition Key is an important step in the
design and building of scalable and reliable applications on top of
DynamoDB.
What is a partition key?
DynamoDB supports two types of primary keys:
Partition key: Also known as a hash key, the partition key is composed of a single attribute. Attributes in DynamoDB are similar in
many ways to fields or columns in other database systems.
Partition key and sort key: Referred to as a composite primary key or hash-range key, this type of key is composed of two attributes. The
first attribute is the partition key, and the second attribute is the
sort key. Here is an example:
Why do I need a partition key?
DynamoDB stores data as groups of attributes, known as items. Items
are similar to rows or records in other database systems. DynamoDB
stores and retrieves each item based on the primary key value which
must be unique. Items are distributed across 10 GB storage units,
called partitions (physical storage internal to DynamoDB). Each table
has one or more partitions, as shown in Figure 2. For more
information, see the Understand Partition Behavior in the DynamoDB
Developer Guide.
DynamoDB uses the partition key’s value as an input to an internal
hash function. The output from the hash function determines the
partition in which the item will be stored. Each item’s location is
determined by the hash value of its partition key.
All items with the same partition key are stored together, and for
composite partition keys, are ordered by the sort key value. DynamoDB
will split partitions by sort key if the collection size grows bigger
than 10 GB.
Recommendations for partition keys
Use high-cardinality attributes. These are attributes that have
distinct values for each item like e-mail id, employee_no,
customerid, sessionid, ordered, and so on.
Use composite attributes. Try to combine more than one attribute to
form a unique key, if that meets your access pattern. For example,
consider an orders table with customerid+productid+countrycode as the
partition key and order_date as the sort key.
Cache the popular items when there is a high volume of read traffic.
The cache acts as a low-pass filter, preventing reads of unusually
popular items from swamping partitions. For example, consider a table
that has deals information for products. Some deals are expected to be
more popular than others during major sale events like Black Friday or
Cyber Monday.
Add random numbers/digits from a predetermined range for write-heavy
use cases. If you expect a large volume of writes for a partition key,
use an additional prefix or suffix (a fixed number from predeternmined
range, say 1-10) and add it to the partition key. For example,
consider a table of invoice transactions. A single invoice can contain
thousands of transactions per client.
Read More # Choosing the Right DynamoDB Partition Key

Deleting multiple items based on global secondary index in DynamoDB

I have an existing table which has two fields - primary key and a global secondary index:
----------------------------
primary key | attributeA(GSI)
----------------------------
1 | id1
2 | id1
3 | id2
4 | id2
5 | id1
Since having the attributeA as a global secondary index, can I delete all items by specifying a value for the global secondary index? i.e I want to delete all records with the attributeA being id1 - Is this possible in Dynamo?
Dynamo provides documentation about deleting the index itself, but not specifically if we can use the GSI to delete multiple items
As of now, you cannot delete an item just by passing Non-key attributes or GSI keys.
The simplest way to do this is to Query GSI and get primaryKey(Hash key of the table) and Delete in next request.
You can refer this answer if you want to do batchDeletion.
Hope that helps

Analyze a scenario performance?

i want to design something like a dynamic form in which admin define each form fields.
i design 3 table: mainform table for shared properties, then formfield tables which have mainformID as a foreign key and define each form fields
e.g:
AutoID | FormID | FieldName
_____________________________
100 | Form1 | weight
101 | Form1 | height
102 | Form1 | color
103 | Form2 | Size
104 | Form2 | Type
....
at leas a formvalues table like bellow:
FormFieldID | Value | UniqueResponseID
___________________________________________
100 | 50px | 200
101 | 60px | 200
102 | Red | 200
100 | 30px | 201
101 | 20px | 201
102 | Black | 201
103 | 20x10 | 201
104 | Y | 201
....
for each form i have to join these 3 tables to catch all fields and values. i wonder if its the only way to design such a scenario? does it decrease sql performance? or is there any fast and better way?
This is a form of EAV, and I'm gonna assume you absolutely have to do it instead of the "static" design.
does it decrease sql performance?
Yes, getting a bunch of rows (under EAV) is always going to be slower than getting just one (under the static design).
or is there any fast and better way?
Not from the logical standpoint, but there are significant optimizations (for query performance at least) that can be done at the physical level. Specifically, you can carefully design your keys to minimize the I/O (by putting related data close together) and even eliminate the JOIN itself.
For example:
This model migrates keys through FOREIGN KEY hierarchy all the way down to the ATTRIBUTE_VALUE table. The resulting natural composite key in ATTRIBUTE_VALUE table enables us to:
Get all attributes1 of a given form by a single index range scan + table heap access on ATTRIBUTE_VALUE table, and without doing any JOINs at all. In addition to that, you can cluster2 it, eliminating the table heap access and leaving you with only the index range scan3.
If you need to only get the data for a specific response, change the order of the fields in the composite key, so the RESPONSE_ID is at the leading edge.
If you need both "by form" and "by response" queries, you'll need both indexes, at which point, I'd recommend secondary index to also cover4 the VALUE field.
For example:
-- Since we haven't used NONCLUSTERED clause, this is a B-tree
-- that covers all fields. Table heap doesn't exist.
CREATE TABLE ATTRIBUTE_VALUE (
FORM_ID INT,
ATTRIBUTE_NAME VARCHAR(50),
RESPONSE_ID INT,
VALUE VARCHAR(50),
PRIMARY KEY (FORM_ID, ATTRIBUTE_NAME, RESPONSE_ID)
-- FOREIGN KEYs omitted for brevity.
);
-- We have included VALUE, so this B-tree covers all fields as well.
CREATE UNIQUE INDEX ATTRIBUTE_VALUE_IE1 ON
ATTRIBUTE_VALUE (RESPONSE_ID, FORM_ID, ATTRIBUTE_NAME)
INCLUDE (VALUE);
1 Or a specific attribute, or a specific response for a specific attribute.
2 MS SQL Server clusters all tables by default, unless you specify NONCLUSTERED clause.
3 Friendliness to clustering and elimination of JOINs are some of the main strengths of natural keys (as opposed to surrogate keys). But they also make tables "fatter" and don't isolate from ON UPDATE CASCADE. I believe pros outweigh cons in this particular case. For more info on natural vs. surrogate keys, look here.
4 Fortunately, MS SQL Server supports including fields in index solely for covering purposes (as opposed to actually searching through the index). This makes the index leaner than a "normal" index on the same fields.
I like Branko's approach, and it is quite similar to metadata models i have created in the past, so this post is by way of extension to his. you may want to add a datatype table, which can work both for native types (int,varchar,bit,datetime etc.) and your own definitions (although i don't see the necessity off the cuff).
thence, Branko's "value" column becomes:
value_tinyint tinyint
value_int int
value_varchar varchar(xx)
etc.
with a datatype_id (probably tinyint) as a foreign key into the "mydatatype" table.
[excuse the lack of pretty ER diagrams like BD's]
mydatatype
datatype_id tinyint
code varchar(16)
description varchar(64) -- for reference purposes
This extension should:
a. save you a good deal of casting when reading or writing your data
b. allow both reads and writes with some easily constructed dynamic SQL
Furthermore (and maybe this is out of scope), you may want to store the order in which these objects are created/saved, as well as conditional display based on button push/checkbox/radio button selection etc.
I won't go into detail here, since i'm not sure you need these things, but if you do i'll check this every so often and respond with stuff.

Resources