I have tried creating GSI with a PK that uses a composite value of business_id, type_id, partner_id fields. I did it in two different ways in the AWS console:
First: business_id#type_id#partner_id
Second: [business_id]#[type_id]#[partner_id]
and sort key: updated
Here is the query:
SELECT *
FROM "items"."composite_key-index"
WHERE business_id = 435634652 AND type_id = 2 AND partner_id = 69992528
ORDER BY updated ASC
In both cases it throws this error:
ValidationException: Must have at least one non-optional hash key
condition in WHERE clause when using ORDER BY clause.
And if I run it without the order by:
SELECT *
FROM "items"."composite_key-index"
WHERE business_id = 435634652 AND type_id = 2 AND partner_id = 69992528
it doesnt return any items, even though there are data matching those values.
What am I doing wrong here?
To use a composite value as a key, you have to build the values yourself.
Your application would have to store the value in a single attribute, ex GSI_PK, as 435634652#2#6992528
Then your query would look like
SELECT *
FROM "items"."composite_key-index"
WHERE GSI_PK = "435634652#2#6992528"
I'm familiar with MySQL and am starting to use Amazon DynamoDB for a new project.
Assume I have a MySQL table like this:
CREATE TABLE foo (
id CHAR(64) NOT NULL,
scheduledDelivery DATETIME NOT NULL,
-- ...other columns...
PRIMARY KEY(id),
INDEX schedIndex (scheduledDelivery)
);
Note the secondary Index schedIndex which is supposed to speed-up the following query (which is executed periodically):
SELECT *
FROM foo
WHERE scheduledDelivery <= NOW()
ORDER BY scheduledDelivery ASC
LIMIT 100;
That is: Take the 100 oldest items that are due to be delivered.
With DynamoDB I can use the id column as primary partition key.
However, I don't understand how I can avoid full-table scans in DynamoDB. When adding a secondary index I must always specify a "partition key". However, (in MySQL words) I see these problems:
the scheduledDelivery column is not unique, so it can't be used as a partition key itself AFAIK
adding id as unique partition key and using scheduledDelivery as "sort key" sounds like a (id, scheduledDelivery) secondary index to me, which makes that index pratically useless
I understand that MySQL and DynamoDB require different approaches, so what would be a appropriate solution in this case?
It's not possible to avoid a full table scan with this kind of query.
However, you may be able to disguise it as a Query operation, which would allow you to sort the results (not possible with a Scan).
You must first create a GSI. Let's name it scheduled_delivery-index.
We will specify our index's partition key to be an attribute named fixed_val, and our sort key to be scheduled_delivery.
fixed_val will contain any value you want, but it must always be that value, and you must know it from the client side. For the sake of this example, let's say that fixed_val will always be 1.
GSI keys do not have to be unique, so don't worry if there are two duplicated scheduled_delivery values.
You would query the table like this:
var now = Date.now();
//...
{
TableName: "foo",
IndexName: "scheduled_delivery-index",
ExpressionAttributeNames: {
"#f": "fixed_value",
"#d": "scheduled_delivery"
},
ExpressionAttributeValues: {
":f": 1,
":d": now
},
KeyConditionExpression: "#f = :f and #d <= :d",
ScanIndexForward: true
}
Consider a simple self-referencing table in SQLite with the following fields
Create Table Test{
Id INTEGER PRIMARY KEY, <-- Alias for the RowId
Name TEXT NOT NULL CHECK(length(Name) > 0),
ParentId INTEGER REFERENCES Test(Id),
};
CREATE UNIQUE INDEX IX_UniqueNamePerLevel ON Test(ParentId, Name);
We're trying to set a uniqueness constraint on Name for all items which share the same ParentId. In other words, you can have two items with the name 'Joe' provided those items do not have the same ParentId.
The problem is that SQLite seems to treat nulls as distinct, meaning for any level except root items, the constraint works, but you can have fifteen 'Joe' entries all with a ParentId of 'null.'
Bonus points if you can show how to make that constraint trim leading and trailing whitespace on insert/update, and ignore case for the uniqueness constraint too.
SQLite treats NULLs in UNIQUE constraints as distinct for compatibility with other databases.
It is not possible to use a CHECK constraint for this because it would have to access other rows from the table, but subqueries are not allowed in CHECK.
You would have to use a trigger:
CREATE TRIGGER Test_ParentName_unique_insert_check
AFTER INSERT ON Test
FOR EACH ROW
WHEN NEW.ParentId IS NULL
BEGIN
SELECT RAISE(ABORT, 'root items must be unique, too')
FROM Test
WHERE ParentId IS NULL
AND Name = NEW.Name
AND Id <> NEW.Id;
END;
CREATE TABLE "DEPARTMENT"
( "DEP_NO" NUMBER(*,0) NOT NULL ENABLE,
"SSN" NUMBER(*,0),
"STREET" CHAR(40) NOT NULL ENABLE,
"CITY" CHAR(25) NOT NULL ENABLE,
"NAME" CHAR(50) NOT NULL ENABLE,
"BUDGET" NUMBER(8,2),
CONSTRAINT "PK_DEPARTMENT" PRIMARY KEY ("DEP_NO") ENABLE
) ;
ALTER TABLE "DEPARTMENT" ADD CONSTRAINT "FK_DEPARTMENT_EMPLOYEE" FOREIGN KEY ("SSN")
REFERENCES "EMPLOYEE" ("SSN") ENABLE;
ALTER TABLE "DEPARTMENT" ADD CONSTRAINT "FK_DEPARTMENT_LOCATION" FOREIGN KEY ("STREET", "CITY")
REFERENCES "LOCATION" ("STREET", "CITY") ENABLE;
what is the correct way in building a data base , is it better to create the tables with their primary keys , insert the data and then link these tables to another tables with the foreign key or it is better to create all the tables , link them together and then insert the required data ???
There is no correct way. Both approaches can be used.
The simpler approach is to first create all the tables, indexes and constraints and then to insert the data.
For maximum performance, first create just the tables and the primary key indexes, then insert the data and finally create the additional indexes and the constraints.
I've found a few "would be" solutions for the classic "How do I insert a new record or update one if it already exists" but I cannot get any of them to work in SQLite.
I have a table defined as follows:
CREATE TABLE Book
ID INTEGER PRIMARY KEY AUTOINCREMENT,
Name VARCHAR(60) UNIQUE,
TypeID INTEGER,
Level INTEGER,
Seen INTEGER
What I want to do is add a record with a unique Name. If the Name already exists, I want to modify the fields.
Can somebody tell me how to do this please?
Have a look at http://sqlite.org/lang_conflict.html.
You want something like:
insert or replace into Book (ID, Name, TypeID, Level, Seen) values
((select ID from Book where Name = "SearchName"), "SearchName", ...);
Note that any field not in the insert list will be set to NULL if the row already exists in the table. This is why there's a subselect for the ID column: In the replacement case the statement would set it to NULL and then a fresh ID would be allocated.
This approach can also be used if you want to leave particular field values alone if the row in the replacement case but set the field to NULL in the insert case.
For example, assuming you want to leave Seen alone:
insert or replace into Book (ID, Name, TypeID, Level, Seen) values (
(select ID from Book where Name = "SearchName"),
"SearchName",
5,
6,
(select Seen from Book where Name = "SearchName"));
You should use the INSERT OR IGNORE command followed by an UPDATE command:
In the following example name is a primary key:
INSERT OR IGNORE INTO my_table (name, age) VALUES ('Karen', 34)
UPDATE my_table SET age = 34 WHERE name='Karen'
The first command will insert the record. If the record exists, it will ignore the error caused by the conflict with an existing primary key.
The second command will update the record (which now definitely exists)
You need to set a constraint on the table to trigger a "conflict" which you then resolve by doing a replace:
CREATE TABLE data (id INTEGER PRIMARY KEY, event_id INTEGER, track_id INTEGER, value REAL);
CREATE UNIQUE INDEX data_idx ON data(event_id, track_id);
Then you can issue:
INSERT OR REPLACE INTO data VALUES (NULL, 1, 2, 3);
INSERT OR REPLACE INTO data VALUES (NULL, 2, 2, 3);
INSERT OR REPLACE INTO data VALUES (NULL, 1, 2, 5);
The "SELECT * FROM data" will give you:
2|2|2|3.0
3|1|2|5.0
Note that the data.id is "3" and not "1" because REPLACE does a DELETE and INSERT, not an UPDATE. This also means that you must ensure that you define all necessary columns or you will get unexpected NULL values.
INSERT OR REPLACE will replace the other fields to default value.
sqlite> CREATE TABLE Book (
ID INTEGER PRIMARY KEY AUTOINCREMENT,
Name TEXT,
TypeID INTEGER,
Level INTEGER,
Seen INTEGER
);
sqlite> INSERT INTO Book VALUES (1001, 'C++', 10, 10, 0);
sqlite> SELECT * FROM Book;
1001|C++|10|10|0
sqlite> INSERT OR REPLACE INTO Book(ID, Name) VALUES(1001, 'SQLite');
sqlite> SELECT * FROM Book;
1001|SQLite|||
If you want to preserve the other field
Method 1
sqlite> SELECT * FROM Book;
1001|C++|10|10|0
sqlite> INSERT OR IGNORE INTO Book(ID) VALUES(1001);
sqlite> UPDATE Book SET Name='SQLite' WHERE ID=1001;
sqlite> SELECT * FROM Book;
1001|SQLite|10|10|0
Method 2
Using UPSERT (syntax was added to SQLite with version 3.24.0 (2018-06-04))
INSERT INTO Book (ID, Name)
VALUES (1001, 'SQLite')
ON CONFLICT (ID) DO
UPDATE SET Name=excluded.Name;
The excluded. prefix equal to the value in VALUES ('SQLite').
Firstly update it. If affected row count = 0 then insert it. Its the easiest and suitable for all RDBMS.
Upsert is what you want. UPSERT syntax was added to SQLite with version 3.24.0 (2018-06-04).
CREATE TABLE phonebook2(
name TEXT PRIMARY KEY,
phonenumber TEXT,
validDate DATE
);
INSERT INTO phonebook2(name,phonenumber,validDate)
VALUES('Alice','704-555-1212','2018-05-08')
ON CONFLICT(name) DO UPDATE SET
phonenumber=excluded.phonenumber,
validDate=excluded.validDate
WHERE excluded.validDate>phonebook2.validDate;
Be warned that at this point the actual word "UPSERT" is not part of the upsert syntax.
The correct syntax is
INSERT INTO ... ON CONFLICT(...) DO UPDATE SET...
and if you are doing INSERT INTO SELECT ... your select needs at least WHERE true to solve parser ambiguity about the token ON with the join syntax.
Be warned that INSERT OR REPLACE... will delete the record before inserting a new one if it has to replace, which could be bad if you have foreign key cascades or other delete triggers.
If you have no primary key, You can insert if not exist, then do an update. The table must contain at least one entry before using this.
INSERT INTO Test
(id, name)
SELECT
101 as id,
'Bob' as name
FROM Test
WHERE NOT EXISTS(SELECT * FROM Test WHERE id = 101 and name = 'Bob') LIMIT 1;
Update Test SET id='101' WHERE name='Bob';
I believe you want UPSERT.
"INSERT OR REPLACE" without the additional trickery in that answer will reset any fields you don't specify to NULL or other default value. (This behavior of INSERT OR REPLACE is unlike UPDATE; it's exactly like INSERT, because it actually is INSERT; however if what you wanted is UPDATE-if-exists you probably want the UPDATE semantics and will be unpleasantly surprised by the actual result.)
The trickery from the suggested UPSERT implementation is basically to use INSERT OR REPLACE, but specify all fields, using embedded SELECT clauses to retrieve the current value for fields you don't want to change.
I think it's worth pointing out that there can be some unexpected behaviour here if you don't thoroughly understand how PRIMARY KEY and UNIQUE interact.
As an example, if you want to insert a record only if the NAME field isn't currently taken, and if it is, you want a constraint exception to fire to tell you, then INSERT OR REPLACE will not throw and exception and instead will resolve the UNIQUE constraint itself by replacing the conflicting record (the existing record with the same NAME). Gaspard's demonstrates this really well in his answer above.
If you want a constraint exception to fire, you have to use an INSERT statement, and rely on a separate UPDATE command to update the record once you know the name isn't taken.