Enforce uniqueness within a date range or based on the value of another column - mariadb

I have a table with a large amount of data; moving forward, I would like to enforce uniqueness for a given column in this table. However, the table contains a large amount of rows where that column is non-unique. I am not able to delete or alter these rows.
Is it possible to enforce uniqueness over a given date range, or since a specific date, or based on the value of another column (or something else like that) in MariaDB?

You can create a UNIQUE index on multiple columns, where one column is nullable. MariaDB will see each column with NULL values as a different value regarding the UNIQUE index, even if the other column values of the UNIQUE index are the same. Check the MariaDB documentation Getting Started with Indexes - Unique Index:
The fact that a UNIQUE constraint can be NULL is often overlooked. In SQL any NULL is never equal to anything, not even to another NULL. Consequently, a UNIQUE constraint will not prevent one from storing duplicate rows if they contain null values:
CREATE TABLE t1 (a INT NOT NULL, b INT, UNIQUE (a,b));
INSERT INTO t1 values (3,NULL), (3, NULL);
SELECT * FROM t1;
+---+------+
| a | b |
+---+------+
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
| 3 | NULL |
| 3 | NULL |
+---+------+
You can create such a UNIQUE index on the date column you already have and a new column which indicates if the date value should be unique or not:
CREATE TABLE Foobar(
id INT AUTO_INCREMENT PRIMARY KEY NOT NULL,
createdAt DATE NOT NULL,
dateUniqueMarker BIT NULL DEFAULT 0,
UNIQUE KEY uq_createdAt(createdAt, dateUniqueMarker)
);
INSERT INTO Foobar(createdAt) VALUES ('2021-11-04'),('2021-11-05'),('2021-11-06');
SELECT * FROM Foobar;
+----+------------+------------------------------------+
| id | createdAt | dateUniqueMarker |
+----+------------+------------------------------------+
| 1 | 2021-11-04 | 0x00 |
| 2 | 2021-11-05 | 0x00 |
| 3 | 2021-11-06 | 0x00 |
+----+------------+------------------------------------+
INSERT INTO Foobar(createdAt) VALUES ('2021-11-05');
ERROR 1062 (23000): Duplicate entry '2021-11-05-\x00' for key 'Foobar.uq_createdAt'
UPDATE Foobar SET dateUniqueMarker = NULL WHERE createdAt = '2021-11-05';
INSERT INTO Foobar(createdAt, dateUniqueMarker) VALUES ('2021-11-05', NULL);
SELECT * FROM Foobar;
+----+------------+------------------------------------+
| id | createdAt | dateUniqueMarker |
+----+------------+------------------------------------+
| 1 | 2021-11-04 | 0x00 |
| 2 | 2021-11-05 | NULL |
| 5 | 2021-11-05 | NULL |
| 3 | 2021-11-06 | 0x00 |
+----+------------+------------------------------------+

Without any data example and scenario illustration, it's hard to know. If you can update your question with those information, please do.
"Is it possible to enforce uniqueness over a given date range, or since a specific date, or based on the value of another column (or something else like that) in MariaDB?"
If by "enforce" you mean to create a new column then populate it with unique identifier, then yes it is possible. If what you really mean is to generate a unique value based on other column, that's also possible. Question is, how unique do you want it to be?
Is it like this unique?
column1
column2
column3
unique_val
2021-02-02
ABC
DEF
1
2021-02-02
CBD
FEA
1
2021-02-03
BED
GER
2
2021-02-04
ART
TOY
3
2021-02-04
ZSE
KSL
3
Whereby if it's the same date (on column1), it should have the same unique value regardless of column2 & column3 data.
Or like this?
column1
column2
column3
unique_val
2021-02-02
ABC
DEF
1
2021-02-02
CBD
FEA
2
2021-02-03
BED
GER
3
2021-02-04
ART
TOY
4
2021-02-04
ZSE
KSL
5
Taking all (or certain) columns to consider the unique value.
Both of the scenario above can be achieved in query without the need to alter the table, adding and populate a new column but of course, the latter is also possible.

Related

global or local indexes on column with duplicate value in Oracle 19C?

I have below table on Oracle19c(I am an oracle newbie). 4 million rows are inserted into the table daily and for now this table have 40 column and 240 million rows.
I usually search the table with user_id and MyTimestamp columns filter query and it takes 10 minutes to return the answer.
Example:
select * from table where user_id=123581 and MyTimestamp between 1657640396 and 1657777396
Note: Duplicate values are stored in the user_id and MyTimestamp columns.
I want partition monthly on MyTimestamp and index on user_id but which global or local indexes is suitable for indexing and how do I do it?
----------------------------------------------------------------------------------------------------
| id | MyTimestamp | Name | user_id ...
----------------------------------------------------------------------------------------------------
| 0 | 1657640396 | John | 123581 ...
| 1 | 1657638832 | Tom | 168525 ...
| 2 | 1657640265 | Tom | 168525 ...
| 3 | 1657640292 | John | 123581 ...
| 4 | 1657640005 | Jack | 896545 ...
--------------------------------------------------------------------------------------------------
If the majority of your queries contain the partition key, then better create LOCAL indexes:
CREATE INDEX index_name ON table_name (MyTimestamp, user_id) LOCAL;
Local indexes are smaller (i.e. the index partition) and thus faster and you don't have to rebuild the index when you drop an outdated partition.

How to scan DynamoDB table for retrieving only one item in each partition key

Let say I have a table with partition key "ID" and range key "Time" with the following items:
ID | Time | Data
------------------
A | 1 | abc
A | 2 | def
B | 2 | ghi
B | 3 | jkl
And I want to scan only one item in each partition that has the highest time value in each partition. So the outcome of the scan should look like:
ID | Time | Data
------------------
A | 2 | def
B | 3 | jkl
Is this possible with the DynamoDB's scan feature?
(I want to avoid scan all and do such filtering by myself).
If you want to fetch just a few IDs along with their highest Time, you can query with reverse index, so for every ID, you will have only 1 item read. But for this you need an existing list of IDs.
So for each ID, there will be:
1 query
1 item read
Otherwise, the only way is to scan everything unfortunately.

How to rank rows in a table in sqlite?

How can I create a column that has ranked the information of the table based on two or three keys?
For example, in this table the rank variable is based on Department and Name:
Dep | Name | Rank
----+------+------
1 | Jeff | 1
1 | Jeff | 2
1 | Paul | 1
2 | Nick | 1
2 | Nick | 2
I have found this solution but it's in SQL and I don't think it applies to my case as all information is in one table and the responses seem to SELECT and JOIN combine information from different tables.
Thank you in advance
You can count how many rows come before the current row in the current group:
UPDATE MyTable
SET Rank = (SELECT COUNT(*)
FROM MyTable AS T2
WHERE T2.Dep = MyTable.Dep
AND T2.Name = MyTable.Name
AND T2.rowid <= MyTable.rowid);
(The rowid column is used to differentiate between otherwise identical rows. Use the primary key, if you have one.)

selecting a row based on a number of column values in SQLite

I have a table with this structure:
id | IDs | Name | Type
1 | 10 | A | 1
2 | 11 | B | 1
3 | 12 | C | 2
4 | 13 | D | 3
except id nothing else is a FOREIGN or PRIMARY KEY. I want to select a row based on it's column values that are not PRIMARY KEY. I have tried the following syntax but it yields no results.
SELECT * FROM MyTable WHERE Name = 'A', Type = 1;
what am I doing wrong? What is exactly returned by a SELECT statement? I'm totally new to Data Base and I'm currently experimenting and trying to learn it. so far my search has not yield any results regarding this case.
Use and to add multiple conditions to your query
SELECT *
FROM MyTable
WHERE Name = 'A'
AND Type = 1;

How to add a new column in a View in sqlite?

I have this database in sqlite (table1):
+-----+-------+-------+
| _id | name | level |
+-----+-------+-------+
| 1 | Mike | 3 |
| 2 | John | 2 |
| 3 | Bob | 2 |
| 4 | David | 1 |
| 5 | Tom | 2 |
+-----+-------+-------+
I want to create a view with all elements of level 2 and then to add a new column indicating the order of the row in the new table. That is, I would want this result:
+-------+------+
| index | name |
+-------+------+
| 1 | John |
| 2 | Bob |
| 3 | Tom |
+-------+------+
I have tried:
CREATE VIEW words AS SELECT _id as index, name FROM table1;
But then I get:
+-------+------+
| index | name |
+-------+------+
| 2 | John |
| 3 | Bob |
| 5 | Tom |
+-------+------+
I suppose it should be something as:
CREATE VIEW words AS SELECT XXXX as index, name FROM table 1;
What should I use instead of XXXX?
When ordered by _id, the number of rows up to and including this one is the same as the number of rows where the _id value is less than or equal to this row's _id:
CREATE VIEW words AS
SELECT (SELECT COUNT(*)
FROM table1 b
WHERE level = 2
AND b._id <= a._id) AS "index",
name
FROM table1 a
WHERE level = 2;
(The computation itself does not actually require ORDER BY _id because the order of the rows does not matter when we're just counting them.)
Please note that words is not guaranteed to be sorted; add ORDER BY "index" if needed.
And this is, of course, not very efficient.
You have two options. First, you could simply add a new column with the following:
ALTER TABLE {tableName} ADD COLUMN COLNew {type};
Second, and more complicatedly, but would actually put the column where you want it, would be to rename the table:
ALTER TABLE {tableName} RENAME TO TempOldTable;
Then create the new table with the missing column:
CREATE TABLE {tableName} (name TEXT, COLNew {type} DEFAULT {defaultValue}, qty INTEGER, rate REAL);
And populate it with the old data:
INSERT INTO {tableName} (name, qty, rate) SELECT name, qty, rate FROM TempOldTable;
Then delete the old table:
DROP TABLE TempOldTable;
I'd much prefer the second option, as it will allow you to completely rename everything if need be.

Resources