How to restrict string and date values in a Teradata table - teradata

Is there any way to impose a restriction on a Teradata table fields, something like this
Column Education can only contain: High School, BS, MS, PhD and nothing else. If somebody tries to INSERT any other string, the Referential Integrity mechanism would throw an exception
Also, is there any way to impose a restriction to the DATE field, such that a newly inserted date must >= CURRENT_DATE ?
Thank you experts

It's the same as any other RDBMS (besides the COMPRESS part, which is Teradata only to save disk space):
datecol DATE CHECK(datecol >= CURRENT_DATE)
Maybe add a NOT NULL constraint.
education VARCHAR(11) CHECK (education IN ('High School', 'BS', 'MS', 'PhD'))
In Teradata you should add COMPRESS ('High School', 'BS', 'MS', 'PhD')to save disk space.
Of course you could also insert the four values in a table, add a Primary Key and then a Foreign Key referencing that column...

Related

Teradata: How to extend the range partition of a non-empty partitioned table?

I have create a table, mydb.mytable, with essentially the following SQL, say last week:
CREATE MULTISET TABLE mydb.mytable ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
master_transaction_header VARCHAR(64) CHARACTER SET LATIN NOT CASESPECIFIC,
demand_date DATE FORMAT 'YY/MM/DD',
item_id BIGINT,
QTY INTEGER,
price DECIMAL(15,2))
PRIMARY INDEX ( master_transaction_header )
PARTITION BY RANGE_N(demand_date BETWEEN DATE '2018-01-01' AND CURRENT_DATE EACH INTERVAL '1' DAY );
When I try to insert data into it, for say yesterday, teradata gives me the following error message
Partitioning violation for table mydb.mytable
When I try to extend the partition using:
ALTER TABLE mydb.mytable MODIFY PRIMARY INDEX (master_transaction_header) ADD RANGE BETWEEN DATE '2018-03-15' AND CURRENT_DATE EACH INTERVAL '1' DAY;
I get the following error message from teradata:
The altering of RANGE_N definition with CURRENT_DATE/CURRENT_TIMESTAMP is not allowed.
I understand that I could:
Create a copy with PARTITION BY RANGE_N(demand_date BETWEEN DATE '2018-01-01' AND '9999-12-31' EACH INTERVAL '1' DAY );
Insert all the data from the old table into the new one
drop the old table
rename the new table
but I am hoping that teradata provides a more elegant way to add partitions to an existing partitioned table.
I have already consulted the following stackoverflow posts:
Range partition table creation with large number of paritions
Teradata: How to add range partition to non empty table?
They were enlightening, but I could not conjure an answer from the discussion therein.
Using CURRENT_DATE for partitioning is possible, but I never found a use case for it.
When you create that table it is resolved to the current date, but not changed afterwards, check the ResolvedCurrent_Date column in dbc.PartitioningConstraintsV. When you submit an ALTER TABLE mydb.mytable TO CURRENT it's resolved again and the range modified.
But there's no reason to do this, simply define the range large enough, so you never have to modify it again, e.g.
PARTITION BY RANGE_N(demand_date
BETWEEN DATE '2018-01-01'
AND DATE '2040-01-01' EACH INTERVAL '1' DAY);
Unused partitions have zero overhead in Teradata.

Limiting the number of rows a table can contain based on the value of a column - SQLite

Since SQLite doesn't support TRUE and FALSE, I have a boolean keyword that stores 0 and 1. For the boolean column in question, I want there to be a check for the number of 1's the column contains and limit the total number for the table.
For example, the table can have columns: name, isAdult. If there are more than 5 adults in the table, the system would not allow a user to add a 6th entry with isAdult = 1. There is no restriction on how many rows the table can contain, since there is no limit on the amount of entries where isAdult = 0.
You can use a trigger to prevent inserting the sixth entry:
CREATE TRIGGER five_adults
BEFORE INSERT ON MyTable
WHEN NEW.isAdult
AND (SELECT COUNT(*)
FROM MyTable
WHERE isAdult
) >= 5
BEGIN
SELECT RAISE(FAIL, "only five adults allowed");
END;
(You might need a similar trigger for UPDATEs.)
The SQL-99 standard would solve this with an ASSERTION— a type of constraint that can validate data changes with respect to an arbitrary SELECT statement. Unfortunately, I don't know any SQL database currently on the market that implements ASSERTION constraints. It's an optional feature of the SQL standard, and SQL implementors are not required to provide it.
A workaround is to create a foreign key constraint so isAdult can be an integer value referencing a lookup table that contains only values 1 through 5. Then also put a UNIQUE constraint on isAdult. Use NULL for "false" when the row is for a user who is not an adult (NULL is ignored by UNIQUE).
Another workaround is to do this in application code. SELECT from the database before changing it, to make sure your change won't break your app's business rules. Normally in a multi-user RDMS this is impossible due to race conditions, but since you're using SQLite you might be the sole user.

Calculating the percentage of dates (SQL Server)

I'm trying to add an auto-calculated field in SQL Server 2012 Express, that stores the % of project completion, by calculating the date difference by using:
ALTER TABLE dbo.projects
ADD PercentageCompleted AS (select COUNT(*) FROM projects WHERE project_finish > project_start) * 100 / COUNT(*)
But I am getting this error:
Msg 1046, Level 15, State 1, Line 2
Subqueries are not allowed in this context. Only scalar expressions are allowed.
What am I doing wrong?
Even if it would be possible (it isn't), it is anyway not something you would want to have as a caculated column:
it will be the same value in each row
the entire table would need to be updated after every insert/update
You should consider doing this in a stored procedure or a user defined function instead.Or even better in the business logic of your application,
I don't think you can do that. You could write a trigger to figure it out or do it as part of an update statement.
Are you storing "percentageCompleted" as a duplicated column value in the same table as your project data?
If this is the case, I would not recommend this, because it would duplicate the data.
If you don't care about duplicate data, try something separating the steps out like this:
ALTER TABLE dbo.projects
ADD PercentageCompleted decimal(2,2) --You could also store it as a varchar or char
declare #percentageVariable decimal(2,2)
select #percentageVariable = (select count(*) from projects where Project_finish > project_start) / (select count(*) from projects) -- need to get ratio by completed/total
update projects
set PercentageCompleted = #percentageVariable
this will give you a decimal value in that table, then you can format it on select if you desire to % + PercentageCompleted * 100

Speed up SQL select in SQLite

I'm making a large database that, for the sake of this question, let's say, contains 3 tables:
A. Table "Employees" with fields:
id = INTEGER PRIMARY INDEX AUTOINCREMENT
Others don't matter
B. Table "Job_Sites" with fields:
id = INTEGER PRIMARY INDEX AUTOINCREMENT
Others don't matter
C. Table "Workdays" with fields:
id = INTEGER PRIMARY INDEX AUTOINCREMENT
emp_id = is a foreign key to Employees(id)
job_id = is a foreign key to Job_Sites(id)
datew = INTEGER that stands for the actual workday, represented by a Unix date in seconds since midnight of Jan 1, 1970
The most common operation in this database is to display workdays for a specific employee. I perform the following select statement:
SELECT * FROM Workdays WHERE emp_id='Actual Employee ID' AND job_id='Actual Job Site ID' AND datew>=D1 AND datew<D2
I need to point out that D1 and D2 are calculated for the beginning of the month in search and for the next month, respectively.
I actually have two questions:
Should I set any fields as indexes besides primary indexes? (Sorry, I seem to misunderstand the whole indexing concept)
Is there any way to re-write the Select statement to maybe speed it up. For instance, most of the checks in it would be to see that the actual employee ID and job site ID match. Maybe there's a way to split it up?
PS. Forgot to say, I use SQLite in a Windows C++ application.
If you use the above query often, then you may get better performance by creating a multicolumn index containing the columns in the query:
CREATE INDEX WorkdaysLookupIndex ON Workdays (emp_id, job_id, datew);
Sometimes you just have to create the index and try your queries to see what is faster.

Using SQLite how do I index columns in a CREATE TABLE statement?

How do I index a column in my CREATE TABLE statement? The table looks like
command.CommandText =
"CREATE TABLE if not exists file_hash_list( " +
"id INTEGER PRIMARY KEY, " +
"hash BLOB NOT NULL, " +
"filesize INTEGER NOT NULL);";
command.ExecuteNonQuery();
I want filesize to be index and would like it to be 4 bytes
You can't do precisely what you're asking, but unlike some RDBMSs, SQLite is able to perform DDL inside of a transaction, with the appropriate result. This means that if you're really concerned about nothing ever seeing file_hash_list without an index, you can do
BEGIN;
CREATE TABLE file_hash_list (
id INTEGER PRIMARY KEY,
hash BLOB NOT NULL,
filesize INTEGER NOT NULL
);
CREATE INDEX file_hash_list_filesize_idx ON file_hash_list (filesize);
COMMIT;
or the equivalent using the transaction primitives of whatever database library you've got there.
I'm not sure how necessary that really is though, compared to just doing the two commands outside of a transaction.
As others have pointed out, SQLite's indexes are all B-trees; you don't get to choose what the type is or what portion of the column is indexed; the entire indexed column(s) are in the index. It will still be efficient for range queries, it just might take up a little more disk space than you'd really like.
I don't think you can. Why can't you use a second query? And specifying the size of the index seems to be impossible in SQLite. But then again, it is SQ_Lite_ ;)
create index
myIndex
on
myTable (myColumn)

Resources