Mismatch data between source table and target table - teradata

I loaded the table but source count and target count is mismatching.
Source Data is in Oracle and Target data is in Teradata. How i can find missing data in Target

Not sure what approach/utility/tool you are using for migrating from Oracle to Teradata .Verify whether bad records are captured in any way.
If you are unable to find from above then do a count of record on a yearly/monthly basis on both databases.Export them in excel and do a comparison.From here you will find the range where missing records are there.From there you further root cause
st_date end_date count(1)
1-1-1999 31-12-1999 10000
1-1-2000 31-12-2000 10000
select min(u_date) as st_date, min(u_date)+365 ,count(1) from table
do a looping of the above query,till you reach max(date)

The target table in Teradata is probably be a SET table disallowing duplicate rows.
If there's a Unique/Primary Key constraint in Oracle rows might still be considered duplicate by Teradata
if a [Var]Char column is defined as NOT CASESPECIFIC, then 'a' and 'A' are equal
if there are trailing spaces, 'a' and 'a '
Change the character columns to CASESPECIFIC and/or the table to MULTISET and try again.

Related

In Teradata there get columns/fields used by join and where condition and respective table without parsing query

I am trying to automate some performance check on query in Teradata.
So as part of that I want to check if columns used in joining condition are primary index of respective table or not and similarly for columns used in where condition are partition column in respective table or not. Is there any direct Teradata query which can directly give this without parsing whole query.
Yes there are two dbc objects where you can query :
dbc.columnsv
dbc.indicesv.
Primary index information will be stored in the 2nd view just search with your tablename and database name.
Partitioned information is stored in columnsv , there is a column with a flag value 'Y' for partitioned columns.
Example :
SELECT DATABASENAME,TABLENAME,COLUMNNAME FROM DBC.COLUMNSV WHERE PARTITIONINGCOLUMN='Y' where tablename=<> and databasename=<>;
Select * from dbc.indicesv where tablename=<> and databasename=<>;

How to handle date type in SQLite?

I've created a table, where I have "Date of birth" column of date type. The problem is that I can insert anything and it's successfully done. I want that field to restrict opportunities like inserting strings and not related stuff.
insertions
wrongResults
I've searched for the solution, but I could only find codes for getting the current time in different formats. I also don't get how exactly modifiers work (https://www.sqlite.org/lang_datefunc.html).
Bar the rowid column or an alias of the rowid column, any type of value can be stored in an type of column. That is the type of column does not restrict/constrain the data that can be stored.
p.s. there is no DATE type rather due to SQLite's flexibility DATE actually has a type (type affinity) of NUMERIC (not that that matters that much). You might find Datatypes In SQLite Version 3 an interesting read or perhaps this How flexible/restricive are SQLite column types?.
the rowid and, therefore an alias thereof, column MUST be an integer. Although typically you allow SQLite to assign the value.
You should either check the data programatically or alternately use a CHECK constraint when defining the column in the CREATE TABLE SQL.
A CHECK constraint may be attached to a column definition or specified
as a table constraint. In practice it makes no difference. Each time a
new row is inserted into the table or an existing row is updated, the
expression associated with each CHECK constraint is evaluated and cast
to a NUMERIC value in the same way as a CAST expression. If the result
is zero (integer value 0 or real value 0.0), then a constraint
violation has occurred. If the CHECK expression evaluates to NULL, or
any other non-zero value, it is not a constraint violation. The
expression of a CHECK constraint may not contain a subquery.
SQL As Understood By SQLite - CREATE TABLE
Example
Consider the following code :-
DROP TABLE IF EXISTS mychecktable ;
CREATE TABLE IF NOT EXISTS mychecktable (mycolumn BLOB CHECK(substr(mycolumn,3,1) = '-'));
INSERT INTO mychecktable VALUES('14-03-1900');
INSERT INTO mychecktable VALUES('1900-03-14'); -- ouch 3rd char not -
The is will result in :-
DROP TABLE IF EXISTS mychecktable
> OK
> Time: 0.187s
CREATE TABLE IF NOT EXISTS mychecktable (mycolumn BLOB CHECK(substr(mycolumn,3,1) = '-'))
> OK
> Time: 0.084s
INSERT INTO mychecktable VALUES('14-03-1900')
> Affected rows: 1
> Time: 0.206s
INSERT INTO mychecktable VALUES('1900-03-14')
> CHECK constraint failed: mychecktable
> Time: 0s
i.e. the first insert is successful, the second insert fails.
Usually you would enforce the correct format in your application, but you can also add constraints to your table definition to prevent this, e.g.,
CREATE TABLE users(...,
DoB TEXT CHECK(DATE(DoB) NOT NULL AND DATE(DoB)=DoB)
)

sqlite3 - the philosophy behind sqlite design for this scenario

suppose we have a file with just one table named TableA and this table has just one column named Text;
let say we populate our TableA with 3,000,000 of strings like these(each line a record):
Many of our patients are incontinent.
Many of our patients are severely disturbed.
Many of our patients need help with dressing.
if I save the file at this level it'll be: ~326 MB
now let say we want to increase the speed of our queries and therefore we set our Text column as the PrimaryKey(or create index on it);
if I save the file at this level it'll be: ~700 MB
our query:
SELECT Text FROM "TableA" where Text like '% home %'
for the table without index: ~5.545s
for the indexed table: ~2.231s
As far as I know when we create index on a column or set a column to be our PrimaryKey then sqlite engine doesn't need to refer to table itself(if no other column was requested in query) and it uses the index for query and hence the speed of query execution increases;
My question is in the scenario above which we have just one column and set that column to be the PrimaryKey too, then why sqlite holds some kind of unnecessary data?(at least it seems unnecessary!)(in this case ~326 MB) why not just keeping the index\PrimaryKey data?
In SQLite, table rows are stored in the order of the internal rowid column.
Therefore, indexes must be stored separately.
In SQLite 3.8.2 or later, you can create a WITHOUT ROWID table which is stored in order of its primary key values.

How to use one sequence for all table in sqlite

When I'm creating tables in an SQLite database, separate sequences are automatically created for each table.
However I want to use one sequence for all tables in my SQLite database and also need to set min and max values (e.g. min=10000 and max=999999) of that sequence (min and max means start value of sequence and maximum value that sequence can increment).
I know this can be done in an Oracle database, but don't know how to do it in SQLite.
Is there any way to do this?
Unfortunately, you cannot do this: SQLite automatically creates sequences for each table in special sqlite_sequence service table.
And even if you somehow forced it to take single sequence as source for all your tables, it would not work the way you expect. For example, in PostgreSQL or Oracle, if you set sequence to value say 1 but table already has filled rows with values 1,2,..., any attempt to insert new rows using that sequence as a source would fail with unique constraint violation.
In SQLite, however, if you manually reset sequence using something like:
UPDATE sqlite_sequence SET seq = 1 WHERE name = 'mytable'
and you already have rows 1,2,3,..., new attempts to insert will NOT fail, but automatically assign this sequence maximum existing value for appropriate auto-increment column, so it can keep going up.
Note that you can use this hack to assign starting value for the sequence, however you cannot make it stop incrementing at some value (unless you watch it using other means).
First of all, this is a bad idea.
The performance of database queries depends on predictability, and by fiddling with the sequence of indexes you are introducing offsets which will confuse the database engine.
However, to achieve this you could try to determine the lowest sequence number which is higher than or equal to any existing sequence number:
SELECT MAX(seq) FROM sqlite_sequence
This needs to be done after each INSERT query, followed by an update of all sequences for all tables:
UPDATE sqlite_sequence SET seq=determined_max

BTEQ The activity count returned by DBS does not match the actual number of rows returned

When I am exporting a table from teradata using BTEQ, the output row count does not match the select query count. The following is the warning shown by BTEQ
Warning: The activity count returned by DBS does not match
the actual number of rows returned.
Activity Count=495294, Total Rows Returned=495286
Here is the select query,
SELECT CUST_ID, SPEC1_CODE FROM Table
GROUP BY 1,2
Here is the create table script,
CREATE MULTISET TABLE Table ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT
(
RECORD_KEY DECIMAL(20,0) NOT NULL,
CUST_ID VARCHAR(40) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
SPEC1_CODE VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX ( RECORD_KEY );
When we contacted Teradata support, they asked us to run the following query.
DIAGNOSTIC NOAGGRENH ON FOR SESSION;
So, if we run the above query and then run our select/BTEQ export, it is working fine.
I was hoping you would answer my questions in the comment sooner but I'm going to throw this out as a possible reason for the discrepancy you are seeing in the warning message.
Your table is defined as MULTISET with a non-unique primary index or possibly as a NOPI table in Teradata 13.x. There are no additional unique constraints on the table or unique indexes. The table has been loaded with 8 duplicate rows of data.
For reasons that I can not pinpoint based on your description BTEQ returned a unique set of records although the optimizer indicates that the activity count for the statement was greater. Thus the warning message that you are seeing.

Resources