Data load with Teradata TPT failing - teradata

Here i'm trying load a csv file into teradata tables using TPT utility
,but is filing with an error:
Here is my TPT script:
DEFINE JOB test_tpt
DESCRIPTION 'Load a Teradata table from a file'
(
DEFINE SCHEMA SCHEMA_EMP_NAME
(
NAME VARCHAR(50),
AGE VARCHAR(50)
);
DEFINE OPERATOR od_EMP_NAME
TYPE DDL
ATTRIBUTES
(
VARCHAR PrivateLogName = 'tpt_log',
VARCHAR LogonMech = 'LDAP',
VARCHAR TdpId = 'TeraDev',
VARCHAR UserName = 'user',
VARCHAR UserPassword = 'pwd',
VARCHAR ErrorList = '3807'
);
DEFINE OPERATOR op_EMP_NAME
TYPE DATACONNECTOR PRODUCER
SCHEMA SCHEMA_EMP_NAME
ATTRIBUTES
(
VARCHAR DirectoryPath= '/home/hadoop/retail/',
VARCHAR FileName = 'emp_age.csv',
VARCHAR Format = 'Delimited',
VARCHAR OpenMode = 'Read',
VARCHAR TextDelimiter =','
);
DEFINE OPERATOR ol_EMP_NAME
TYPE LOAD
SCHEMA *
ATTRIBUTES
(
VARCHAR LogonMech = 'LDAP',
VARCHAR TdpId = 'TeraDev',
VARCHAR UserName = 'user',
VARCHAR UserPassword = 'pwd',
VARCHAR LogTable = 'EMP_NAME_LG',
VARCHAR ErrorTable1 = 'EMP_NAME_ET',
VARCHAR ErrorTable2 = 'EMP_NAME_UV',
VARCHAR TargetTable = 'EMP_NAME'
);
STEP stSetup_Tables
(
APPLY
('DROP TABLE EMP_NAME_LG;'),
('DROP TABLE EMP_NAME_ET;'),
('DROP TABLE EMP_NAME_UV;'),
('DROP TABLE EMP_NAME;'),
('CREATE TABLE EMP_NAME(NAME VARCHAR(50), AGE VARCHAR(2));')
TO OPERATOR (od_EMP_NAME);
);
STEP stLOAD_FILE_NAME
(
APPLY
('INSERT INTO EMP_NAME
(Name,Age)
VALUES
(:Name,:Age);
')
TO OPERATOR (ol_EMP_NAME)
SELECT * FROM OPERATOR(op_EMP_NAME);
);
);
Call TPT:
tbuild -f test_tpt.sql
Above TPT script is failing with following error:
Teradata Parallel Transporter Version 15.10.01.02 64-Bit
TPT_INFRA: Syntax error at or near line 6 of Job Script File 'test_tpt.sql':
TPT_INFRA: At "NAME" missing RPAREN_ in Rule: Explicit Schema Element List
TPT_INFRA: Syntax error at or near line 8 of Job Script File 'test_tpt.sql':
TPT_INFRA: TPT03020: Rule: DEFINE SCHEMA
Compilation failed due to errors. Execution Plan was not generated.
Job script compilation failed .
Am i missing any detail in here?

The messages certainly could be clearer, but the issue is that NAME is a restricted word.

Related

Issue with importing Image into CLOB data using TPT script

I have a simple tpt script (given below) to load image into CLOB column in an empty table.
USING CHARACTER SET UTF8
DEFINE JOB LoadingtableData
DESCRIPTION 'Loading data into table using TPT'
(
DEFINE SCHEMA TableStaging
DESCRIPTION 'SYS FILE Staging Table'
(
Col_Colb CLOB(131072) AS DEFERRED BY NAME
,Col_FNAME VARCHAR(100)
,Col_ID VARCHAR(50)
);
DEFINE OPERATOR FileReader()
DESCRIPTION 'Read file with list'
TYPE DATACONNECTOR PRODUCER
SCHEMA TableStaging
ATTRIBUTES (
VARCHAR TraceLevel = 'None'
, VARCHAR PrivateLogName = 'read_log'
, VARCHAR FileName = 'datafile.txt'
, VARCHAR OpenMode = 'Read'
, VARCHAR Format = 'Delimited'
, VARCHAR TextDelimiter = ',');
DEFINE OPERATOR SQLInserter()
DESCRIPTION 'Insert from files into table'
TYPE INSERTER
INPUT SCHEMA TableStaging
ATTRIBUTES (
VARCHAR TraceLevel = 'None'
, VARCHAR PrivateLogName = '#LOG'
, VARCHAR TdpId = '#TdpId '
, VARCHAR UserName = '#UserName '
, VARCHAR UserPassword = '#UserPassword ');
STEP LoadData (
APPLY ('INSERT INTO table_A(Col_Colb,Col_FNAME,Col_ID) VALUES (:Col_Colb,:Col_FNAME,:Col_ID);')
TO OPERATOR (SQLInserter [1])SELECT * FROM OPERATOR (FileReader ());););
To load data into table I'm using two text files:
File have all the Varchar column values and Clob data location.
Example data in file: <Clob_File_Location>,Name,123
Clob column value
Example data in file: Image.png
After executing the above tpt, I get message as "data loaded successfully". But when I check the table in place of image in Clob column text is loaded.
Can someone help in letting me know what I might be doing wrong.

Teradata Parallel Transporter DDL Operator - missing { EXTENDED_LITERAL_ CHAR_STRING_LITERAL_ } in Rule: Character String Literal ERROR

What I want to do is check in my database if my table exists, if yes drop it. Here is my .tpt :
DEFINE JOB DELETE_ET_TABLES
DESCRIPTION 'Delete ET tables'
(
DEFINE OPERATOR DDL_OPERATOR
DESCRIPTION 'Teradata Parallel Transporter DDL Operator'
TYPE DDL
ATTRIBUTES
(
varchar TdpId = #TERADATA_TDP,
varchar UserName = #User,
varchar UserPassword = #Pwd
);
APPLY
'SELECT (CASE WHEN TableName = ''Test_Del''
THEN (''DROP TABLE #Table;'')
ELSE NULL
END)
FROM dbc.TablesV WHERE databasename = #Db;' TO OPERATOR(DDL_OPERATOR);
And this is the error message I am getting :
Running "tbuild" command: tbuild -f /$HOME/loaders/test_deleteETTables.tpt -u TERADATA_TDP=$TDP, TERADATA_DATABASE=$DB -L /$LOG/
Teradata Parallel Transporter Version 16.20.00.09 64-Bit
TPT_INFRA: Syntax error at or near line 18 of Job Script File '/$HOME/loaders/test_deleteETTables.tpt':
TPT_INFRA: At "(" missing { EXTENDED_LITERAL_ CHAR_STRING_LITERAL_ } in Rule: Character String Literal
Compilation failed due to errors. Execution Plan was not generated.
Do you have any idea ? I have tried multiple things, such as :
SELECT 1 FROM dbc.TablesV WHERE databasename = #Db AND TABLENAME ='TEST_DEL';
CASE WHEN ACTIVITYCOUNT = 1
THEN (DROP TABLE #Table)
ELSE ( QUIT )
END;
All my variables have been declared. I feel that it is a problem with using single quotes inside que statement but I am not sure and I don't know how to resolve it. Thank you very much for your time.
The solution that Fred recommended me to try in the comments worked just fine :
I think this is due to use of NULL but SELECT is not valid for DDL operator. The recommended way to do this is simply pass a DROP to the operator and tell it to ignore "not found" (and consider that success), i.e. ErrorList='3807'
DESCRIPTION 'Delete ET tables'
(
DEFINE OPERATOR DDL_OPERATOR
DESCRIPTION 'Teradata Parallel Transporter DDL Operator'
TYPE DDL
ATTRIBUTES
(
varchar TdpId = #TERADATA_TDP,
varchar UserName = #USERDB,
varchar UserPassword = #PWD,
VARCHAR ErrorList = '3807'
);
APPLY
('DROP TABLE #TABLENAME')
TO OPERATOR(DDL_OPERATOR);
);```

HSQL postgres dialog not recognized

I want to use HSQL for integration tests. Therefore I want to setup the test schema with exact the same script I use for production. This is in postgresql dialect. In the test script I tried to set the dialect but it doesn't seem to work.
At least for uuid datatype and constraints I get syntax error exceptions. E.g. I get a:
CREATE TABLE testtable ( id bigint NOT NULL, some_uuid uuid NOT NULL,
name character varying(32) NOT NULL, CONSTRAINT testtable PRIMARY KEY
(id) ) WITH ( OIDS=FALSE ); nested exception is
java.sql.SQLSyntaxErrorException: type not found or user lacks
privilege: UUID
for the following script:
SET DATABASE SQL SYNTAX PGS TRUE;
CREATE TABLE testtable
(
id bigint NOT NULL,
some_uuid uuid NOT NULL,
name character varying(32) NOT NULL,
CONSTRAINT testtable PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
And I get:
Failed to execute SQL script statement #2 of class path resource
[setupTestData.sql]: CREATE TABLE testtable ( id bigint NOT NULL, name
character varying(32) NOT NULL, CONSTRAINT testtable PRIMARY KEY (id)
) WITH ( OIDS=FALSE ); nested exception is
java.sql.SQLSyntaxErrorException: unexpected token: (
for this script:
SET DATABASE SQL SYNTAX PGS TRUE;
CREATE TABLE testtable
(
id bigint NOT NULL,
--some_uuid uuid NOT NULL,
name character varying(32) NOT NULL,
CONSTRAINT testtable PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
HSQLDB 2.3.4 and later supports UUID.
HSQLDB does not currently support the PostgreSQL extension WITH (ODS= FALSE)

How can I use IF statements in Teradata without using BTEQ

I'm trying to create some deployment tools and I don't want to use BTEQ. I've been trying to work with the Teradata.Client.Provider in PowerShell but I'm getting syntax errors on the creation of a table.
[Teradata Database] [3706] Syntax error: expected something between
';' and the 'IF' keyword.
SELECT * FROM DBC.TablesV WHERE DatabaseName = DATABASE AND TableName = 'MyTable';
IF ACTIVITYCOUNT > 0 THEN GOTO EndStep1;
CREATE MULTISET TABLE MyTable ,
NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
MyColId INTEGER GENERATED ALWAYS AS IDENTITY
(START WITH 1
INCREMENT BY 1
MINVALUE 0
MAXVALUE 2147483647
NO CYCLE)
NOT NULL,
MyColType VARCHAR(50) NULL,
MyColTarget VARCHAR(128) NULL,
MyColScriptName VARCHAR(256) NULL,
MyColOutput VARCHAR(64000) NULL,
isMyColException BYTEINT(1) NULL,
ExceptionOutput VARCHAR(64000) NULL,
MyColBuild VARCHAR(128) NULL,
MyColDate TIMESTAMP NOT NULL
)
PRIMARY INDEX PI_MyTable_MyColLogId(MyColLogId);
LABEL EndStep1;
I would rather not use BTEQ as I've not found it has worked well in other deployment tools we have created and requires a bit of hacks. Is there anything I can use that would avoid using that tool?
What Parse error?
The CREATE will fail due to double INTEGER in MyColId and VARCHAR(max) in ExceptionOutput, it's an unknown datatype in Teradata.

How to suppress Teradata warnings with CAST

Let's say I need to CAST(birth_date AS DATE FORMAT 'MM/DD/YYYY')
If the birth_date field contains nulls or invalid characters it throws and untranslatable character error
Of course I can use the regex, otranslate but all that overcomplicates the sql
Is there any way to suppress all these errors ? CAST if you can, otherwise make it null?
The burden of checking whether the data fits into the data type you wish to store it must reside somewhere. You could use CASE {regular expression matching} THEN CAST() ELSE NULL END which may be the cleanest way to address the data quality validation in your SQL.
Otherwise, pre-process your data file to replace bad data with a token you can replace with NULL in your SQL. You can consider doing this in PowerShell, UNIX shell scripting, or perhaps a third-party tool (e.g. address cleansing/formatting, etc.).
Since there is no built in way to say CAST(<field> as <datatype) IGNORING ERRORS AS <alias> you could use a TPT script instead.
In TPT APPLY you can have an INSERT statement route errors into two different Error tables.
Something like the following would get you close. This is something that you would run after your dirty date table is loaded to get them into a clean date table.
DEFINE JOB DATA_insert_Example
(
DEFINE OPERATOR data_insert_Example
TYPE UPDATE
SCHEMA *
ATTRIBUTES
(
VARCHAR UserName,
VARCHAR UserPassword,
VARCHAR LogTable,
VARCHAR TargetTable,
INTEGER BufferSize,
INTEGER ErrorLimit = 5,
INTEGER MaxSessions = 4,
INTEGER MinSessions = 1,
INTEGER TenacityHours,
INTEGER TenacitySleep,
VARCHAR AccountID,
VARCHAR AmpCheck,
VARCHAR DeleteTask,
VARCHAR ErrorTable1 = '<yourdatabase>.<yourcleantable>'||'_ET',
VARCHAR ErrorTable2 = '<yourdatabase>.<yourcleantable>'||'_UV',
VARCHAR NotifyExit,
VARCHAR NotifyExitIsDLL,
VARCHAR NotifyLevel,
VARCHAR NotifyMethod,
VARCHAR NotifyString,
VARCHAR PauseAcq,
VARCHAR PrivateLogName,
VARCHAR TdpId,
VARCHAR TraceLevel,
VARCHAR WorkingDatabase = <yourdatabase>,
VARCHAR WorkTable = '<yourdatabase>.<yourcleantable>'||'_Work'
);
DEFINE SCHEMA data_insert_schema
(
field1 VARCHAR(20),
field2 VARCHAR(20),
field3 VARCHAR(20),
field4 VARCHAR(20)
);
DEFINE OPERATOR data_insert_export
TYPE EXPORT
SCHEMA W_0_s_DATA_esuh
ATTRIBUTES
(
VARCHAR UserName,
VARCHAR UserPassword,
STEP UPS
(
APPLY
(
'INSERT INTO <yourdatabase>.<yourcleantable>
field1,
field2,
field3,
field4
VALUES (
:field1,
:field2,
:field3,
:field4,
)';
)
TO OPERATOR
(
data_insert_Example[1]
ATTRIBUTES
(
UserName = <yourusername>,
UserPassword = <yourpass>,
LogTable = <yourdatabase>.<yourcleantable> ||'_LOG',
TargetTable = <yourdatabase>.<yourcleantable> ,
TdpId = <yourserverip/address>
)
)
SELECT * FROM OPERATOR
(
data_insert_export[1]
ATTRIBUTES
(
UserName = <yourusername>,
UserPassword = <yourpassword>,
SelectStmt = 'SELECT field1,field2,field3,field4 FROM <yourdatabase>.<yourtable> ;',
TdpId = '<yourserverip/address>'
)
);
);
)
Obviously, though, this is quite a bit more overkill than a simple REGEX. RegEx feels way overwhelming when you first start using it, but I think it's a completely reasonable usecase for checking dates stored as string literals before trying to convert them into their proper data type.
Overall, it sounds like you have garbage for data, so I totally get the frustration. Unfortunately for garbage data there is no magic bullet. You'll need some decent ETL between the garbage and your clean output.

Resources