I have a simple tpt script (given below) to load image into CLOB column in an empty table.
USING CHARACTER SET UTF8
DEFINE JOB LoadingtableData
DESCRIPTION 'Loading data into table using TPT'
(
DEFINE SCHEMA TableStaging
DESCRIPTION 'SYS FILE Staging Table'
(
Col_Colb CLOB(131072) AS DEFERRED BY NAME
,Col_FNAME VARCHAR(100)
,Col_ID VARCHAR(50)
);
DEFINE OPERATOR FileReader()
DESCRIPTION 'Read file with list'
TYPE DATACONNECTOR PRODUCER
SCHEMA TableStaging
ATTRIBUTES (
VARCHAR TraceLevel = 'None'
, VARCHAR PrivateLogName = 'read_log'
, VARCHAR FileName = 'datafile.txt'
, VARCHAR OpenMode = 'Read'
, VARCHAR Format = 'Delimited'
, VARCHAR TextDelimiter = ',');
DEFINE OPERATOR SQLInserter()
DESCRIPTION 'Insert from files into table'
TYPE INSERTER
INPUT SCHEMA TableStaging
ATTRIBUTES (
VARCHAR TraceLevel = 'None'
, VARCHAR PrivateLogName = '#LOG'
, VARCHAR TdpId = '#TdpId '
, VARCHAR UserName = '#UserName '
, VARCHAR UserPassword = '#UserPassword ');
STEP LoadData (
APPLY ('INSERT INTO table_A(Col_Colb,Col_FNAME,Col_ID) VALUES (:Col_Colb,:Col_FNAME,:Col_ID);')
TO OPERATOR (SQLInserter [1])SELECT * FROM OPERATOR (FileReader ());););
To load data into table I'm using two text files:
File have all the Varchar column values and Clob data location.
Example data in file: <Clob_File_Location>,Name,123
Clob column value
Example data in file: Image.png
After executing the above tpt, I get message as "data loaded successfully". But when I check the table in place of image in Clob column text is loaded.
Can someone help in letting me know what I might be doing wrong.
Related
Created two tables in two different database(TestDb1 and TestDb2) in the same server.I wrote after delete trigger on "Error_Master" table`.if i delete record in "ERROR_MASTER" table which is in TestDB1 that trigger insert records in "ERROR_MASTER_LOG" table which exists in TestDb2.
My dblink->dblink('dbname=TestDb2 port=5432 host=192.168.0.48 user=postgres password=soft123')
DB->TestDb1
CREATE TABLE public."ERROR_MASTER"
(
"MARKERID" integer NOT NULL,
"FILENAME" character varying,
"RECNO" integer,
"ERRORCODE" character varying,
"USERID" character varying,
"ID" integer NOT NULL,
CONSTRAINT "ERR_MASTER_pkey" PRIMARY KEY ("ID")
)
WITH (
OIDS=FALSE
);
DB->TestDb2
CREATE TABLE public."ERROR_MASTER_LOG"
(
"MARKERID" integer NOT NULL,
"FILENAME" character varying,
"RECNO" integer,
"ERRORCODE" character varying,
"USERID" character varying,
"ID" integer NOT NULL,
CONSTRAINT "ERR_MASTER_Log_pkey" PRIMARY KEY ("ID"),
"Machine_IP" character varying,
"DELETED_AT" timestamp
)
WITH (
OIDS=FALSE
);
ALTER TABLE public."ERROR_MASTER_LOG"
OWNER TO postgres;
GRANT ALL ON TABLE public."ERROR_MASTER_LOG" TO postgres;
CREATE INDEX "IDX_ERROR_MASTER_LOG_MARKERID_RECNO"
ON public."ERROR_MASTER_LOG"
USING btree
("MARKERID" COLLATE pg_catalog."default", "RECNO" COLLATE pg_catalog."default", round("X1"::numeric, 2));
i tried below trigger in TestDb1 for inserting record in a table which exists in another database TestDb2 using dblink. It shows schema "old" does not exist.Please help.
CREATE OR REPLACE FUNCTION mdp_error_master_after_delete()
RETURNS trigger AS
$BODY$
BEGIN
perform dblink_connect('host=localhost user=postgres password=postgres dbname=TestDB2');
perform dblink_exec('INSERT INTO "ERROR_MASTER_LOG"("MARKERID","ID")
values('||OLD."MARKERID"||','||OLD."ID"')');
perform dblink_disconnect();
RETURN OLD;
EXCEPTION WHEN OTHERS THEN
RAISE NOTICE 'insert_new_sessions SQL ERROR: %', SQLERRM;
RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION mdp_error_master_after_delete()
OWNER TO postgres;
CREATE TRIGGER ERROR_MASTER_CHANGES
AFTER DELETE
ON "ERROR_MASTER"
FOR EACH ROW
EXECUTE PROCEDURE mdp_error_master_after_delete()
Here i'm trying load a csv file into teradata tables using TPT utility
,but is filing with an error:
Here is my TPT script:
DEFINE JOB test_tpt
DESCRIPTION 'Load a Teradata table from a file'
(
DEFINE SCHEMA SCHEMA_EMP_NAME
(
NAME VARCHAR(50),
AGE VARCHAR(50)
);
DEFINE OPERATOR od_EMP_NAME
TYPE DDL
ATTRIBUTES
(
VARCHAR PrivateLogName = 'tpt_log',
VARCHAR LogonMech = 'LDAP',
VARCHAR TdpId = 'TeraDev',
VARCHAR UserName = 'user',
VARCHAR UserPassword = 'pwd',
VARCHAR ErrorList = '3807'
);
DEFINE OPERATOR op_EMP_NAME
TYPE DATACONNECTOR PRODUCER
SCHEMA SCHEMA_EMP_NAME
ATTRIBUTES
(
VARCHAR DirectoryPath= '/home/hadoop/retail/',
VARCHAR FileName = 'emp_age.csv',
VARCHAR Format = 'Delimited',
VARCHAR OpenMode = 'Read',
VARCHAR TextDelimiter =','
);
DEFINE OPERATOR ol_EMP_NAME
TYPE LOAD
SCHEMA *
ATTRIBUTES
(
VARCHAR LogonMech = 'LDAP',
VARCHAR TdpId = 'TeraDev',
VARCHAR UserName = 'user',
VARCHAR UserPassword = 'pwd',
VARCHAR LogTable = 'EMP_NAME_LG',
VARCHAR ErrorTable1 = 'EMP_NAME_ET',
VARCHAR ErrorTable2 = 'EMP_NAME_UV',
VARCHAR TargetTable = 'EMP_NAME'
);
STEP stSetup_Tables
(
APPLY
('DROP TABLE EMP_NAME_LG;'),
('DROP TABLE EMP_NAME_ET;'),
('DROP TABLE EMP_NAME_UV;'),
('DROP TABLE EMP_NAME;'),
('CREATE TABLE EMP_NAME(NAME VARCHAR(50), AGE VARCHAR(2));')
TO OPERATOR (od_EMP_NAME);
);
STEP stLOAD_FILE_NAME
(
APPLY
('INSERT INTO EMP_NAME
(Name,Age)
VALUES
(:Name,:Age);
')
TO OPERATOR (ol_EMP_NAME)
SELECT * FROM OPERATOR(op_EMP_NAME);
);
);
Call TPT:
tbuild -f test_tpt.sql
Above TPT script is failing with following error:
Teradata Parallel Transporter Version 15.10.01.02 64-Bit
TPT_INFRA: Syntax error at or near line 6 of Job Script File 'test_tpt.sql':
TPT_INFRA: At "NAME" missing RPAREN_ in Rule: Explicit Schema Element List
TPT_INFRA: Syntax error at or near line 8 of Job Script File 'test_tpt.sql':
TPT_INFRA: TPT03020: Rule: DEFINE SCHEMA
Compilation failed due to errors. Execution Plan was not generated.
Job script compilation failed .
Am i missing any detail in here?
The messages certainly could be clearer, but the issue is that NAME is a restricted word.
i know we can use SHOW TABLE DATABASENAME.TABLENAME to get the DDL of a certain table in teradata,
like,
show table customerservice.employee;
result:
CREATE SET TABLE customerservice.employee ,FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
employee_number INTEGER,
manager_employee_number INTEGER,
department_number INTEGER,
job_code INTEGER,
last_name CHAR(20) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
first_name VARCHAR(30) CHARACTER SET LATIN NOT CASESPECIFIC NOT NULL,
hire_date DATE FORMAT 'YY/MM/DD' NOT NULL,
birthdate DATE FORMAT 'YY/MM/DD' NOT NULL,
salary_amount DECIMAL(10,2) NOT NULL)
UNIQUE PRIMARY INDEX ( employee_number );
If i want to show DDL of a certain database, is it possible ?
I have two questions for mariadb utf8mb4 with dynamic column.
Above all, I use mariadb version 10.0 and connect by jdbc.
For saving emoji characters, I modified mariadb as follow that,
Edited in /etc/my.cnf
[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
Edited DB Table Charset.
CREATE TABLE `MEMBER` (
`name` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`regdate` datetime DEFAULT NULL,
`sso_json` blob,
..(skip)..
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Remove characterEncoding parameter from JDBC String
deleted particle : characterEncoding=utf-8
So, It's perfect, emoji character was saved in a varchar column γ
ccurately. But not in a dynamic column. In both Ad-Hoc Query and procedure, column_create() save question mark instead of emoji.
As follow, procedure sample.
CREATE DEFINER=`sample`#`%` PROCEDURE `SP_INSERT`(
inName varchar(500) CHARACTER SET utf8mb4
)
BEGIN
SET #pSql = CONCAT( ' INSERT INTO SAMPLE_TBL ( '
, ' name, sso_json '
, ' ) VALUES ( '
, ' ?, COLUMN_CREATE(?, ?) '
, ' ) '
);
-- variables bind
SET #pName = inName;
SET #pKey = 'title';
-- prepare stmt
PREPARE pstmt FROM #pSql;
EXECUTE pstmt USING #pName, #pKey, #pName;
END
Procedure Result : {'title', '?????'}.
And In a Ad-Hoc query,
set names utf8mb4 collate 'utf8mb4_unicode_ci';
select 'testππππ
', column_json(column_create('name','testππππ
'));
Result :
testππππ
' || {\"name\":\"test????\"}
result column is accurately but column_json no.
set names utf8;
select 'testππππ
', column_json(column_create('name','testππππ
'));
Result :
testππππ
|| {\"name\":\"testππππ
\"}
I don't know why. Help me, please.
sso_json blob acquires the table's DEFAULT CHARACTER SET utf8; you need utf8mb4 for Emoji, as you did with name.
Let's say I need to CAST(birth_date AS DATE FORMAT 'MM/DD/YYYY')
If the birth_date field contains nulls or invalid characters it throws and untranslatable character error
Of course I can use the regex, otranslate but all that overcomplicates the sql
Is there any way to suppress all these errors ? CAST if you can, otherwise make it null?
The burden of checking whether the data fits into the data type you wish to store it must reside somewhere. You could use CASE {regular expression matching} THEN CAST() ELSE NULL END which may be the cleanest way to address the data quality validation in your SQL.
Otherwise, pre-process your data file to replace bad data with a token you can replace with NULL in your SQL. You can consider doing this in PowerShell, UNIX shell scripting, or perhaps a third-party tool (e.g. address cleansing/formatting, etc.).
Since there is no built in way to say CAST(<field> as <datatype) IGNORING ERRORS AS <alias> you could use a TPT script instead.
In TPT APPLY you can have an INSERT statement route errors into two different Error tables.
Something like the following would get you close. This is something that you would run after your dirty date table is loaded to get them into a clean date table.
DEFINE JOB DATA_insert_Example
(
DEFINE OPERATOR data_insert_Example
TYPE UPDATE
SCHEMA *
ATTRIBUTES
(
VARCHAR UserName,
VARCHAR UserPassword,
VARCHAR LogTable,
VARCHAR TargetTable,
INTEGER BufferSize,
INTEGER ErrorLimit = 5,
INTEGER MaxSessions = 4,
INTEGER MinSessions = 1,
INTEGER TenacityHours,
INTEGER TenacitySleep,
VARCHAR AccountID,
VARCHAR AmpCheck,
VARCHAR DeleteTask,
VARCHAR ErrorTable1 = '<yourdatabase>.<yourcleantable>'||'_ET',
VARCHAR ErrorTable2 = '<yourdatabase>.<yourcleantable>'||'_UV',
VARCHAR NotifyExit,
VARCHAR NotifyExitIsDLL,
VARCHAR NotifyLevel,
VARCHAR NotifyMethod,
VARCHAR NotifyString,
VARCHAR PauseAcq,
VARCHAR PrivateLogName,
VARCHAR TdpId,
VARCHAR TraceLevel,
VARCHAR WorkingDatabase = <yourdatabase>,
VARCHAR WorkTable = '<yourdatabase>.<yourcleantable>'||'_Work'
);
DEFINE SCHEMA data_insert_schema
(
field1 VARCHAR(20),
field2 VARCHAR(20),
field3 VARCHAR(20),
field4 VARCHAR(20)
);
DEFINE OPERATOR data_insert_export
TYPE EXPORT
SCHEMA W_0_s_DATA_esuh
ATTRIBUTES
(
VARCHAR UserName,
VARCHAR UserPassword,
STEP UPS
(
APPLY
(
'INSERT INTO <yourdatabase>.<yourcleantable>
field1,
field2,
field3,
field4
VALUES (
:field1,
:field2,
:field3,
:field4,
)';
)
TO OPERATOR
(
data_insert_Example[1]
ATTRIBUTES
(
UserName = <yourusername>,
UserPassword = <yourpass>,
LogTable = <yourdatabase>.<yourcleantable> ||'_LOG',
TargetTable = <yourdatabase>.<yourcleantable> ,
TdpId = <yourserverip/address>
)
)
SELECT * FROM OPERATOR
(
data_insert_export[1]
ATTRIBUTES
(
UserName = <yourusername>,
UserPassword = <yourpassword>,
SelectStmt = 'SELECT field1,field2,field3,field4 FROM <yourdatabase>.<yourtable> ;',
TdpId = '<yourserverip/address>'
)
);
);
)
Obviously, though, this is quite a bit more overkill than a simple REGEX. RegEx feels way overwhelming when you first start using it, but I think it's a completely reasonable usecase for checking dates stored as string literals before trying to convert them into their proper data type.
Overall, it sounds like you have garbage for data, so I totally get the frustration. Unfortunately for garbage data there is no magic bullet. You'll need some decent ETL between the garbage and your clean output.