Oracle: is it possible to compare values in Long column with values in Clob column - oracle11g

There is a table X having a "Long" Column and Table Y having a "CLOB" Column. Data has been migrated from Table X to Table Y. Now I need to verify if the data got converted correctly. I had the idea of using casting, but seems "Long" values cannot be converted to "Varchar" in select statements. Any ideas are highly appreciated.
Eg:
SELECT LONG_COLUMN FROM TABLE_X
Minus
SELECT CLOB_COLUMN FROM TABLE_Y

You have two problems: long can't easily be converted/compared to other datatypes and clob can't be used in a minus operation!
Fortunately, both of these can be overcome by using PL/SQL. Longs and clobs can be implicitly converted into varchar2 when they are selected in PL/SQL blocks. You can load these into a nested-table, then use the multiset except operator to find the differences between them:
create table long_t ( x long );
create table lob_t ( x clob );
insert into long_t values ('1');
insert into long_t values ('2');
insert into lob_t values ('1');
declare
type t is table of varchar2(32767);
longs t;
clobs t;
diff t;
begin
select x bulk collect into longs from long_t;
select x bulk collect into clobs from lob_t;
diff := longs multiset except clobs;
for i in 1 .. diff.count loop
dbms_output.put_line(diff(i));
end loop;
diff := clobs multiset except longs;
for i in 1 .. diff.count loop
dbms_output.put_line(diff(i));
end loop;
end;
/
anonymous block completed
2
If your tables contain more than a couple of thousand rows, you're likely to run out of memory using the above as-is, as the whole table will be loaded at once. If you've got an id or similar column on each table, then it would be best to fetch and compare rows in ranges, e.g. 1-1000, 1001-2000 and so on.

Related

Insert array / nested table into table

I really apologise if answer already was given, but I could only find answers for php.
My problem is that I got nested table array "test_nested_table" that got values ('a','b','c'). I also got table "test_table" in the DB that got three columns col1, col2, col3.
All I want to do is something like
insert into test_table values (test_nested_table);
I understand I can do that:
insert into test_table values (test_nested_table(1), test_nested_table(2), test_nested_table(3));
However, my actual real life table might be very big and I would be very surprised if I really need to type all 100 elements to insert.
I have come up with the following solution:
declare
type test_array is varray(3) of varchar2(10);
ta test_array := test_array ('a','b','c');
sql_txt varchar2(1000) := 'insert into test_table_1 values (''';
begin
for n in 1 .. ta.count loop
sql_txt := sql_txt || ta(n)||''',''';
end loop;
sql_txt := rtrim(sql_txt,',''')||''')';
execute immediate sql_txt;
end;
/
So, if you have an array or any other collection type, you can use dynamic approach to insert a lot of values without writing them all in the insert statement; however, I believe there still should be a more usual way to do that.

SUM totals by FOR ALL ENTRIES itab keys

I want to execute a SELECT query on a database table that has 6 key fields, let's assume they are keyA, keyB, ..., keyF.
As input parameters to my ABAP function module I do receive an internal table with exactly that structure of the key fields, each entry in that internal table therefore corresponds to one tuple in the database table.
Thus I simply need to select all tuples from the database table that correspond to the entries in my internal table.
Furthermore, I want to aggregate an amount column in that database table in exactly the same query.
In pseudo SQL the query would look as follows:
SELECT SUM(amount) FROM table WHERE (keyA, keyB, keyC, keyD, keyE, keyF) IN {internal table}.
However, this representation is not possible in ABAP OpenSQL.
Only one column (such as keyA) is allowed to state, not a composite key. Furthermore I can only use 'selection tables' (those with SIGN, OPTIOn, LOW, HIGH) after they keyword IN.
Using FOR ALL ENTRIES seems feasible, however in this case I cannot use SUM since aggregation is not allowed in the same query.
Any suggestions?
For selecting records for each entry of an internal table, normally the for all entries idiom in ABAP Open SQL is your friend. In your case, you have the additional requirement to aggregate a sum. Unfortunately, the result set of a SELECT statement that works with for all entries is not allowed to use aggregate functions. In my eyes, the best way in this case is to compute the sum from the result set in the ABAP layer. The following example works in my system (note in passing: using the new ABAP language features that came with 7.40, you could considerably shorten the whole code).
report zz_ztmp_test.
start-of-selection.
perform test.
* Database table ZTMP_TEST :
* ID - key field - type CHAR10
* VALUE - no key field - type INT4
* Content: 'A' 10, 'B' 20, 'C' 30, 'D' 40, 'E' 50
types: ty_entries type standard table of ztmp_test.
* ---
form test.
data: lv_sum type i,
lt_result type ty_entries,
lt_keys type ty_entries.
perform fill_keys changing lt_keys.
if lt_keys is not initial.
select * into table lt_result
from ztmp_test
for all entries in lt_keys
where id = lt_keys-id.
endif.
perform get_sum using lt_result
changing lv_sum.
write: / lv_sum.
endform.
form fill_keys changing ct_keys type ty_entries.
append :
'A' to ct_keys,
'C' to ct_keys,
'E' to ct_keys.
endform.
form get_sum using it_entries type ty_entries
changing value(ev_sum) type i.
field-symbols: <ls_test> type ztmp_test.
clear ev_sum.
loop at it_entries assigning <ls_test>.
add <ls_test>-value to ev_sum.
endloop.
endform.
I would use FOR ALL ENTRIES to fetch all the related rows, then LOOP round the resulting table and add up the relevant field into a total. If you have ABAP 740 or later, you can use REDUCE operator to avoid having to loop round the table manually:
DATA(total) = REDUCE i( INIT sum = 0
FOR wa IN itab NEXT sum = sum + wa-field ).
One possible approach is simultaneous summarizing inside SELECT loop using statement SELECT...ENDSELECT statement.
Sample with calculating all order lines/quantities for the plant:
TYPES: BEGIN OF ls_collect,
werks TYPE t001w-werks,
menge TYPE ekpo-menge,
END OF ls_collect.
DATA: lt_collect TYPE TABLE OF ls_collect.
SELECT werks UP TO 100 ROWS
FROM t001w
INTO TABLE #DATA(lt_werks).
SELECT werks, menge
FROM ekpo
INTO #DATA(order)
FOR ALL ENTRIES IN #lt_werks
WHERE werks = #lt_werks-werks.
COLLECT order INTO lt_collect.
ENDSELECT.
The sample has no business sense and placed here just for educational purpose.
Another more robust and modern approach is CTE (Common Table Expressions) available since ABAP 751 version. This technique is specially intended among others for total/subtotal tasks:
WITH
+plants AS (
SELECT werks UP TO 100 ROWS
FROM t011w ),
+orders_by_plant AS (
SELECT SUM( menge )
FROM ekpo AS e
INNER JOIN +plants AS m
ON e~werks = m~werks
GROUP BY werks )
SELECT werks, menge
FROM +orders_by_plant
INTO TABLE #DATA(lt_sums)
ORDER BY werks.
cl_demo_output=>display( lt_sums ).
The first table expression +material is your internal table, the second +orders_by_mat quantities totals selected by the above materials and the last query is the final output query.

Query a manual list of data items

I would like to run a query involving joining a table to a manually generated list but am stuck trying to generate the manual list. There is an example of what I am attempting to do below:
SELECT
*
FROM
('29/12/2014', '30/12/2014', '30/12/2014') dates
;
Ideally I would want my output to look like:
29/12/2014
30/12/2014
31/12/2014
What's your Teradata release?
In TD14 there's STRTOK_SPLIT_TO_TABLE:
SELECT *
FROM TABLE (STRTOK_SPLIT_TO_TABLE(1 -- any dummy value
,'29/12/2014,30/12/2014,30/12/2014' -- any delimited string
,',' -- delimiter
)
RETURNS (outkey INTEGER
,tokennum INTEGER
,token VARCHAR(20) CHARACTER SET UNICODE) -- modify to match the actual size
) AS d
You can easily put this in a Derived Table and then join to it.
inkey (here the dummy value 1) is a numeric or string column, usually a key. Can be used for joining back to the original row.
outkey is the same as inkey.
tokennum is the ordinal position of the token in the input string.
token is the extracted substring.
Try this:
select '29/12/2014'
union
select '30/12/2014'
union
...
It should work in Teradata as well as in MySql.

PostgreSQL, R and timestamps with no time zone

I am reading a big csv (>1GB big for me!). It contains a timestamp field.
I read it (100 rows to start with ) with fread from the excellent data.table package.
ddfr <- fread(input="~/file1.csv",nrows=100, header=T)
Problem 1 (RESOLVED): the timestamp fields (called "ts" and "update"), e.g. "02/12/2014 04:40:00 AM" is converted to string. I convert the fields back to timestamp with lubridate package mdh_hms. Splendid.
ddfr$ts <- data.frame( mdy_hms(ddfr$ts))
Problem 2 (NOT RESOLVED): The timestamp is created with time zone as per POSIXlt.
How do I create in R a timestamp with NO TIME ZONE? is it possible??
Now I use another (new) great package, PivotalR to write the dataframe to PostGreSQL 9.3 using as.db.data.frame. It works as a charm.
x <- as.db.data.frame(ddfr, table.name= "tbl1", conn.id = 1)
Problem 3 (NOT RESOLVED): As the original dataframe timestamp fields had time zones, a table is created with the fields "timestamp with time zone". Ultimately the data needs to be stored in a table with fields configured as "timestamp without time zone".
But in my table in Postgres the data is stored as "2014-02-12 04:40:00.0", where the .0 at the end is the UTC offset. I think I need to have "2014-02-12 04:40:00".
I tried
ALTER TABLE tbl ALTER COLUMN ts type timestamp without time zone;
Then I copied across. While Postgres accepts the ALTER COLUMN command, when I try to copy (using INSERT INTO tbls SELECT ...) I get an error:
"column "ts" is of type timestamp without time zone but expression is of type text
Hint: You will need to rewrite or cast the expression."
Clearly the .0 at the end is not liked (but why then Postgres accepts the ALTER COLUMN? boh!).
I tried to do what the error suggested using CAST in the INSERT INTO query:
INSERT INTO tbl2 SELECT CAST(ts as timestamp without time zone) FROM tbl1
But I get the same error (including the suggestion to use CAST aargh!)
The table directly created by PivotalR (based on the dataframe) has this CREATE script:
CREATE TABLE tbl2
(
businessid integer,
caseno text,
ts timestamp with time zone
)
WITH (
OIDS=FALSE
);
ALTER TABLE tbl1
OWNER TO mydb;
The table I'm inserting into has this CREATE script:
CREATE TABLE tbl1
(
id integer NOT NULL DEFAULT nextval('bus_seq'::regclass),
businessid character varying,
caseno character varying,
ts timestamp without time zone,
updated timestamp without time zone,
CONSTRAINT busid_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE tbl1
OWNER TO postgres;
My apologies for the convoluted explanation, but potentially a solution could be found at any step in the chain, so I preferred to put all my steps in one question. I am sure there has to be a simpler method...
I think you're confused about copying data between tables.
INSERT INTO ... SELECT without a column list expects the columns from source and destination to be the same. It doesn't magically match up columns by name, it'll just assign columns from the SELECT to the INSERT from left to right until it runs out of columns, at which point any remaining cols are assumed to be null. So your query:
INSERT INTO tbl2 SELECT ts FROM tbl1;
isn't doing this:
INSERT INTO tbl2(ts) SELECT ts FROM tbl1;
it's actually picking the first column of tbl2, which is businessid, so it's really attempting to do:
INSERT INTO tbl2(businessid) SELECT ts FROM tbl1;
which is clearly nonsense, and no casting will fix that.
(Your error in the original question doesn't match your tables and queries, so the details might be different as you've clearly made a mistake in mangling/obfuscating your tables or posted a newer version of the tables than the error. The principle remains.)
It's generally a really bad idea to assume your table definitions won't change and column order won't change anyway. So always be explicit about columns. In this case I think your intention might have actually been:
INSERT INTO tbl2(businessid, caseno, ts)
SELECT CAST(businessid AS integer), caseno, ts
FROM tbl1;
Note the cast, because the type of businessid is different between the two tables.

tSQLt AssertEqualsTable - unexpected results when table schema doesn't match

I noticed the other day that you can write a test where there are more columns in the Actual table that in the Expected table and the test will still pass if the the data matches in the columns that exist in both.
Here is an example:
if exists(select * from INFORMATION_SCHEMA.ROUTINES where ROUTINE_SCHEMA='UnitTests_FirstTry' and ROUTINE_NAME='test_AssertEqualsTable_IgnoresExtraColumnsInActual')
begin
drop procedure UnitTests_FirstTry.test_AssertEqualsTable_IgnoresExtraColumnsInActual
end
go
create procedure UnitTests_FirstTry.test_AssertEqualsTable_IgnoresExtraColumnsInActual
as
begin
IF OBJECT_ID(N'tempdb..#Expected') > 0 DROP TABLE [#Expected];
IF OBJECT_ID(N'tempdb..#Actual') > 0 DROP TABLE [#Actual];
create table #expected( a int null) --, b int null, c varchar(10) null)
create table #actual(a int, x money null)
insert #expected (a) values (1)
insert #actual (a, x) values (1, 22.51)
--insert #expected (a, b, c) values (1,2,'test')
--insert #actual (a, x) values (1, 22.51)
exec tSQLt.AssertEqualsTable '#expected', '#actual'
end
go
exec tSQLt.Run 'UnitTests_FirstTry.test_AssertEqualsTable_IgnoresExtraColumnsInActual'
go
I noticed this when I removed some extra columns from the Expected table of a test that no longer needed those columns, but I forgot to remove the same columns from the Actual table and my test still passed, which to me was a bit off putting.
This only happens when the Actual table has more columns. If the expected has more columns an error is generated. Is this correct? Does anyone know what the reasoning is behind this behavior?
Although not particularly well documented in this respect, the AssertEqualsTable routine only looks at the data in the table - not that the columns are the same. To check whether the table structures are the same, use AssertResultSetsHaveSameMetaData. I wrote a bit about this in this
article.
You can of course run both in the same test, and the test will only pass if both checks pass.
I guess the reason for the split would be because there may be rare instances where you care about either the data or the metadata being consistent for your test, but not both.

Resources