Query a manual list of data items - teradata

I would like to run a query involving joining a table to a manually generated list but am stuck trying to generate the manual list. There is an example of what I am attempting to do below:
SELECT
*
FROM
('29/12/2014', '30/12/2014', '30/12/2014') dates
;
Ideally I would want my output to look like:
29/12/2014
30/12/2014
31/12/2014

What's your Teradata release?
In TD14 there's STRTOK_SPLIT_TO_TABLE:
SELECT *
FROM TABLE (STRTOK_SPLIT_TO_TABLE(1 -- any dummy value
,'29/12/2014,30/12/2014,30/12/2014' -- any delimited string
,',' -- delimiter
)
RETURNS (outkey INTEGER
,tokennum INTEGER
,token VARCHAR(20) CHARACTER SET UNICODE) -- modify to match the actual size
) AS d
You can easily put this in a Derived Table and then join to it.
inkey (here the dummy value 1) is a numeric or string column, usually a key. Can be used for joining back to the original row.
outkey is the same as inkey.
tokennum is the ordinal position of the token in the input string.
token is the extracted substring.

Try this:
select '29/12/2014'
union
select '30/12/2014'
union
...
It should work in Teradata as well as in MySql.

Related

How to convert a number with decimal values to a float in PL/SQL?

The issue is that I need to insert this number into json, and because the number contains a comma, json becomes invalid. A float would work because it contains a period not a comma.
I have tried using replace(v_decimalNumber,',','.') and it kind of works, except that the json property is converted to a string. I need it to remain some type of a numerical value.
How can this be achieved?
I am using Oracle 11g.
You just need to_number() function.
select to_number(replace('1,2', ',', '.')) float_nr from dual;
Result:
FLOAT_NR
1.2
Note that if your number has .0 like 1.0, the function will remove it and leave it only 1
The data type of v_decimalNumber is some type of character format as it can contain commas (,). Your contention is that it contains a number once the commas are removed. However there is NO SUCH THING until that contention has been validated since being character I can put any character(s) I want into it subject to any length restriction. As an example a spreadsheet column that should contain numeric data. However, it that doesn't apply users will often put N/A into telling themselves that it doesn't apply. Oracle will happily load this into your v_decimalNumber. (And that's 1 of many many ways non-numeric data can get into your column.) So before attempting to process as a numeric value you must validate it is in fact valid numeric data. The following demonstrates one such way.
with some_numbers (n) as
( select '123,4456,789.00' from dual union all
select '987654321.00' from dual union all
select '1928374655' from dual union all
select '1.2' from dual union all
select '.1' from dual union all
select '1..1' from dual union all
select 'N/A' from dual
)
, rx as (select '^[0-9]*\.?[0-9]*$' regexp from dual)
select n
, case when regexp_like(replace(n,',',null), regexp)
then to_number(replace(n,',',null))
else null
end Num_value
, case when regexp_like(replace(n,',',null), regexp)
then null
else 'Not valid number'
end msg
from some_numbers,rx ;
Take away: Never trust a character type column to contain specific data requirements except random characters. Always validate then put the data into the appropriately defined columns.

Show negative real in SQLite table

I have a column C of type REAL in table F in SQLite. I want to join this everywhere where in another table the negative value of F exists (along with some other fields).
However -C or 0-C etc.. all return the rounded value of C e.g. when C contains "123,456" then -C returns "-123".
Should I cast this via a string first or is the syntax differently?
Looks like the , in 123,456 is meant to be a decimal separator but SQLite treats the whole thing as a string (i.e. '123,456' rather than 123.456). Keep in mind that SQLite's type system is a little different than SQL's as values have types but columns don't:
[...] In SQLite, the datatype of a value is associated with the value itself, not with its container. [...]
So you can quietly put a string (that looks like a real number in some locales) into a real column and nothing bad happens until later.
You could fix the import process to interpret the decimal separator as desired before the data gets into SQLite or you could use replace to fix them up as needed:
sqlite> select -'123,45';
-123
sqlite> select -replace('123,45', ',', '.');
-123.45

Adding value to existing database table in RSQLite

I am new to RSQLite.
I have an input document in text format in which values are seperately by '|'
I created a table with the required variables (dummy code as follows)
db<-dbconnect(SQLite(),dbname="test.sqlite")
dbSendQuery(conn=db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER
NAME CHAR(25)
DATED DATE)"
)
However I am struck at how to import values into the created table.
I cannot use INSERT INTO Values command as there are thousands of rows and more than 20+ columns in the original data file and it is impossible to manually type in each data point.
Can someone suggest an alternative efficient way to do so?
You are using a scripting language. The deal of this is literally to avoid manually typing each data point. Sorry.
You have two routes:
1: You have corrected loaded a database connection and created an empty table in your SQLite database. Nice!
To load data into the table, load your text file into R using e.g. df <-
read.table('textfile.txt', sep='|') (modify arguments to fit your text file).
To have a 'dynamic' INSERT statement, you can use placeholders. RSQLite allows for both named or positioned placeholder. To insert a single row, you can do:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (?, ?, ?);', list(1, 16, 'Big fellow'))
You see? The first ? got value 1, the second ? got value 16, and the last ? got the string Big fellow. Also note that you do not enclose placeholders for text in quotation marks (' or ")!
Now, you have thousands of rows. Or just more than one. Either way, you can send in your data frame. dbSendQuery has some requirements. 1) That each vector has the same number of entries (not an issue when providing a data.frame). And 2) You may only submit the same number of vectors as you have placeholders.
I assume your data frame, df contains columns mark, roll, and name, corrsponding to the columns. Then you may run:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (:mark, :roll, :name);', df)
This will execute an INSERT statement for each row in df!
TIP! Because an INSERT statement is execute for each row, inserting thousands of rows can take a long time, because after each insert, data is written to file and indices are updated. Insert, enclose it in an transaction:
dbBegin(db)
res <- dbSendQuery(db, 'INSERT ...;', df)
dbClearResult(res)
dbCommit(db)
and SQLite will save the data to a journal file, and only save the result when you execute the dbCommit(db). Try both methods and compare the speed!
2: Ah, yes. The second way. This can be done in SQLite entirely.
With the SQLite command utility (sqlite3 from your command line, not R), you can attach a text file as a table and simply do a INSERT INTO ... SELECT ... ; command. Alternately, read the text file in sqlite3 into a temporary table and run a INSERT INTO ... SELECT ... ;.
Useful site to remember: http://www.sqlite.com/lang.html
A little late to the party, but DBI provides dbAppendTable() which will write the contents of a dataframe to an SQL table. Column names in the dataframe must match the field names in the database. For your example, the following code would insert the contents of my random dataframe into your newly created table.
library(DBI)
db<-dbConnect(RSQLite::SQLite(),dbname=":memory")
dbExecute(db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER,
NAME TEXT
)"
)
df <- data.frame(MARKS = sample(1:100, 10),
ROLLNUM = sample(1:100, 10),
NAME = stringi::stri_rand_strings(10, 10))
dbAppendTable(db, "TABLE1", df)
I don't think there is a nice way to do a large number of inserts directly from R. SQLite does have a bulk insert functionality, but the RSQLite package does not appear to expose it.
From the command line you may try the following:
.separator |
.import your_file.csv your_table
where your_file.csv is the CSV (or pipe delimited) file containing your data and your_table is the destination table.
See the documentation under CSV Import for more information.

SUM totals by FOR ALL ENTRIES itab keys

I want to execute a SELECT query on a database table that has 6 key fields, let's assume they are keyA, keyB, ..., keyF.
As input parameters to my ABAP function module I do receive an internal table with exactly that structure of the key fields, each entry in that internal table therefore corresponds to one tuple in the database table.
Thus I simply need to select all tuples from the database table that correspond to the entries in my internal table.
Furthermore, I want to aggregate an amount column in that database table in exactly the same query.
In pseudo SQL the query would look as follows:
SELECT SUM(amount) FROM table WHERE (keyA, keyB, keyC, keyD, keyE, keyF) IN {internal table}.
However, this representation is not possible in ABAP OpenSQL.
Only one column (such as keyA) is allowed to state, not a composite key. Furthermore I can only use 'selection tables' (those with SIGN, OPTIOn, LOW, HIGH) after they keyword IN.
Using FOR ALL ENTRIES seems feasible, however in this case I cannot use SUM since aggregation is not allowed in the same query.
Any suggestions?
For selecting records for each entry of an internal table, normally the for all entries idiom in ABAP Open SQL is your friend. In your case, you have the additional requirement to aggregate a sum. Unfortunately, the result set of a SELECT statement that works with for all entries is not allowed to use aggregate functions. In my eyes, the best way in this case is to compute the sum from the result set in the ABAP layer. The following example works in my system (note in passing: using the new ABAP language features that came with 7.40, you could considerably shorten the whole code).
report zz_ztmp_test.
start-of-selection.
perform test.
* Database table ZTMP_TEST :
* ID - key field - type CHAR10
* VALUE - no key field - type INT4
* Content: 'A' 10, 'B' 20, 'C' 30, 'D' 40, 'E' 50
types: ty_entries type standard table of ztmp_test.
* ---
form test.
data: lv_sum type i,
lt_result type ty_entries,
lt_keys type ty_entries.
perform fill_keys changing lt_keys.
if lt_keys is not initial.
select * into table lt_result
from ztmp_test
for all entries in lt_keys
where id = lt_keys-id.
endif.
perform get_sum using lt_result
changing lv_sum.
write: / lv_sum.
endform.
form fill_keys changing ct_keys type ty_entries.
append :
'A' to ct_keys,
'C' to ct_keys,
'E' to ct_keys.
endform.
form get_sum using it_entries type ty_entries
changing value(ev_sum) type i.
field-symbols: <ls_test> type ztmp_test.
clear ev_sum.
loop at it_entries assigning <ls_test>.
add <ls_test>-value to ev_sum.
endloop.
endform.
I would use FOR ALL ENTRIES to fetch all the related rows, then LOOP round the resulting table and add up the relevant field into a total. If you have ABAP 740 or later, you can use REDUCE operator to avoid having to loop round the table manually:
DATA(total) = REDUCE i( INIT sum = 0
FOR wa IN itab NEXT sum = sum + wa-field ).
One possible approach is simultaneous summarizing inside SELECT loop using statement SELECT...ENDSELECT statement.
Sample with calculating all order lines/quantities for the plant:
TYPES: BEGIN OF ls_collect,
werks TYPE t001w-werks,
menge TYPE ekpo-menge,
END OF ls_collect.
DATA: lt_collect TYPE TABLE OF ls_collect.
SELECT werks UP TO 100 ROWS
FROM t001w
INTO TABLE #DATA(lt_werks).
SELECT werks, menge
FROM ekpo
INTO #DATA(order)
FOR ALL ENTRIES IN #lt_werks
WHERE werks = #lt_werks-werks.
COLLECT order INTO lt_collect.
ENDSELECT.
The sample has no business sense and placed here just for educational purpose.
Another more robust and modern approach is CTE (Common Table Expressions) available since ABAP 751 version. This technique is specially intended among others for total/subtotal tasks:
WITH
+plants AS (
SELECT werks UP TO 100 ROWS
FROM t011w ),
+orders_by_plant AS (
SELECT SUM( menge )
FROM ekpo AS e
INNER JOIN +plants AS m
ON e~werks = m~werks
GROUP BY werks )
SELECT werks, menge
FROM +orders_by_plant
INTO TABLE #DATA(lt_sums)
ORDER BY werks.
cl_demo_output=>display( lt_sums ).
The first table expression +material is your internal table, the second +orders_by_mat quantities totals selected by the above materials and the last query is the final output query.

PostgreSQL, R and timestamps with no time zone

I am reading a big csv (>1GB big for me!). It contains a timestamp field.
I read it (100 rows to start with ) with fread from the excellent data.table package.
ddfr <- fread(input="~/file1.csv",nrows=100, header=T)
Problem 1 (RESOLVED): the timestamp fields (called "ts" and "update"), e.g. "02/12/2014 04:40:00 AM" is converted to string. I convert the fields back to timestamp with lubridate package mdh_hms. Splendid.
ddfr$ts <- data.frame( mdy_hms(ddfr$ts))
Problem 2 (NOT RESOLVED): The timestamp is created with time zone as per POSIXlt.
How do I create in R a timestamp with NO TIME ZONE? is it possible??
Now I use another (new) great package, PivotalR to write the dataframe to PostGreSQL 9.3 using as.db.data.frame. It works as a charm.
x <- as.db.data.frame(ddfr, table.name= "tbl1", conn.id = 1)
Problem 3 (NOT RESOLVED): As the original dataframe timestamp fields had time zones, a table is created with the fields "timestamp with time zone". Ultimately the data needs to be stored in a table with fields configured as "timestamp without time zone".
But in my table in Postgres the data is stored as "2014-02-12 04:40:00.0", where the .0 at the end is the UTC offset. I think I need to have "2014-02-12 04:40:00".
I tried
ALTER TABLE tbl ALTER COLUMN ts type timestamp without time zone;
Then I copied across. While Postgres accepts the ALTER COLUMN command, when I try to copy (using INSERT INTO tbls SELECT ...) I get an error:
"column "ts" is of type timestamp without time zone but expression is of type text
Hint: You will need to rewrite or cast the expression."
Clearly the .0 at the end is not liked (but why then Postgres accepts the ALTER COLUMN? boh!).
I tried to do what the error suggested using CAST in the INSERT INTO query:
INSERT INTO tbl2 SELECT CAST(ts as timestamp without time zone) FROM tbl1
But I get the same error (including the suggestion to use CAST aargh!)
The table directly created by PivotalR (based on the dataframe) has this CREATE script:
CREATE TABLE tbl2
(
businessid integer,
caseno text,
ts timestamp with time zone
)
WITH (
OIDS=FALSE
);
ALTER TABLE tbl1
OWNER TO mydb;
The table I'm inserting into has this CREATE script:
CREATE TABLE tbl1
(
id integer NOT NULL DEFAULT nextval('bus_seq'::regclass),
businessid character varying,
caseno character varying,
ts timestamp without time zone,
updated timestamp without time zone,
CONSTRAINT busid_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE tbl1
OWNER TO postgres;
My apologies for the convoluted explanation, but potentially a solution could be found at any step in the chain, so I preferred to put all my steps in one question. I am sure there has to be a simpler method...
I think you're confused about copying data between tables.
INSERT INTO ... SELECT without a column list expects the columns from source and destination to be the same. It doesn't magically match up columns by name, it'll just assign columns from the SELECT to the INSERT from left to right until it runs out of columns, at which point any remaining cols are assumed to be null. So your query:
INSERT INTO tbl2 SELECT ts FROM tbl1;
isn't doing this:
INSERT INTO tbl2(ts) SELECT ts FROM tbl1;
it's actually picking the first column of tbl2, which is businessid, so it's really attempting to do:
INSERT INTO tbl2(businessid) SELECT ts FROM tbl1;
which is clearly nonsense, and no casting will fix that.
(Your error in the original question doesn't match your tables and queries, so the details might be different as you've clearly made a mistake in mangling/obfuscating your tables or posted a newer version of the tables than the error. The principle remains.)
It's generally a really bad idea to assume your table definitions won't change and column order won't change anyway. So always be explicit about columns. In this case I think your intention might have actually been:
INSERT INTO tbl2(businessid, caseno, ts)
SELECT CAST(businessid AS integer), caseno, ts
FROM tbl1;
Note the cast, because the type of businessid is different between the two tables.

Resources