Airflow:Redshift:Dynamic Value pass into SQL file - airflow

Requirement: To pass dynamic dates abc_date into the sql file from the dag;
Error: redshift_connector.error.InterfaceError: Only %s and %% are supported in the query.
Below is the query in the SQL file
SET abc_date = (SELECT %(abc_date) s :: date);
unload (
'select distinct
a,
b
from table_A
where date_trunc(\'day\', date_col) = '$abc_date'
)
'
) TO 's3://{{var.value.bucket}}/redshift/' IAM_ROLE '{{var.value.redshift_iam}}' JSON GZIP
Dag Code :
def get_params_list(dq_start_date, dq_end_date):
start_dt = datetime.now().date() - timedelta(days=2)
end_dt = datetime.now().date() + timedelta(days=1)
date_list = []
for i in range((end_dt - start_dt).days):
abc_date = datetime.strftime(start_dt + timedelta(days=0 + i), "%Y-%m-%d")
date_list.append(
{
"abc_date": abc_date
}
)
return date_list
redshif_sql = RedshiftSQLOperator.partial(
task_id=f"redshif_sql",
sql='dq_redshift_data.sql',
redshift_conn_id='redshift_conn_id',
).expand (
parameters = get_params_list(start_date, end_date)
)

Related

- ORA-01407: cannot update ("PSOWNER"."PS_VCHR_LINE_STG"."CLASS_FLD") to NULL Failed SQL stmt: UPDATE

Error Message:- ORA-01407: cannot update ("PSOWNER"."PS_VCHR_LINE_STG"."CLASS_FLD") to NULL Failed SQL stmt: UPDATE
When I am generating the report it is saying NO Success in Peoplesoft.
Below is the code for the Update statement.
Please help me how to overcome this problem.
UPDATE %Table(VCHR_LINE_STG) A
SET A.CLASS_FLD = (
SELECT SUBSTR(DCP_FLD49
,3
,4)
FROM %Table(DCP_AP11_TMP2)
WHERE VCHR_BLD_KEY_C1 = A.VCHR_BLD_KEY_C1
AND DCP_FLD34= A.VOUCHER_LINE_NUM),A.BUSINESS_UNIT =(
SELECT D.CF_ATTRIB_VALUE
FROM %Table(CF_ATTRIB_TBL) D
, %Table(DEPT_TBL) E
WHERE ( D.EFFDT = (
SELECT MAX(D_ED.EFFDT)
FROM %Table(CF_ATTRIB_TBL) D_ED
WHERE D.SETID = D_ED.SETID
AND D.CHARTFIELD_VALUE = D_ED.CHARTFIELD_VALUE
AND D_ED.EFFDT <= SYSDATE)
AND E.EFFDT=D.EFFDT
AND D.CHARTFIELD_VALUE = (
SELECT M.DCP_FLD41
FROM %Table(DCP_AP11_TMP2) M
WHERE M.VCHR_BLD_KEY_C1 = A.VCHR_BLD_KEY_C1
AND M.DCP_FLD34= A.VOUCHER_LINE_NUM)
AND D.SETID = E.SETID
AND D.SETID = 'DCPID'
AND D.CF_ATTRIBUTE='AP_BUSN_UNIT'
AND E.EFFDT = (
SELECT MAX(E_ED.EFFDT)
FROM %Table(DEPT_TBL) E_ED
WHERE E.SETID = E_ED.SETID
AND E.DEPTID = E_ED.DEPTID
AND E_ED.EFFDT <= SYSDATE)
AND E.DEPTID = D.CHARTFIELD_VALUE
AND E.SETID = D.SETID
AND E.EFF_STATUS='A')),A.BUSINESS_UNIT_GL=(
SELECT D.CF_ATTRIB_VALUE
FROM %Table(CF_ATTRIB_TBL) D
, %Table(DEPT_TBL) E
WHERE ( D.EFFDT = (
SELECT MAX(D_ED.EFFDT)
FROM %Table(CF_ATTRIB_TBL) D_ED
WHERE D.SETID = D_ED.SETID
AND D.CHARTFIELD_VALUE = D_ED.CHARTFIELD_VALUE
AND D_ED.EFFDT <= SYSDATE)
AND E.EFFDT=D.EFFDT
AND D.CHARTFIELD_VALUE = (
SELECT M.DCP_FLD41
FROM %Table(DCP_AP11_TMP2) M
WHERE M.VCHR_BLD_KEY_C1 = A.VCHR_BLD_KEY_C1
AND M.DCP_FLD34= A.VOUCHER_LINE_NUM)
AND D.SETID = E.SETID
AND D.SETID = 'DCPID'
AND D.CF_ATTRIBUTE='GL_BUSN_UNIT'
AND E.EFFDT = (
SELECT MAX(E_ED.EFFDT)
FROM %Table(DEPT_TBL) E_ED
WHERE E.SETID = E_ED.SETID
AND E.DEPTID = E_ED.DEPTID
AND E_ED.EFFDT <= SYSDATE)
AND E.DEPTID = D.CHARTFIELD_VALUE
AND E.SETID = D.SETID
AND E.EFF_STATUS='A'))
WHERE EXISTS (
SELECT 'X'
FROM %Table(DCP_AP11_TMP2)
WHERE VCHR_BLD_KEY_C1 = A.VCHR_BLD_KEY_C1
AND VOUCHER_LINE_NUM = A.VOUCHER_LINE_NUM)
Above is the code for the Update statement in App engine.
Please help me how to overcome this problem.
Thanks in Advance.
The sub-selects populating each field are not returning values so the database is trying to update the field to NULL. In PeopleSoft null values are not allowed in character fields. A field with no value needs to be set to a single space, like ' '.
You will need to wrap each sub-select with a COALESCE() function, with a non-null alternative option if the sub-select does not return a value. Character fields need to be set to ' ', numbers to 0 if no values returned. Date fields can be null. Here is an example using the first few lines of the code provided.
UPDATE %Table(VCHR_LINE_STG) A
SET A.CLASS_FLD =
COALESCE(
(SELECT SUBSTR(DCP_FLD49,3,4)
FROM %Table(DCP_AP11_TMP2)
WHERE VCHR_BLD_KEY_C1 = A.VCHR_BLD_KEY_C1
AND DCP_FLD34= A.VOUCHER_LINE_NUM), ' ')
, A.BUSINESS_UNIT_GL=(SELECT
...

FakeFunction Results based on Test

I'm using tSqlt to unit test a stored procedure. This stored proc joins to a table-valued function, the function takes no parameters and the results are filtered via the join on clause.
I'm writing multiple tests for the stored proc. Is there a way to to fake the function in such a way that I could return different results based on the test that is being run.
The only solution I can think of is to create a fake per test, which is possible but a little more than clunky.
I imagine an ideal solution would be some sort of variable exposed in tsqlt that would allow me to determine which test I'm in and use some sort of case statement or something.
I use following procedure for that. It is not ideal, but working:
CREATE PROCEDURE [tSQLt].[FakeFunction2]
#FunctionName VARCHAR(200)
, #SchemaName VARCHAR(200) = 'dbo'
, #tmpTableName VARCHAR(200)
AS
BEGIN
DECLARE #Params VARCHAR(2000);
DECLARE #NewName VARCHAR(MAX) = #FunctionName + REPLACE(CAST(NEWID() AS VARCHAR(100)), '-', '');
DECLARE #FunctionNameWithSchema VARCHAR(MAX) = #SchemaName + '.' + #FunctionName;
DECLARE #RenameCmd VARCHAR(MAX) = 'EXEC sp_rename ''' + #FunctionNameWithSchema + ''', ''' + #NewName + ''';';
DECLARE #newTbleName VARCHAR(200) = #SchemaName + '.tmp' + REPLACE(CAST(NEWID() AS VARCHAR(100)), '-', '');
DECLARE #newTblStmt VARCHAR(2000) = 'SELECT * INTO ' + #newTbleName + ' FROM ' + #tmpTableName;
EXEC tSQLt.SuppressOutput #command = #newTblStmt;
SELECT #Params = p.params
FROM
( SELECT DISTINCT ( SELECT p1.name + ' ' + type1.name + b.brk + ',' AS [text()]
FROM sys.types type1
JOIN sys.parameters p1 ON p1.system_type_id = type1.system_type_id
CROSS APPLY
( SELECT CASE WHEN type1.name LIKE '%char' OR type1.name = 'varbinary' THEN
REPLACE(
'(' + CAST(p1.max_length AS VARCHAR(5)) + ')', '-1', 'MAX')
WHEN type1.name IN ('decimal', 'numeric') THEN
'(' + CAST(p1.precision AS VARCHAR(5)) + ', '
+ CAST(p1.scale AS VARCHAR(5)) + ')'
WHEN type1.name IN ('datetime2') THEN
'(' + CAST(p1.scale AS VARCHAR(5)) + ')'
ELSE ''
END AS brk) b
WHERE p1.object_id = p.object_id
ORDER BY p1.parameter_id
FOR XML PATH('')) [parameters]
FROM sys.objects AS o
LEFT JOIN sys.parameters AS p ON p.object_id = o.object_id
LEFT JOIN sys.types AS t ON t.system_type_id = p.system_type_id
WHERE o.name = #FunctionName AND o.schema_id = SCHEMA_ID(#SchemaName)) [Main]
CROSS APPLY
(SELECT LEFT(Main.[parameters], LEN(Main.[parameters]) - 1) params) AS p;
EXEC tSQLt.SuppressOutput #command = #RenameCmd;
DECLARE #newFunctionStmt VARCHAR(MAX) = '';
SET #newFunctionStmt = 'CREATE FUNCTION [' + #SchemaName + '].[' + #FunctionName + '](' + COALESCE(#Params,'') + ')';
SET #newFunctionStmt = #newFunctionStmt + ' RETURNS TABLE AS RETURN (SELECT * FROM ' + #newTbleName + ');';
EXEC tSQLt.SuppressOutput #command = #newFunctionStmt;
END;
and usage:
INSERT INTO #table
(col1
, col2
, col3)
VALUES
('a', 'b', 'c'),
('d', 'e', 'f');
EXEC tSQLt.FakeFunction2 #FunctionName = 'function_name'
, #SchemaName = 'dbo'
, #tmpTableName = '#table';
now with any passed parameter to that function it will always return the values from #table temp table
I thought of one potential solution.
I create a table within the test class schema and populate it with the results I wish to be returned per test.
CREATE TABLE testcalass.fakefunction_Results
(
ID INT,
Value NUMERIC(12, 5)
)
GO
CREATE FUNCTION testcalass.fakefunction()
RETURNS #results TABLE
(
ID INT,
Value NUMERIC(12, 5)
)
BEGIN
INSERT INTO #results
SELECT ID, Value FROM testcalass.fakefunction_Results
END
GO
So basically, I can populate is functions results at the top of my tests during the assemble section.

How to delete columns where all values are null?

How to delete columns with all null values in SQLite? I've got nearly 200 columns and don't want to list them all.
For SQLite you will want to try something along to lines of:
DELETE FROM myTable WHERE myColumn IS NULL OR trim(myColumn) = '';
You have to use another language to automate it.
## pip install sqlite_utils
import argparse
import sqlite_utils
def tracer(sql, params) -> None:
print("SQL: {} - params: {}".format(sql, params))
def connect(args) -> sqlite_utils.Database:
db = sqlite_utils.Database(args.database, tracer=tracer if args.verbose >= 2 else None)
db.execute("PRAGMA main.cache_size = 8000")
return db
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser()
parser.add_argument("database")
parser.add_argument("table")
parser.add_argument("--verbose", "-v", action="count", default=0)
args = parser.parse_args()
return args
def remove_empty_cols() -> None:
args = parse_args()
db = connect(args)
total_rows = db[args.table].count
for col in [col.name for col in db[args.table].columns if col.type == 'TEXT']:
details = db[args.table].analyze_column(col, total_rows=total_rows)
if details.num_null == total_rows and details.num_distinct == 0:
with db.conn:
db.execute(f'alter table "{args.table}" drop column "{col}"')
if __name__ == "__main__":
remove_empty_cols()
Run like this:
python remove_empty_cols.py video.db reddit_posts
Using a subquery like this did not seem to work:
SELECT 'alter table reddit_posts drop column ' || name || ';' ddl
FROM pragma_table_info('reddit_posts') t
WHERE "notnull"=0
AND (
SELECT count(t.name) FROM reddit_posts
) = 0
but if you did not want to use python you could run this then manually fill in columns that have the result of 0;
SELECT 'select count(' || name || ') from reddit_posts;' dml
FROM pragma_table_info('reddit_posts') t
WHERE "notnull"=0 AND "type"='TEXT';
SELECT 'alter table reddit_posts drop column ' || name || ';' ddl
FROM pragma_table_info('reddit_posts') t
WHERE name IN (
...
);

Stored Procedures and asp.net programmability; variable or SQL

Trying to display a users Lastname, Firstname --- Website
And I need to insert a comma and space after Lastname to a GridView.
I am trying to add a CASE statement in SQL and having trouble figuring it out.
Perhaps I need to use #parameter (scalar variable?) to abstract the
memory read from CASE statement; or my syntax is wrong and I just don't
understand.
SELECT
CASE
WHEN IsNull(people_Table.firstName, '') = ''
THEN CONCAT(people_Table.lastName, ', ', people_Table.firstName)
ELSE people_Table.lastName
END as fullName,
people_Table.website
FROM
people_Table
INNER JOIN
membership_Table on people_Table.ID = membership_Table.personID
WHERE
rectype = 'Master'
AND membershipType = 'Business'
AND expirationDate > GetDate()
ORDER BY
people_Table.lastName
Getting SQL Server error:
Msg 208, Level 16, State 1, Line 1
Invalid object name 'people_Table'.
Otherwise I suppose I should use an asp databoundevent in the template.
What is better for performance and security?
SELECT ISNULL(people_Table.lastName + ', ', '')
+ ISNULL(people_Table.firstName , '') as fullName
, people_Table.website
FROM people_Table INNER JOIN membership_Table on people_Table.ID =
membership_Table.personID
WHERE rectype = 'Master'
AND membershipType = 'Business'
AND expirationDate > GetDate()
ORDER BY people_Table.lastName
OR
SELECT COALESCE(people_Table.lastName + ', ', '')
+ COALESCE(people_Table.firstName , '') as fullName
, people_Table.website
FROM people_Table INNER JOIN membership_Table on people_Table.ID =
membership_Table.personID
WHERE rectype = 'Master'
AND membershipType = 'Business'
AND expirationDate > GetDate()
ORDER BY people_Table.lastName

How to get a list of column names

Is it possible to get a row with all column names of a table like this?
|id|foo|bar|age|street|address|
I don't like to use Pragma table_info(bla).
SELECT sql FROM sqlite_master
WHERE tbl_name = 'table_name' AND type = 'table'
Then parse this value with Reg Exp (it's easy) which could looks similar to this: [(.*?)]
Alternatively you can use:
PRAGMA table_info(table_name)
If you are using the command line shell to SQLite then .headers on before you perform your query. You only need to do this once in a given session.
You can use pragma related commands in sqlite like below
pragma table_info("table_name")
--Alternatively
select * from pragma_table_info("table_name")
If you require column names like id|foo|bar|age|street|address, basically your answer is in below query.
select group_concat(name,'|') from pragma_table_info("table_name")
Yes, you can achieve this by using the following commands:
sqlite> .headers on
sqlite> .mode column
The result of a select on your table will then look like:
id foo bar age street address
---------- ---------- ---------- ---------- ---------- ----------
1 val1 val2 val3 val4 val5
2 val6 val7 val8 val9 val10
This helps for HTML5 SQLite:
tx.executeSql('SELECT name, sql FROM sqlite_master WHERE type="table" AND name = "your_table_name";', [], function (tx, results) {
var columnParts = results.rows.item(0).sql.replace(/^[^\(]+\(([^\)]+)\)/g, '$1').split(','); ///// RegEx
var columnNames = [];
for(i in columnParts) {
if(typeof columnParts[i] === 'string')
columnNames.push(columnParts[i].split(" ")[0]);
}
console.log(columnNames);
///// Your code which uses the columnNames;
});
You can reuse the regex in your language to get the column names.
Shorter Alternative:
tx.executeSql('SELECT name, sql FROM sqlite_master WHERE type="table" AND name = "your_table_name";', [], function (tx, results) {
var columnNames = results.rows.item(0).sql.replace(/^[^\(]+\(([^\)]+)\)/g, '$1').replace(/ [^,]+/g, '').split(',');
console.log(columnNames);
///// Your code which uses the columnNames;
});
Use a recursive query. Given
create table t (a int, b int, c int);
Run:
with recursive
a (cid, name) as (select cid, name from pragma_table_info('t')),
b (cid, name) as (
select cid, '|' || name || '|' from a where cid = 0
union all
select a.cid, b.name || a.name || '|' from a join b on a.cid = b.cid + 1
)
select name
from b
order by cid desc
limit 1;
Alternatively, just use group_concat:
select '|' || group_concat(name, '|') || '|' from pragma_table_info('t')
Both yield:
|a|b|c|
The result set of a query in PHP offers a couple of functions allowing just that:
numCols()
columnName(int $column_number )
Example
$db = new SQLIte3('mysqlite.db');
$table = 'mytable';
$tableCol = getColName($db, $table);
for ($i=0; $i<count($tableCol); $i++){
echo "Column $i = ".$tableCol[$i]."\n";
}
function getColName($db, $table){
$qry = "SELECT * FROM $table LIMIT 1";
$result = $db->query($qry);
$nCols = $result->numCols();
for ($i = 0; $i < $ncols; $i++) {
$colName[$i] = $result->columnName($i);
}
return $colName;
}
$<?
$db = sqlite_open('mysqlitedb');
$cols = sqlite_fetch_column_types('form name'$db, SQLITE_ASSOC);
foreach ($cols as $column => $type) {
echo "Column: $column Type: $type\n";
}
Using #Tarkus's answer, here are the regexes I used in R:
getColNames <- function(conn, tableName) {
x <- dbGetQuery( conn, paste0("SELECT sql FROM sqlite_master WHERE tbl_name = '",tableName,"' AND type = 'table'") )[1,1]
x <- str_split(x,"\\n")[[1]][-1]
x <- sub("[()]","",x)
res <- gsub( '"',"",str_extract( x[1], '".+"' ) )
x <- x[-1]
x <- x[-length(x)]
res <- c( res, gsub( "\\t", "", str_extract( x, "\\t[0-9a-zA-Z_]+" ) ) )
res
}
Code is somewhat sloppy, but it appears to work.
Try this sqlite table schema parser, I implemented the sqlite table parser for parsing the table definitions in PHP.
It returns the full definitions (unique, primary key, type, precision, not null, references, table constraints... etc)
https://github.com/maghead/sqlite-parser
Easiest way to get the column names of the most recently executed SELECT is to use the cursor's description property. A Python example:
print_me = "("
for description in cursor.description:
print_me += description[0] + ", "
print(print_me[0:-2] + ')')
# Example output: (inp, output, reason, cond_cnt, loop_likely)

Resources