Simple Examples to Use Insert_Rows DB hook in Airflow - airflow

Can someone help me with simple examples to use Insert_Rows DB hook in Airflow?
I have a requirement to make an insert into a table.
How do I do that and make commit to the database.
Starting with airflow, so simple examples will help understand in a better way.

There are many ways. it depends on what is you preferered mode.
Based on you description, I think the most simple is use dboperator + SQL. It need strong Databases admin experience + a piece of airflow experience. For example:
process_order_fact = PostgresOperatorWithTemplatedParams(
task_id='process_order_fact',
postgres_conn_id='postgres_dwh',
sql='process_order_fact.sql',
parameters={"window_start_date": "{{ ds }}", "window_end_date": "{{ tomorrow_ds }}"},
dag=dag,
pool='postgres_dwh'
Above code was copied from https://gtoonstra.github.io/etl-with-airflow/etlexample.html
Good Luck.

Here is another method. For example, if you have database A that you read a row from it and want to insert it to a similar database B.
Here is an example of INSERT:
cursor.execute("SELECT * FROM A WHERE ID > 5")
connecetion_to_B.insert_rows(table="B", rows=cursor)
UPSERT:
cursor.execute("SELECT * FROM A WHERE Id > 5")
connecetion_to_B.insert_rows(table="B",
rows=cursor,
replace=True,
replace_index="id",
target_fields=['Id','memberId'])

Related

Error with SQLite query, What am I missing?

I've been attempting to increase my knowledge and trying out some challenges. I've been going at this for a solid two weeks now finished most of the challenge but this one part remains. The error is shown below, what am i not understanding?
Error in sqlite query: update users set last_browser= 'mozilla' + select sql from sqlite_master'', last_time= '13-04-2019' where id = '14'
edited for clarity:
I'm trying a CTF challenge and I'm completely new to this kind of thing so I'm learning as I go. There is a login page with test credentials we can use for obtaining many of the flags. I have obtained most of the flags and this is the last one that remains.
After I login on the webapp with the provided test credentials, the following messages appear: this link
The question for the flag is "What value is hidden in the database table secret?"
So from the previous image, I have attempted to use sql injection to obtain value. This is done by using burp suite and attempting to inject through the user-agent.
I have gone through trying to use many variants of the injection attempt shown above. Im struggling to find out where I am going wrong, especially since the second single-quote is added automatically in the query. I've gone through the sqlite documentation and examples of sql injection, but I cannot sem to understand what I am doing wrong or how to get that to work.
A subquery such as select sql from sqlite_master should be enclosed in brackets.
So you'd want
update user set last_browser= 'mozilla' + (select sql from sqlite_master''), last_time= '13-04-2019' where id = '14';
Although I don't think that will achieve what you want, which isn't clear. A simple test results in :-
You may want a concatenation of the strings, so instead of + use ||. e.g.
update user set last_browser= 'mozilla' || (select sql from sqlite_master''), last_time= '13-04-2019' where id = '14';
In which case you'd get something like :-
Thanks for everyone's input, I've worked this out.
The sql query was set up like this:
update users set last_browser= '$user-agent', last_time= '$current_date' where id = '$id_of_user'
edited user-agent with burp suite to be:
Mozilla', last_browser=(select sql from sqlite_master where type='table' limit 0,1), last_time='13-04-2019
Iterated with that found all tables and columns and flags. Rather time consuming but could not find a way to optimise.

How to prepare a statement from the CLI interpreter?

How does one prepare a statement from the SQLite CLI? I have found the page Compiling An SQL Statement but it is geared more towards the ODBC interface, not the CLI interpreter. I'm hopinpg for something akin to the following:
sqlite> pq = prepare(SELECT * FROM Users WHERE username=?)
sqlite> run(pq, 'jeffatwood')
0 | jeffatwood | hunter2 | admin
sqlite>
Does the SQLite CLI have any such functionality? Note that I am not referring to the Bash CLI but rather SQLite's CLI interpreter or the excellent LiteCLI alternative.
Perhaps SQL Parameters using named parameters would do the trick
sqlite> .param set :user 'jeffatwood'
sqlite> select * from Users where username = :user
should return the desired row.
CLI was not designed for such. For this you must use an SQLite API on an available programming language.
You may also write a batch/shell file to handle CLI call.
E.g., in Windows a file named User.bat like following:
#SQLITE3.EXE some.db "SELECT * FROM Users WHERE username='%~1'"
May be called like this:
User "jeffatwood"
will perform desired result.
EDIT:
About prepared/compiled statements: with those you can bind parameters, fetch queries row by row and repeat same command in a faster manner.
sqlite3 CLI tool wouldn't take any advantage on those:
all parameters must be typed in SQL statement, making binding useless;
all query rows are returned at once, no need to fetch row by row;
repeated commands must be retyped - small speed improvement would result in using precompiled statements.

Oracle 11g data pump 10 column limit

I am using an Oracle data pump to do a schema "rename." There is a primary key column on all (2000) tables. For example, I need to run this on all tables:
update mytable set mykey='foo2' where mykey='foo';
I would use the remap_data option of expdp to do this. The problem is that there are some columns that I would need to do the rename on 10+ columns. Has anyone had a problem like this and found a way to handle this?
Previously, I had tried using "Create Table As." The problem would be having to recreate the schema structure for all of the tables (views/triggers/grants/indexes/constraints). I am aware of the DBMS_METADATA.GET_DDL package. Offhand, doing a diff of the database schema before and after and recreating the diffs seems ugly.
I have also tried doing inserts on the table without any constraints or indexes, so I would only have to re-enable constraints and recreate the indexes, but I would like to try something faster.
I am using Oracle 11.2.0.3.0.
If i understand correctly, your real problem (or goal) is to 'RENAME' a schema.
You chose to export / import (using a different NAME to achieve RENAME) using oracle data pump.
Then DROP old schema (if you feel redundant).
If this is correct, here are the steps, you can do to achieve your goal. I did it successfully on my DEV env. All objects (including PK, FKs) were imported successfully.
-- Export RMCORE_QA
expdp DIRECTORY=DMPDIR DUMPFILE=RMCORE_QA.dmp SCHEMAS='RMCORE_QA' LOG=RMCORE_QA_EXP_DP.lst
-- Import using RMCORE_QA3
impdp DIRECTORY=DMPDIR DUMPFILE=RMCORE_QA.dmp REMAP_SCHEMA='RMCORE_QA:RMCORE_QA3' SCHEMAS='RMCORE_QA' LOG=RMCORE_QA_IMP_DP.lst TRANSFORM=OID:N
You can also compare objects b/w schemas by-
SELECT OBJECT_NAME, STATUS, object_type FROM dba_objects WHERE owner LIKE 'RMCORE_QA'
MINUS
select OBJECT_NAME, STATUS, object_type from dba_objects where owner like 'RMCORE_QA3';
HTH. Let me know if i did not get your problem...

SQL Server 2005 - Pass In Name of Table to be Queried via Parameter

Here's the situation. Due to the design of the database I have to work with, I need to write a stored procedure in such a way that I can pass in the name of the table to be queried against if at all possible. The program in question does its processing by jobs, and each job gets its own table created in the database, IE table-jobid1, table-jobid2, table-jobid3, etc. Unfortunately, there's nothing I can do about this design - I'm stuck with it.
However, now, I need to do data mining against these individualized tables. I'd like to avoid doing the SQL in the code files at all costs if possible. Ideally, I'd like to have a stored procedure similar to:
SELECT *
FROM #TableName AS tbl
WHERE #Filter
Is this even possible in SQL Server 2005? Any help or suggestions would be greatly appreciated. Alternate ways to keep the SQL out of the code behind would be welcome too, if this isn't possible.
Thanks for your time.
best solution I can think of is to build your sql in the stored proc such as:
#query = 'SELECT * FROM ' + #TableName + ' as tbl WHERE ' + #Filter
exec(#query)
not an ideal solution probably, but it works.
The best answer I can think of is to build a view that unions all the tables together, with an id column in the view telling you where the data in the view came from. Then you can simply pass that id into a stored proc which will go against the view. This is assuming that the tables you are looking at all have identical schema.
example:
create view test1 as
select * , 'tbl1' as src
from job-1
union all
select * , 'tbl2' as src
from job-2
union all
select * , 'tbl3' as src
from job-3
Now you can select * from test1 where src = 'tbl3' and you will only get records from the table job-3
This would be a meaningless stored proc. Select from some table using some parameters? You are basically defining the entire query again in whatever you are using to call this proc, so you may as well generate the sql yourself.
the only reason I would do a dynamic sql writing proc is if you want to do something that you can change without redeploying your codebase.
But, in this case, you are just SELECT *'ing. You can't define the columns, where clause, or order by differently since you are trying to use it for multiple tables, so there is no meaningful change you could make to it.
In short: it's not even worth doing. Just slop down your table specific sprocs or write your sql in strings (but make sure it's parameterized) in your code.

Fun with Database Triggers and Recursion in RDB

I had a problem this week (which thankfully I've solved in a much better way);
I needed to keep a couple of fields in a database constant.
So, I knocked up a script to place a Trigger on the table, that would set the value back to a preset number when either an insert, or update took place.
The database is RDB running on VMS (but i'd be interested to know the similarities for SQLServer).
Here are the triggers:
drop trigger my_ins_trig;
drop trigger my_upd_trig;
!
!++ Create triggers on MY_TABLE
CREATE TRIGGER my_ins_trig AFTER INSERT ON my_table
WHEN somefield = 2
(UPDATE my_table table1
SET table1.field1 = 0.1,
table1.field2 = 1.2
WHERE my_table.dbkey = table1.dbkey)
FOR EACH ROW;
CREATE TRIGGER my_upd_trig AFTER UPDATE ON my_table
WHEN somefield = 2
(UPDATE my_table table1
SET table1.field1 = 0.1,
table1.field2 = 1.2
WHERE my_table.dbkey = table1.dbkey)
FOR EACH ROW;
Question Time
I'd would expect this to form an infinite recursion - but it doesnt seem to?
Can anyone explain to me how RDB deals with this one way or another...or how other databases deal with it.
[NOTE: I know this is an awful approach but various problems and complexities meant that even though this is simple in the code - it couldn't be done the best/easiest way. Thankfully I haven't implemented it in this way but I wanted to ask the SO community for its thoughts on this. ]
Thanks in advance
edit: It seems Oracle RDB just plain doesnt execute nested triggers that result in recursion. From the paper: 'A trigger can nest other triggers as long as recursion does not take place.' I'll leave the rest of the answer here for anyone else wondering about recursive triggers in other DBs.
Well firstly to answer your question - it depends on the database. Its entirely possible that trigger recursion is turned off on the instance you are working on. As you can imagine, trigger recursion could cause all kinds of chaos if handled incorrectly so SQL Server allows you to disable it altogether.
Secondly, I would suggest that perhaps there is a better way to get this functionality without triggers. You can get view based row level security with SQL Server. The same outcome can be achieved with Oracle VPDs.
Alternatively, if its configuration values you are trying to protect, I would group them all into a single table and apply permissions on that (simpler than row based security).

Resources