How to delete columns where all values are null? - sqlite

How to delete columns with all null values in SQLite? I've got nearly 200 columns and don't want to list them all.

For SQLite you will want to try something along to lines of:
DELETE FROM myTable WHERE myColumn IS NULL OR trim(myColumn) = '';

You have to use another language to automate it.
## pip install sqlite_utils
import argparse
import sqlite_utils
def tracer(sql, params) -> None:
print("SQL: {} - params: {}".format(sql, params))
def connect(args) -> sqlite_utils.Database:
db = sqlite_utils.Database(args.database, tracer=tracer if args.verbose >= 2 else None)
db.execute("PRAGMA main.cache_size = 8000")
return db
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser()
parser.add_argument("database")
parser.add_argument("table")
parser.add_argument("--verbose", "-v", action="count", default=0)
args = parser.parse_args()
return args
def remove_empty_cols() -> None:
args = parse_args()
db = connect(args)
total_rows = db[args.table].count
for col in [col.name for col in db[args.table].columns if col.type == 'TEXT']:
details = db[args.table].analyze_column(col, total_rows=total_rows)
if details.num_null == total_rows and details.num_distinct == 0:
with db.conn:
db.execute(f'alter table "{args.table}" drop column "{col}"')
if __name__ == "__main__":
remove_empty_cols()
Run like this:
python remove_empty_cols.py video.db reddit_posts
Using a subquery like this did not seem to work:
SELECT 'alter table reddit_posts drop column ' || name || ';' ddl
FROM pragma_table_info('reddit_posts') t
WHERE "notnull"=0
AND (
SELECT count(t.name) FROM reddit_posts
) = 0
but if you did not want to use python you could run this then manually fill in columns that have the result of 0;
SELECT 'select count(' || name || ') from reddit_posts;' dml
FROM pragma_table_info('reddit_posts') t
WHERE "notnull"=0 AND "type"='TEXT';
SELECT 'alter table reddit_posts drop column ' || name || ';' ddl
FROM pragma_table_info('reddit_posts') t
WHERE name IN (
...
);

Related

xcom value are overwritten by else statement in if/else loop

I'm looping through a folder with some sql files. For each file I want to push them as xcom value with specific value for each queries.
The code below is kind of working however not when adding else statement. not set value is overwriting everything.
directory = r'airflow_home/dags/sql'
for filename in os.listdir(directory):
with open(os.path.join(directory, filename), 'r') as file:
sqlFile = file.read()
file.close()
if filename == 'api_params.sql':
query = sqlFile.format(partitioned_key,execution_date_second,partitioned_key,next_execution_date_second)
if filename == 'create_fact_table.sql':
query = sqlFile.format(fact_table_dest)
if filename == 'create_geo_table.sql':
query = sqlFile.format(fact_table_dest)
if filename == f'{geo_type}'+'.sql':
query = sqlFile.format(execution_date)
filename = 'geo_query'
if filename == 'schema_' + f'{schema}' + '.sql':
query = sqlFile.format(fact_table_dest,raw_table_dest,execution_date,next_execution_date)
filename = 'production_query'
if filename == 'insert_key.sql':
query = sqlFile.format(raw_table_dest,execution_date,next_execution_date)
else:
query = 'not set'
task_instance.xcom_push(key=filename, value=query)
can someone explain me what's happening here?
You are using multiple if statements, which are executed one after the other. The else just referring to the last if statement and therefore overwriting previous set query parameter. What you are actually looking for is elif - see Python Docs.
directory = r'airflow_home/dags/sql'
for filename in os.listdir(directory):
with open(os.path.join(directory, filename), 'r') as file:
sqlFile = file.read()
file.close()
if filename == 'api_params.sql':
query = sqlFile.format(partitioned_key,execution_date_second,partitioned_key,next_execution_date_second)
elif filename == 'create_fact_table.sql':
query = sqlFile.format(fact_table_dest)
elif filename == 'create_geo_table.sql':
query = sqlFile.format(fact_table_dest)
elif filename == f'{geo_type}'+'.sql':
query = sqlFile.format(execution_date)
filename = 'geo_query'
elif filename == 'schema_' + f'{schema}' + '.sql':
query = sqlFile.format(fact_table_dest,raw_table_dest,execution_date,next_execution_date)
filename = 'production_query'
elif filename == 'insert_key.sql':
query = sqlFile.format(raw_table_dest,execution_date,next_execution_date)
else:
query = 'not set'
task_instance.xcom_push(key=filename, value=query)

MS SQL Case in Where Clause testing against NULL or Argument

I have a query against a UDF where I want to allow the user to pass in either ALL or a specific EType.
If they pass in ALL, I want to accept all ETypes where it is not null.
I have searched thru SO for examples and not seem to meet my particular situation.
Where am I going wrong?
Declare
#company varchar(4),
#charge_cov bit,
#EType varchar(8);
set #company = '123'
set #charge_cov =1
set #EType = 'ALL'
select e.emp_id,
dbo.format_emp_number(pd.EN) as EN,
dbo.format_emp_number(pd.MEN) as MEN,
pd.EType
from dbo.employee_payroll_data(NULL) pd
inner join employee e on (e.emp_id=pd.emp_id)
where pd.EType = case when #EType='ALL' then pd.EType
else #EType ) END
and pd.EType is not null
and e.emp_number is not null
and e.charge_cov = 1
and lc.pr_co_code = #company
Try below code:
WHERE (((1 = (CASE WHEN #EType = 'ALL' THEN 1 ELSE 0 END)))
OR ((pd.Etype = (CASE WHEN #EType <> 'ALL' THEN #EType ELSE '' END))))
AND pd.Etype IS NOT NULL

Get Installation Sequence of Oracle Objects

Ok, I have a complex recursion problem. I want to get a dependecy installation sequence of all of my objcts (all_objects table) in my Oracle 11g database.
First I have created a view holding all dependencies
create or replace
view REALLY_ALL_DEPENDENCIES as
select *
from ALL_DEPENDENCIES
union
select owner, index_name, 'INDEX', table_owner, table_name, table_type, null, null
from all_indexes
union
select p.owner, p.table_name, 'TABLE', f.owner, f.table_name, 'TABLE', null, null
from all_constraints p
join all_constraints f
on F.R_CONSTRAINT_NAME = P.CONSTRAINT_NAME
and F.CONSTRAINT_TYPE = 'R'
and p.constraint_type='P'
;
/
EDIT
I have tried do concate all dependencies by using this function:
create
or replace
function dependency(
i_name varchar2
,i_type varchar2
,i_owner varchar2
,i_level number := 0
,i_token clob := ' ') return clob
is
l_token clob := i_token;
l_exist number := 0;
begin
select count(*) into l_exist
from all_objects
where object_name = i_name
and object_type = i_type
and owner = i_owner;
if l_exist > 0 then
l_token := l_token || ';' || i_level || ';' ||
i_name || ':' || i_type || ':' || i_owner;
else
-- if not exist function recursion is finished
return l_token;
end if;
for tupl in (
select distinct
referenced_name
,referenced_type
,referenced_owner
from REALLY_ALL_DEPENDENCIES
where name = i_name
and type = i_type
and owner = i_owner
)
loop
-- if cyclic dependency stop and shout!
if i_token like '%' || tupl.referenced_name || ':' || tupl.referenced_type || ':' || tupl.referenced_owner || '%' then
select count(*) into l_exist
from REALLY_ALL_DEPENDENCIES
where name = tupl.referenced_name
and type = tupl.referenced_type
and owner = tupl.referenced_owner;
if l_exist > 0 then
return '!!!CYCLIC!!! (' || i_level || ';' || tupl.referenced_name || ':' || tupl.referenced_type || ':' || tupl.referenced_owner || '):' || l_token;
end if;
end if;
-- go into recursion
l_token := dependency(
tupl.referenced_name
,tupl.referenced_type
,i_owner /* I just want my own sources */
,i_level +1
,l_token);
end loop;
-- no cyclic condition and loop is finished
return l_token;
end;
/
And I can query through
select
object_name
,object_type
,owner
,to_char(dependency(object_name, object_type, owner)) as dependecy
from all_objects
where owner = 'SYSTEM'
;
Ok, maybe it is something like "cheating" but you can not do cyclic dependencies at creation time. So at least as a human beeing I am only able to create one object after another :-) And this sequence should be "reverse engineer able".
Now I am more interested in a solution than before ;-) And it is still about the tricky part ... "How can I select all soures from a schema orderd by its installation sequence (dependent objects list prior the using object)"?
It is just some kind of sorting problem, insn't it?
Usually you "cheat" by creating the objects in a particular order. For example, you might make sequences first (they have zero dependencies). Then you might do tables. After that, package specs, then package bodies, and so on.
Keep in mind that it is possible to have cyclic dependencies between packages, so there are cases where it will be impossible to satisfy all dependencies at creation anyway.
What's the business case here? Is there a real "problem" or just an exercise?
EDIT
The export tool we use exports objects in the following order:
Database Links
Sequences
Types
Tables
Views
Primary Keys
Indexes
Foreign Keys
Constraints
Triggers
Materialized Views
Materialized View Logs
Package Specs
Package Bodies
Procedures
Functions
At the end, we run the dbms_utility.compile_schema procedure to make sure everything is valid and no dependencies are missed. If you use other object types than these, I'm not sure where they'd go in this sequence.
Ok, I had some time to look at the job again and I want to share the results. Maybe anotherone comes across this thread searching for a solution. First of all I did the SQLs as SYS but I think you can do it in every schema using public synonyms.
The Procedure "exec obj_install_seq.make_install('SCOTT');" makes a clob containing a sql+ compatible sql file, assuming your sources are called "object_name.object_type.sql". Just spool it out.
Cheers
Chris
create global temporary table DEPENDENCIES on commit delete rows as
select * from ALL_DEPENDENCIES where 1=2 ;
/
create global temporary table install_seq(
idx number
,seq number
,iter number
,owner varchar2(30)
,name varchar2(30)
,type varchar2(30)
) on commit delete rows;
/
create global temporary table loop_chk(
iter number
,lvl number
,owner varchar2(30)
,name varchar2(30)
,type varchar2(30)
) on commit delete rows;
/
create or replace package obj_install_seq is
procedure make_install(i_schema varchar2 := 'SYSTEM');
end;
/
create or replace package body obj_install_seq is
subtype install_seq_t is install_seq%rowtype;
type dependency_list_t is table of DEPENDENCIES%rowtype;
procedure set_list_data(i_schema varchar2 := user)
is
l_owner varchar2(30) := i_schema;
begin
-- collect all dependencies
insert into DEPENDENCIES
select *
from (select *
from ALL_DEPENDENCIES
where owner = l_owner
and referenced_owner = l_owner
union
select owner, index_name, 'INDEX', table_owner, table_name, table_type, null, null
from all_indexes
where owner = l_owner
and table_owner = l_owner
union
select p.owner, p.table_name, 'TABLE', f.owner, f.table_name, 'TABLE', null, null
from all_constraints p
join all_constraints f
on F.R_CONSTRAINT_NAME = P.CONSTRAINT_NAME
and F.CONSTRAINT_TYPE = 'R'
and p.constraint_type='P'
and p.owner = f.owner
where p.owner = l_owner
) all_dep_tab;
-- collect all objects
insert into install_seq
select rownum, null,null, owner, object_name, object_type
from (select distinct owner, object_name, object_type, created
from all_objects
where owner = l_owner
order by created) objs;
end;
function is_referencing(
i_owner varchar2
,i_name varchar2
,i_type varchar2
,i_iter number
,i_level number := 0
) return boolean
is
l_cnt number;
begin
select count(*) into l_cnt
from loop_chk
where name = i_name
and owner = i_owner
and type = i_type
and iter = i_iter
and lvl < i_level;
insert into loop_chk values(i_iter,i_level,i_owner,i_name,i_type);
if l_cnt > 0 then
return true;
else
return false;
end if;
end;
procedure set_seq(
i_owner varchar2
,i_name varchar2
,i_type varchar2
,i_iter number
,i_level number := 0)
is
-- l_dep all_dependencies%rowtype;
l_idx number;
l_level number := i_level +1;
l_dep_list dependency_list_t;
l_cnt number;
begin
-- check for dependend source
begin
select * bulk collect into l_dep_list
from dependencies
where name = i_name
and owner = i_owner
and type = i_type;
if l_dep_list.count <= 0 then
-- recursion finished
return;
end if;
end;
for i in 1..l_dep_list.count loop
if is_referencing(
l_dep_list(i).referenced_owner
,l_dep_list(i).referenced_name
,l_dep_list(i).referenced_type
,i_iter
,i_level
) then
-- cyclic dependecy
update install_seq
set seq = 999
,iter = i_iter
where name = l_dep_list(i).referenced_name
and owner = l_dep_list(i).referenced_owner
and type = l_dep_list(i).referenced_type;
else
--chek if sequence is earlier
select count(*) into l_cnt
from install_seq
where name = l_dep_list(i).referenced_name
and owner = l_dep_list(i).referenced_owner
and type = l_dep_list(i).referenced_type
and seq > l_level *-1;
-- set sequence
if l_cnt > 0 then
update install_seq
set seq = l_level *-1
,iter = i_iter
where name = l_dep_list(i).referenced_name
and owner = l_dep_list(i).referenced_owner
and type = l_dep_list(i).referenced_type;
end if;
-- go recusrion
set_seq(
l_dep_list(i).referenced_owner
,l_dep_list(i).referenced_name
,l_dep_list(i).referenced_type
,i_iter + (i-1)
,l_level
);
end if;
end loop;
end;
function get_next_idx return number
is
l_idx number;
begin
select min(idx) into l_idx
from install_seq
where seq is null;
return l_idx;
end;
procedure make_install(i_schema varchar2 := 'SYSTEM')
is
l_obj install_seq_t;
l_idx number;
l_iter number := 0;
l_install_clob clob := chr(10);
begin
set_list_data(i_schema);
l_idx := get_next_idx;
while l_idx is not null loop
l_iter := l_iter +1;
select * into l_obj from install_seq where idx = l_idx;
update install_seq set iter = l_iter where idx = l_idx;
update install_seq set seq = 0 where idx = l_idx;
set_seq(l_obj.owner,l_obj.name,l_obj.type,l_iter);
l_idx := get_next_idx;
end loop;
for tupl in ( select * from install_seq order by seq, iter, idx ) loop
l_install_clob := l_install_clob || '#' ||
replace(tupl.name,' ' ,'') || '.' ||
replace(tupl.type,' ' ,'') || '.sql' ||
chr(10);
end loop;
l_install_clob := l_install_clob ||
'exec dbms_utility.compile_schema(''' || upper(i_schema) || ''');';
-- do with the install file what you want
DBMS_OUTPUT.PUT_LINE(dbms_lob.substr(l_install_clob,4000));
end;
end;
/

How to get a list of column names

Is it possible to get a row with all column names of a table like this?
|id|foo|bar|age|street|address|
I don't like to use Pragma table_info(bla).
SELECT sql FROM sqlite_master
WHERE tbl_name = 'table_name' AND type = 'table'
Then parse this value with Reg Exp (it's easy) which could looks similar to this: [(.*?)]
Alternatively you can use:
PRAGMA table_info(table_name)
If you are using the command line shell to SQLite then .headers on before you perform your query. You only need to do this once in a given session.
You can use pragma related commands in sqlite like below
pragma table_info("table_name")
--Alternatively
select * from pragma_table_info("table_name")
If you require column names like id|foo|bar|age|street|address, basically your answer is in below query.
select group_concat(name,'|') from pragma_table_info("table_name")
Yes, you can achieve this by using the following commands:
sqlite> .headers on
sqlite> .mode column
The result of a select on your table will then look like:
id foo bar age street address
---------- ---------- ---------- ---------- ---------- ----------
1 val1 val2 val3 val4 val5
2 val6 val7 val8 val9 val10
This helps for HTML5 SQLite:
tx.executeSql('SELECT name, sql FROM sqlite_master WHERE type="table" AND name = "your_table_name";', [], function (tx, results) {
var columnParts = results.rows.item(0).sql.replace(/^[^\(]+\(([^\)]+)\)/g, '$1').split(','); ///// RegEx
var columnNames = [];
for(i in columnParts) {
if(typeof columnParts[i] === 'string')
columnNames.push(columnParts[i].split(" ")[0]);
}
console.log(columnNames);
///// Your code which uses the columnNames;
});
You can reuse the regex in your language to get the column names.
Shorter Alternative:
tx.executeSql('SELECT name, sql FROM sqlite_master WHERE type="table" AND name = "your_table_name";', [], function (tx, results) {
var columnNames = results.rows.item(0).sql.replace(/^[^\(]+\(([^\)]+)\)/g, '$1').replace(/ [^,]+/g, '').split(',');
console.log(columnNames);
///// Your code which uses the columnNames;
});
Use a recursive query. Given
create table t (a int, b int, c int);
Run:
with recursive
a (cid, name) as (select cid, name from pragma_table_info('t')),
b (cid, name) as (
select cid, '|' || name || '|' from a where cid = 0
union all
select a.cid, b.name || a.name || '|' from a join b on a.cid = b.cid + 1
)
select name
from b
order by cid desc
limit 1;
Alternatively, just use group_concat:
select '|' || group_concat(name, '|') || '|' from pragma_table_info('t')
Both yield:
|a|b|c|
The result set of a query in PHP offers a couple of functions allowing just that:
numCols()
columnName(int $column_number )
Example
$db = new SQLIte3('mysqlite.db');
$table = 'mytable';
$tableCol = getColName($db, $table);
for ($i=0; $i<count($tableCol); $i++){
echo "Column $i = ".$tableCol[$i]."\n";
}
function getColName($db, $table){
$qry = "SELECT * FROM $table LIMIT 1";
$result = $db->query($qry);
$nCols = $result->numCols();
for ($i = 0; $i < $ncols; $i++) {
$colName[$i] = $result->columnName($i);
}
return $colName;
}
$<?
$db = sqlite_open('mysqlitedb');
$cols = sqlite_fetch_column_types('form name'$db, SQLITE_ASSOC);
foreach ($cols as $column => $type) {
echo "Column: $column Type: $type\n";
}
Using #Tarkus's answer, here are the regexes I used in R:
getColNames <- function(conn, tableName) {
x <- dbGetQuery( conn, paste0("SELECT sql FROM sqlite_master WHERE tbl_name = '",tableName,"' AND type = 'table'") )[1,1]
x <- str_split(x,"\\n")[[1]][-1]
x <- sub("[()]","",x)
res <- gsub( '"',"",str_extract( x[1], '".+"' ) )
x <- x[-1]
x <- x[-length(x)]
res <- c( res, gsub( "\\t", "", str_extract( x, "\\t[0-9a-zA-Z_]+" ) ) )
res
}
Code is somewhat sloppy, but it appears to work.
Try this sqlite table schema parser, I implemented the sqlite table parser for parsing the table definitions in PHP.
It returns the full definitions (unique, primary key, type, precision, not null, references, table constraints... etc)
https://github.com/maghead/sqlite-parser
Easiest way to get the column names of the most recently executed SELECT is to use the cursor's description property. A Python example:
print_me = "("
for description in cursor.description:
print_me += description[0] + ", "
print(print_me[0:-2] + ')')
# Example output: (inp, output, reason, cond_cnt, loop_likely)

Search specific value in all field in oracle table

I want to search some keyword in table but I don't know to which column it is belonging to. I have got one of query for that as follows:
variable val varchar2(10)
exec :val := 'KING'
PL/SQL procedure successfully completed.
SELECT DISTINCT SUBSTR (:val, 1, 11) "Searchword",
SUBSTR (table_name, 1, 14) "Table",
SUBSTR (column_name, 1, 14) "Column" FROM cols,
TABLE (xmlsequence (dbms_xmlgen.getxmltype ('select '
|| column_name
|| ' from '
|| table_name
|| ' where upper('
|| column_name
|| ') like upper(''%'
|| :val
|| '%'')' ).extract ('ROWSET/ROW/*') ) ) t
ORDER BY "Table"
Searchword Table Column
KING EMP ENAME
but I am not getting appropriate output.I only got output as:
PL/SQL procedure successfully completed. I have tried but I didn't get satisfactory answer. Can anybody please help..?
The easiest query I can write for such scope is something like:
SELECT *
FROM <table>
WHERE UPPER(column1) LIKE UPPER('%' || :val || '%')
OR UPPER(column2) LIKE UPPER('%' || :val || '%')
OR UPPER(column3) LIKE UPPER('%' || :val || '%')
OR UPPER(column4) LIKE UPPER('%' || :val || '%');
In this query I search for value :val in all columns of the table using OR conditions, so if at least one column contains the value the row is fetched
If you have many columns you can write a query that builds the final query for you, like the following:
SELECT 'SELECT * FROM <table> WHERE ' || LISTAGG(column_name || ' LIKE ''%' || :val || '%''', ' OR ') WITHIN GROUP (ORDER BY column_name)
FROM dba_tab_columns
WHERE table_name = '<table>'
The result of this query is the query to execute. Note that Oracle has a limit of 4000 characters for a string field built in a query. If your where condition is too big the query will fail.
In this case, the only alternative is to write a stored procedure that builds the query and returns it in a CLOB variable, here's an example:
CREATE OR REPLACE FUNCTION build_query(in_table_name IN VARCHAR2, in_search IN VARCHAR2) RETURN `CLOB` IS
lc_query CLOB := 'SELECT * FROM ' || in_table_name || ' WHERE 1=0';
BEGIN
FOR c IN (
SELECT *
FROM user_tab_columns
WHERE table_name = in_table_name
ORDER BY column_name
) LOOP
lc_query := lc_query || ' OR ' || c.column_name || ' LIKE ''%' || in_search || '%''';
END LOOP;
RETURN lc_query;
END;
This function will works and generates strings longer than 4000 characters.

Resources