I have written an SQL query which amalgamates data from two separate tables with the following query:
SELECT * FROM table 1
UNION ALL
SELECT * FROM table 2
ORDER BY column 1
What I'd like to be able to do is to add a column or 'stamp' in a newly created column which details the table which each text entry originally came from. So my output would have a column which detailed the table which each row was originally from.
Essentially, the tables I have are made up of large quantities of numeric data and are hard to distinguish upon completing the Union command.
Thanks for any help.
Regards,
CJW.
You can select a scalar value from your selects, but you need to specify columns instead of *:
SELECT col1, col2, 'TABLE1' FROM table 1
UNION ALL
SELECT col1, col2, 'TABLE2' FROM table 2 ORDER BY column 1
You can simply add any expression(s) anywhere in the SELECT clause:
SELECT *, 1 AS SourceTable FROM Table1
UNION ALL
SELECT *, 2 AS SourceTable FROM Table2
ORDER BY Column1;
Related
Consider the following schema and table:
CREATE TABLE IF NOT EXISTS `names` (
`id` INTEGER,
`name` TEXT,
PRIMARY KEY(`id`)
);
INSERT INTO `names` VALUES (1,'zulu');
INSERT INTO `names` VALUES (2,'bene');
INSERT INTO `names` VALUES (3,'flip');
INSERT INTO `names` VALUES (4,'rossB');
INSERT INTO `names` VALUES (5,'albert');
INSERT INTO `names` VALUES (6,'zuse');
INSERT INTO `names` VALUES (7,'rossA');
INSERT INTO `names` VALUES (8,'juss');
I access this table with the following query:
SELECT *
FROM names
ORDER BY name
LIMIT 10
OFFSET 4;
Where offset 4 is used because it's the rowid (in the ordered list) to the first occurance of 'R%' names. This returns:
1="7" "rossA"
2="4" "rossB"
3="1" "zulu"
4="6" "zuse"
My question is, is there an SQL statement which can return the OFFSET value (in the R case above its 4) given a starting first letter please? (I don't really want to resort to stepping() through results, counting rows, until first 'R%' is reached!)
I've tried the following without success:
SELECT MIN(ROWID)
FROM
(
SELECT *
FROM names
ORDER BY name
)
WHERE name LIKE 'R%'
It always returns single row of NULL data.
As background, this table is a phone book list and I want to provide subset of results (from main table) back to caller, starting at a initial letter offset.
Just count the rows before the string of interest:
select count(*) from names where name < 'r';
The following has a number of options. Basically your issues is that the sub-query doesn't return the roiwd hencne NULL as the minimum. However, there is no need to use the rowid directly as the id column is an alias of the rowid, so that could be used:-
SELECT name, id, MIN(rowid), min(id) -- shows how rowid and id are the same
FROM
(
SELECT rowid, * -- returns rowid from the subquery so min(rowid) now works
FROM names
ORDER BY name
)
WHERE name LIKE 'R%' ORDER BY id ASC LIMIT 1 -- Will effectivley do the same (no need for the sub-query)
Extra columns added for demonstration.
As such your query could be :-
SELECT min(rowid) FROM names where name LIKE 'R%';
Or :-
SELECT min(id) FROM names where name LIKE 'R%';
You could also use :-
SELECT id FROM names WHERE name LIKE 'R%' ORDER BY id ASC LIMIT 1;
Or :-
SELECT rowid FROM names WHERE name LIKE 'R%' ORDER BY id ASC LIMIT 1;
I have created a subset of the pg_table_def table with table_name,col_name and data_type. I have also added a column active with 'Y' as value for some of the rows. Let us call this table as config.Table config looks like below:
table_name column_name
interaction_summary name_id
tag_transaction name_id
interaction_summary direct_preference
bulk_sent email_image_click
crm_dm web_le_click
Now I want to be able to map the table names from this table to the actual table and fetch values for the corresponding column. name_id will be the key here which will be available in all tables. My output should look like below:
name_id direct_preference email_image_click web_le_click
1 Y 1 2
2 N 1 2
The solution needs to be dynamic so that even if the table list extends tomorrow, the new table should be able to accommodate. Since I am new to Redshift, any help is appreciated. I am also considering to do the same via R using the dplyr package.
I understood that dynamic queries don't work with Redshift.
My objective was to pull any new table that comes in and use their columns for regression analysis in R.
I made this working by using listagg feature and concat operation. And then wrote the output to a dataframe in R. This dataframe would have 'n' number of select queries as different rows.
Below is the format:
df <- as.data.frame(tbl(conn,sql("select 'select ' || col_names|| ' from ' || table_name as q1 from ( select distinct table_name, listagg(col_name,',') within group (order by col_name)
over (partition by table_name) as col_names
from attribute_config
where active = 'Y'
order by table_name )
group by 1")))
Once done, I assigned every row of this dataframe to a new dataframe and fetched the output using below:
df1 <- tbl(conn,sql(df[1,]))
I know this is a round about solution. But it works !! Fetches about 17M records under 1 second.
Suppose I have a column with distinct values (a,b,c,d,e,f) as values.
In PL/SQL, how can I compare this column with a set, say, (a,b,d,f) and output an indicator?
My approach was:
select case
when values in (a,b,d,f) then 'yes'
else 'no'
end
However this approach takes one value at a time and check if it is in (a,b,d,f).
if you want to compare all values at once you can use the oracle's minus statement :
select values
from your_table minus
(
select 'a' from dual union all
select 'b' from dual union all
select 'd' from dual union all
select 'f' from dual
)
this will retrieve all of the values from your_table that are not in (a,b,d,f).
you can also use the SYS.DBMS_DEBUG_VC2COLL function to transfer a comma seperated list of values into a table (instead of using the
select ... from dual union all ...
)
you can read more about teh SYS.DBMS_DEBUG_VC2COLL here.
You could try to select from your set of values, see Oracle Documentation: Object Tables, and use a join-command.
I am trying to run some analysis on sales data using SQLite.
At the moment, my table has several columns including a unique transaction ID, product name, quantity of that product and value of that product. For each transaction, there can be several records, because each distinct type of product in the basket has its own entry.
I would like to add two new columns to the table. The first one would be a total for each transaction ID which summed up the total quantity of all products in that basket.
I realize that there would be duplication in the table, as the repeated transaction IDs would all have the total. The second one would be similar but in value terms.
I unfortunately cannot do this by creating a new table with the values I want calculated in Excel, and then joining it to the original table, because there are too many records for Excel.
Is there a way to get SQL to do the equivalent of a sumif in Excel?
I was thinking something along the lines of:
select sum(qty) where uniqID = ...
But I am stumped by how to express that it needs to sum all quantities where the uniqID is the same as the one in that record.
You wouldn't create a column like that in SQL. You would simply query for the total on the fly. If you really wanted a table-like object, you could create a view that held 2 columns; uniqID and the sum for that ID.
Let's set up some dummy data in a table; column a is your uniqID, b is the values you're summing.
create table tab1 (a int, b int);
insert into tab1 values (1,1);
insert into tab1 values (1,2);
insert into tab1 values (2,10);
insert into tab1 values (2,20);
Now you can do simple queries for individual uniqIDs like this:
select sum(b) from tab1 where a = 2;
30
Or sum for all uniqIDs (the 'group by' clause might be all you're groping for:) :
select a, sum(b) from tab1 group by a;
1|3
2|30
Which could be wrapped as a view:
create view totals as select a, sum(b) from tab1 group by a;
select * from totals;
1|3
2|30
The view will update on the fly:
insert into tab1 values (2,30);
select * from totals;
1|3
2|60
In further queries, for analysis, you can use 'totals' just like you would a table.
I have the following query:
SELECT rowid FROM table1 ORDER BY RANDOM() LIMIT 1
And as well I have another table (table3). In that table I have columns table1_id and table2_id. table1_id is a link to a row in table1 and table2_id is a link to a row in another table.
I want in my query to get only those results that are defined in table3. Only those that have table1 rowid in their table1_id column. There may not be any columns at all referring to a certain table1 rowid so in this case I don't want to receive them.
How can I achieve this goal?
Update: I tried the following query, which doesn't work:
SELECT rowid FROM table1
WHERE rowid IN (SELECT table1_id FROM table3 WHERE table1_id = table1.rowid)
ORDER BY RANDOM() LIMIT 1
SELECT rowid FROM table1
WHERE rowid IN ( SELECT DISTINCT table1_id FROM table3 )
ORDER BY RANDOM() LIMIT 1;
This query means "choose at random a row from table1 which has an entry in table3".
Every row in table1 equal likelihood of being selected (DISTINCT) as long as it is referenced at least once in table3.
If you are trying to get more than one result, then you should remove the "ORDER BY RANDOM() LIMIT 1" clause.
Assuming you want to select more than just a rowid, you need to SELECT from a JOIN between the tables you're interested in. SQLite doesn't have the full set of standard JOIN functionality, so you'll need to re-work your query so it can use a LEFT OUTER JOIN.
SELECT table1.rowid, table1.other_field
FROM table3
LEFT OUTER JOIN table1 ON table3.table1_id = table1.rowid
ORDER BY RANDOM()
LIMIT 1;