SQLite 3 CROSS or INTERSECT complex Subqueries - sqlite

I have 3 related tables. One table has the rows that I am actually looking for, another table has the Data that I need to be Searching and the Third table describes What data I am looking for. I am getting undesired results from the following query :
SELECT * FROM names WHERE namesKey IN ( SELECT namesKey FROM data WHERE
( dataType IS 3 AND data IS 'COINCIDENCE' )
AND ( dataType IS 2 AND data IS 'STATE' )
AND ( dataType IS 1 AND data IS 'COUNTRY' ) );
I need help making a query based on Multiple rows from the filter table. I need the rows which correspond to the keys from the second table that exist on multiple rows... I am explaining badly... here is an example :
DROP TABLE IF EXISTS names ;
CREATE TABLE names (
namesKey INTEGER PRIMARY KEY ASC,
name TEXT NOT NULL
);
DROP TABLE IF EXISTS data ;
CREATE TABLE data (
dataKey INTEGER PRIMARY KEY ASC,
namesKey INTEGER NOT NULL,
dataType INTEGER NOT NULL,
data TEXT NOT NULL,
FOREIGN KEY(namesKey) REFERENCES names(namesKey)
);
DROP TABLE IF EXISTS filter ;
CREATE TABLE filter (
filterKey INTEGER PRIMARY KEY ASC,
dataType INTEGER NOT NULL,
data TEXT NOT NULL
);
INSERT INTO names( name ) VALUES ( 'name1' );
INSERT INTO names( name ) VALUES ( 'name2' );
INSERT INTO names( name ) VALUES ( 'name3' );
INSERT INTO names( name ) VALUES ( 'name4' );
INSERT INTO names( name ) VALUES ( 'name5' );
INSERT INTO names( name ) VALUES ( 'name6' );
INSERT INTO names( name ) VALUES ( 'name7' );
INSERT INTO names( name ) VALUES ( 'name8' );
INSERT INTO names( name ) VALUES ( 'name9' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 1, 1, 'COUNTRY' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 1, 2, 'STATE' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 1, 3, 'CITY' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 2, 1, 'COUNTRY' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 2, 2, 'STATE' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 2, 3, 'OTHERCITY' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 3, 1, 'COUNTRY' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 3, 2, 'STATE' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 3, 3, 'COINCIDENCE' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 4, 1, 'COUNTRY' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 4, 2, 'OTHERSTATE' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 4, 3, 'COINCIDENCE' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 5, 1, 'OTHERCOUNTRY' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 5, 2, 'RANDOM' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 5, 3, 'COINCIDENCE' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 6, 1, 'OTHERCOUNTRY' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 6, 2, 'OTHERSTATE' );
INSERT INTO data( namesKey, dataType, data ) VALUES ( 6, 3, 'COINCIDENCE' );
INSERT INTO filter( dataType, data ) VALUES ( 1, 'COUNTRY' );
INSERT INTO filter( dataType, data ) VALUES ( 2, 'STATE' );
INSERT INTO filter( dataType, data ) VALUES ( 3, 'COINCIDENCE' );
Now what I need is to be able to run 3 different types of queries relatively reliably.
I need to search for "No Data" and get names 7, 8, and 9
This one is Easy :
SELECT * FROM names WHERE namesKey NOT IN ( SELECT namesKey FROM data ) ;
I need to Search based on a Single type of data from the data table
Also Easy, Desired Result 3, 4, 5, and 6
SELECT * FROM names WHERE
namesKey IN ( SELECT namesKey FROM data WHERE
( dataType IS 3 AND data IS 'COINCIDENCE' ) )
;
I need to Search based on Multiple rows From The filter table. This one I don't know how to do...
Desired Result is the name3 row ONLY
I Could do it by
SELECT * FROM names WHERE
namesKey IN ( SELECT namesKey FROM data WHERE
( dataType IS 3 AND data IS 'COINCIDENCE' ) )
AND
namesKey IN ( SELECT namesKey FROM data WHERE
( dataType IS 2 AND data IS 'STATE' ) )
AND
namesKey IN ( SELECT namesKey FROM data WHERE
( dataType IS 1 AND data IS 'COUNTRY' ) )
;
But that is just Ugly with a capital UGH!
And even worse with that approach, the dataType is theoretically arbitrarily large and thus I might end up trying to string together dozens or even Hundreds of sub queries... I could run out of RAM just composing my string before even Trying to put it into the SQL.
So I am looking for a more elegant solution. Any suggestions?

If I understand you correctly, you could use:
SELECT *
FROM names
WHERE namesKey IN (SELECT namesKey
FROM data
WHERE dataType IS 3 AND data IS 'COINCIDENCE'
INTERSECT
SELECT namesKey
FROM data
WHERE dataType IS 2 AND data IS 'STATE'
INTERSECT
SELECT namesKey
FROM data
WHERE dataType IS 1 AND data IS 'COUNTRY'
);
SqlFiddleDemo
Output:
╔═══════════╦═══════╗
║ namesKey ║ name ║
╠═══════════╬═══════╣
║ 3 ║ name3 ║
╚═══════════╩═══════╝
Or using aggregation:
SELECT *
FROM names
WHERE namesKey IN (SELECT namesKey
FROM data
GROUP BY namesKey
HAVING SUM(dataType IS 3 AND data IS 'COINCIDENCE') > 0
AND SUM(dataType IS 2 AND data IS 'STATE') > 0
AND SUM(dataType IS 1 AND data IS 'COUNTRY') > 0
)
SqlFiddleDemo2

You can join the filter table directly with the actual table to get rows with matches, and then check for only those name keys where all three search terms are matching, i.e., groups whose number of matching rows is the same as the number of all search values:
SELECT namesKey
FROM data
JOIN filter USING (dataType, data)
GROUP BY namesKey
HAVING COUNT(*) = (SELECT COUNT(*) FROM filter);
Then use these name keys as usual:
SELECT *
FROM names
WHERE namesKey IN (SELECT namesKey...);

Related

Insert Data into SQL Table with primary key column

I have a table in a SQL Server database and an R script that appends data to that tabl.
The db table contains a primary key ("ID"), which is just a scope_identity field
When I try to append the table into that location, I keep running into the following error
> sqlSave(Conn[["DbCon"]],
+ dat = OutputDataFinal,
+ tablename = "DataSci_StandardTransferPriority",
+ verbose = TRUE,
+ append = TRUE,
+ rownames = FALSE)
Query:
INSERT INTO "DataSci_StandardTransferPriority" (
"ID", "LeadSourceName", "AgeCategory", "ZipColor", "LeadCount_Sum",
"OB_TotalDials_Sum", "ContactRate", "TransferRate", "HypTransfers", "LaborCPT",
"MarketingCpt", "CloseRate", "PDLTR", "Policy_Count_Sum", "InboundDials_Sum",
"LeadCost_Sum", "PPT", "PPH", "ContactRateXCloseRate", "ContactRateXCloseRateTarget",
"ModelValue", "SourcePriority", "InsertTS"
)
VALUES ( ?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,? )
Error in odbcUpdate(channel, query, mydata, coldata[m, ], test = test, :
missing columns in 'data'
How can I append and ignore the issue with the primary key?
In standard INSERT operation dont specify identity column. It automatically provides next incremental identity value. For example:
CREATE TABLE #TempTable (
[ID] [bigint] NOT NULL IDENTITY(1, 1) PRIMARY KEY,
[Column1] [varchar](50) NULL,
[Column2] [decimal](19, 3) NOT NULL
);
INSERT INTO #TempTable (
[Column1],
[Column2]
)
VALUES
('first insert', 50.2),
('second insert', 84.2);
SELECT *
FROM #TempTable
DROP TABLE #TempTable;
The result was:
ID
Column1
Column2
1
first insert
50.200
2
second insert
84.200
If you want insert on identity column any way - enable IDENTITY_INSERT. But be care for data integrity issues.
CREATE TABLE #TempTable (
[ID] [bigint] NOT NULL IDENTITY(1, 1) PRIMARY KEY,
[Column1] [varchar](50) NULL,
[Column2] [decimal](19, 3) NOT NULL
);
INSERT INTO #TempTable (
[Column1],
[Column2]
)
VALUES ('first insert', 50.2);
SET IDENTITY_INSERT #TempTable ON;
INSERT INTO #TempTable (
[ID],
[Column1],
[Column2]
)
VALUES
(4, 'second insert', 84.63),
(2, 'third insert', 99.56);
SET IDENTITY_INSERT #TempTable OFF;
INSERT INTO #TempTable (
[Column1],
[Column2]
)
VALUES ('four insert', 100.32);
SELECT *
FROM #TempTable
DROP TABLE #TempTable;
And the result:
ID
Column1
Column2
1
first insert
50.200
2
third insert
99.560
4
second insert
84.630
5
four insert
100.320

How to get the last updated value of a value in sql

sample data I have 2 columns old_store_id, changed_new_store_id and there are cases when changed_new_store_id value will also get updated to new value. how can i traverse through DB(teradata) to get the last value (changed_new_store_id ) of the respective old_store_id
let say in 1 st row
old_store_id = A ;
changed_new_store_id = B
and 5 th row contains
old_store_id = B ;
changed_new_store_id = C
and some other nth row C is changed to X etc
how to get final value of A which is X ?
I can try using multiple self joins
using Stored procedure but it will not be an efficient way (for many reasons)
Is there any way to find ?
Please anyone suggest me
This assumes no "loops", and uses "bottom-up" recursion. Something very similar could be done "top-down", limiting the seed query to rows where the "old" value doesn't appear anywhere as a "new" value.
CREATE VOLATILE TABLE #Example (
Old_Store_ID VARCHAR(8),
New_Store_ID VARCHAR(8)
)
PRIMARY INDEX(Old_Store_ID)
ON COMMIT PRESERVE ROWS;
INSERT INTO #Example VALUES ('A', 'B');
INSERT INTO #Example VALUES ('D', 'c');
INSERT INTO #Example VALUES ('B', 'F');
INSERT INTO #Example VALUES ('c', 'FF');
INSERT INTO #Example VALUES ('FF', 'GG');
INSERT INTO #Example VALUES ('F', 'X');
WITH RECURSIVE #Traverse(Old_Store_ID,New_Store_ID,Final_ID)
AS
(
--Seed Query - start with only the rows having no further changes
SELECT Old_Store_ID
,New_Store_ID
,New_Store_ID as Final_ID
FROM #Example as This
WHERE NOT EXISTS (
SELECT 1 FROM #Example AS Other WHERE This.New_Store_ID = Other.Old_Store_ID
)
UNION ALL
--Recursive Join
SELECT NewRow.Old_Store_ID
,NewRow.New_Store_ID
,OldRow.Final_ID
FROM #Example AS NewRow
INNER JOIN #Traverse AS OldRow
ON NewRow.New_Store_ID = OldRow.Old_Store_ID
)
SELECT *
FROM #Traverse
;
A recursive answer:
CREATE VOLATILE TABLE #SearchList (
SearchID CHAR(2),
ParentSearchID CHAR(2)
)
PRIMARY INDEX(SearchID)
ON COMMIT PRESERVE ROWS;
INSERT INTO #SearchList VALUES ('A', 'B');
INSERT INTO #SearchList VALUES ('D', 'c');
INSERT INTO #SearchList VALUES ('B', 'F');
INSERT INTO #SearchList VALUES ('c', 'FF');
INSERT INTO #SearchList VALUES ('FF', 'GG');
INSERT INTO #SearchList VALUES ('F', 'X');
CREATE VOLATILE TABLE #IntermediateResults(
SearchID CHAR(2),
ParentSearchID CHAR(2),
SearchLevel INTEGER
)
ON COMMIT PRESERVE ROWS;
INSERT INTO #IntermediateResults
WITH RECURSIVE RecursiveParent(SearchID,ParentSearchID,SearchLevel)
AS
(
--Seed Query
SELECT SearchID
,ParentSearchID
,1
FROM #SearchList
UNION ALL
--Recursive Join
SELECT a.SearchID
,b.ParentSearchID
,SearchLevel+1
FROM #SearchList a
INNER JOIN RecursiveParent b
ON a.ParentSearchID = b.SearchID
)
SELECT SearchID
,ParentSearchID
,MAX(SearchLevel)
FROM RecursiveParent
GROUP BY SearchID
,ParentSearchID
;
SELECT RESULTS.*
FROM #IntermediateResults RESULTS
INNER JOIN (SELECT RESULTS_MAX.SearchID
,MAX(RESULTS_MAX.SearchLevel) MaxSearchLevel
FROM #IntermediateResults RESULTS_MAX
GROUP BY RESULTS_MAX.SearchID
) GROUPED_RESULTS
ON RESULTS.SearchID = GROUPED_RESULTS.SearchID
AND RESULTS.SearchLevel = GROUPED_RESULTS.MaxSearchLevel
ORDER BY RESULTS.SearchID ASC
,RESULTS.SearchLevel ASC
;
Output:
SearchID ParentSearchID SearchLevel
-------- -------------- -----------
A X 3
B X 2
c GG 2
D GG 3
F X 1
FF GG 1

Simple Split function in SQL Server 2012 with explanation pls

I have two tables Procedures and ProcedureTypes.
Procedures has a column Type which is a varchar with the values (1, 2), (3, 4), (4, 5) etc...
ProcedureType has a primary key 'ID' 1 to 9.
ID Description
1 Drug
2 Other-Drug
etc...
ID is an integer value and Type is varchar value.
Now I need to join these two tables to show the values
ID in the Procedures table
ProcedureType in the Procedures table
Description in the ProceduresType table with the value separated by a "-".
For example if he value in Type is (1,2) the new table after join should show values in the description like (Drug-Other Drug)
I have used this query bot to no avail
SELECT * FROM dbo.[Split]((select RequestType from GPsProcedures), ',')
Can anyone tell me how to do it and why the above query is not working
with Procedures as (
select 1 as ID, '1,2,3' as Typ
),
ProcedureTypes as (
select 1 as TypeID, 'Drug' as Name
union select 2 , 'Other-Drug'
union select 3 , 'Test 3'
)
/*Get one extra column of type xml*/
,Procedures_xml as (
select id,CONVERT(xml,' <root> <s>' + REPLACE(Typ,',','</s> <s>') + '</s> </root> ') as Typ_xml
from Procedures
)
/*Convert the field string to multiple rows then join to procedure types*/
, Procdure_With_Type as (
select ID,T.c.value('.','varchar(20)') as TypeID,
ProcedureTypes.Name
from Procedures_xml
CROSS APPLY Typ_xml.nodes('/root/s') T(c)
INNER JOIN ProcedureTypes ON T.c.value('.','varchar(20)') = ProcedureTypes.TypeID
)
/*Finally, group the procedures type names by procedure id*/
select id,
STUFF((
SELECT ', ' + [Name]
FROM Procdure_With_Type inn
WHERE (Procdure_With_Type.ID = inn.ID)
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
,1,2,'') AS NameValues
from Procdure_With_Type
group by ID
You can't have a select statement as a parameter for a function, so instead of this:
SELECT * FROM dbo.[Split]((select RequestType from GPsProcedures), ',')
Use this:
select S.*
from GPsProcedures P
cross apply dbo.[Split](P.RequestType, ',') S

SQLite cross reference unique combinations

I've got two tables already populated with data with the given schemas:
CREATE TABLE objects
(
id BIGINT NOT NULL,
latitude BIGINT NOT NULL,
longitude BIGINT NOT NULL,
PRIMARY KEY (id)
)
CREATE TABLE tags
(
id BIGINT NOT NULL,
tag_key VARCHAR(100) NOT NULL,
tag_value VARCHAR(500),
PRIMARY KEY (id , tag_key)
)
object.id and tags.id refer to the same object
I'd like to populate a third table with the unique combinations of tag_key and tag_value. For example:
INSERT OR REPLACE INTO objects (id) VALUES (0);
INSERT OR REPLACE INTO tags (id, tag_key, tag_value) VALUES (0, 'a', 'x');
INSERT OR REPLACE INTO objects (id) VALUES (1);
INSERT OR REPLACE INTO tags (id, tag_key, tag_value) VALUES (1, 'a', 'y');
INSERT OR REPLACE INTO objects (id) VALUES (2);
INSERT OR REPLACE INTO tags (id, tag_key, tag_value) VALUES (2, 'a', 'x');
INSERT OR REPLACE INTO tags (id, tag_key, tag_value) VALUES (2, 'a', 'y');
INSERT OR REPLACE INTO objects (id) VALUES (3);
INSERT OR REPLACE INTO tags (id, tag_key, tag_value) VALUES (3, 'a', 'x');
INSERT OR REPLACE INTO objects (id) VALUES (4);
INSERT OR REPLACE INTO tags (id, tag_key, tag_value) VALUES (4, 'a', 'y');
Should result in 3 entries of
0: ([a,x])
1: ([a,y])
3: ([a,x][a,y])
Currently I have:
CREATE TABLE tags_combinations
(
id INTEGER PRIMARY KEY,
tag_key VARCHAR(100) NOT NULL,
tag_value VARCHAR(500)
);
The id shouldn't be related to the original id of the object, just something to group unique combinations.
This is the query I have so far:
SELECT
t1.tag_key, t1.tag_value
FROM
tags t1
WHERE
t1.id
IN
(
/* select ids who's every tags entry is not under one id in tags_combinations */
SELECT
t2.id
FROM
tags t2
WHERE
t2.tag_key, t2.tag_value
NOT IN
(
)
);
The part with the comment is what I am not sure about, how would I select every id from tags that does not have all of the corresponding tag_key and tag_value entries already under one id in tags_combinations?
To clarify exactly the result I am after: From the sample data given, it should return 4 rows with:
row id tag_key tag_value
0 0 a x
1 1 a y
2 2 a x
3 2 a y
SQL is a set-based language. If you reformulate your question in the language of set theory, you can directly translate it into SQL:
You want all rows of the tags table, except those from duplicate objects.
Objects are duplicates if they have exactly the same key/value combinations. However, we still want to return one of those objects, so we define duplicates only as those objects where no other duplicate object with a smaller ID exists.
Two objects A and B have exactly the same key/value combinations if
all key/value combinations in A also exist in B, and
all key/value combinations in B also exist in A.
All key/value combinations in A also exist in B if there is no key/value combination in A that does not exist in B (note: double negation).
SELECT id, tag_key, tag_value
FROM tags
WHERE NOT EXISTS (SELECT 1
FROM tags AS dup
WHERE dup.id < tags.id
AND NOT EXISTS (SELECT 1
FROM tags AS A
WHERE A.id = tags.id
AND NOT EXISTS (SELECT 1
FROM tags AS B
WHERE B.id = dup.id
AND B.tag_key = A.tag_key
AND B.tag_value = A.tag_value)
)
AND NOT EXISTS (SELECT 1
FROM tags AS B
WHERE B.id = dup.id
AND NOT EXISTS (SELECT 1
FROM tags AS A
WHERE A.id = tags.id
AND A.tag_key = B.tag_key
AND A.tag_value = B.tag_value)
)
)
ORDER BY id, tag_key;
This is not easy in SQLite. We want to identify groups of tag key/value pairs. So we could group by id and get a string of the associated pairs with group_concat. This would be the way to do it in another DBMS. SQLite, however, cannot order in group_concat, so we might end up with 2: 'a/x,a/y' and 5: 'a/y,a/x'. Two different strings for the same pairs.
Your best bet may be to write a program and find the distinct pairs iteratively.
In SQLite you may want to try this:
insert into tags_combinations (id, tag_key, tag_value)
select id, tag_key, tag_value
from tags
where id in
(
select min(id)
from
(
select id, group_concat(tag_key || '/' || tag_value) as tag_pairs
from
(
select id, tag_key, tag_value
from tags
order by id, tag_key, tag_value
) ordered_data
group by id
) aggregated_data
group by tag_pairs
);
Ordering the data before applying group_concat is likely to get the tag pairs ordered, but in no way guaranteed! If this is something you want to do only once, it may be worth a try, though.
To merge multiple rows into one value, you need a function like group_concat().
The ORDER BY is needed to ensure a consistent order of the rows within a group:
SELECT DISTINCT group_concat(tag_key) AS tag_keys,
group_concat(tag_value) AS tag_values
FROM (SELECT id,
tag_key,
tag_value
FROM tags
ORDER BY id,
tag_key,
tag_value)
GROUP BY id;
If you want to have keys and values interleaved, as shown in the question, you need to do more string concatenation:
SELECT DISTINCT group_concat(tag_key || ',' || tag_value, ';') AS keys_and_values
FROM (...

Mdx calculate count distinct

I want to write a mdx script that displays the count rows I have for a member.
This is my initial script:
SELECT NON EMPTY { [Measures].[I_OPC_ATTEINT]
and 6 measures } ON COLUMNS
, NON EMPTY { ([Axe_Temps].[MOIS_ANNEE].[MOIS_ANNEE].ALLMEMBERS
* [Axe_ORGANISATION].[Structure].[EQUIPE].ALLMEMBERS
* [Axe_OPC].[TYPE_REGROUPEMENT].[TYPE_REGROUPEMENT].ALLMEMBERS
* [Axe_OPC].[COMPOSITION].[COMPOSITION].ALLMEMBERS
* [Axe_OPC].[OPC].[OPC].ALLMEMBERS ) } DIMENSION PROPERTIES MEMBER_CAPTION, MEMBER_UNIQUE_NAME ON ROWS
FROM ( SELECT ( STRTOMEMBER('[Axe_ORGANISATION].[CODE_EQUIPE].&[E_1001]') ) ON COLUMNS
FROM ( SELECT ( STRTOMEMBER('[Axe_ORGANISATION].[CODE_PLATEAU].&[D_1000]') ) ON COLUMNS
FROM ( SELECT ( STRTOMEMBER('[Axe_ORGANISATION].[CODE_UNITE].&[U_107864]') ) ON COLUMNS
FROM ( SELECT ( STRTOMEMBER('[Axe_ORGANISATION].[CODE_CANAL].&[AVSC]') ) ON COLUMNS
FROM ( SELECT ( STRTOMEMBER('[Axe_Temps].[MOIS_ANNEE].&[201306]') ) ON COLUMNS
FROM [PVC_Reporting])))))
I want to display 2 calculated measures:
The count of rows of my result ( count distinct ( [Axe_OPC].[COMPOSITION].[COMPOSITION].ALLMEMBERS )
The count of rows where [Measures].[I_OPC_ATTEINT] <> 0
Thank you.
I would create a measure of type 'distinct count' within the cube and create a simple dimension (oui|non) for I_OPC_ATTEINT.

Resources