I have this current example data set
NEW_ID Name OLD_ID New_Name
123 Hello XYZ
124 How XYZ
125 Are XYZ
126 My ABC
127 Name ABC
128 Is ABC
129 Alex ABC
My objective is to amend the Name field to a new naming convention to be stored in New_Name- ie Hello_Part_1, How_Part_2, Are_Part_3 where all these records share an OLD_ID - in this case XYZ. Similarly, with My_Part_1, Name_Part_2, Is_Part_3, Alex_Part_4 etc with IDs that equal ABC.
I'm using SQL Lite with an Import of .CSV File.
The naming convention is as follows - NAME_PART_X where X increments on the number of records within that 'Group' of OLD_IDs.
SQL does not work sequentially; you have to express the operation independently for each row.
The number you want is the count of rows with the same old ID that also have a new ID that is the same or smaller as the new ID of the current row.
This can be computed with a correlated subquery:
UPDATE MyTable
SET New_Name = Name || '_Part_' ||
(SELECT COUNT(*)
FROM MyTable AS T2
WHERE T2.OLD_ID = MyTable.OLD_ID
AND T2.NEW_ID <= MyTable.NEW_ID);
Related
I am trying to build an sqlite query against a database which I have no control over, i.e. I can't change the odd way data is currently stored.
My data is spread across several tables, and one in particular is causing me issues; contacts.
The tables I am interested in are structured as follows (examples just include what I care about);
Results
id
state
account
5
0
102
11
2
62
Data
id
results_id
type
date
contact_id
1
5
1033
1596664666
360
2
11
1034
1596452446
32
Contacts_list
id
contact_id
key
type
32
12
test
email
360
110
test2
email
360
5
test2
phone
Contacts
id
results_id
name
email
12
231
Test Account
test1#gmail.com
110
726
Test Account 2
test2#gmail.com
5
6
Test Account no
01234567890
So what I want to do is:
Query the data table to grab all the values in that table, lookup type from the relevant table.
join the results table (Data.results_id = Results.id) and lookup the relevant account and state details
Get the contacts name and email from the contacts table.
The last bit is what is throwing me. Data.contact_id = Contacts_list.id and then Contacts_list.contact_id = Contacts.id
If I do a left join on Contacts_list I get additional rows because Contacts_list can have multiple rows for the same ID and I just want as many rows as the Data table has. What I want is to get concat multiple contacts into a single cell, eg;
account
state
type
date
contact
102
0
1033
1596664666
Test Account (test1#gmail.com)
62
2
1034
1596452446
Test Account 2 (test2#gmail.com), Test Account no (01234567890)
I feel like there will be an easy solution, but I am scratching my head at the minute... Any ideas?
You must join all tables, group by data.id and use GROUP_CONCAT() to collect the contacts:
SELECT r.account, r.state, d.type, d.date,
GROUP_CONCAT(c.name || ' (' || c.email || ')') contact
FROM Data d
LEFT JOIN Contacts_list cl ON cl.id = d.contact_id
LEFT JOIN Contacts c ON c.id = cl.contact_id
LEFT JOIN Results r ON r.id = d.results_id
GROUP BY d.id
See the demo.
I have to create population for the people who has only one product association (ABC) using qualify statement.
For example I have the data
Id Code Prod Date
101 202 ABC 2017-05-31
101 203 DEF 2017-04-30
102 302 ABC 2018-06-30
From the above data I need the data for Id=102 because this id has only one prod relation where as id 101 has both ABC and DEF which should be excluded.
I tried the following
Select id,prod from table1
Qualify row_number() over (partition by id order by Date)=1
Where prod=‘ABC’
With this, I get the two records in my data which I don’t want. Appreciate your help.
Select *
from table1
Qualify min(Prod) over (partition by id)='ABC'
and max(Prod) over (partition by id)='ABC'
Both MIN and MAX return the same value ABC, thus there's no other value
If you want to return the id's that have one prod value (ABC) in the table, you can do something like this:
SELECT id, prod
FROM (
SELECT id, prod
FROM table1
GROUP BY id, prod -- Get unique (id, prod) combinations
QUALIFY COUNT(prod) OVER(PARTITION BY id) = 1 -- Get id's with only one prod
) src
WHERE prod = 'ABC' -- Only get rows with "ABC" prod
The key here is the order in which Teradata processes the query:
Aggregate - GROUP BY
OLAP - COUNT(prod) OVER()
QUALIFY
You may be able to move the WHERE prod = 'ABC' into the QUALIFY clause and get rid of the outer SELECT, not 100% sure.
Just use having, instead of qualify. I don't see any need for window fuctions. Something like:
Select id,prod ,
count(prod)
from
table1
group by
id,
prod
having count(prod) = 1
I am trying out SQLite and encountered a problem. There are 3 Tables A, B, and C.
I want to update Table A using the sum of B and C.
Table A.
James null.
Table B.
James 5.
Table C
James 2
so with the update, I want table A to have
James 3. (5-2)
Thank You
SQLite does not support joins in an UPDATE statement so you can do it by accessing directly the corresponding rows of the tables A and B like this:
update A
set value =
(select value from B where name = A.name) -
(select value from C where name = A.name)
If you want to update only the row with name = 'James' then add:
where name = 'James'
See the demo
Works in every DB:
UPDATE
"A"
SET
"x" =
(
SELECT
SUM("x")
FROM "B"
WHERE "B"."id"="A"."id"
) +
(
SELECT
SUM("x")
FROM "C"
WHERE "C"."id"="A"."id"
)
I believe the following demonstrates that Yes you can:-
DROP TABLE IF EXISTS ta;
DROP TABLE IF EXISTS tb;
DROP TABLE IF EXISTS tc;
CREATE TABLE IF NOT EXISTS ta (name TEXT, numb INTEGER);
CREATE TABLE IF NOT EXISTS tb (name TEXT, numb INTEGER);
CREATE TABLE IF NOT EXISTS tc (name TEXT, numb INTEGER);
INSERT INTO ta VALUES ('JAMES',null),('Mary',100);
INSERT INTO tb VALUES ('JAMES',5),('Sue',33);
INSERT INTO tc VALUES ('JAMES',2),('Anne',45);
UPDATE ta SET numb =
(SELECT sum(numb) FROM tb WHERE name = 'JAMES')
-
(SELECT sum(numb) FROM tc WHERE name = 'JAMES')
WHERE name = 'JAMES';
SELECT * FROM ta;
SELECT * FROM tb;
SELECT * FROM tc;
This :-
Drops the tables if they exist allowing it to be rerun (simplifies modifications if need be).
column names name and numb have been assumed as they weren't given.
Creates the 3 tables (note table names used for the demo are ta, tb and tc)
Adds some data (note that additional rows have been added to show how to distinguish (at least to a fashion))
Updates column numb of table A (ta) where the name column has a value of JAMES according to the sum of the numb column from all rows with the same name (JAMES) from table tb minus the sum of the numb column from all rows with the same name (JAMES) from table tc
This may not be exactly what you want so it assumes that you want to sum all rows with the same name per table (ta and tc)
Queries all the tables (first is shown below as that is the table that has been updated.)
The first result showing that the row has been updated from null to 3 (5 - 2) and that the row for Mary has remained as it was :-
The following change to the UPDATE gets the name (rather than hard-coding 'JAMES' multiple times, as per the row(s) extract from the ta table, the use of hard-coded names perhaps making it easier to understand the working of the SQL).
UPDATE ta SET numb = (SELECT sum(numb) FROM tb WHERE name = ta.name) - (SELECT sum(numb) FROM tc WHERE name = ta.name) WHERE name = 'JAMES';
Note that should there not be an associated row (i.e. with the same name) in either tb or tc then the result will be null (whether or not sum is used).
I'm wondering if I can use an OLAP Function to filter irrelevant rows like this:
If I have one matching value (the fourth fields) all the rows with the same key ( the first 3 fields) must not be displayed
In this example, the matching value would be 'C':
Entities product ID Solde
997 0050 123 D
997 0050 123 D
997 0050 123 C
899 0124 125 D
899 0124 125 D
So here My key is composed by entities/product/ID, regarding the value of "Solde" I need to display or not.
Here the the undesired value is Solde = C.
In this example only the last row should be diplayed, because the key 899/0124/125 has only rows with solde = 'D'
The key 997/0050/123 has one row with solde = 'C' so I don't want to display it
Thanks in advance for your helping
Christophe
Updated answer
The more traditional way to solve this is to first select the Entities/Product/ID records that you DON'T want.
SELECT Entities, Product, ID FROM table WHERE Solde<>'D';
Use that result in a subquery in your WHERE clause to exclude those:
SELECT DISTINCT Entities, Product, ID, Solde
FROM table
WHERE (Entities, Product, ID) NOT IN ( SELECT Entities, Product, ID FROM table WHERE Solde<>'D');
Alternatively using a HAVING clause and aggregating
SELECT Entities, Product, ID
FROM table
COUNT(*) = SUM(CASE WHEN Solde = 'D' THEN 1 ELSE 0 END)
GROUP BY 1,2,3
I guess you are looking for answer as the below:
SELECT Solde
FROM yourtable
QUALIFY COUNT(*) OVER (PARTITION BY Entities, Product, ID, Solde) = 1;
I found a good article on converting adjacency to nested sets at http://dataeducation.com/the-hidden-costs-of-insert-exec/
The SQL language used is Microsoft SQL Server (I think) and I am trying to convert the examples given in the article to sqlite (as this is what I have easy access to on my Macbook).
The problem I appear to be having is converting the part of the overall CTE query to do with the Employee Rows
EmployeeRows AS
(
SELECT
EmployeeLevels.*,
ROW_NUMBER() OVER (ORDER BY thePath) AS Row
FROM EmployeeLevels
)
I converted this to
EmployeeRows AS
(
SELECT
EmployeeLevels.*,
rowid AS Row
FROM EmployeeLevels
ORDER BY thePath
)
and the CTE query runs (no syntax errors) but the output I get is a table without the Row and Lft and Rgt columns populated
ProductName ProductID ParentProductID TreePath HLevel Row Lft Rgt
----------- ---------- --------------- ---------- ---------- ---------- ---------- ----------
Baby Goods 0 0 1
Baby Food 10 0 0.10 2
All Ages Ba 100 10 0.10.100 3
Strawberry 200 100 0.10.100.2 4
Baby Cereal 250 100 0.10.100.2 4
Beginners 150 10 0.10.150 3
Formula Mil 300 150 0.10.150.3 4
Heinz Formu 310 300 0.10.150.3 5
Nappies 20 0 0.20 2
Small Pack 400 20 0.20.400 3
Bulk Pack N 450 20 0.20.450 3
I think the start of the problem is the Row is not getting populated and therefore the Lft and Rgt columns do not get populated by the following parts of the query.
Are there any sqlite experts out there to tell me:
am I translating the rowid part of the query correctly
does sqlite support a rowid in a part of a CTE query
is there a better way? :)
Any help appreciated :)
am I translating the rowid part of the query correctly
No.
The SQL:
SELECT
EmployeeLevels.*,
rowid AS Row
FROM EmployeeLevels
ORDER BY thePath
has the Row defined as the rowid of table EmployeeLevels in SQLite, ignoring the order clause. Which is different from the intention of ROW_NUMBER() OVER (ORDER BY thePath) AS Row
does sqlite support a rowid in a part of a CTE query
Unfortunately no. I assume you mean this:
WITH foo AS (
SELECT * FROM bar ORDER BY col_a
)
SELECT rowid, *
FROM foo
but SQLite will report no such column of rowid in foo.
is there a better way?
Not sure it is better but at least it works. In SQLite, you have a mechanism of temp table which exists as long as your connection opens and you didn't delete it deliberately. Rewrite the above SQL in my example:
CREATE TEMP TABLE foo AS
SELECT * FROM bar ORDER BY col_a
;
SELECT rowid, *
FROM foo
;
DROP TABLE foo
;
This one will run without SQLite complaining.
update:
As of SQLite version 3.25.0, window function is supported. Hence you can use row_number() over (order by x) expression in your CTE if you happen to use a newer SQLite