Copy SEVERAL Column Values from One table into Another Matching IDs - SQLite - sqlite

I'm elaborating on this question (Copy Column Value from One table into Another Matching IDs - SQLite) by adding an extra challenge.
The idea is to copy the value of several columns from one table to another when an id is matching. The aforementioned question addresses how to copy the content of one column when a matching id is found. Here the code as posted by #scaisEdge:
UPDATE t1
SET value = (
SELECT value
FROM t2
WHERE t1.ID = t2.ID)
But what if we want to update several columns from that same row where t1.ID = t2.ID? Of course one could run it several times once for each column to be updated, however, that's extraordinarily inefficient as I have to update millions of rows. I guess that the less amount of logical comparisons that the query has to do the faster it will be. Any other ways of optimizing this task are welcome, the IDs are unique, both tables have the same number of rows, and the exact same values of ID are found in both tables. So sorting the tables is not out of the question.

If your version of SQLite is 3.15.0+ you can do it with Row Values:
update t1 set
(col1, col2) = (
select col1, col2
from t2
where t2.id = t1.id
)
See the demo.

Related

SQLITE select unique rows

I have a table where rows appear to be "duplicates" but they are actually not (they have different date).
Suppose each record has a column A that is supposed to be unique. However due to this column A could or could not appear again later with updated information (with column A unchanged), it is no longer unique even when it should be.
Therefore I want the table with latest information only. Currently this table contains 500k entries, however the "true" number of unique entries is less than half of it.
I have tried
SELECT *
FROM TABLE
WHERE A = A
AND Date = (SELECT MAX(Date) from TABLE)
ORDER BY DATE
However this only returns 2 results. How do I achieve that?
The subquery on the date is the correct idea, but you must include the column A in the subquery and relate it back to the main table. I prefer to use explicit joins rather than embedding the subquery in the WHERE statement. This is usually more efficient anyway.
SELECT TABLE.*
FROM TABLE INNER JOIN
(SELECT A, MAX(Date) AS MaxDate FROM TABLE GROUP BY A) AS latest
ON TABLE.A = latest.A AND TABLE.date = latest.MaxDate
ORDER BY A, date
Or even better, I prefer CTE (Common Table Expression) syntax, since it makes the individual queries easier to read:
WITH latest AS (
SELECT A, MAX(Date) AS MaxDate
FROM TABLE
GROUP BY A
)
SELECT TABLE.*
FROM TABLE INNER JOIN latest
ON TABLE.A = latest.A AND TABLE.date = latest.MaxDate
ORDER BY TABLE.A, TABLE.date
Comparison to other answer
The answer by MikeT relies on a non-standard feature of sqlite. That is okay of itself as long as you are aware that the solution is not compatible with other databases engines/servers and SQL dialects.
The next possible gotcha really relies on your actual data and table schema (neither of which you shared in the question details). If your data allows multiple rows with the same date for the a single A column value, then the conditions in your question are not enough to definitively remove all duplicates. You would need to identify another column by which to resolve any remaining duplicates, but once again your question did not do that.
However, in testing, I found that my solution allows unresolved duplicates to remain in the results. MikeT's solution eliminate all duplicates, but it does so by arbitrarily excluding one of those duplicates. There are ways to fix either solution to definitely select which duplicate to keep, but I will not even attempt that unless you post actual data and the table schema so that my answer is not just mere guessing. I'm glad that my answer was useful thus far, but you need to understand your data better (than reveal in the question) to ensure what solution is actually best.
Bonus
Against my better judgement to just keep expanding on answers... since you should really research this separately... here's an example of how you would continue joining this with other queries...
WITH latest AS (
SELECT A, MAX(Date) AS MaxDate
FROM TABLE
GROUP BY A
),
firstResults AS (
SELECT TABLE.*
FROM TABLE INNER JOIN latest
ON TABLE.A = latest.A AND TABLE.date = latest.MaxDate
ORDER BY TABLE.A, TABLE.date
)
SELECT otherTable.*
FROM firstResults JOIN otherTable
ON firstResults.A = otherTable.A
WHERE somecondition = 'foobar'
Another approach if you're using a somewhat recent version of sqlite (3.25 or newer), using the row_number() window function to rank groups of the same a value by date and picking the first one:
WITH cte AS
(SELECT a, date, row_number() OVER (PARTITION BY a ORDER BY date DESC) AS rn
FROM yourtable)
SELECT a, date
FROM cte
WHERE rn = 1;
One important thing to note since I noticed you mentioning another answer was slow is that an index on mytable(a, date DESC) will be needed for this query for best results, and an index on mytable(a, date) will speed up the other answers given.
I believe, if I understand what you have written, that you could use :-
SELECT a,max(date), other FROM mytable GROUP BY a ORDER BY date;
note that the other column represents other columns (if present)
However, the other column will be an arbritary value (from one of the grouped columns) which may well be the required value (in the example it is).
As per :-
Each expression in the result-set is then evaluated once for each
group of rows. If the expression is an aggregate expression, it is
evaluated across all rows in the group. Otherwise, it is evaluated
against a single arbitrarily chosen row from within the group. If
there is more than one non-aggregate expression in the result-set,
then all such expressions are evaluated for the same row.
SQL As Understood By SQLite - SELECT
More correctly, to eliminate an arbritary value(sic) for the other column, you could use :-
SELECT
a /* will always be the same and isn't arbritary */,
max(date) /* will be the maximum data */ AS date,
(SELECT other FROM mytable WHERE a = m.a AND date = m.date) AS other
FROM mytable AS m /* AS m allows the outer query to be distinguished from the inner query */
GROUP BY a /* this effectivel removes duplicates on the a column */
ORDER BY date
;
The example below appears to produce the same result.
Example :-
Using the following to populate the table with some generated testing data :-
CREATE TABLE IF NOT EXISTS mytable (a TEXT, date TEXT, other);
WITH cte(count,a,date,other) AS
(
SELECT 1,1,date('now','+'||(random() % 30)||' days'),'other1'
UNION ALL SELECT count+1,abs(random()) % 20,date('now','+'||(abs(random()) % 30)||' days'), 'other'||(count+1) FROM cte LIMIT 100
INSERT INTO mytable (a,date,other) SELECT a,date,other FROM cte
;
SELECT * FROM mytable ORDER BY DATE DESC;
in this case :-
Highlighted rows being those required to be extracted.
Then after the above has been run the following is run
SELECT * FROM mytable WHERE a = a AND date = (SELECT MAX(date) FROM mytable);
SELECT * FROM mytable WHERE /*a = a AND*/ date = (SELECT MAX(date) FROM mytable);
/* Will only select 1 row per unique value of a BUT other will be an arbritary value not necessairlly the latest */
SELECT a,max(date), other FROM mytable GROUP BY a /* group by effectively display unique */;
SELECT
a /* will always be the same and isn't arbritary */,
max(date) /* will be the maximum data */ AS date,
(SELECT other FROM mytable WHERE a = m.a AND date = m.date) AS other
FROM mytable AS m
GROUP BY a
;
The first two results show that a = a does nothing as it will always be true.
The thrid query produces (unordered) :-
Note ticks assigned by checking the value of other from the previous result.
In this case this shorter query works OK even though values of other are arbritary values (they aren't really as it depends upon how the query planner plasn the query).
The fourth, the more correct, produces the same results :-
Result 2 (your orignal query) and 3 (original without a = a) produce :-
and :-

How can I save row numbers to a table in SQLite?

I have added row numbers to a table (merged) thus:
SELECT ROW_NUMBER() OVER (ORDER BY Pclass) RowNum, *
FROM merged;
Which returns:
1|1|0|58|0|0|146.5208|0|20|0|1|1|0.53043592
2|1|0|31|1|0|113.275|0|23|0|1|1|0.671198682
3|1|0|38|0|0|227.525|0|29|0|1|1|0.888825796
4|1|0|36|0|2|71|0|23|1|0|1|0.49853335
However, when I then check merged, the row numbers are no longer present (note that this produces unordered results, but nevertheless shows the point I am making):
SELECT * FROM merged;
2|0|24|0|0|13|0|38|1|0|0|0.505845678
3|1|61|0|0|6.2375|0|25|1|0|0|0.128146005
2|0|17|0|0|12|0|21|0|1|1|0.465261004
2|1|18|0|0|11.5|0|26|1|0|0|0.458356337
I suspect that the way to achieve this is to update merged by adding a new column and then adding the row numbers to said column, but I don't know how to go about it.
As such, my question is this: how can I save row numbers to merged?
SELECT statement won't change the merged table. Updating the existing table will be a bit complicated, and I guess there is no trivial way to do that. So the easier way is to create a new table, drop the previous one and renaming the new one to be the old one.
This code should work:
CREATE TABLE new_merged
AS (SELECT ROW_NUMBER() OVER (ORDER BY Pclass) RowNum, * FROM merged);
DROP TABLE merged;
ALTER TABLE new_merged RENAME TO merged;

sqlite3: Intersect two tables where one value BETWEEN two others

I have two tables, one has single entries like this:
'rs47' 1027
The other has ranges:
'gene1' 1000 1500
These tables are huge, so I am trying to figure out the most efficient way to get all entries from table 1 where the entries are within any range in table 2.
I don't think that INTERSECT can be used like this. I know how to use SELECT to do this for a single entry:
SELECT name FROM 'table2' INDEXED BY 'start_end' WHERE 1027 BETWEEN start AND end
But I am not sure how to do that for every record in a table. Any ideas?
To check whether corresponding rows exist in the other table, you can use a correlated subquery:
SELECT *
FROM Table1
WHERE EXISTS (SELECT 1
FROM Table2
WHERE Table1.Value BETWEEN Table2.StartValue AND Table2.EndValue);

Can you only use one Select command w/SqlDataSource

This is a pretty simple question that I haven't been able to find an answer for. Is it possible to have two separate SELECT commands (from the same table) in the same SqlDataSource command to populate two different cells in a given GridView?
I haven't been able to find current information so far.
::EDIT::
The challenge is that I'm attempting to manupulate one cell with a COUNT command and the second cell with a numerical grand total from the same information.
You can Combine results from two separate SELECT Statements by doing something like this..
SELECT X.A , Y.B
FROM (SELECT Column1 AS A FROM TableName) X, (SELECT Column2 AS B FROM TableName) Y

SQLite - Selecting rows from table using a comparer function

I have a table in my SQLite database where one of the columns is just a free text.
Also, I have a custom function defined that calculates the Levenshtein distance of two given strings. Basically is just a comparer function that returns an integer value (distance between both strings).
My goal is to retrieve ALL the rows from that table that share a distance lower than a given value D between them.
Is this possible using queries? I thought GROUP BY would be the answer but I haven't gotten any semi decent results I can share.
Thanks in advance for any help provided.
You have to join the table with itself:
SELECT *
FROM MyTable T1 JOIN
MyTable T2 ON T1.ID < T2.ID AND
LDist(T1.TextColumn, T2.TextColumn) < 42
(The ID comparison prevents returning two result records for the same pair.)

Resources