I have a scenario where I got multiple columns with similar content. I want to count how many distinct values are there in all the columns. Slightly different to the below linked case where content of two columns are looked at as a single attribute/element.
Counting DISTINCT over multiple columns
My table is as above. I need to go thru all club columns and count how many distinct clubs there are.
The below code I managed only counts distinct rows. Not individual distinct elements in each column.
select count(*) from( select distinct Club1 Club2 from StudentClubs) as ClubCount
The above returns 6
I need it to output 12 as there are 12 clubs in total.
Thanks in advance.
Actually I found the below solution as a interim. First stacking the columns and then using a count distinct.
It seem to work at least in this example.
select count(distinct A.Club1)
from (
select Club1 from StudentClubs as A
union all
select Club2 from StudentClubs as A
union all
select Club3 from StudentClubs as A
union all
select Club4 from StudentClubs as A
) as A
The above outputs 12.
Please do share if you have better ways to handle this.
Thanks
Related
I have a database that has two integer columns, and I'm trying to find a way to select the top 'x' amount of rows with the highest sums of these two columns. I'm trying to eliminate the need of creating a third column that stores the sum of the two, unless there's a way to to automatically update this column every time one of the other two are altered. I'm using SQLite by the way, as I know there are some slight differences here and there between SQL/SQLite syntax.
Any help is appreciated.
Something like
SELECT a, b
FROM yourtable
ORDER BY a + b DESC
LIMIT :x
should do it.
In ClickHouse, is there any way use the topK query on more than the column ,
for example:
select topK(10)(AGE,COUNTRY) ...
meaning I want the top10 combinations of AGE+COUNTRY,
I only found a workaround using concat on fields and topK on them, wondered if there is any other way.
You can pass array (or tuple) of columns to topK:
SELECT topK(10)([Age, Country])
FROM table
Or use the straightforward calculation (it is much slower but provides the exact result):
SELECT
Age,
Country
FROM table
GROUP BY
Age,
Country
ORDER BY count() DESC
LIMIT 10
I'm elaborating on this question (Copy Column Value from One table into Another Matching IDs - SQLite) by adding an extra challenge.
The idea is to copy the value of several columns from one table to another when an id is matching. The aforementioned question addresses how to copy the content of one column when a matching id is found. Here the code as posted by #scaisEdge:
UPDATE t1
SET value = (
SELECT value
FROM t2
WHERE t1.ID = t2.ID)
But what if we want to update several columns from that same row where t1.ID = t2.ID? Of course one could run it several times once for each column to be updated, however, that's extraordinarily inefficient as I have to update millions of rows. I guess that the less amount of logical comparisons that the query has to do the faster it will be. Any other ways of optimizing this task are welcome, the IDs are unique, both tables have the same number of rows, and the exact same values of ID are found in both tables. So sorting the tables is not out of the question.
If your version of SQLite is 3.15.0+ you can do it with Row Values:
update t1 set
(col1, col2) = (
select col1, col2
from t2
where t2.id = t1.id
)
See the demo.
I have two tables, one has single entries like this:
'rs47' 1027
The other has ranges:
'gene1' 1000 1500
These tables are huge, so I am trying to figure out the most efficient way to get all entries from table 1 where the entries are within any range in table 2.
I don't think that INTERSECT can be used like this. I know how to use SELECT to do this for a single entry:
SELECT name FROM 'table2' INDEXED BY 'start_end' WHERE 1027 BETWEEN start AND end
But I am not sure how to do that for every record in a table. Any ideas?
To check whether corresponding rows exist in the other table, you can use a correlated subquery:
SELECT *
FROM Table1
WHERE EXISTS (SELECT 1
FROM Table2
WHERE Table1.Value BETWEEN Table2.StartValue AND Table2.EndValue);
I have two tables, one contains a list of items which is called watch_list with some important attributes and the other is just a list of prices which is called price_history. What I would like to do is group together 10 of the lowest prices into a single column with a group_concat operation and then create a row with item attributes from watch_list along with the 10 lowest prices for each item in watch_list. First I tried joins but then I realized that the operations where happening in the wrong order so there was no way I could get the desired result with a join operation. Then I tried the obvious thing and just queried the price_history for every row in the watch_list and just glued everything together in the host environment which worked but seemed very inefficient. Now I have the following query which looks like it should work but it's not giving me the results that I want. I would like to know what is wrong with the following statement:
select w.asin,w.title,
(select group_concat(lowest_used_price) from price_history as p
where p.asin=w.asin limit 10)
as lowest_used
from watch_list as w
Basically I want the limit operation to happen before group_concat does anything but I can't think of a sql statement that will do that.
Figured it out, as somebody once said "All problems in computer science can be solved by another level of indirection." and in this case an extra select subquery did the trick:
select w.asin,w.title,
(select group_concat(lowest_used_price)
from (select lowest_used_price from price_history as p
where p.asin=w.asin limit 10)) as lowest_used
from watch_list as w