I'm running the following query with SQLITE against a geo database:
SELECT name, max(inhabitants) FROM countries GROUP BY continent
I'm consistently getting the names and inhabitants of the countries with the largest population, per continent. Is this by chance or some kind of expected (documented?) behavior that I can rely on?
In this statement:
SELECT name, max(inhabitants) FROM countries GROUP BY continent
the column name although it is in the SELECT list, it does not appear appear in the GROUP BY clause neither it is aggregated.
For SQLite this is a "bare" column and since it is used side by side with MAX() aggregate function, the value returned in the results for name is the value from the row that contains the max value for inhabitants for each continent.
And yes, this behavior is documented here: Simple Select Processing (at the end of the section).
Related
In ClickHouse, is there any way use the topK query on more than the column ,
for example:
select topK(10)(AGE,COUNTRY) ...
meaning I want the top10 combinations of AGE+COUNTRY,
I only found a workaround using concat on fields and topK on them, wondered if there is any other way.
You can pass array (or tuple) of columns to topK:
SELECT topK(10)([Age, Country])
FROM table
Or use the straightforward calculation (it is much slower but provides the exact result):
SELECT
Age,
Country
FROM table
GROUP BY
Age,
Country
ORDER BY count() DESC
LIMIT 10
Say I have a table with three fields message, environment and function.
I want to count up the records by message, environment and function, and then select the highest scoring row for any combination.
Getting the counts is easy
Table
| summarize count() by message, environment, function
...but how do I get just one row with the top count? My solution so far is to create a new table that tallies the counts, then tally max() by environment, function and then do a join, but this seems like an expensive and complicated workaround.
If I understand your original question correctly, you may want to look into summarize arg_max() as well: https://learn.microsoft.com/en-us/azure/kusto/query/arg-max-aggfunction
Ah, just modify the solution here to use max instead of sum
Add column of totals pr. field value
This sqlite query returns zero rows as expected:
SELECT 1 FROM tbl WHERE 0;
This query, on the other hand, returns one row containing a null column:
SELECT MAX(1) FROM tbl WHERE 0;
Why does the second query return one row, rather than zero rows?
'Normal' queries return one result for each (filtered) record of the table in the FROM clause.
However, when you are using some aggregate function, the result has one record for each group in the source table. And if there is no GROUP BY clause, the entire table is one group.
With GROUP BY, there are no results for empty groups.
However, without GROUP BY, you always have exactly one group and get one result record, even if that group ends up being empty.
Your query would get zero rows with MAX if it were also using GROUP BY (with a constant value, which would put all records, if there were any, into a single group):
SELECT MAX(1) FROM tbl WHERE 0 GROUP BY NULL;
Im working with the oracle pdf's to learn pl/sql.
There is an exercise where i have to create a new table with data out
of two other tables already existing. I thought this would do the trick:
CREATE TABLE new_depts
AS SELECT d.department_id, d.department_name, sum(e.salary) dept_sal
FROM employees e, departments d
WHERE e.department_id = d.department_id;
But this raises the following error:
SQL-Fehler: ORA-00937: not a single-group group function
00937. 00000 - "not a single-group group function"
I cant find something usefull about this error. From what i know yet
about SQL my code should work fine!
Am i wrong?
Try adding group by clause :
CREATE TABLE new_depts
AS SELECT d.department_id, d.department_name, sum(e.salary) dept_sal
FROM employees e, departments d
WHERE e.department_id = d.department_id
group by d.department_id,d.department_name
Update 1
You need to use group by clause in your select query because you are using aggregate function: sum(e.salary). If you are using aggregate function then you need to have group by clause. Please see here for more information about group by clause.
The main concept to understanding why aggregate functions or columns that are specified in the GROUP BY clause cannot be mixed with other non aggregate expressions in the select list is the level of detail of the value they produce. The select list of the SELECT statement can include only expressions that produce values that are on the same level of detail as others in that select list.
Example 1: incorrect
SELECT avg(col1) --> level of detail of the value is aggregated
,col2 --> level of detail of the value is only for one row
FROM table_a;
Example 2: correct
SELECT avg(col1) --> level of detail of the value is aggregated
,col2 --> level of detail of the value is aggregated
FROM table_a
GROUP BY col2;
By including a column in the GROUP BY clause you aggregate the specified column and change its level of detail from single row to aggregate.
So I know enough SQL just to be really dangerous (I don't normally work the back-end) but cannot get the following view to be created successfully ;) The result set I'm after is a data set that has rows assigned as a column alias from multiple tables (instead of a 1xN flat of all columns). There is a many-to-one relationship when looking at the main table, based on foreign keys associated to the row id of the appropriate related table.
Ideally I'd like a data set that looks like this in the return:
dataset.transaction_row[n]: col1, col2, col3, coln... (columns from the transaction table)
dataset.category_row[n]: col1, co2, col3, coln... (columns from the category table)
and so on...
I get the following error:
Query Error: near "AS": syntax error Unable to execute statement
From:
CREATE VIEW view_unreconciled_transactions
AS SELECT account_transaction.* AS transaction_row,
category.* AS category_row,
memorized.name_rule_replace OR account_transaction.name AS payee
FROM account_transaction
LEFT JOIN memorized ON account_transaction.memorized_key = memorized.id
LEFT JOIN category ON account_transaction.category_key = category.id
WHERE status != 2
ORDER BY account_transaction.dt_posted DESC
It seems easy enough since the result-column selector is repeatable which includes expressions (referencing sqlite's syntax diagrams). In reference to the error, I'm assuming it's complaining about the 2nd 'AS' where I'm trying to get table.* assigned as an alias. Any help in the right direction is appreciated. If I had to, I suppose I could explicitly state all columns but that feels like a kludge.
The AS modifier can only be applied to a single column, not to a collection such as the * you used. You will have to break them out into specific names, (which is best practice IMHO anyway)
It looks like you want to make a "pivot table". They can be tricky to make in a database. I can say that if you a data result, where each row comes from a different table source, and the columns form each table are IDENTICAL, then you could try using a UNION statement to join the different results together like they are just one dataset.
NOTE that the columns all take their naming cue from the first dataset in a UNION and the datatype all need to be the same.