Query returning incorrect count - count

I have the following query.
SELECT a.link_field1 AS journo, count(a.link_id) as articles, AVG( b.vote_value ) AS score FROM dan_links a LEFT JOIN dan_votes b ON link_id = vote_link_id WHERE link_field1 <> '' and link_status NOT IN ('discard', 'spam', 'page') GROUP BY link_field1 ORDER BY link_field1, link_id
This query is returning a count of 3 for the first item in the list. What should be returned is
Journo | count | score
John S | 2 | 6.00
Joe B | 1 | 4
However for the first one John S, it returns a count of 3.
If I directly query
select * from dan_links where link_field1 = 'John S'
I get 2 records return as I would expect. I can't for the life of me figure out why the count is wrong, unless for some reason it is counting the records from the dan_vote table
How can I get the correct count, or is my query completely wrong?
EDIT: Contents of the tables
link_id | link_field1 | link | source | link_status
1 | John S | http://test.com | test.com | approved
2 | John S | http://google.com | google | approved
3 | Joe B | http://facebook.com | facebook | approved
vote_id | link_id | vote_value
1 | 1 | 5
2 | 1 | 8
3 | 2 | 4
4 | 3 | 1
EDIT: it looks like it is counting the rows in the votes table for some reason

When you are doing a left outer join with the condition link_id = vote_link_id for every matching record one row is created, some thing like
link_id | link_field1 | link | source | link_status|vote_id|vote_value
1 | John S | http://test.com | test.com | approved|1|5
1 | John S | http://test.com | test.com | approved|2|8
2 | John S | http://google.com | google | approved|3|4
3 | Joe B | http://facebook.com | facebook | approved|4|1
Now when you do group by on link_field1, you get count as 3 for John S
Nested query might work
SELECT journo,count(linkid) as articles,AVG(score) FROM
(SELECT a.link_field1 AS journo, AVG( b.vote_value ) AS score, a.link_id as linkid
FROM dan_links a
LEFT JOIN dan_votes b
ON link_id = vote_link_id
WHERE link_field1 <> ''
and link_status NOT IN ('discard', 'spam', 'page')
GROUP BY link_id
ORDER BY link_field1, link_id) GROUP BY journo
The above query will give incorrect average as ((n1+n2)/2+n3)/2 != (n1+n2+n3)/3, so use the below query
SELECT journo,count(linkid) as articles, SUM(vote_sum)/SUM(count(linkid))
(SELECT a.link_field1 AS journo, SUM( b.vote_value ) AS vote_sum, a.link_id as linkid, count(a.link_id) as count_on_id
FROM dan_links a
LEFT JOIN dan_votes b
ON link_id = vote_link_id
WHERE link_field1 <> ''
and link_status NOT IN ('discard', 'spam', 'page')
GROUP BY link_id
ORDER BY link_field1, link_id) GROUP BY journo
Hope this helps.


Sqlite / populate new column that ranks the existing rows

I've a SQLite database table with the following columns:
| day | place | visitors |
| 2021-05-01 | AAA | 20 |
| 2021-05-01 | BBB | 10 |
| 2021-05-01 | CCC | 3 |
| 2021-05-02 | AAA | 5 |
| 2021-05-02 | BBB | 7 |
| 2021-05-02 | CCC | 2 |
Now I would like to introduce a column 'rank' which indicates the rank according to the visitors each day. Expected table would look like:
| day | place | visitors | Rank |
| 2021-05-01 | AAA | 20 | 1 |
| 2021-05-01 | BBB | 10 | 2 |
| 2021-05-01 | CCC | 3 | 3 |
| 2021-05-02 | AAA | 5 | 2 |
| 2021-05-02 | BBB | 7 | 1 |
| 2021-05-02 | CCC | 2 | 3 |
Populating the data for the new column Rank can be done with a program like (Pseudocode).
for each i_day in all_days:
ROW_NUMBER () OVER (ORDER BY `visitors` DESC) Day_Rank, place
FROM mytable
WHERE `day` = 'i_day'
for each i_place in all_places:
UPDATE mytable
SET rank= Day_Rank
WHERE `Day`='i_day'
AND place = 'i_place'
Since this line by line update is quite inefficient, I'm searching how to optimize this with a SQL sub query in combination with the UPDATE.
(does not work so far...)
for each i_day in all_days:
UPDATE mytable
SET rank= (
ROW_NUMBER () OVER (ORDER BY `visitors` DESC) Day_Rank
FROM mytable
WHERE `day` = 'i_day'
Typically, this can be done with a subquery that counts the number of rows with visitors greater than the value of visitors of the current row:
UPDATE mytable
SET Day_Rank = (
FROM mytable m
WHERE m.day = mytable.day AND m.visitors > mytable.visitors
Note that the result is actually what RANK() would return, if there are ties in the values of visitors.
See the demo.
Or, you could calculate the rankings with ROW_NUMBER() in a CTE and use it in a subquery:
WITH cte AS (
FROM mytable
UPDATE mytable
SET Day_Rank = (SELECT rn FROM cte c WHERE (c.day, c.place) = (mytable.day, mytable.place));
See the demo.
Or, if your versipn of SQLite is 3.33.0+ you can use the join-like UPDATE...FROM... syntax:
UPDATE mytable AS m
SET Day_Rank = t.rn
FROM mytable
) t
WHERE (t.day, t.place) = (m.day, m.place);

SQLite : DELETE on multiple WHERE criteria from query

I have a SQLite database where I need to delete records from a many-to-many table, based on query results where 2 criteria for each row must be met.
As an example, take 2 tables :
oldEvents <select_query>
user_id | event_id qry_user_id | qry_event_id
---------+---------- -------------+-------------
1 | aaa 2 | aaa
2 | aaa 3 | bbb
2 | bbb 1 | ccc
3 | bbb
1 | ccc
3 | ccc
From table oldEvents, I want to delete each row that appears in the query, so as to end up with:
user_id | event_id
1 | aaa
2 | bbb
3 | ccc
Until now, I use a cumbersome DELETE query that concatenates the qry_user_id and qry_event_id, and uses them in an EXISTS sub-clause :
SELECT user_id||event_id AS deleteCombo
FROM oldEvents
WHERE deleteCombo IN
SELECT qry_user_id||qry_event_id
It works, but is hardly readable, and wouldn't scale once more variables enter the scene.
I can't repeat the select_query inline in an AND-clause, because it is, itsself, a rather complicated query (triple-JOIN).
I could write the query data to a temporary table, but would rather not do that.
Anyone a suggestion on how to write a DELETE that accepts multiple WHERE criteria from a query ?
Enclose your query inside a CTE like this:
WITH cte(user_id, event_id) AS (
<your query here>
WHERE (user_id, event_id) IN (SELECT user_id, event_id FROM cte);
See the demo.
SELECT * FROM oldEvents;
| user_id | event_id |
| ------- | -------- |
| 1 | aaa |
| 2 | bbb |
| 3 | ccc |

SQLite - Update a column based on values from two other tables' columns

I am trying to update Data1's ID to Record2's ID when:
Record1's and Record2's Name are the same, and
Weight is greater in Record2.
| ID | Weight | Name |
| 1 | 10 | a |
| 2 | 10 | b |
| 3 | 10 | c |
| ID | Weight | Name |
| 4 | 20 | a |
| 5 | 20 | b |
| 6 | 20 | c |
| ID | Weight |
| 4 | 40 |
| 5 | 40 |
I have tried the following SQLite query:
update data1
set id =
(select record2.id
from record2,record1
where record1.name=record2.name
and record1.weight<record2.weight)
where id in
(select record1.id
from record1, record2
where record1.name=record2.name
and record1.weight<record2.weight)
Using the above query Data1's id is updated to 4 for all records.
NOTE: Record1's ID is the foreign key for Data1.
For the given data set the following seems to serve the cause:
update data1
set id =
(select record2.id
from record2,record1
data1.id = record1.id
and record1.name=record2.name
and record1.weight<record2.weight)
where id in
(select record1.id
from record1, record2
record1.id in (select id from data1)
and record1.name=record2.name
and record1.weight<record2.weight)
See it in action: SQL Fiddle.
Please comment if and as this requires adjustment / further detail.

How to join these two tables properly?

I have these two tables
id | name | address
1 | company 1 | some address information
2 | company 2 | another address
3 | yet another company | no address here
id | customer_id | date
1 | 2 | 20151209
2 | 2 | 20151211
3 | 3 | 20151210
4 | 1 | 20151223
Now I want to get a resulting table with each customer on the left and the amount of orders within an arbitrary period of time on the right.
For example, given this period to be 20151207 <= date <= 20151211, the resulting table should look like this:
name | orders count
company 1 | 0
company 2 | 2
yet another company | 1
Note: date = 20151207 means the 7th of december 2015.
How to join them?
SELECT c.name, COUNT(CASE WHEN ((o.date BETWEEN 20151207 AND 20151211) OR (o.date ISNULL)) THEN o.customer_id END) AS "Total Sales" FROM customers AS c LEFT JOIN orders o ON c.id == o.customer_id GROUP BY c.name

distinct sum does not distinct values

I have 2 tables, reservations and articles:
Id | Name | City |
1 | Mike | Stockholm
2 | Daniel | Gothenburg
2 | Daniel | Gothenburg
3 | Andre | Gothenburg (Majorna)
ArticleId | Name | Amount | ReservationId |
10 | Coconuts | 1 | 1
10 | Coconuts | 4 | 2
11 | Apples | 2 | 2
12 | Oranges | 2 | 3
I want to select Articles Name and the sum of Articles.Amount per Articles.ArticleId and Reservations.City.
My code:
SELECT distinct r.ID,a.Name as ArticleName,
sum(a.Amount) as ArticlesAmount,
substr(r.City,1,3) as ToCityName
FROM Reservations r
INNER JOIN Articles a
on r.Id = a.ReservationId
WHERE a.Name <> ''
GROUP BY ToCityName,a.ArticleId,a.Name
This gives me following result:
Id | ArticleName | ArticlesAmount | ToCityName
2 | Coconuts | 8 | Got
2 | Apples | 4 | Got
3 | Oranges | 2 | Got
1 | Coconuts | 1 | Sto
But i want:
Id | ArticleName | ArticlesAmount | ToCityName
2 | Coconuts | 4 | Got
2 | Apples | 2 | Got
3 | Oranges | 2 | Got
1 | Coconuts | 1 | Sto
Help would be appreciated, and an explanation please :)
Have a look at SQLFiddle
SELECT distinct r.ID,a.Name as ArticleName,
sum(distinct a.Amount) as ArticlesAmount,
substr(r.City,1,3) as ToCityName
FROM Reservations r
INNER JOIN Articles a
on r.Id = a.ReservationId
WHERE a.Name <> ''
GROUP BY ToCityName,a.ArticleId,a.Name
You want to ensure you sum the amount by the distinct number of times it appears per group.
I had added Articles again to select requested rows again... here is query
a.`Name` AS ArticleName,
substr(r.City, 1, 3) AS ToCityName
Reservations r
INNER JOIN Articles a ON r.Id = a.ReservationId
INNER JOIN Articles ON a.ReservationId = Articles.ReservationId
AND a.ArticleId = Articles.ArticleId
a. NAME <> ''
ToCityName ASC
