Selecting all max values of column for each distinct value of other column - sqlite

I am trying to get a list of most used tags for posts on a website on a given day. I currently have this query:
SELECT posts.pdate, tags.tag, count(posts.pid) as post_count
FROM posts, tags
WHERE posts.pid = tags.pid
GROUP BY posts.pdate, tags.tag
ORDER BY posts.pdate;
This provides me with each distinct tag, along with the date they are used on as well as how many posts used them, returning me with this:
2020-09-10|CMPUT291|1
2020-09-10|computing|1
2020-09-10|database|2
2020-09-10|frequentTag1|2
2020-09-10|relational|2
2020-09-10|sql|1
2020-09-10|tieTag1|2
2020-09-11|Database|1
2020-09-11|data|1
2020-09-11|relational|1
2020-09-11|sql|1
2020-09-13|Database|1
2020-09-13|Sql language|1
2020-09-13|access|1
2020-09-13|frequentTag3|2
2020-09-13|query|3
2020-09-13|relational|3
2020-09-13|sql|1
2020-09-17|Database|1
2020-09-17|frequentTag3|3
2020-09-17|query|1
2020-09-17|relational|1
2020-09-17|sql|1
2020-09-17|sql language|1
2020-09-20|RELATIONAL|1
2020-09-20|database|1
2020-09-20|query|1
2020-09-20|sql language|1
2020-09-25|database|1
2020-09-25|sql language|1
2020-09-30|boring|2
2020-09-30|extra tag|1
2020-09-30|fun|3
2020-09-30|just here|1
2020-09-30|more tag|1
2020-09-30|sleep|3
2020-09-30|tag tag|1
2020-09-30|tag test|1
2020-09-30|test tag|1
But, I now need to make it only give me the rows that have the max (or all of them with max in case of a tie) for each date.
I WANT to be able to use MAX(count(posts.pid)) but I know that doesn't work so I need to find an alternative.
I should get a final result of this:
2020-09-10|database|2
2020-09-10|frequentTag1|2
2020-09-10|relational|2
2020-09-10|tieTag1|2
2020-09-11|Database|1
2020-09-11|data|1
2020-09-11|relational|1
2020-09-11|sql|1
2020-09-13|query|3
2020-09-13|relational|3
2020-09-17|frequentTag3|3
2020-09-20|RELATIONAL|1
2020-09-20|database|1
2020-09-20|query|1
2020-09-20|sql language|1
2020-09-25|database|1
2020-09-25|sql language|1
2020-09-30|fun|3
2020-09-30|sleep|3
Any help would be greatly appreciated.
APPLICABLE SCHEMA:
create table posts (
pid char(4),
pdate date,
title text,
body text,
poster char(4),
primary key (pid),
foreign key (poster) references users
);
create table tags (
pid char(4),
tag text,
primary key (pid,tag),
foreign key (pid) references posts
);

You can use RANK() window function:
SELECT pdate, tag, post_count
FROM (
SELECT p.pdate,
t.tag,
COUNT(*) post_count,
RANK() OVER (PARTITION BY p.pdate ORDER BY COUNT(*) DESC) rnk
FROM posts p INNER JOIN tags t
ON p.pid = t.pid
GROUP BY p.pdate, t.tag
)
WHERE rnk = 1
ORDER BY pdate, tag;
You should use a proper JOIN with an ON clause instead of that outdated syntax with the WHERE clause.

Related

How to do arithemtic operations with an alias in sqlite

I want to calculate with an alias in sqlite (Example is modified from http://www.sqlitetutorial.net):
if i do it like this, i get the error message "no such column: tracks_count"
SELECT albumid,
title,
(
SELECT count(trackid)
FROM tracks
WHERE tracks.AlbumId = albums.AlbumId
)
tracks_count, tracks_count * album_nr
FROM albums
ORDER BY tracks_count DESC;
if i do it like this, i get zero for the mulitplication
SELECT albumid,
title,
(
SELECT count(trackid)
FROM tracks
WHERE tracks.AlbumId = albums.AlbumId
)
tracks_count, "tracks_count" * album_nr
FROM albums
ORDER BY tracks_count DESC;
Table data for the example:
table albums
table tracks
You don't even need a subquery here:
SELECT
a.albumid,
a.title,
COUNT(t.albumid) AS tracks_count,
COUNT(t.albumid) * a.album_nr AS other_count
FROM albums a
LEFT JOIN tracks t
ON a.albumid = t.albumid
GROUP BY
a.albumid,
a.title;
If you wanted to make your current approach work, then the problem you are having is that you are referring to the tracks_count alias in the same select in which it was defined. This isn't allowed, because the alias may not have even been computed yet. But, I would recommend using the answer I gave above.

SQL Select Query Asp.Net

I have a product page on a webpage that shows categories of products. This is done with a listview populated from a database. The issue that I have is that the main supplier has demanded that their products are first in the category list. So what I need to do is run a query that will return the results, display those two categories first and then display the rest alphabetically.
So I've been trying to do this using a UNION ALL query like this:
SELECT cat, cat_id, image FROM prod_categories WHERE cat_id = 19 OR cat_id = 65
UNION ALL
SELECT cat, cat_id, image FROM prod_categories WHERE cat_id <> 19 AND cat_id <> 65
I thought with a union like this it would display the results of the first select query first, but it's not doing that.
I can add an 'order by cat' clause on the end, but obviously that only displays them in the correct order if the two categories I want to display come first alphabetically, which they don't.
If anyone has any ideas how to do this it would be greatly appreciated.
Thanks
How about this:
SELECT cat, cat_id, image FROM prod_categories
order by case when cat_id in (19, 65) then 1 else 2 end, cat_id
Cuts out the need to UNION altogether. Might even produce a more efficient execution plan (possibly...).
(using Transact-SQL for SQL Server - the exact syntax may have to be tinkered for MySql etc)
Try something like this.
SELECT cat, cat_id, image, 1 as [srt]
FROM prod_categories WHERE cat_id = 19 OR cat_id = 65
UNION ALL
SELECT cat, cat_id, image, 2 as [srt]
FROM prod_categories WHERE cat_id <> 19 AND cat_id <> 65
ORDER BY srt ASC, cat_id
Don't hard-code this into your query. What happens when the next supplier wants to come second? Or last? For that matter, you may want to list categories in some sort of "group", anyways.
Instead, you should be using an ordering table (or multiple). Something simple to get you started:
CREATE TABLE Category_Order (categoryId INTEGER -- fk to category.id, unique
priority INTEGER) -- when to display category
Then you want to insert the values for the current "special" categories:
INSERT INTO Category_Order (categoryId, priority) VALUES (19, 2147483647), (65, 0)
You'll also need an entry for rows that are not currently prioritized:
INSERT INTO Category_Order (categoryId, priority)
SELECT catId, -2147483648
FROM prod_categories
WHERE catID NOT IN (19, 65)
Which can then be queried like this:
SELECT cat, cat_id, image
FROM prod_categories
JOIN Category_Order
ON category_id = cat_id
ORDER BY priority DESC, cat
If you write a small maintenance program for this table, you can then push re-ordering duties off onto the correct business department. Reordering of entries can be accomplished by splitting the difference between existing entries, although you'll want a procedure to re-distribute if things get too crowded.
Note that, in the event your db supports a clause like ORDER BY priority NULLS LAST, the entries for non-prioritized categories are unnecessary, and you can simply LEFT JOIN to the ordering table.

sqlite include 0 in count

I've got two tables in a SQLite database, and I'm attempting to calculate the count of the routes by rating. It works, but doesn't return 0 for when there isn't a route with that rating. An example rating would be 8, or 11b, or V3.
The query I'm using right now is:
select routes.rating, count(routes.rating) from routes
left join orderkeys on routes.rating = orderkeys.rating
group by routes.rating
order by orderkeys.key
This doesn't return 0 for the ratings that don't have any routes for them, though. The output I get is:
10d|3
7|3
8|2
9|9
10a|5
10b|4
10c|2
11a|3
12b|1
V0|5
V1|7
V2|5
V3|8
V4|3
V5|1
V6|2
V7|3
V8|2
V9|1
What I expect to get is:
7|3
8|2
9|9
10a|5
10b|4
10c|2
10d|3
11a|3
11b|0
11c|0
11d|0
12a|0
12b|1
12c|0
12d|0
V0|5
V1|7
V2|5
V3|8
V4|3
V5|1
V6|2
V7|3
V8|2
V9|1
Here's the schema:
CREATE TABLE routes (
id integer PRIMARY KEY,
type text, rating text,
rope integer,
name text,
date text,
setter text,
color_1 text,
color_2 text,
special_base text,
blurb text,
updater text
);
CREATE TABLE orderkeys (
rating TEXT,
key INTEGER PRIMARY KEY
);
A left join returns all records from the left table, but what you want is all ratings, i.e., all records from the orderkeys table:
SELECT orderkeys.rating,
COUNT(routes.id)
FROM orderkeys
LEFT JOIN routes USING (rating)
GROUP BY orderkeys.rating
ORDER BY orderkeys.key
Try this. I do not like join quite much but it is quite useful when there are a lot of tables:
select routes.rating, count(routes.rating) from routes, rating
where routes.rating = orderkeys.rating
group by routes.rating
order by orderkeys.key

"Insert if not exists" statement in SQLite

I have an SQLite database. I am trying to insert values (users_id, lessoninfo_id) in table bookmarks, only if both do not exist before in a row.
INSERT INTO bookmarks(users_id,lessoninfo_id)
VALUES(
(SELECT _id FROM Users WHERE User='"+$('#user_lesson').html()+"'),
(SELECT _id FROM lessoninfo
WHERE Lesson="+lesson_no+" AND cast(starttime AS int)="+Math.floor(result_set.rows.item(markerCount-1).starttime)+")
WHERE NOT EXISTS (
SELECT users_id,lessoninfo_id from bookmarks
WHERE users_id=(SELECT _id FROM Users
WHERE User='"+$('#user_lesson').html()+"') AND lessoninfo_id=(
SELECT _id FROM lessoninfo
WHERE Lesson="+lesson_no+")))
This gives an error saying:
db error near where syntax.
If you never want to have duplicates, you should declare this as a table constraint:
CREATE TABLE bookmarks(
users_id INTEGER,
lessoninfo_id INTEGER,
UNIQUE(users_id, lessoninfo_id)
);
(A primary key over both columns would have the same effect.)
It is then possible to tell the database that you want to silently ignore records that would violate such a constraint:
INSERT OR IGNORE INTO bookmarks(users_id, lessoninfo_id) VALUES(123, 456)
If you have a table called memos that has two columns id and text you should be able to do like this:
INSERT INTO memos(id,text)
SELECT 5, 'text to insert'
WHERE NOT EXISTS(SELECT 1 FROM memos WHERE id = 5 AND text = 'text to insert');
If a record already contains a row where text is equal to 'text to insert' and id is equal to 5, then the insert operation will be ignored.
I don't know if this will work for your particular query, but perhaps it give you a hint on how to proceed.
I would advice that you instead design your table so that no duplicates are allowed as explained in #CLs answer below.
For a unique column, use this:
INSERT OR REPLACE INTO tableName (...) values(...);
For more information, see: sqlite.org/lang_insert
insert into bookmarks (users_id, lessoninfo_id)
select 1, 167
EXCEPT
select user_id, lessoninfo_id
from bookmarks
where user_id=1
and lessoninfo_id=167;
This is the fastest way.
For some other SQL engines, you can use a Dummy table containing 1 record.
e.g:
select 1, 167 from ONE_RECORD_DUMMY_TABLE

Creating a VIEW from multiple tables each with a different number of columns

I want to combine multiple tables into one VIEW.
My understanding is that if the number of columns are different we cannot use UNION.
How do I solve this?
I have the following three TABLES:
1.Table Name- Albums
2.Table Name-AlbumPictures
3.Table Name-Stories
I want to have 3 tables as follows:(i can do this part using INNER JOINS- kindly correct me if i am wrong)
For Stories: StoryID,AlbumID,StoryTitle,AlbumCover,Votes
For Albums: AlbumID,AlbumName,AlbumCover,Votes
For Pictures: AlbumPictureID,Votes
I want to merge all the rows retrieved from the above queries into one VIEWand shuffle them. As the number of columns are different in each of the result sets above am I able to combine them into one VIEW?
So in your UNION sql, either remove the extra columns from the sql for the table with too many, or add extra columns with constant default values to the sql for the table with fewer columns.
Based on your example output, adding extra constant values might look like this...
Select StoryID id, AlbumID,
StoryTitle name, AlbumCover, Votes
From Stories
UNION
Select AlbumID id, AlbumID,
AlbumName name, AlbumCover, Votes
From Albums
UNION
Select AlbumPictureID id, null AlbumId,
null AlbumCover, Votes
From pictures
Order By id, Votes, name
But this makes me want to ask WHY???
EDIT: To sort, just add an order by using output column names, as shown above....
In order to use a UNION or UNION ALL operator, the number of columns and datatypes of the columns returned by each query have to be the same.
One trick you can use is to return a NULL value for the columns that are "missing" from some of the queries.
For performance, I recommend you use the UNION ALL operator in place of the UNION operator, if removing duplicates is not a requirement.
Whenever I need to do something like this, I usually include a literal in each query, as an identifier of which query the row came from.
e.g.
SELECT 'a' AS source
, a.id AS id
, a.name AS name
FROM table_a a
UNION ALL
SELECT 'b' AS source
, b.id AS id
, NULL AS name
FROM table_b b
ORDER BY 1,2
You can do something like this. All three tables are given similar columns with null values and TableName column is to identify the table which brings the data
EDIT: I have to say, this is not the right approach. I wanted to show you how to union tables but I think now it is getting ugly when editing it according to your comments.
--Note: Vote is on all three table, I have selected from Stories
select s.storyId, a.albumId, s.storyTitle, null albumName,
ap.albumCover, s.votes , null albumPictureId, 'stories-albums-albumPics' tableName
from Stories s join Albums a on s.albumId = a.albumId
join AlbumPictures ap on a.albumid = ap.albumId
UNION ALL
select null storyId, a.albumID, null storyTitle, a.albumName,
ap.albumCover, a.votes, null albumPictureId, 'albums-albumPics' tableName
from Albums a join AlbumPictures ap on a.albumid = ap.albumId
UNION ALL --use required table here as well
select null storyId, null albumId, null storyTitle, null albumName,
null albumCover, votes, albumPictureId, 'pictures' tableName
from Pictures
I guess this makes little sense,
Select StoryID+'SID' id, AlbumID,
StoryTitle name, AlbumCover, Votes
From Stories
UNION
Select AlbumID+'AID' id, AlbumID,
AlbumName name, AlbumCover, Votes
From Albums
UNION
Select AlbumPictureID+'APID' id, null AlbumId,
null AlbumCover, Votes
From pictures
Concatenating 'SID','AID' and 'APID' and it will make some sense when you see UI data
select * from Stories as s
inner join Albums as a on a.AccountID = s.AccountID
inner join Pictures as p on p.AccountID = s.AccountID
will return all, as long as AccountID is defined in all 3 tables
To only obtain the unique columns change * for the columns you desire
Why on earth would you need the data to be all in the same view? Just return 3 sets of data. If for example you are using a web browser as the front end, you could perform three queries and return them as a single set of JSON, for example:
{
Albums: [
{AlbumID: 1, AlbumName: 'blah', ... },
{AlbumID: 2, AlbumName: 'gorp', ... }
],
AlbumPictures: [
{AlbumID: 1, URL: 'http://fun.jpg'},
{AlbumID: 1, URL: 'http://fun2.jpg'}
],
Stories [
{StoryID: 3, StoryTitle: 'Jack & Jill', ... },
{ etc. }
]
}
There is absolutely no programming architectural constraint forcing you to put everything together in a single view.

Resources