I am trying to get a list of most used tags for posts on a website on a given day. I currently have this query:
SELECT posts.pdate, tags.tag, count(posts.pid) as post_count
FROM posts, tags
WHERE posts.pid = tags.pid
GROUP BY posts.pdate, tags.tag
ORDER BY posts.pdate;
This provides me with each distinct tag, along with the date they are used on as well as how many posts used them, returning me with this:
2020-09-10|CMPUT291|1
2020-09-10|computing|1
2020-09-10|database|2
2020-09-10|frequentTag1|2
2020-09-10|relational|2
2020-09-10|sql|1
2020-09-10|tieTag1|2
2020-09-11|Database|1
2020-09-11|data|1
2020-09-11|relational|1
2020-09-11|sql|1
2020-09-13|Database|1
2020-09-13|Sql language|1
2020-09-13|access|1
2020-09-13|frequentTag3|2
2020-09-13|query|3
2020-09-13|relational|3
2020-09-13|sql|1
2020-09-17|Database|1
2020-09-17|frequentTag3|3
2020-09-17|query|1
2020-09-17|relational|1
2020-09-17|sql|1
2020-09-17|sql language|1
2020-09-20|RELATIONAL|1
2020-09-20|database|1
2020-09-20|query|1
2020-09-20|sql language|1
2020-09-25|database|1
2020-09-25|sql language|1
2020-09-30|boring|2
2020-09-30|extra tag|1
2020-09-30|fun|3
2020-09-30|just here|1
2020-09-30|more tag|1
2020-09-30|sleep|3
2020-09-30|tag tag|1
2020-09-30|tag test|1
2020-09-30|test tag|1
But, I now need to make it only give me the rows that have the max (or all of them with max in case of a tie) for each date.
I WANT to be able to use MAX(count(posts.pid)) but I know that doesn't work so I need to find an alternative.
I should get a final result of this:
2020-09-10|database|2
2020-09-10|frequentTag1|2
2020-09-10|relational|2
2020-09-10|tieTag1|2
2020-09-11|Database|1
2020-09-11|data|1
2020-09-11|relational|1
2020-09-11|sql|1
2020-09-13|query|3
2020-09-13|relational|3
2020-09-17|frequentTag3|3
2020-09-20|RELATIONAL|1
2020-09-20|database|1
2020-09-20|query|1
2020-09-20|sql language|1
2020-09-25|database|1
2020-09-25|sql language|1
2020-09-30|fun|3
2020-09-30|sleep|3
Any help would be greatly appreciated.
APPLICABLE SCHEMA:
create table posts (
pid char(4),
pdate date,
title text,
body text,
poster char(4),
primary key (pid),
foreign key (poster) references users
);
create table tags (
pid char(4),
tag text,
primary key (pid,tag),
foreign key (pid) references posts
);
You can use RANK() window function:
SELECT pdate, tag, post_count
FROM (
SELECT p.pdate,
t.tag,
COUNT(*) post_count,
RANK() OVER (PARTITION BY p.pdate ORDER BY COUNT(*) DESC) rnk
FROM posts p INNER JOIN tags t
ON p.pid = t.pid
GROUP BY p.pdate, t.tag
)
WHERE rnk = 1
ORDER BY pdate, tag;
You should use a proper JOIN with an ON clause instead of that outdated syntax with the WHERE clause.
The question probably is quite confusing.
In affect i have the following:
WatchList table
UserId | FilmId
| 3 77
| etc etc
|
|
|
these are foreign keys for the following tables
FilmDB - Film_title, Film_plot, Film_Id etc.
and
aspnet_memberships - UserId, Username etc..
Now, i presume i will need to use a join but i am struggling with the syntax.
I would like to use 'Count' on the 'WatchList' and return the most frequent filmId's and their counterpart information, but i'd then like to return the REST of the FilmDB results, essentially giving me a list of ALL films, but with those found in the WatchedList my frequently sorted to the top.
Does that make sense? Thanks.
SELECT *
FROM filmdb
LEFT JOIN (
SELECT filmid, count(*) AS cnt
FROM watch_list
GROUP BY filmid) AS a
ON filmdb.film_id = a.filmid
ORDER BY isnull(cnt, 0) DESC;
http://sqlfiddle.com/#!3/46b16/10
You did not specify if the query should be grouped by film_id or user_id. The example I have provided is grouped by user if you change that to film_id then you will get the watch count for all users per film.
You need to use a subquery to get the count and then order the results by the count descending to get an ordered list.
SELECT
*
FROM
(
SELECT
WatchList.Film_Id,
WatchCount=COUNT(*)
FilmDB.Film_Title
FROM
WatchList
INNER JOIN FilmDB ON FilmDB.Film_Id=WatchList.Film_Id
GROUP BY
WatchList.UserID,
WatchList.Film_Id,
FilmDB.Film_Title
)AS X
ORDER BY
WatchCount DESC
I have a web service that generates radio station playlists and I'm trying to ensure that playlists never have tracks from the same artist more than n times.
So for example (unless it is Mandatory Metallica --haha) then no artist should ever dominate any 8 hour programming segment.
Today we use a query similar to this which generates smaller randomized playlists out of existing very large playlists:
SELECT FilePath FROM vwPlaylistTracks
WHERE Owner='{0}' COLLATE NOCASE AND
Playlist='{1}' COLLATE NOCASE
ORDER BY RANDOM()
LIMIT {2};
Someone then has to manually review the playlists and do some manual editing if the same artist appears consecutively or more than the desired limit.
Supposing the producer wants to ensure that no artist appears more than twice in the span of the playlist generated in this query (and assuming there is an artist field in the vwPlaylistTracks view; which there is) is GROUP BY the correct way to accomplish this?
I've been messing around with the view trying to accomplish this but this query always only returns 1 track from each artist.
SELECT
a.Name as 'Artist',
f.parentPath || '\' || f.fileName as 'FilePath',
p.name as 'Playlist',
u.username as 'Owner'
FROM mp3_file f,
mp3_track t,
mp3_artist a,
mp3_playlist_track pt,
mp3_playlist p,
mp3_user u
WHERE f.file_id = t.track_id
AND t.artist_id = a.artist_id
AND t.track_id = pt.track_id
AND pt.playlist_id = p.playlist_id
AND p.user_id = u.user_id
--AND p.Name = 'Alternative Rock'
GROUP BY a.Name
--HAVING Count(a.Name) < 3
--ORDER BY RANDOM()
--LIMIT 50;
GROUP BY creates exactly one result record for each distinct value in the grouped column, so this is not what you want.
You have to count any previous records with the same artist, which is not easy because the random ordering is not stable.
However, this is possible with a temporary table, which is ordered by its rowid:
CREATE TEMPORARY TABLE RandomTracks AS
SELECT a.Name as Artist, parentPath, name, username
FROM ...
WHERE ...
ORDER BY RANDOM();
CREATE INDEX RandomTracks_Artist on RandomTracks(Artist);
SELECT *
FROM RandomTracks AS r1
WHERE -- filter out if there are any two previous records with the same artist
(SELECT COUNT(*)
FROM RandomTracks AS r2
WHERE r2.Artist = r1.Artist
AND r2.rowid < r1.rowid
) < 2
AND -- filter out if the directly previous record has the same artist
r1.Artist IS NOT (SELECT Artist
FROM RandomTracks AS r3
WHERE r3.rowid = r1.rowid - 1)
LIMIT 50;
DROP TABLE RandomTracks;
It might be easier and faster to just read the entire playlist and to filter and reorder it in your code.
I want to combine multiple tables into one VIEW.
My understanding is that if the number of columns are different we cannot use UNION.
How do I solve this?
I have the following three TABLES:
1.Table Name- Albums
2.Table Name-AlbumPictures
3.Table Name-Stories
I want to have 3 tables as follows:(i can do this part using INNER JOINS- kindly correct me if i am wrong)
For Stories: StoryID,AlbumID,StoryTitle,AlbumCover,Votes
For Albums: AlbumID,AlbumName,AlbumCover,Votes
For Pictures: AlbumPictureID,Votes
I want to merge all the rows retrieved from the above queries into one VIEWand shuffle them. As the number of columns are different in each of the result sets above am I able to combine them into one VIEW?
So in your UNION sql, either remove the extra columns from the sql for the table with too many, or add extra columns with constant default values to the sql for the table with fewer columns.
Based on your example output, adding extra constant values might look like this...
Select StoryID id, AlbumID,
StoryTitle name, AlbumCover, Votes
From Stories
UNION
Select AlbumID id, AlbumID,
AlbumName name, AlbumCover, Votes
From Albums
UNION
Select AlbumPictureID id, null AlbumId,
null AlbumCover, Votes
From pictures
Order By id, Votes, name
But this makes me want to ask WHY???
EDIT: To sort, just add an order by using output column names, as shown above....
In order to use a UNION or UNION ALL operator, the number of columns and datatypes of the columns returned by each query have to be the same.
One trick you can use is to return a NULL value for the columns that are "missing" from some of the queries.
For performance, I recommend you use the UNION ALL operator in place of the UNION operator, if removing duplicates is not a requirement.
Whenever I need to do something like this, I usually include a literal in each query, as an identifier of which query the row came from.
e.g.
SELECT 'a' AS source
, a.id AS id
, a.name AS name
FROM table_a a
UNION ALL
SELECT 'b' AS source
, b.id AS id
, NULL AS name
FROM table_b b
ORDER BY 1,2
You can do something like this. All three tables are given similar columns with null values and TableName column is to identify the table which brings the data
EDIT: I have to say, this is not the right approach. I wanted to show you how to union tables but I think now it is getting ugly when editing it according to your comments.
--Note: Vote is on all three table, I have selected from Stories
select s.storyId, a.albumId, s.storyTitle, null albumName,
ap.albumCover, s.votes , null albumPictureId, 'stories-albums-albumPics' tableName
from Stories s join Albums a on s.albumId = a.albumId
join AlbumPictures ap on a.albumid = ap.albumId
UNION ALL
select null storyId, a.albumID, null storyTitle, a.albumName,
ap.albumCover, a.votes, null albumPictureId, 'albums-albumPics' tableName
from Albums a join AlbumPictures ap on a.albumid = ap.albumId
UNION ALL --use required table here as well
select null storyId, null albumId, null storyTitle, null albumName,
null albumCover, votes, albumPictureId, 'pictures' tableName
from Pictures
I guess this makes little sense,
Select StoryID+'SID' id, AlbumID,
StoryTitle name, AlbumCover, Votes
From Stories
UNION
Select AlbumID+'AID' id, AlbumID,
AlbumName name, AlbumCover, Votes
From Albums
UNION
Select AlbumPictureID+'APID' id, null AlbumId,
null AlbumCover, Votes
From pictures
Concatenating 'SID','AID' and 'APID' and it will make some sense when you see UI data
select * from Stories as s
inner join Albums as a on a.AccountID = s.AccountID
inner join Pictures as p on p.AccountID = s.AccountID
will return all, as long as AccountID is defined in all 3 tables
To only obtain the unique columns change * for the columns you desire
Why on earth would you need the data to be all in the same view? Just return 3 sets of data. If for example you are using a web browser as the front end, you could perform three queries and return them as a single set of JSON, for example:
{
Albums: [
{AlbumID: 1, AlbumName: 'blah', ... },
{AlbumID: 2, AlbumName: 'gorp', ... }
],
AlbumPictures: [
{AlbumID: 1, URL: 'http://fun.jpg'},
{AlbumID: 1, URL: 'http://fun2.jpg'}
],
Stories [
{StoryID: 3, StoryTitle: 'Jack & Jill', ... },
{ etc. }
]
}
There is absolutely no programming architectural constraint forcing you to put everything together in a single view.
I have an SQL Function with the following SQL within:
SELECT StockID FROM (SELECT DISTINCT StockID,
ROW_NUMBER() OVER(ORDER BY DateAdded DESC) AS RowNum
FROM Stock
WHERE CategoryCode LIKE #CategoryID) AS Info
WHERE RowNum BETWEEN #startRowIndex AND (#startRowIndex + #maximumRows) - 1
I have a Parameter #CategoryID - however I need to take in a category ID such as "BA" and translated this to a list of Category IDs such as "IE","EG" etc so my WHERE clause looks like:
WHERE (CategoryCode LIKE 'IE' OR CategoryCode LIKE 'EG') AS Info
I have a Lookup Table which contains the "BA" code and then all the real category codes this means such as "IE" and "EG".
How do I have the CategoryID expand to multiple "OR" statements in my SQL Function?
I am unsure how to do this, can anyone solve this problem?
At the moment the query as shown can cope with one CategoryID such as "IE", this is done as I want a category page such as category.aspx where a parameter "BA" is passed such as category.aspx?category=BA and this page will list all items with the category codes "EG" and "IE".
The reason I need this is there is a "parent" category code which has multiple "children" category codes which are different to the parent code. I am using ASP.NET and .NET 3.5 on the front-end if this helps.
Try using
WHERE CategoryCode IN (
SELECT LookupCategoryCode
FROM LookupTable
WHERE LookupCategoryId = #CategoryId
)
Replacing "LookupCategoryCode", "LookupTable", and "LookupCategoryId" for the respective values in your lookup table.
Assuming the parameter is a common delimited list of categoryID's
Try
WHERE charindex(','+CategoryCode+',',','+#CatParam+',') > 0
Performance won't be great, but it should do the trick for you