Group By breaks Having clause

Group By breaks Having clause - count

I am querying the DB to get all venues within a certain X Radius of a given latitude & longitude.
Everything works good until I insert a count & group by, causing the Having clause to no longer filter out records that are
SELECT
locations.id AS 'locations_id_c',
locations.city AS 'locations_city_c',
locations.province AS 'locations_state_c',
locations.latitude AS 'latitude_c',
locations.longitude AS 'longitude_c',
venues.id AS 'venues_id_c',
venues.name AS 'venues_name_c',
venues.venue_rank,
venues.median_fb_likes,
count(gigs.show_time),
count(band_profiles.name) AS 'Related Band',
#san fran lat & long = 37.7833° N, 122.4167° W
# ( 3959 * acos( cos( radians(37.7833) ) * cos( radians( locations.latitude) ) * cos( radians( locations.longitude ) - radians(-122.4167) ) + sin( radians(37.7833) ) * sin(radians(locations.latitude)) ) ) AS 'distance'
3956 * 2 * ASIN(SQRT( POWER(SIN((37.7833 - abs(locations.latitude)) * pi()/180/2),2) +COS(37.7833 * pi()/180 )*COS(abs(locations.latitude)*pi()/180) *POWER(SIN((locations.longitude + 122.4167)*pi()/180/2),2))) as 'distance'
FROM locations
INNER JOIN venues
ON locations.locatable_id = venues.id
AND venues.venue_rank >= 0
LEFT JOIN gigs
ON gigs.venue_id = venues.id
and gigs.show_time > NOW()
LEFT JOIN band_profiles
ON gigs.account_id = band_profiles.account_id
and band_profiles.name IS NOT NULL
WHERE
locations.locatable_type = 'Venue'
group by venues.id
HAVING distance < 50
ORDER BY venues.venue_rank DESC, distance DESC;

got it working
Seems the group by must happen on the main table, and not one of the joins.
Does anyone have an explanation why this is so?

Related

How to get the sum of rows then get the average of the calculate sum

I have a database that represent music shop data.
Tables I want to work with are (artist, album, track)
every album has an albumID, title and artistID
and every track has TrackID, Milliseconds, albumID and name
well, as you know every album has many tracks
I want to calculate the sum of all songs in the album(in Milliseconds) Then get the average of that sum to get the albums that are above the average in length.
I managed to calculate the length of every album but i struggled to get the average.
I want to generate a table that will be like that.
---------------------------------------------------------------------
AlbumID | Tile | Milliseconds
---------------------------------------------------------------------
1- | |
2- | |
3- | |
...
10- | |
----------------------------------------------------------------------
Sample table of Track
TrackID | Name | AlbumId | Milliseconds
1 For Those About To Rock 1 343719
6 Put The Finger On You 1 205662
7 Let's Get It Up 1 233926
85 Cochise 10 222380
86 Show Me How to Live 10 277890
87 Gasoline 10 279457
88 What You Are 10 249391
89 Like a Stone 10 294034
99 Your Time Has Come 11 255529
100 Out Of Exile 11 291291
101 Be Yourself 11 279484
102 Doesn't Remind Me 11 255869
111 Money 12 147591
112 Long Tall Sally 12 106396
so, LIMIT 10
and the length of the track

I believe that the following may be what you wish :-
WITH albumsums(id,asum) AS (
SELECT albumid, sum(milliseconds)
FROM track
GROUP BY albumid
)
SELECT album.albumid, album.title, asum
FROM album
JOIN albumsums
ON album.albumid = albumsums.id
WHERE asum > (SELECT avg(asum) FROM albumsums)
LIMIT 10;
Consider the following demo :-
DROP TABLE IF EXISTS track;
DROP TABLE IF EXISTS artist;
DROP TABLE IF EXISTS album;
CREATE TABLE IF NOT EXISTS track (trackid INTEGER PRIMARY KEY, name TEXT, milliseconds INTEGER, albumid);
CREATE TABLE IF NOT EXISTS artist (artistid INTEGER PRIMARY KEY, artistname TEXT);
CREATE TABLE IF NOT EXISTS album (albumid INTEGER PRIMARY KEY, title TEXT, artistid INTEGER);
INSERT INTO artist (artistname) VALUES ('Pink Floyd'),('Genesis'),('Deep Purple');
INSERT INTO album (title,artistid) VALUES('Dark side of the moon',1),('Fireball',3),('Foxtrot',2);
INSERT INTO track (name,milliseconds,albumid) VALUES
('Supper''s Ready',((22 * 60) + 57) * 1000,3),
('Watcher of the Skies',((7 * 60) + 21) + 1000,3),
('Time Table',((4 * 60) + 47) * 1000,3),
('Get ''Em Out by Friday',((8 * 60) + 35) * 1000,3),
('Can-Utility and the Coastliners',((5 * 60) + 45 ) * 1000,3),
('Speak to me /Breath',((3 * 60) + 58) * 1000,1),
('On the Run',((3 * 60) + 35) * 1000,1),
('Time',((7 * 60) + 5) * 1000,1),
('The Great Gig in the Sky',((4 * 60) + 44) * 1000,1),
('Money',((6 * 60) + 23) * 1000,1),
('Use and Them',((7 * 60) + 50) * 1000,1),
('Any Colour you Like',((3 * 60) + 26) * 1000,1),
('Brain Damage',((3 * 60) + 47) * 1000,1),
('Fireball',((3 * 60) + 24) * 1000,2),
('No No No',((6 * 60) + 54) * 1000,2),
('Demon''s Eye',((5 * 60) + 21) * 1000,2),
('Anyone''s Daughter',((4 * 60) + 43) * 1000,2),
('The Mule',((5 * 60) + 21) * 1000,2),
('Fools',((8 * 60) + 19) * 1000,2),
('No One Came',((6 * 60) + 34) * 1000,1),
('Strange Kind of Woman',((4 * 60) + 07) * 1000,1)
;
SELECT * FROM artist;
SELECT * FROM album;
SELECT * FROM track;
SELECT albumid, sum(milliseconds)
FROM track
GROUP BY albumid
;
WITH albumsums(id,asum) AS (
SELECT albumid, sum(milliseconds)
FROM track
GROUP BY albumid
)
SELECT album.albumid, album.title, asum, (SELECT avg(asum) FROM albumsums AS album_average_for_demo)
FROM album
JOIN albumsums
ON album.albumid = albumsums.id
WHERE asum > (SELECT avg(asum) FROM albumsums);
The net result (as you wanted above average) being just one album above the average :-
The CTE (Common Table Expression albumsums) as demonstrated by the previous query produces 3 rows (1 per album) with the album id and the sum of the tracks :-
Thus the average album length is 2552147 and thus only 1 album is greater than that (as would be expected from such a limited amount of data).
The tables (after being loaded) being :-

calculate percentages with postgresql join queries

I am trying to calculate percentages by joining 3 tables data to get the percentages of positive_count, negative_count, neutral_count of each user's tweets. I have succeeded in getting positive, negative and neutral counts, but failing to get the same as percentages instead of counts. Here is the query to get counts:
SELECT
t1.u_id,count() as total_tweets_count ,
(
SELECT count() from t1,t2,t3 c
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Positive'
) as pos_count ,
(
SELECT count() from t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Negative'
) as neg_count ,
(
SELECT count() from t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id AND
t3.sentiment='Neutral'
) as neu_count
FROM t1,t2,t3
WHERE
t1.u_id='18839785' AND
t1.u_id=t2.u_id AND
t2.ts_id=t3.ts_id
GROUP BY t1.u_id;
**OUTPUT:**
u_id | total_tweets_count | pos_count | neg_count | neu_count
-----------------+--------------------+-----------+-----------+-------
18839785| 88 | 38 | 25 | 25
(1 row)
Now I want the same in percentages instead of counts. I have written the query in the following way but failed.
SELECT
total_tweets_count,pos_count,
round((pos_count * 100.0) / total_tweets_count, 2) AS pos_per,neg_count,
round((neg_count * 100.0) / total_tweets_count, 2) AS neg_per,
neu_count, round((neu_count * 100.0) / total_tweets_count, 2) AS neu_per
FROM (
SELECT
count(*) as total_tweets_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Positive'
) AS pos_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Negative'
) AS neg_count,
count(
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id AND
c.sentiment='Neutral') AS neu_count
FROM t1,t2, t3
WHERE
a.u_id='18839785' AND
a.u_id=b.u_id AND
b.ts_id=c.ts_id
GROUP BY a.u_id
) sub;
Can anyone help me out in achieving as percentages for each user data as below?
u_id | total_tweets_count | pos_count | neg_count | neu_count
------------------+--------------------+-----------+-----------+-----
18839785| 88 | 43.18 | 28.4 | 28.4
(1 row)

I am not entirely sure what you are looking for.
For starters, you can simplify your query by using conditional aggregation instead of three scalar subqueries (which btw. do not need to repeat the where condition on a.u_id)
You state you want to "count for all users", so you need to remove the WHERE clause in the main query. The simplification also gets rid of the repeated WHERE condition.
select u_id,
total_tweets_count,
pos_count,
round((pos_count * 100.0) / total_tweets_count, 2) AS pos_per,
neg_count,
round((neg_count * 100.0) / total_tweets_count, 2) AS neg_per,
neu_cont,
round((neu_count * 100.0) / total_tweets_count, 2) AS neu_per
from (
SELECT
t1.u_id,
count(*) as total_tweets_count,
count(case when t3.sentiment='Positive' then 1 end) as pos_count,
count(case when t3.sentiment='Negative' then 1 end) as neg_count,
count(case when t3.sentiment='Neutral' then 1 end) as neu_count
FROM t1
JOIN t2 ON t1.u_id=t2.u_id
JOIN t3 t2.ts_id=t3.ts_id
-- no WHERE condition on the u_id here
GROUP BY t1.u_id
) t
Note that I replaced the outdated, ancient and fragile implicit joins in the WHERE clause with "modern" explicit JOIN operators
With a more up-do-date Postgres version, the expression count(case when t3.sentiment='Positive' then 1 end) as pos_count can also be re-written to:
count(*) filter (where t3.sentiment='Positive') as pos_count
which is a bit more readable (and understandable I think).
In your query you can achieve the repetition of the global WHERE condition on the u_id by using a co-related subquery, e.g.:
(
SELECT count(*)
FROM t1 inner_t1 --<< use different aliases than in the outer query
JOIN t2 inner_t2 ON inner_t2.u_id = inner_t1.u_id
JOIN t3 inner_t3 ON inner_t3.ts_id = inner_t2.ts_id
-- referencing the outer t1 removes the need to repeat the hardcoded ID
WHERE innter_t1.u_id = t1.u_id
) as pos_count
The repetition of the table t1 isn't necessary either, so the above could be re-written to:
(
SELECT count(*)
FROM t2 inner_t2
JOIN t3 inner_t3 ON inner_t3.ts_id = inner_t2.ts_id
WHERE inner_t2.u_id = t1.u_id --<< this references the outer t1 table
) as pos_count
But the version with conditional aggregation will still be a lot faster than using three scalar sub-queries (even if you remove the unnecessary repetition of the t1 table).

The query performance is very low + correlated subquery

The query's aim is to get reserved rooms status is 3 or going to reserve in 30 to 45 minutes status in 2 or unreserved status in 1. reservation rooms are in RESEENH table and each reservation is in ORD_NOARCHIVE table which has begintime and endtime of reservation. So for each reservation room this query checks whether reservation is available at current time also its checks the meeting room parents and children. if children is reserved then parents are are blocked.
it takes 10 secs to fetch first 50 records.
with cte as
(
SELECT DISTINCT R.syscode,
R.behcode,
R.syscode AS FK_RESERVATIONUNIT, (
CASE
WHEN R.TYPE = 3 THEN '1'
WHEN R.TYPE = 1 THEN '2'
ELSE NULL
END ) AS LOCATION_TYPE,
R.sysobjalg AS FK_PROPERTY,
MP.syscode AS FK_MEASUREMENTPOINT,
MP.fk_plc_occupancy_state AS FK_PLC_OCCUPANCY_STATE,
F.syscode AS FK_FLOOR,
R.transitiontime,
r.type,
r.is_compoundreservationunit,
r.is_archived,
MP.fk_person,
os.transitionperiod
FROM reseenh R
--left outer join ordtrantbl RSS
--ON RSS.reservationunisyscode = R.syscode
left outer join objond F
ON F.syscode = R.fk_floor
left outer join pln_measurementpoint MP
ON MP.fk_reservationunit = R.syscode
AND MP.is_primary_measurement_point = 'T',
pln_ordersetting os
)
select cte.syscode,cte.behcode,cte.FK_RESERVATIONUNIT,
(CASE
WHEN O.begindatetime_user IS NULL THEN '1' --GREEN
WHEN O.begindatetime_user - (Nvl(cte.transitiontime, ( cte.transitionperiod ))/1440 ) > current_date THEN '2' -- ORANGE
WHEN O.begindatetime_user + (Nvl(cte.transitiontime, ( cte.transitionperiod )) /1440 ) > current_date THEN '3' -- RED
ELSE '3'
END ) AS LOCAVAILABILITY_STATUS_CODE,
cte.LOCATION_TYPE,
cte.FK_PROPERTY,
Coalesce(O.sysmelder, cte.fk_person) AS FK_PERSON,
O.syscode AS FK_ORDER,
O.ref_bostate_userdefined AS FK_ORDER_STATE_USER,
O.fk_bostate AS FK_ORDER_STATE_SYSTEM,
FK_MEASUREMENTPOINT,FK_PLC_OCCUPANCY_STATE,FK_FLOOR
from cte left outer join ord_noarchive O on O.syscode in
( SELECT MAX(ord.syscode) KEEP (DENSE_RANK FIRST ORDER BY ord.begindatetime_user) OVER (PARTITION BY ord.sysreseenh )
FROM ord_noarchive ORD
WHERE ( ( (
current_date >= ( ORD.begindatetime_user - ( Nvl(cte.transitiontime, ( cte.transitionperiod ))/1440) )
AND (
current_date - ( Nvl(cte.transitiontime, (cte.transitionperiod )) / 1440 ) ) <=ORD.enddatetime_user )
OR ( (
current_date + ( (
CASE
WHEN (
cte.TYPE = 1 ) THEN 30
ELSE 45
END ) / 1440 ) ) >= ( ORD.begindatetime_user - (Nvl(cte.transitiontime, ( cte.transitionperiod))/1440 ) )
AND (
current_date - ( Nvl(cte.transitiontime, ( cte.transitionperiod )) / 1440 ) ) < ORD.enddatetime_user ) )
AND ORD.sysreseenh IN
(
SELECT fk_reservationunit_parent
FROM pln_reservationunit_rut
WHERE fk_reservationunit_child IN
(
SELECT fk_reservationunit_child
FROM pln_reservationunit_rut
WHERE cte.is_compoundreservationunit = 'T'
AND fk_reservationunit_parent = cte.syscode)
UNION
SELECT cte.syscode
FROM dual
UNION
SELECT
CASE
WHEN cte.is_compoundreservationunit = 'T' THEN fk_reservationunit_child
ELSE fk_reservationunit_parent
END
FROM pln_reservationunit_rut
WHERE (
cte.is_compoundreservationunit = 'T'
AND fk_reservationunit_parent = cte.syscode )
OR (
cte.is_compoundreservationunit = 'F'
AND fk_reservationunit_child = cte.syscode ))
AND ORD.fk_bostate IN
(
SELECT syscode
FROM pln_bostate
WHERE pnname IN ( 'Requested',
'Made',
'AdministrativelyCompleted' )
AND fk_bodefinition = ref_bodefinition)
AND ORD.sysreseenh = O.sysreseenh
))
WHERE cte.is_archived = 'F'
AND cte.TYPE IN ( 1,
3 )
AND cte.fk_floor=495

No time to analyze the details of a very complex query, but my overall sense is that you are trying to do too many things at once. Separate the tasks. Check on room availability in one query, and check on the children/parent thing in a separate query.
Also, you could analyze the execution plan for the query and see what it bogging it down. My suspicion (again without time to really try to understand your query) is that at some point the mixing of tasks is translating into a many-to-many relationship where you have an intermediate result that is a cross product of rows between some of the tables in your query.

Pivot or Unpivot query results?

I am new to the confusing pivot and unpivot operators available in Oracle. Can someone help me pivot the results of a query like this?
SELECT * FROM
(
SELECT
ROUND(((SELECT COUNT(*) FROM TABLE_X WHERE X = 1) / (SELECT (COUNT(*)) FROM TABLE_X) * 100))AS X_FIELD,
ROUND(((SELECT COUNT(*) FROM TABLE_Y WHERE Y = 1) / (SELECT (COUNT(*)) FROM TABLE_Y) * 100))AS Y_FIELD FROM DUAL
);
Results are something like this:
X_FIELD Y_FIELD
---------- ----------
37 26
And I need something like this:
FIELDS PERCENTAGE
---------- ----------
X_FIELD 37
Y_FIELD 26
Please guide me how to achieve this.

Try this:
WITH temp_result AS
(SELECT ROUND((
(SELECT COUNT(*) FROM TABLE_X WHERE X = 1) /(SELECT (COUNT(*)) FROM TABLE_X) * 100))AS X_FIELD,
ROUND(((SELECT COUNT(*) FROM TABLE_Y WHERE Y = 1) /(SELECT (COUNT(*)) FROM TABLE_Y) * 100))AS Y_FIELD
FROM DUAL
)
SELECT *
FROM temp_result UNPIVOT INCLUDE NULLS ( VALUE FOR COL IN (X_FIELD,Y_FIELD));

Symfony2, Doctrine search and order by distance

I have a database of places which I store the latitude and longitude for. I want to query the database to find all places within a radius ($requestedDistance) of a specific latitude ($latitude) and longitude ($longitude).
The below query works and returns only those places within this radius, however how would I order them by distance so the closest is first? In the past, using raw SQL I have done the calculation within the SELECT statement and set it as 'distance' and then used HAVING distance < $requestedDistance ORDER BY distance, however I'm not sure how to add the calculation to the SELECT query using Doctrine Repository.
$d = $this->getDoctrine()->getRepository('XXXWebsiteBundle:Locations')->createQueryBuilder('l');
$d
->add('where','l.enabled = 1')
->andWhere('( 3959 * acos( cos( radians('.$latitude.') )
* cos( radians( l.latitude ) )
* cos( radians( l.longitude )
- radians('.$longitude.') )
+ sin( radians('.$latitude.') )
* sin( radians( l.latitude ) ) ) ) < '.$requestedDistance);
$result= $d->getQuery();
UPDATE
I've tried the following query thanks to #Lazy Ants:
$d = $this->getDoctrine()->getRepository('XXXWebsiteBundle:Locations')->createQueryBuilder('l');
$d
->select('l')
->addSelect(
'( 3959 * acos(cos(radians(' . $latitude . '))' .
'* cos( radians( l.latitude ) )' .
'* cos( radians( l.longitude )' .
'- radians(' . $longitude . ') )' .
'+ sin( radians(' . $latitude . ') )' .
'* sin( radians( l.latitude ) ) ) ) as distance'
)
->andWhere('l.enabled = :enabled')
->setParameter('enabled', 1)
->andWhere('distance < :distance')
->setParameter('distance', $requestedDistance)
->orderBy('distance', 'ASC');
However, it returns the following error:
`An exception occurred while executing 'SELECT COUNT(*) AS dctrn_count FROM
(SELECT DISTINCT il0 FROM (SELECT l0_.id AS il0, l0_.name AS name2, l0_.address1
AS address14, l0_.address2 AS address25, l0_.postcode AS postcode6, l0_.town AS
town7, l0_.county AS county8, l0_.enabled AS enabled11, l0_.date_created AS
date_created12, l0_.date_modified AS date_modified13, l0_.latitude AS
latitude19, l0_.longitude AS longitude20, 3959 * ACOS(COS(RADIANS(53.51331889999999))
* COS(RADIANS(l0_.latitude)) * COS(RADIANS(l0_.longitude) -
RADIANS(-2.935331099999985)) + SIN(RADIANS(53.51331889999999)) *
SIN(RADIANS(l0_.latitude))) AS sclr21 FROM locations l0_ WHERE l0_.enabled = ?
AND sclr21 < ? ORDER BY sclr21 ASC) dctrn_result) dctrn_table' with params
{"1":1,"2":30}:
SQLSTATE[42S22]: Column not found: 1054 Unknown column 'sclr21' in 'where clause'`

$d = $this->getDoctrine()->getRepository('XXXWebsiteBundle:Locations')->createQueryBuilder('l');
$d
->select('l')
->addSelect(
'( 3959 * acos(cos(radians(' . $latitude . '))' .
'* cos( radians( l.latitude ) )' .
'* cos( radians( l.longitude )' .
'- radians(' . $longitude . ') )' .
'+ sin( radians(' . $latitude . ') )' .
'* sin( radians( l.latitude ) ) ) ) as distance'
)
->andWhere('l.enabled = :enabled')
->setParameter('enabled', 1)
->having('distance < :distance')
->setParameter('distance', $requestedDistance)
->orderBy('distance', 'ASC');

You are using andWhere() without a where().
Replace the andWhere() with a where().
Please check also in your Entity definition if some property is mapped on a column called "sclr21", so you can discovery also on which property do you have the issue.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Group By breaks Having clause - count

got it working Seems the group by must happen on the main table, and not one of the joins. Does anyone have an explanation why this is so?

Related

How to get the sum of rows then get the average of the calculate sum

calculate percentages with postgresql join queries

The query performance is very low + correlated subquery

Pivot or Unpivot query results?

Symfony2, Doctrine search and order by distance

Categories

Resources