Azure CosmosDB, count with subquery - count

I'm currently using CosmosDB to store some reviews data. I'm trying to retrieve the average rating from a specific date and also the number of registers that have a rating below 3.
The query that I'm using to retrieve the average rating and count the number of registers with ratings below 3 are:
**Ratings**
SELECT avg(c.x_review) as avg_x_review,
avg(c.y_review) as avg_y_review,
{date} as date
FROM c where c.date = {date}
**Criticals (below 3)**
SELECT count(1) FROM c where c.date = {date}
AND (
c.x_review < 3 OR
c.y_review < 3
)
I wanted it to be inside just one query. This data is going to be retrieved by an Azure Function HTTP Trigger, I will use and save the generated JSON from this query in another container.
I've been searching for tips but nothing seems to work, one example of a query that I'm trying to reach is:
SELECT avg(c.x_review) as avg_x_review,
avg(c.y_review) as avg_y_review,
count(SELECT * FROM c where c.date = {date}
AND (
c.x_review < 3 OR
c.y_review < 3
)
) as criticals
FROM c where c.date = {date}
...or something like that where I could have both queries inside one query alone.
I'm expecting to generate a JSON like this:
{
"avg_x_review": 5,
"avg_y_review": 4,
[...]
[...]
"criticals": 0,
"date": "2022-02-28"
}

Related

order of search for Sqlite's "IN" operator guaranteed?

I'm performing an Sqlite3 query similar to
SELECT * FROM nodes WHERE name IN ('name1', 'name2', 'name3', ...) LIMIT 1
Am I guaranteed that it will search for name1 first, name2 second, etc? Such that by limiting my output to 1 I know that I found the first hit according to my ordering of items in the IN clause?
Update: with some testing it seems to always return the first hit in the index regardless of the IN order. It's using the order of the index on name. Is there some way to enforce the search order?
The order of the returned rows is not guaranteed to match the order of the items inside the parenthesis after IN.
What you can do is use ORDER BY in your statement with the use of the function INSTR():
SELECT * FROM nodes
WHERE name IN ('name1', 'name2', 'name3')
ORDER BY INSTR(',name1,name2,name3,', ',' || name || ',')
LIMIT 1
This code uses the same list from the IN clause as a string, where the items are in the same order, concatenated and separated by commas, assuming that the items do not contain commas.
This way the results are ordered by their position in the list and then LIMIT 1 will return the 1st of them which is closer to the start of the list.
Another way to achieve the same results is by using a CTE which returns the list along with an Id which serves as the desired ordering of the results, which will be joined to the table:
WITH list(id, item) AS (
SELECT 1, 'name1' UNION ALL
SELECT 2, 'name2' UNION ALL
SELECT 3, 'name3'
)
SELECT n.*
FROM nodes n INNER JOIN list l
ON l.item = n.name
ORDER BY l.id
LIMIT 1
Or:
WITH list(id, item) AS (
SELECT * FROM (VALUES
(1, 'name1'), (2, 'name2'), (3, 'name3')
)
)
SELECT n.*
FROM nodes n INNER JOIN list l
ON l.item = n.name
ORDER BY l.id
LIMIT 1
This way you don't have to repeat the list twice.

Counting 2 Columns from Separate Tables Between 2 Dates

I have a query that Counts 2 columns from 2 separate tables using subqueries, which works. Now I have to implement into this query the ability to filter out these results based on the Date of a Call Record. I will post the query in which I am working with:
SELECT (m.FirstName || " " || m.LastName) AS Members,
(
SELECT count(CallToLineOfficers.MemberID)
FROM CallToLineOfficers
WHERE CallToLineOfficers.MemberID = m.MemberID
)
+ (
SELECT count(CallToMembers.MemberID)
FROM CallToMembers
WHERE CallToMembers.MemberID = m.MemberID
) AS Tally
FROM Members AS m, Call, CallToMembers, CallToLineOfficers
Join Call on CallToMembers.CallID = Call.CallID
and CallToLineOfficers.CallID = Call.CallI
WHERE m.FirstName <> 'None'
-- and Call.Date between '2017-03-21' and '2017-03-22'
GROUP BY m.MemberID
ORDER BY m.LastName ASC;
Ok, so table Call stores the Date and its PK is CallID. Both CallToLineOfficers and CallToMembers are Bridge Tables that also contain only CallID and MemberID. With the current query, where the Date is commented out, that Date range should only return all names, but a count of 1 should appear under 1 person's name.
I have tried joining Call.CallID with both Bridge Tables' CallIDs without any luck, though I think this is the right way to do it. Could someone help point me in the right direction? I am lost. (I tried explaining this the best I could, so if you need more info, let me know.)
UPDATED: Here is a screenshot of what I am getting:
Based on the provided date in the sample, the new results, with the Date, should be:
Bob Clark - 1
Rob Catalano - 1
Matt Butler - 1
Danielle Davidson - 1
Jerry Chuska - 1
Tom Cramer - 1
Everyone else should be 0.
At the moment, the subqueries filter only on the member ID. So for any member ID in the outer query, they return the full count.
To reduce the count, you have to filter in the subqueries:
SELECT (FirstName || " " || LastName) AS Members,
(
SELECT count(*)
FROM CallToLineOfficers
JOIN Call USING (CallID)
WHERE MemberID = m.MemberID
AND Date BETWEEN '2017-03-21' AND '2017-03-22'
)
+ (
SELECT count(*)
FROM CallToMembers
JOIN Call USING (CallID)
WHERE MemberID = m.MemberID
AND Date BETWEEN '2017-03-21' AND '2017-03-22'
) AS Tally
FROM Members AS m
WHERE FirstName <> 'None'
ORDER BY LastName ASC;

MS Access Group by query by n records

Does anybody know how to do a group by query by n records.
For example if I have a db with xn records I would like to aggregate the first 3 and then the next 3 and so on.
Where {x,n member of positive integers excluding 0} :)
Thanks
This does exactly what you want :
SELECT int(((T.Rank - 1) / 3)) AS GroupID, SUM(T.field_to_agregate)
FROM
(
SELECT (SELECT COUNT(*) FROM your_table AS T2 WHERE T1.ID>T2.ID) + 1 AS Rank , ID, field_to_agregate
FROM your_table AS T1
) T
GROUP BY int(((T.Rank - 1) / 3))
However since you did not posted any data sample and table structure (mistake!), I had to suppose that you have an ID field in your table, if not you will have to adapt it. If you don't suceed add more info about your data and I will adapt my query to match your table struct

Retrieve a table to tallied numbers, best way

I have query that runs as part of a function which produces a one row table full of counts, and averages, and comma separated lists like this:
select
(select
count(*)
from vw_disp_details
where round = 2013
and rating = 1) applicants,
(select
count(*)
from vw_disp_details
where round = 2013
and rating = 1
and applied != 'yes') s_applicants,
(select
LISTAGG(discipline, ',')
WITHIN GROUP (ORDER BY discipline)
from (select discipline,
count(*) discipline_number
from vw_disp_details
where round = 2013
and rating = 1
group by discipline)) disciplines,
(select
LISTAGG(discipline_count, ',')
WITHIN GROUP (ORDER BY discipline)
from (select discipline,
count(*) discipline_count
from vw_disp_details
where round = 2013
and rating = 1
group by discipline)) disciplines_count,
(select
round(avg(util.getawardstocols(application_id,'1','AWARD_NAME')), 2)
from vw_disp_details
where round = 2013
and rating = 1) average_award_score,
(select
round(avg(age))
from vw_disp_details
where round = 2013
and rating = 1) average_age
from dual;
Except that instead of 6 main sub-queries there are 23.
This returns something like this (if it were a CSV):
applicants | s_applicants | disciplines | disciplines_count | average_award_score | average_age
107 | 67 | "speed,accuracy,strength" | 3 | 97 | 23
Now I am programmatically swapping out the "rating = 1" part of the where clauses for other expressions. They all work rather quickly except for the "rating = 1" one which takes about 90 seconds to run and that is because the rating column in the vw_disp_details view is itself compiled by a sub-query:
(SELECT score
FROM read r,
eval_criteria_lookup ecl
WHERE r.criteria_id = ecl.criteria_id
AND r.application_id = a.lgo_application_id
AND criteria_description = 'Overall Score'
AND type = 'ABC'
) reader_rank
So when the function runs this extra query seems to slow everything down dramatically.
My question is, is there a better (more efficient) way to run a query like this that is basically just a series of counts and averages, and how can I refactor to optimize the speed so that the rating = 1 query doesn't take 90 seconds to run.
You could choose to MATERIALIZE the vw_disp_details VIEW. That would pre-calculate the value of the rating column. There are various options for how up-to-date a materialized view is kept, you would probably want to use the ON COMMIT clause so that vw_disp_details is always correct.
Have a look at the official documentation and see if that would work for you.
http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_6002.htm
Do all most of your queries in only one. Instead of doing:
select
(select (count(*) from my_tab) as count_all,
(select avg(age) from my_tab) as avg_age,
(select avg(mypkg.get_award(application_id) from my_tab) as_avg-app_id
from dual;
Just do:
select count(*), avg(age),avg(mypkg.get_award(application_id)) from my_tab;
And then, maybe you can do some union all for the other results. But this step all by itself should help.
I was able to solve this issue by doing two things: creating a new view that displayed only the results I needed, which gave me marginal gains in speed, and in that view moving the where clause of the sub-query that caused the lag into the where clause of the view and tacking on the result of the sub-query as column in the view. This still returns the same results thanks to the fact that there are always going to be records in the table the sub-query accessed for each row of the view query.
SELECT
a.application_id,
util.getstatus (a.application_id) status,
(SELECT score
FROM applicant_read ar,
eval_criteria_lookup ecl
WHERE ar.criteria_id = ecl.criteria_id
AND ar.application_id = a.application_id
AND criteria_description = 'Overall Score' //THESE TWO FIELDS
AND type = 'ABC' //ARE CRITERIA_ID = 15
) score
as.test_total test_total
FROM application a,
applicant_scores as
WHERE a.application_id = as.application_id(+);
Became
SELECT
a.application_id,
util.getstatus (a.application_id) status,
ar.score,
as.test_total test_total
FROM application a,
applicant_scores as,
applicant_read ar
WHERE a.application_id = as.application_id(+)
AND ar.application_id = a.application_id(+)
AND ar.criteria_id = 15;

Consolidating values from multiple tables

I have an application which has data spread accross 2 tables.
There is a main table Main which has columns - Id , Name, Type.
Now there is a Sub Main table that has columns - MainId(FK), StartDate,Enddate,city
and this is a 1 to many relation (each main can have multiple entries in submain).
Now I want to display columns Main.Id, City( as comma seperated from various rows for that main item from submain), min of start date(from submain for that main item) and max of enddate( from sub main).
I thought of having a function but that will slow things up since there will be 100k records. Is there some other way of doing this. btw the application is in asp.net. Can we have a sql query or some linq kind of thing ?
This is off the top of my head, but firstly I would suggest you create a user defined function in sql to create the city comma separated list string that accepts #mainid, then does the following:
DECLARE #listStr VARCHAR(MAX)
SELECT #listStr = COALESCE(#listStr+',' , '') + city
FROM submain
WHERE mainid = #mainid
... and then return #listStr which will now be a comma separated list of cities. Let's say you call your function MainIDCityStringGet()
Then for your final result you can simply execute the following
select cts.mainid,
cts.cities,
sts.minstartdate,
sts.maxenddate
from ( select distinct mainid,
dbo.MainIDCityStringGet(mainid) as 'cities'
from submain) as cts
join
( select mainid,
min(startdate) as 'minstartdate',
max(enddate) as 'maxenddate'
from submain
group by mainid ) as sts on sts.mainid = cts.mainid
where startdate <is what you want it to be>
and enddate <is what you want it to be>
Depending on how exactly you would like to filter by startdate and enddate you may need to put the where filter within each subquery and in the second subquery in the join you may then need to use the HAVING grouped filter. You did not clearly state the nature of your filter.
I hope that helps.
This will of course be in stored procedure. May need some debugging.
An alternative to creating a stored procedure is performing the complex operations on the client side. (untested):
var result = (from main in context.Main
join sub in context.SubMain on main.Id equals sub.MainId into subs
let StartDate = subs.Min(s => s.StartDate)
let EndDate = subs.Max(s => s.EndDate)
let Cities = subs.Select(s => s.City).Distinct()
select new { main.Id, main.Name, main.Type, StartDate, EndDate, Cities })
.ToList()
.Select(x => new
{
x.Id,
x.Name,
x.Type,
x.StartDate,
x.EndDate,
Cities = string.Join(", ", x.Cities.ToArray())
})
.ToList();
I am unsure how well this is supported in other implimentations of SQL, but if you have SQL Server this works a charm for this type of scenario.
As a disclaimer I would like to add that I am not the originator of this technique. But I immediately thought of this question when I came across it.
Example:
For a table
Item ID Item Value Item Text
----------- ----------------- ---------------
1 2 A
1 2 B
1 6 C
2 2 D
2 4 A
3 7 B
3 1 D
If you want the following output, with the strings concatenated and the value summed.
Item ID Item Value Item Text
----------- ----------------- ---------------
1 10 A, B, C
2 6 D, A
3 8 B, D
The following avoids a multi-statement looping solution:
if object_id('Items') is not null
drop table Items
go
create table Items
( ItemId int identity(1,1),
ItemNo int not null,
ItemValue int not null,
ItemDesc nvarchar(500) )
insert Items
( ItemNo,
ItemValue,
ItemDesc )
values ( 1, 2, 'A'),
( 1, 2, 'B'),
( 1, 6, 'C'),
( 2, 2, 'D'),
( 2, 4, 'A'),
( 3, 7, 'B'),
( 3, 1, 'D')
select it1.ItemNo,
sum(it1.ItemValue) as ItemValues,
stuff((select ', ' + it2.ItemDesc --// Stuff is just used to remove the first 2 characters, instead of a substring.
from Items it2 with (nolock)
where it1.ItemNo = it2.ItemNo
for xml path(''), type).value('.','varchar(max)'), 1, 2, '') as ItemDescs --// Does the actual concatenation..
from Items it1 with (nolock)
group by it1.ItemNo
So you see all you need is a sub query in your select that retrieves a set of all the values you need to concatenate and then use the FOR XML PATH command in that sub query in a clever way. It does not matter where the values you need to concatenate comes from you just need to retrieve them using the sub query.

Resources