I am trying to determine the on time delivery rate of orders:
The column of interest is on time delivery orders, which contains a field of 0 (not on time) or 1 ( on time). How can I calculate in sql the on time rate for each person? Basically count the number of 0 / over total count(0's & 1's) for each person? Same thing for on time ( count 1/total count (0's & 1's)?
Heres a data example:
Week Delivery on time Person
1 0 sARAH
1 0 sARAH
1 1 sARAH
2 1 vIC
2 0 Vic
You may aggregate by person, and then take the average of the on time statistic:
SELECT Person, AVG(1.0*DeliveryOnTime) AS OnTime,
AVG(1.0 - DeliveryOnTime) AS NotOnTime
FROM yourTable
GROUP BY Person;
Demo
The demo given is for SQL Server, and the above syntax might have to change slightly depending on your actual database, which you did not reveal to us.
I am attempting to select the max value within separate columns per a dimension listed in a row as so;
Input Dataset
Person|Date#1 |Date#2 |Date#3 |Date#4
------+--------+--------+--------+---------
Matt |12/01/18|01/15/19|02/15/19|04/15/18
Dave |01/15/18|01/02/19|03/15/19|11/01/19
Desired result
Input Dataset
Person|Max Date|
------+--------+
Matt |02/15/19|
Dave |11/01/19|
Once you fix up your tables to a proper format like YYYY-mm-dd so the table looks like so:
Person Date#1 Date#2 Date#3 Date#4
---------- ---------- ---------- ---------- ----------
Matt 2018-12-01 2019-01-15 2019-02-15 2018-04-15
Dave 2018-01-15 2019-01-12 2019-03-15 2019-11-01
it becomes a trivial
SELECT Person, max("Date#1", "Date#2", "Date#3", "Date#4") AS "Max Date" FROM mytable;
Person Max Date
---------- ----------
Matt 2019-02-15
Dave 2019-11-01
Remember, sqlite does not have any date or time types. It uses strings or numbers to hold those values. When storing dates as strings, they have to be formatted in a way that can be compared meaningfully. '04/15/18' is greater than '01/15/19' because the character 4 is greater than the character 1. None of the standard time string formats have that problem.
I have this Kusto code that I have been trying to develop and any help would be greatly appreciated.
The objective is to count to the first occurrence of the CurrentOwningTeamId in the OwningTeamId column.
I packed the Owning Team number and parsed the value into a column of its own. I need to count the owning teams until I get to the current owning team.
Columns are (example):
Objective: Count to the first occurrence of the CurrentOwningTeam value in the OwningTeamId column using Kusto (Application Insights code):
[CODE]
OwningTeamId, CurrenOwningTeam, CreateDate, RequestType
155523 **888888** 2017-07-02 PRIMARY
256924 **888888** 2017-08-02 TRANSFER
**888888** **888888** 2017-09-02 TRANSFER
954005 **888888** 2017-10-02 TRANSFER
**888888** **888888** 2017-11-02 TRANSFER
155523 **888888** 2017-12-02 TRANSFER
954005 **888888** 2017-13-02 TRANSFER
**888888** **888888** 2017-14-02 TRANSFER
[/CODE]
I think you can match the current owning team with the countof() function, but I don't know how to go about it using regex. Note: values are different with each owning team on every incident, is why I capture the owning team on the incident first and try to count the very first instance of the CurrentOwningTeam number in the OwningTeamId column. In other words I want to count the number of times it takes to get to the very first owning team. In this case, it would be three.
Note: OwningTeamId's and CurrentOwningTeam can change on every incident, I first capture the CurrentOwningTeam then try to match in the OwningTeamId column.
Note: This is just one incident, but I am trying to do multiple Incidents.
Below is how I got the Current Owning Team Value.
[/CODE]
| extend CurrentOwningTeam=pack_array(OwningTeamId)
| parse CurrentOwningTeam with * "[" CurrentOwningTeam:int "]" *
| serialize CurrentOwningTeam
[/CODE]
I tried using row_number() but it will not work for multiple incidents, only per incident, so I have to use count or countof functions or another way of doing it.
Thanks for clarification. Here is a suggestion for a query that counts ordered by-time rows until certain condition is reached (count is contextual using IncidentId key).
datatable(IncidentId:string, OwningTeamId:string, CurrentOwningTeam:string, CreateDate:datetime, RequestType:string)
[
'Id1','155523','888888',datetime(2017-02-07),'PRIMARY',
'Id1','256924','888888',datetime(2017-02-08),'TRANSFER',
'Id1','888888','888888',datetime(2017-02-09),'TRANSFER',
'Id1','954005','888888',datetime(2017-02-10),'TRANSFER',
'Id1','888888','888888',datetime(2017-02-11),'TRANSFER',
'Id1','155523','888888',datetime(2017-02-12),'TRANSFER',
'Id1','954005','888888',datetime(2017-02-13),'TRANSFER',
'Id1','888888','888888',datetime(2017-02-14),'TRANSFER',
// Id2
'Id2','155523','888888',datetime(2017-02-07),'PRIMARY',
'Id2','256924','888888',datetime(2017-02-08),'TRANSFER',
'Id2','999999','888888',datetime(2017-02-09),'TRANSFER',
'Id2','954005','888888',datetime(2017-02-10),'TRANSFER',
'Id2','888888','888888',datetime(2017-02-11),'TRANSFER',
'Id2','155523','888888',datetime(2017-02-12),'TRANSFER',
'Id2','954005','888888',datetime(2017-02-13),'TRANSFER',
'Id2','888888','888888',datetime(2017-02-14),'TRANSFER',
]
| order by IncidentId, CreateDate asc
| extend c= row_cumsum(1, IncidentId!=prev(IncidentId))
| where OwningTeamId == CurrentOwningTeam
| summarize arg_min(CreateDate, c) by IncidentId
Result:
IncidentId CreateDate c
Id1 2017-02-09 00:00:00.0000000 3
Id2 2017-02-11 00:00:00.0000000 5
Here are the links to the docs that point how to find earliest record using arg_min() aggregation, and link to the row_cumsum() (cumulative sum) function.
https://learn.microsoft.com/en-us/azure/kusto/query/arg-min-aggfunction
https://learn.microsoft.com/en-us/azure/kusto/query/rowcumsumfunction
I figured it out by using the RowNumber directly into grouping inside the table, then finally summing to get my total count.
[CODE]
| serialize Id
| extend RowNumber=row_number(1, (Id) ==Id)
| summarize TotalOwningTeamChanges=sum(RowNumber) by Id
[/CODE]
Then after that I got the Minimum Date to extract the entire data set to the first instance of the current OwningTeamName.
[CODE]
//Outside the scope of the table.
| extend ExtractFirstOwningTeamCreateDate=CreateDate2
| extend VeryFirstOwningTeamCreateDate=MinimumCreateDate
| where FirstOwningTeamRow == true or MinimumCreateDate <=
ExtractFirstOwningTeamCreateDate
| serialize VeryFirstOwningTeamCreateDate
[/CODE]
My Table comprises 4 Columns (Patient, Sample, Analysis and Component). I am trying to write a query that will look at the combination of Patient, Analysis and Component for each record and assign a "Section Number".
The numbering should re-start for every patient.
See expected output below. Patient 1010 has 3 samples but all have same Analysis-component. Hence they all have the same section (1).
Now, counting restarts for Patient 2020. This patient has 2 samples but both have a different Analysis-Component combination. Hence they are placed in separate sections 1 and 2.
Patient Sample Analysis Component Section Number
_______ ______ ________ _________ ______________
1010 720000140249 CALC Calcium 1
1010 720000140288 CALC Calcium 1
1010 720000140288 CALC Calcium 1
2020 720000190504 ALB Albumin 1
2020 720000160504 ALB Albumin Pct 2
3030 720000134568 CALC Calcium 1
3030 720000123404 ALB Albumin 2
3030 720000160765 ALB Albumin Pct 3
I have written the following query but all it does is groups samples with the same Component into one section. It does not consider the Patient or Analysis at all.
Your help is much appreciated (as always!)
select
x.patient, x.sample_number, x.analysis, x.component
a.myRowCount
from
X_PREV_PAT_RESULTS x inner join (
select distinct
x1.COMPONENT
, ROW_NUMBER() OVER (ORDER BY x1.COMPONENT) myRowCount
from X_PREV_PAT_RESULTS x1
group by x1.patient ) A on x.COMPONENT = A.COMPONENT
order by a.myRowCount, x.patient;
My guess is that you want
dense_rank() over (partition by patient
order by analysis desc, component) myRowCount
What happens with rows after a tie? If patient 1010 gets an ALB analysis? Would that have a MyRowCount of 2? Or 4? rank would return 4. dense_rank would return 2.
How are you determining the order of rows for a partiticular patient? It appears that you're going in reverse alphabetical order for analysis and then alphabetically for component but that seems like a pretty unusual ordering.
select x.patient, x.sample_number, x.analysis, x.component,
dense_rank() over(partition by x.patient order by x.analysis, x.component)
from X_PREV_PAT_RESULTS x
where exists (select 1 from X_PREV_PAT_RESULTS x1 where x1.COMPONENT = x.COMPONENT);
I have a database with a long list of records. Most of the columns have foreign keys to other tables.
Example:
ID SectorId BranchId
-- -------- --------
5 3 5
And then I will have a table with sectors, branches ect.
My issue:
I want to know how many records which has sector 1, 2, 3 ... n. So what I want is a group by Sector and then some count(*) which will tell me how many there is of each.
Expected output
So for instance, if I have 20 records the result might look like this:
SectorId Count
-------- -----
1 3
2 10
3 4
4 6
My attempts so far
I do not normally work a lot with databases and I have been trying to solve this for 1.5 hours. I have tried something like this:
SELECT COUNT(*)
FROM Records r
GROUP BY r.Sector
WHERE r.Date BETWEEN '2011-01-01' AND '2011-12-31'
But... errors and problems all over!
I would really appreciate some help. I do know this is probably very simple.
Thanks!
The sequence of your query is not correct; it should be like this: -
SELECT COUNT(*)
FROM Records r
WHERE r.Date BETWEEN '2011-01-01' AND '2011-12-31'
GROUP BY r.Sector
The output will be only counts i.e.
count
-----
3
10
4
6
If you want to fetch both sector and count then you need to modify the query a little
SELECT r.Sector, COUNT(*) as Count
FROM Records r
WHERE r.Date BETWEEN '2011-01-01' AND '2011-12-31'
GROUP BY r.Sector
The output will be like this: -
Sector Count
------ -----
1 3
2 10
3 4
3 6
Your query was partially right,But it needs some modification.
If I write this way:-
SELECT r.SectorID,COUNT(*) AS count
FROM Records r
WHERE r.Date BETWEEN '2011-01-01' AND '2011-12-31'
GROUP BY r.SectorID
Then output will be:-
SectorID Count
1 3
2 10
3 4
4 6