Count of sick leave per day category - count

I have a PowerBI measure to calculate sick leave days between 2 dates.
CALCULATE(
COUNTROWS(PERIOD),
DatesBetween(
'Period'[Datekey],
min('Leave Management'[DateFrom]),
max('Leave Management'[DateTo])),
Filter(
Period,
'Period'[ISWorkDay] = 1 &&
'Period'[IsPublicHoliday] = FALSE
),
ALLSELECTED('Leave Management')
)```
a second measure(may not be essential) to categorize the leave days as
```DateDiff =
SWITCH(
True(),
'Staff Management'[workdays without holidays] <= 1 , 1,
'Staff Management'[workdays without holidays] = 2 , 2 ,
'Staff Management'[workdays without holidays] >= 3 , 3
)
table reference...[Table/Matrix][1]
I would like to count the number of Days per category, I have tried measures...
1DayLeave Distinct Count =
COUNTROWS(FILTER('Leave Management',[DateDiff]=1))
3DayCount =
COUNTX(Summarize(FILTER('Staff Management','Leave Management'[DateDiff] = 3
),"Leave 3 Day Catergory",'Leave Management'[DateDiff]),[DateDiff])
```[enter image description here][1]
[1]: https://i.stack.imgur.com/Wq9DV.png

Related

Rolling 4 week average conversion rate, in R or Power BI

I am trying to create a rolling 4 week average conversion rate. The column LTA is the conversion rate and equals (Appts/Leads). Right now, LTA is week by week. I need to create a new column that is a 4 rolling conversion rate.
Here is the data.
Week Leads Appts LTA
4/17/2022 205 83 40.49%
4/24/2022 126 68 53.97%
5/1/2022 117 40 34.19%
5/8/2022 82 38 46.34%
5/15/2022 60 32 53.33%
5/22/2022 45 19 42.22%
5/29/2022 25 19 76.00%
So if we started at the bottom, the RollingAvg for May 29 would be (19+19+32+38)/(25+45+60+82) = 50.943 %
For the week may 22, the numbers would roll back one week, so it'd be (19+32+38+0)/(45+60+82+117) = 29.276 %
Help would be appreciated.
transform(data.frame(lapply(df, zoo::rollsum, k=4)), roll = Appts/Leeds * 100)
Leeds Appts roll
1 530 38 7.169811
2 385 70 18.181818
3 304 89 29.276316
4 212 108 50.943396
Simple solution for a calculated column in DAX:
RollingAvg =
VAR _currentDate = [Week]
VAR _minDate = _currentDate - 4*7
RETURN
CALCULATE (
DIVIDE (
SUM ( 'Table'[Appts] ) ,
SUM ( 'Table'[Leads] )
),
// Lift filters on table to have all rows visible
ALL ( 'Table' ) ,
// Add constraints to dates for a 4-week average
'Table'[Week] <= _currentDate ,
'Table'[Week] > _minDate
)
Or better yet, a measure that doesn't take up space in the data model:
RollingAvgMeasure =
/*
Calculates the 4-week rolling average if used with the Week dimension.
Else calculates the total rolling average.
*/
VAR _currentDate = MAX ( 'Table'[Week] )
VAR _minDate = _currentDate - 4*7
VAR _movingAvg =
CALCULATE (
DIVIDE (
SUM ( 'Table'[Appts] ) ,
SUM ( 'Table'[Leads] )
),
ALL ( 'Table' ) ,
'Table'[Week] <= _currentDate ,
'Table'[Week] > _minDate
)
VAR _total = DIVIDE ( SUM ( 'Table'[Appts] ) , SUM ( 'Table'[Leads] ) )
RETURN
// Replace if-statement with only return of _movingAvg to display the latest 4-week value.
IF (
ISFILTERED ( 'Table'[Week] ),
_movingAvg ,
_total
)

Can I create a date variable in R for dbGetQuery?

This is my code:
cohort_query <- dbGetQuery(con,'select u.unique_id customer_id,
o.user_id user_hash,
"min"(o.deliveryconfirmeddate) first_order,
(
CASE
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(\'2021-11-01\') - INTERVAL \'1\' YEAR)
) THEN 0
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(\'2021-11-01\') - INTERVAL \'2\' YEAR)
) THEN 1
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(\'2021-11-01\') - INTERVAL \'3\' YEAR)
) THEN 2
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(\'2021-11-01\') - INTERVAL \'4\' YEAR)
) THEN 3
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(\'2021-11-01\') - INTERVAL \'5\' YEAR)
) THEN 4
ELSE 5
END
) "cohort_year",
min(o.id) as first_order_id
FROM (
"bj-analytics"."mysql_bj_orders" o
INNER JOIN "bj-analytics"."mysql_bj_users" u ON (u.user_id = o.user_id)
)
WHERE (
(
(o.connectedorder = false)
AND (o.status <> \'cancelled\')
)
AND (o.status <> \'Cancelled\')
)
GROUP BY 1,
2
HAVING (
"min"(o.deliveryconfirmeddate) >= (date(\'2021-11-01\') - INTERVAL \'6\' YEAR)
)
and "min"(o.deliveryconfirmeddate) < date(\'2021-11-01\')')
This all runs fine. Basically, I want to make the 2021-11-01 date a variable, so that I only have to enter it once.
Someone recommended glue_sql but I couldn't make it work. I think it's an issue with the double quotation marks and the single marks. I tried to change these to all "" or to all '' but then the query just doesn't run!
I also found that I have to include \'s - again without them doesn't seem to run.
I'm new to R so not too sure how to get around this!
Basically can anyone please help with turning that date into a variable?
Any help is much appreciated - thanks!
Never use paste or sprintf to put "data" into a query, for many reasons (query optimization/caching and accidental sql-injection being the top two), instead use parameter-binding:
DBI::dbGetQuery(con, "
select (case
when (5 < ?) then 5
when (3 < ?) then 3
when (1 < ?) then 1
else 0 end) as quux",
params = replicate(n=3, expr=3, simplify = FALSE))
# quux
# 1 1
DBI::dbGetQuery(con, "
select (case
when (5 < ?) then 5
when (3 < ?) then 3
when (1 < ?) then 1
else 0 end) as quux",
params = replicate(n=3, expr=10, simplify = FALSE))
# quux
# 1 5
The replicate(3,10,F) is just a programmatic way to do list(10,10,10), they are equivalent here. My use of static comparisons (5 < ?) is solely for placeholders, your min(.) should work fine.
(FYI, one more benefit of parameter-binding: no need to deal with additional quotes in the query.)
See https://db.rstudio.com/best-practices/run-queries-safely/#parameterized-queries for more discussions on safely using data in the query.
I think that would make your query this:
cohort_query <- dbGetQuery(con, 'select u.unique_id customer_id,
o.user_id user_hash,
"min"(o.deliveryconfirmeddate) first_order,
(
CASE
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(?) - INTERVAL \'1\' YEAR)
) THEN 0
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(?) - INTERVAL \'2\' YEAR)
) THEN 1
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(?) - INTERVAL \'3\' YEAR)
) THEN 2
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(?) - INTERVAL \'4\' YEAR)
) THEN 3
WHEN (
"min"(o.deliveryconfirmeddate) >= (date(?) - INTERVAL \'5\' YEAR)
) THEN 4
ELSE 5
END
) "cohort_year",
min(o.id) as first_order_id
FROM (
"bj-analytics"."mysql_bj_orders" o
INNER JOIN "bj-analytics"."mysql_bj_users" u ON (u.user_id = o.user_id)
)
WHERE (
(
(o.connectedorder = false)
AND (o.status <> \'cancelled\')
)
AND (o.status <> \'Cancelled\')
)
GROUP BY 1,
2
HAVING (
"min"(o.deliveryconfirmeddate) >= (date(?) - INTERVAL \'6\' YEAR)
)
and "min"(o.deliveryconfirmeddate) < date(?)'),
params = replicate(n=7, expr='2021-11-01', simplify=FALSE)

Using Case statement with multiple when in Teradata

Suppose I have a data table with the following fields:
CUSTOMER: either A or B
DAY: either Monday or Tuesday
PAID: either Y or N
The total number of rows being four, let's say the data table is this:
CUSTOMER DAY PAID
A Monday Y
A Tuesday N
B Monday Y
B Tuesday N
How do I create a SQL query on Teradata SQL Assistant, that will show the number of people who were Y on Monday and N on Tuesday? (or any of these combinations) I tried to use the query below, but cannot seem to figure out the logic. Your help is much appreciated!
SELECT DAY,
COUNT(CASE PAID WHEN 'Y' THEN CUSTOMER ELSE 0 END) AS PAID_CUSTOMERS,
COUNT(CASE WHEN PAID = 'Y' AND DAY = 'Monday' AND DAY = 'Tuesday' AND PAID = 'N' THEN CUSTOMER ELSE 0 END) AS CUSTOMERS_YM_NT
FROM T1
GROUP BY 1
ORDER BY 1
So, break each day down into a separate case statement:
case when DAY = 'Monday' and PAID = 'NO' then 'NO' else 'YES' end as Monday,
case when DAY = 'Tuesday' and PAID = 'NO' then 'NO' else 'YES end as Tuesday,
etc
Then, you can wrap that with another select and apply whatever criteria you want:
select
<whatever columns>
from
(select
case when DAY = 'Monday' and PAID = 'NO' then 'NO' else 'YES' end as Monday,
case when DAY = 'Tuesday' and PAID = 'NO' then 'NO' else 'YES end as Tuesday,
...
) t
where Monday 'YES'
and Tuesday = 'YES'

T-SQL - Determine number of specific days between 2 dates based on sales

The Problem: Given a day of the week (1, 2, 3, 4, 5, 6, 7), a starting date and an ending date, compute the number of times the given day of the week appears between the starting and ending dates not inclusive of a date for which there were no sales.
Context:
Table "Ticket" has the following structure and sample content:
i_ticket_id c_items_total dt_create_time dt_close_time
----------------------------------------------------------------------------
1 8.50 '10/1/2012 10:23:00' '10/1/2012 11:05:05'
2 10.50 '10/1/2012 11:00:00' '10/1/2012 11:45:05'
3 8.50 '10/2/2012 08:00:00' '10/2/2012 09:25:05'
4 8.50 '10/4/2012 08:00:00' '10/4/2012 09:25:05'
5 7.50 '10/5/2012 13:22:23' '10/5/2012 14:33:27'
.
.
233 6.75 '10/31/2012 23:20:00' '10/31/2012 23:55:39'
Details
There may or may not be any tickets for one or more days during a month. (i.e. the place was closed that/those day/s)
Days in which the business is closed are not regular. There is no predictable pattern.
Based on Get number of weekdays (Sundays, Mondays, Tuesdays) between two dates SQL,
I have derived a query which returns the number of times a given day of the week occurs between the start date and the end date:
DECLARE #dtStart DATETIME = '10/1/2013 04:00:00'
DECLARE #dtEnd DATETIME = '11/1/2013 03:59:00'
DECLARE #day_number INTEGER = 1
DECLARE #numdays INTEGER
SET #numdays = (SELECT 1 + DATEDIFF(wk, #dtStart, #dtEnd)-
CASE WHEN DATEPART(weekday, #dtStart) #day_number THEN 1 ELSE 0 END -
CASE WHEN DATEPART(weekday, #dtEnd) <= #day_number THEN 1 ELSE 0 END)
Now I just need to filter this so that any zero-dollar days are not included in the count. Any help you can provide to add this filter based on the contents of the tickets table is greatly appreciated!
If I understand correctly, you can use a calendar table to count the number of days where the day of week is n and between the start and end and is a date that has ticket sales, which I guess is when the date exists in tickets and has the sum(c_items_total) > 0
WITH cal AS
(
SELECT cast('2012-01-01' AS DATE) dt, datepart(weekday, '2012-01-01') dow
UNION ALL
SELECT dateadd(day, 1, dt), datepart(weekday, dateadd(day, 1, dt))
FROM cal
WHERE dt < getdate()
)
SELECT COUNT(1)
FROM cal
WHERE dow = 5
AND dt BETWEEN '2012-04-01' AND '2012-12-31'
AND EXISTS (
SELECT 1
FROM tickets
WHERE cast(dt_create_time AS DATE) = dt
GROUP BY cast(dt_create_time AS DATE)
HAVING sum(c_items_total) > 0
)
OPTION (MAXRECURSION 0)
SQLFiddle

SQLite Count Summary Query

I'm trying to get a query to summarize each employees work for the week. For example, John Doe did a total of 12 tickets for the week, 4 of which were Break/Fixes, and 4 were Enhancement, and another 4 were uncategorized.
This is what I have so far:
SELECT (users.first_name || ' ' || users.last_name) AS Name,
COUNT(tickets.id) AS 'Number of Tickets Closed',
COUNT(tickets.category = 'Maintenance') AS 'Maintenance Tickets',
COUNT(tickets.category = 'After Hours') AS 'After Hours Tickets',
COUNT(tickets.category = 'Break Fix') AS 'Break Fix Tickets',
COUNT(tickets.category = 'Enhancement') AS 'Enhancement Tickets',
COUNT(tickets.category = '') AS 'Non Categorized Tickets'
FROM tickets, users
ON tickets.assigned_to=users.id
WHERE (tickets.status = 'closed') AND
(tickets.closed_at >= '2011-07-16 00:00:00') AND
(tickets.closed_at <= '2011-07-22 23:59:59')
GROUP BY Name;
Here is a sample result:
John Doe1 10 10 10 10 10 10
John Doe2 2 2 2 2 2 2
John Doe3 25 24 24 24 24 24
John Doe4 2 2 2 2 2 2
John Doe5 12 10 10 10 10 10
John Doe6 7 7 7 7 7 7
This query doesn't quite work as I expected it to as all of the columns have the same total (The total number of tickets closed, the following columns seems to only contain the categorized ones.) Help?
EDIT
Just wanted to post the functional code:
SELECT (users.first_name || ' ' || users.last_name) AS Name,
COUNT(tickets.id) AS 'Number of Tickets Closed',
COUNT(case tickets.category when 'Maintenance' then 1 else null end) AS 'Maintenance Tickets',
COUNT(case tickets.category when 'After Hours' then 1 else null end) AS 'After Hours Tickets',
COUNT(case tickets.category when 'Break Fix' then 1 else null end) AS 'Break Fix Tickets',
COUNT(case tickets.category when 'Enhancement' then 1 else null end) AS 'Enhancement Tickets',
COUNT(case tickets.category when '' then 1 else null end) AS 'Non Categorized Tickets'
FROM tickets, users
ON tickets.assigned_to=users.id
WHERE (tickets.status = 'closed') AND
(tickets.closed_at >= '2011-07-16') AND
(tickets.closed_at <= '2011-07-22')
GROUP BY Name;
you may want to use COUNT like this
...
COUNT(case tickets.category when 'Maintenance' then 1 else null end),
COUNT(case tickets.category when 'After Hours' then 1 else null end),
...
It seems to me you cannot use an alias in the GROUP BY clause. Don't your users have an ID you could use to differenciate them?
And you must use SUM instead of COUNT if you want to count compared with a condition.
SELECT (users.first_name || ' ' || users.last_name) AS Name,
COUNT(tickets.id) AS 'Number of Tickets Closed',
SUM(tickets.category = 'Maintenance') AS 'Maintenance Tickets',
SUM(tickets.category = 'After Hours') AS 'After Hours Tickets',
SUM(tickets.category = 'Break Fix') AS 'Break Fix Tickets',
SUM(tickets.category = 'Enhancement') AS 'Enhancement Tickets',
SUM(tickets.category = '') AS 'Non Categorized Tickets'
FROM tickets, users
ON tickets.assigned_to=users.id
WHERE (tickets.status = 'closed') AND
(tickets.closed_at >= '2011-07-16 00:00:00') AND
(tickets.closed_at <= '2011-07-22 23:59:59')
GROUP BY Name;

Resources