Case When Statement in Google Data Studio not working - case

I'm trying to create a CASE WHEN statement in Google Data Studio. However, it not applying to my table. I tried a few combinations and nothing seems to work. It also says that the syntax is valid.

It can be achieved using the 3 Calculated Fields below:
1) Metric (Needs Intervention)
COUNT(CASE
WHEN Absenteeism Avg in Days > 15 THEN "Needs Intervention"
ELSE NULL END)
2) Metric (Stable)
COUNT(CASE
WHEN Absenteeism Avg in Days <= 15 THEN "Stable"
ELSE NULL END)
3) Dimension (Needs Intervention / Stable)
CASE
WHEN Absenteeism Avg in Days > 15 THEN "Needs Intervention"
ELSE "Stable"
END
Google Data Studio Report and a GIF to elaborate:

Related

Multiple time periods comparison in Power BI / DAX

I have a question regarding a multiple-year comparison report in Power BI and DAX. The first five columns in the below table show the example data, the sixth column shows the two requirements, and the last column shows whether a plan _ID in the first column meets the requirement. I hope to count the number of plan_IDs which meet both requirements for a specific season (e. g. Spring 2018).
enter image description here
As you can see from the last column, Spring 2018 has 3 "yes" while Spring 2019 has 6. Therefore, for Spring 2019, the "Count of Plans for This Year" is 6 while the "Count of Plans for The Last Year" is 3, as shown in the table below. The table is what I want to have.
enter image description here
My question is how to count the plans that meet the two requirements for a specific season/season_number such as Spring 2019/190.
I have been struggling in this situation for a long time. Any ideas or suggestions will be greatly appreciated.
One of the best ways is to create comparative table (with an inactive relationship with the fact table), the user will have the possibility to choose any two time perdiods and the single visual will show the comparison : 
Here are two good videos explaining this subject fully:
https://www.youtube.com/watch?v=TF0lPIYjJfs
https://www.youtube.com/watch?v=knXFVf2ipro
If my understanding (as commented) is correct, just add a Custom Column to your table as below in the Power Query Editor-
Changed Type in line 2 is basically the Previous Step Name. Adjust this accordigly.
= Table.AddColumn(
#"Changed Type",
"counter or not",
each
let
current_year = Date.Year(DateTime.LocalNow()),
year_difference = current_year - [year] + 1,
current_date = DateTime.Date(DateTime.LocalNow()),
date_to_check_with = Date.AddYears(current_date,-(year_difference)),
meet_req = if [plan_date] <= date_to_check_with then "Yes" else "No"
in meet_req
)
And here is the final output-
Now your the column with Yes/No value. To count Yes per year, come back to report and create this below Measure-
yearly_yes_count =
CALCULATE(
COUNT(your_table_name[counted or not]),
FILTER(
ALLEXCEPT(your_table_name,your_table_name[year]),
your_table_name[counted or not] = "Yes"
)
)
Here is last year count-
last_year_yes_count =
VAR current_row_year = MIN(your_table_name[year])
RETURN
CALCULATE(
COUNT(your_table_name[counted or not]),
FILTER(
ALL(your_table_name),
your_table_name[counted or not] = "Yes"
&& your_table_name[Year] = current_row_year - 1
)
)
Now add Year and new measure yearly_yes_count to a table visual and the output will be as below-

How to Subtract Pageviews in Google Data Studio using a CASE statement?

I want to subtract Pageviews of a particular page from Pageviews of a different page, but when I try using COUNT with CASE, I get 1:
COUNT(CASE
WHEN page = "www.link1.com" THEN 1 END)
This gives me a wrong COUNT:
COUNT(CASE
WHEN page = "www.link1.com" THEN 1
ELSE 0 END)
What I ultimately want to do is:
COUNT(CASE
WHEN page="www.link1.com" OR page = "www.link2.com" THEN 1
ELSE 0 END) - COUNT(CASE
WHEN page="www.link3.com" THEN 1
ELSE 0 END)
I want the COUNT of Users who have visited link3 but NOT from link1 and link2. These links are steps in a funnel. link1 is the first step in the funnel but link2 and link3 have more Pageviews. I want to show how many users have come from sources other than the previous funnel step (i.e, link1).
Summary
One way it can be achieved is by using either the RegEx Formula (#2) or the CASE Statement (#3), however, as Pageviews is an aggregated Metric, the Calculated Fields will produce the below message when created at the Data Source:
Sorry, calculated fields can't mix metrics (aggregated values) and dimensions (non-aggregated values). Please check the aggregation types of the fields used in this formula. Learn more.
For future reference, added an Image:
The solution is to first use Data Blending to disaggregate the Pageviews field (#1) and then apply the Calculated Field (#2 or #3):
1) Data Blending
Data Source 1
Join Key 1: Date
Join Key 2: Page
Metric: Pageviews
Data Source 2
Join Key 1: Date
Join Key 2: Page
An image to elaborate:
2) RegEx Formula
SUM(NARY_MAX(CAST(REGEXP_REPLACE(CONCAT(Page, ";", Pageviews), "(www\\.link1\\.com|www\\.link2\\.com);(.*)", "\\2") AS NUMBER ), 0 ) ) - SUM(NARY_MAX(CAST(REGEXP_REPLACE(CONCAT(Page, ";", Pageviews), "(www\\.link3\\.com);(.*)", "\\2") AS NUMBER ), 0 ) )
3) (Alternative Calculated Field) CASE Statement
SUM(CASE
WHEN Page IN ("www.link1.com", "www.link2.com") THEN Pageviews
ELSE 0 END) - SUM(CASE
WHEN Page IN ("www.link3.com") THEN Pageviews
ELSE 0 END)
Google Data Studio Report and a GIF to elaborate:

PL/SQL - need to do some conditioned calculations

I am a beginner in pl/sql so don't be too harsh.
I have a table with Column_A(Current month amount) and Column_B (previous month amount) as number. I need to write a condition for some calculations: "column_A - Column_b=result. If result > 0 (meaning that there is an increase in current month compared to previous), the result + column_A.
I don't know how to write this one.
You can try a query like below.
UPDATE your_table SET column_A=
( CASE
WHEN (column_A - Column_b)>0 THEN (column_A +(column_A - Column_b))
ELSE (column_A)
END )
This will check for all records that have a difference greater than zero and will update the column_A with the result which is a sum of Columns_A and the difference.
Hope this helps. Wish you a great learning!
Edited:
Well if you are just trying to manipulate data for display then you can simplify your query as below, which will do the same functionality.
SELECT (CASE
WHEN (Current_month_amount - previous_month_amount)>0 THEN
(Current_month_amount +(Current_month_amount -
previous_month_amount))
ELSE (Current_month_amount)
END ) AS Current_month_amount,
previous_month_amount,
(Current_month_amount - previous_month_amount) AS Amount_Difference
from table_1

Firebase exported to BigQuery: retention cohorts query

Firebase offer split testing functionality through Firebase remote configuration, but there are lack of ability to filter retention in cohorts sections with user properties (with any property in actual fact).
In quest of solution for this problem i'm looking for BigQuery, in reason of Firebase Analytics provide usable way to export data to this service.
But i stuck with many questions and google has no answer or example which may point me to the right direction.
General questions:
As first step i need to aggregate data which represent same data firebase cohorts do, so i can be sure my calculation is right:
Next step should be just apply constrains to the queries, so they match custom user properties.
Here what i get so far:
The main problem – big difference in users calculations. Sometimes it is about 100 users, but sometimes close to 1000.
This is approach i use:
# 1
# Count users with `user_dim.first_open_timestamp_micros`
# in specified period (w0 – week 1)
# this is the way firebase group users to cohorts
# (who started app on the same day or during the same week)
# https://support.google.com/firebase/answer/6317510
SELECT
COUNT(DISTINCT user_dim.app_info.app_instance_id) as count
FROM
(
TABLE_DATE_RANGE
(
[admob-app-id-xx:xx_IOS.app_events_],
TIMESTAMP('2016-11-20'),
TIMESTAMP('2016-11-26')
)
)
WHERE
STRFTIME_UTC_USEC(user_dim.first_open_timestamp_micros, '%Y-%m-%d')
BETWEEN '2016-11-20' AND '2016-11-26'
# 2
# For each next period count events with
# same first_open_timestamp
# Here is example for one of the weeks.
# week 0 is Nov20-Nov26, week 1 is Nov27-Dec03
SELECT
COUNT(DISTINCT user_dim.app_info.app_instance_id) as count
FROM
(
TABLE_DATE_RANGE
(
[admob-app-id-xx:xx_IOS.app_events_],
TIMESTAMP('2016-11-27'),
TIMESTAMP('2016-12-03')
)
)
WHERE
STRFTIME_UTC_USEC(user_dim.first_open_timestamp_micros, '%Y-%m-%d')
BETWEEN '2016-11-20' AND '2016-11-26'
# 3
# Now we have users for each week w1, w2, ... w5
# Calculate retention for each of them
# retention week 1 = w1 / w0 * 100 = 25.72181359
# rw2 = w2 / w1 * 100
# ...
# rw5 = w5 / w1 * 100
# 4
# Shift week 0 by one and repeat from step 1
BigQuery queries tips request
Any tips and directions to go about building complex query which may aggregate and calculate all data required for this task in one step is very appreciated.
Here is BigQuery Export schema if needed
Side questions:
why all the user_dim.device_info.device_id and user_dim.device_info.resettable_device_idis null?
user_dim.app_info.app_id is missing from the doc (if firebase support teammate will be read this question)
how event_dim.timestamp_micros and event_dim.previous_timestamp_micros should be used, i can not get their purpose.
PS
It will be good someone from Firebase teammate answer this question. Five month ago there are was one mention about extending cohorts functionality with filtering or show bigqueries examples, but things are not moving. Firebase Analytics is way to go they said, Google Analytics is deprecated, they said.
Now i spend second day to lean bigquery and build my own solution over the existing analytics tools. I no, stack overflow is not the place for this comments, but guys are you thinking? Split testing may grammatically affect retention of my app. My app does not sold anything, funnels and events is not valuable metrics in many cases.
Any tips and directions to go about building complex query which may aggregate and calculate all data required for this task in one step is very appreciated.
yes, generic bigquery will work fine
Below is not the most generic version, but can give you an idea
In this example I am using Stack Overflow Data available in Google BigQuery Public Datasets
First sub-select – activities – in most cases the only what you need to re-write to reflect specifics of your data.
What it does is:
a. Defines period you want to set for analysis.
In example below - it is a month - FORMAT_DATE('%Y-%m', ...
But you can use year, week, day or anything else – respectively
• By year - FORMAT_DATE('%Y', DATE(answers.creation_date)) AS period
• By week - FORMAT_DATE('%Y-%W', DATE(answers.creation_date)) AS period
• By day - FORMAT_DATE('%Y-%m-%d', DATE(answers.creation_date)) AS period
• …
b. Also it “filters” only the type of events/activity you need to analyse
for example, `WHERE CONCAT('|', questions.tags, '|') LIKE '%|google-bigquery|%' looks for answers for google-bigquery tagged question
The rest of sub-queries are more-less generic and mostly can be used as is
#standardSQL
WITH activities AS (
SELECT answers.owner_user_id AS id,
FORMAT_DATE('%Y-%m', DATE(answers.creation_date)) AS period
FROM `bigquery-public-data.stackoverflow.posts_answers` AS answers
JOIN `bigquery-public-data.stackoverflow.posts_questions` AS questions
ON questions.id = answers.parent_id
WHERE CONCAT('|', questions.tags, '|') LIKE '%|google-bigquery|%'
GROUP BY id, period
), cohorts AS (
SELECT id, MIN(period) AS cohort FROM activities GROUP BY id
), periods AS (
SELECT period, ROW_NUMBER() OVER(ORDER BY period) AS num
FROM (SELECT DISTINCT cohort AS period FROM cohorts)
), cohorts_size AS (
SELECT cohort, periods.num AS num, COUNT(DISTINCT activities.id) AS ids
FROM cohorts JOIN activities ON activities.period = cohorts.cohort AND cohorts.id = activities.id
JOIN periods ON periods.period = cohorts.cohort
GROUP BY cohort, num
), retention AS (
SELECT cohort, activities.period AS period, periods.num AS num, COUNT(DISTINCT cohorts.id) AS ids
FROM periods JOIN activities ON activities.period = periods.period
JOIN cohorts ON cohorts.id = activities.id
GROUP BY cohort, period, num
)
SELECT
CONCAT(cohorts_size.cohort, ' - ', FORMAT("%'d", cohorts_size.ids), ' users') AS cohort,
retention.num - cohorts_size.num AS period_lag,
retention.period as period_label,
ROUND(retention.ids / cohorts_size.ids * 100, 2) AS retention , retention.ids AS rids
FROM retention
JOIN cohorts_size ON cohorts_size.cohort = retention.cohort
WHERE cohorts_size.cohort >= FORMAT_DATE('%Y-%m', DATE('2015-01-01'))
ORDER BY cohort, period_lag, period_label
You can visualize result of above query with the tool of your choice
Note: you can use either period_lag or period_label
See the difference of their use in below examples
with period_lag
with period_label

Cognos: Count the number of occurences of a distinct id

I'm making a report in Cognos Report Studio and I'm having abit of trouble getting a count taht I need. What I need to do is count the number of IDs for a department. But I need to split the count between initiated and completed. If an ID occures more than once, it is to be counted as completed. The others, of course, will be initiated. So I'm trying to count the number of ID occurences for a distinct ID. Here is the query I've made in SQl Developer:
SELECT
COUNT((CASE WHEN COUNT(S.RFP_ID) > 8 THEN MAX(CT.GCT_STATUS_HISTORY_CLOSE_DT) END)) AS "Sales Admin Completed"
,COUNT((CASE WHEN COUNT(S.RFP_ID) = 8 THEN MIN(CT.GCT_STATUS_HISTORY_OPEN_DT) END)) as "Sales Admin Initiated"
FROM
ADM.B_RFP_WC_COVERAGE_DIM S
JOIN ADM.B_GROUP_CHANGE_REQUEST_DIM CR
ON S. RFP_ID = CR.GCR_RFP_ID
JOIN ADM.GROUP_CHANGE_TASK_FACT CT
ON CR.GROUP_CHANGE_REQUEST_KEY = CT.GROUP_CHANGE_REQUEST_KEY
JOIN ADM.B_DEPARTMENT_DIM D
ON D.DEPARTMENT_KEY = CT.DEPARTMENT_RESP_KEY
WHERE CR.GCR_CHANGE_TYPE_ID = '20'
AND S.RFP_LOB_IND = 'WC'
AND S.RFP_AUDIT_IND = 'N'
AND CR.GCR_RECEIVED_DT BETWEEN '01-JAN-13' AND '31-DEC-13'
AND D.DEPARTMENT_DESC = 'Sales'
AND CT.GCT_STATUS_IND = 'C'
GROUP BY S.RFP_ID ;
Now this works. But I'm not sure how to translate taht into Cognos. I tried doing a CASE taht looked liek this(this code is using basic names such as dept instead of D.DEPARTMENT_DESC):
CASE WHEN dept = 'Sales' AND count(ID for {DISTINCT ID}) > 1 THEN count(distinct ID)END)
I'm using count(distinct ID) instead of count(maximum(close_date)). But the results would be the same anyway. The "AND" is where I think its being lost. It obviously isn't the proper way to count occurences. But I'm hoping I'm close. Is there a way to do this with a CASE? Or at all?
--EDIT--
To make my question more clear, here is an example:
Say I have this data in my table
ID
---
1
2
3
4
2
5
5
6
2
My desired count output would be:
Initiated Completed
--------- ---------
4 2
This is because two of the distinct IDs (2 and 5) occure more than once. So they are counted as Completed. The ones that occure only once are counted as Initiated. I am able to do this in SQl Dev, but I can't figure out how to do this in Cognos Report Studio. I hope this helps to better explaine my issue.
Oh, I didn't quite got it originally, amending the answer.
But it's still easiest to do with 2 queries in Report Studio. Key moment is that you can use a query as a source for another query, guaranteeing proper group by's and calculations.
So if you have ID list in the table in Report Studio you create:
Query 1 with dataitems:
ID,
count(*) or count (1) as count_occurences
status (initiated or completed) with a formula: if (count_occurences > 1) then ('completed') else ('initiated').
After that you create a query 2 using query one as source with just 2 data items:
[Query1].[Status]
Count with formula: count([Query1].[ID])
That will give you the result you're after.
Here's a link to doco on how to nest queries:
http://pic.dhe.ibm.com/infocenter/cx/v10r1m0/topic/com.ibm.swg.ba.cognos.ug_cr_rptstd.10.1.0.doc/c_cr_rptstd_wrkdat_working_with_queries_rel.html?path=3_3_10_6#cr_rptstd_wrkdat_working_with_queries_rel

Resources