Google Analytics: Conversion rate in time intervals - google-analytics
I've got Google Analytics running on a website and am now trying to determine the conversion rate in certain time intervals. I therefore have a table that contains
interval_id
i.interval_start_time_utc
i.interval_stop_time_utc
Sadly, the following BigQuery query that would assign each order to an interval will not work:
SELECT
totals.transactions,
totals.visits,
i.interval_id
FROM [123456.ga_sessions_20160609]
INNER JOIN intervals i ON i.interval_start_time_utc < visitStartTime AND visitStartTime < i.interval_end_time_utc
This throws the error
ON clause must be AND of = comparisons of one field name from each table [...]
so I gather that BigQuery simply doesn't do range joins. Is there another way to do this short of doing a full join and then paring down? Are there entirely different, better approaches for this sort of thing?
BigQuery Standard SQL doesn't have this limitation - see Enabling Standard SQL
If yo want to make with BigQuery Legacy SQL - try something like below
SELECT
totals.transactions,
totals.visits,
i.interval_id
FROM [123456.ga_sessions_20160609]
CROSS JOIN intervals i
WHERE i.interval_start_time_utc < visitStartTime
AND visitStartTime < i.interval_end_time_utc
For the sake of presenting idea - let’s simplify example
And let’s remember - we do want to make it with BigQuery Legacy SQL - not with Standard SqL where it is trivial!
Challenge
Assume we have visits table:
SELECT visit_time FROM
(SELECT 2 AS visit_time),
(SELECT 12 AS visit_time),
(SELECT 22 AS visit_time),
(SELECT 32 AS visit_time)
and intervals table:
SELECT before, after, event FROM
(SELECT 1 AS before, 5 AS after, 3 AS event),
(SELECT 6 AS before, 10 AS after, 8 AS event),
(SELECT 21 AS before, 25 AS after, 23 AS event),
(SELECT 33 AS before, 37 AS after, 35 AS event)
We want to extract all visits which are within event’s before and after values
This can be simply done with use of CROSS JOIN like below:
SELECT
visit_time, event, before, after
FROM (
SELECT visit_time FROM
(SELECT 2 AS visit_time),
(SELECT 12 AS visit_time),
(SELECT 22 AS visit_time),
(SELECT 32 AS visit_time),
) AS visits
CROSS JOIN (
SELECT before, after, event FROM
(SELECT 1 AS before, 5 AS after, 3 AS event),
(SELECT 6 AS before, 10 AS after, 8 AS event),
(SELECT 21 AS before, 25 AS after, 23 AS event),
(SELECT 33 AS before, 37 AS after, 35 AS event)
) AS intervals
WHERE visit_time BETWEEN before AND after
With result as:
visit_time event before after
2 3 1 5
22 23 21 25
Potential Issue
When both tables are big enough – this cross join becomes quite expensive!
Hint
It happened that (from user’s comments) - The intervals are always the x units to the left and right of event.
Solution
Below is proposed solution/option that uses hint/fact and makes use of JOIN instead of CROSS JOIN between two big tables
The key here is to generate (on fly) new table that will hold all possible interval’s values based on event and x
SELECT event, event + delta AS point
FROM (
SELECT event FROM
(SELECT 1 AS before, 5 AS after, 3 AS event),
(SELECT 6 AS before, 10 AS after, 8 AS event),
(SELECT 21 AS before, 25 AS after, 23 AS event),
(SELECT 33 AS before, 37 AS after, 35 AS event)
) AS events
CROSS JOIN (
SELECT pos - 1 - 2 AS delta FROM (
SELECT ROW_NUMBER() OVER() AS pos, * FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + 2 * 2, '.'),'') AS h FROM (SELECT NULL)),h
)))
) AS deltas
In above code x = 2 – but you can change it in two places, for example if x = 5 you should have
SELECT pos - 1 - 5 AS delta FROM (
SELECT ROW_NUMBER() OVER() AS pos, * FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + 2 * 5, '.'),'') AS h FROM (SELECT NULL)),h
)))
CROSS JOIN in above code is inexpensive because deltas table is quite small
So, finally now, you can have your result with below:
SELECT
visit_time, event
FROM (
SELECT visit_time FROM
(SELECT 2 AS visit_time),
(SELECT 12 AS visit_time),
(SELECT 22 AS visit_time),
(SELECT 32 AS visit_time),
) AS visits
JOIN (
SELECT event, event + delta AS point
FROM (
SELECT event FROM
(SELECT 1 AS before, 5 AS after, 3 AS event),
(SELECT 6 AS before, 10 AS after, 8 AS event),
(SELECT 21 AS before, 25 AS after, 23 AS event),
(SELECT 33 AS before, 37 AS after, 35 AS event)
) AS events
CROSS JOIN (
SELECT pos - 1 - 2 AS delta FROM (
SELECT ROW_NUMBER() OVER() AS pos, * FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + 2 * 2, '.'),'') AS h FROM (SELECT NULL)),h
)))
) AS deltas
) AS points
ON points.point = visits.visit_time
With expected result
visit_time event
2 3
22 23
I think above approach can work for you – but you sure need to adopt it to your particular case
I think this can be done relatively easy if you will round all your involved times up to respective minutes
Hope this will help
Share result with us if you will get this work :o)
Related
Rolling 7 day uniques & 31 day uniques in BigQuery w/ Firebase
I'm trying to setup a rolling 7 day users & rolling 31 day users in BigQuery (w/ Firebase) using the following query. I want it where for each day it examines the previous 31 days as well as 7 days. I've been stuck and getting the message: LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join. The query: With events AS ( SELECT PARSE_DATE("%Y%m%d", event_date) as event_date, user_pseudo_id FROM `my_data_table.analytics_178206500.events_*` Where _table_suffix NOT LIKE "i%" AND event_name = "user_engagement" GROUP BY 1, 2 ), DAU AS ( SELECT event_date as date, COUNT(DISTINCT(user_pseudo_id)) AS dau From events GROUP BY 1 ) SELECT DAU.date, DAU.dau, ( SELECT count(distinct(user_pseudo_id)) FROM events WHERE events.event_date BETWEEN DATE_SUB(DAU.date, INTERVAL 29 DAY) and dau.date ) as mau, ( SELECT count(distinct(user_pseudo_id)) FROM events WHERE events.event_date BETWEEN DATE_SUB(DAU.date, INTERVAL 7 DAY) and dau.date ) as wau FROM DAU ORDER BY 1 DESC I'm able to get the DAU part but the last 7 day users (WAU) & last 31 day users (MAU) aren't coming through. I have tried to CROSS JOIN DAU w/ events but I get the following results GraphResults Any pointers would be greatly appreciated
Calculating occupany level between a date range
I'm having trouble trying to wrap my head around how to write this query to calculate the occupancy level of a hotel and then list the results by date. Consider the following type of data from a table called reservations: Arrival Departure Guest Confirmation 08/01/2015 08/05/2015 John 13234 08/01/2015 08/03/2015 Bob 34244 08/02/2015 08/03/2015 Steve 32423 08/02/2015 08/02/2015 Mark 32411 08/02/2015 08/04/2014 Jenny 24422 Output Data would ideally look like: Date Occupancy 08/01/2015 2 08/02/2015 4 08/03/2015 2 08/04/2015 1 08/02/2015 0 And the query should be able to utilize a date range as a variable. I'm having trouble getting the obviously hardest piece of how to both get the count per night and spitting it out by date.
You can generate a list of dates first. In Oracle you can do this by using connect by. This will make a recursive query. For instance, to get the next 30 days, you can select today and keep connecting until you've got the desired number of days. level indicates the level of recursion. select trunc(sysdate) + level - 1 as THEDATE from dual connect by level <= 30; On that list, you can query the number of reservations for each day in that period: select THEDATE, (select count(*) from reservations r where r.Arrival >= THEDATE and r.Departure < THEDATE) as RESERVATIONCOUNT from ( select trunc(sysdate) + level - 1 as THEDATE, from dual connect by level <= 30) Instead of getting a fixed number of dates, you can also get another value there, for instance, to get at least 30 days in the future, but further if there are reservations for later..: select THEDATE, (select count(*) from reservations r where r.Arrival >= THEDATE and r.Departure < THEDATE) as RESERVATIONCOUNT from ( select trunc(sysdate) + level - 1 as THEDATE, from dual connect by level <= greatest(30, (select trunc(max(DEPARTURE) - sysdate) from reservations)))
SQLite count(*) in while clause
I have a calendar table in which there are all the dates in the future and a workday field: fld_date / fld_workday 2014-01-01 / 1 2014-01-02 / 1 2014-01-03 / 0 ... I want select a date which are n workday far from another date. I tried two ways, but i failed: The 5th workday from 2014-11-07: 1. SELECT n1.fld_date FROM calendar as n1 WHERE n1.fld_workday=1 AND (select count(*) FROM calendar as n2 WHERE n2.fld_date>='2014-11-07' AND n2.fld_workday=1)=5 It gave back 0 row. 2. SELECT fld_date FROM calendar WHERE fld_date>='2014-11-07' AND fld_workday=1 LIMIT 1 OFFSET 5 It's ok, but i would like to change the 5 days constant to a field, and it's cannot (it would be inside a bigger select statement): SELECT fld_date FROM calendar WHERE fld_date>='2014-11-07' AND fld_workday=1 LIMIT 1 OFFSET fld_another_field Any suggestion?
In the first query, the subquery does not refer to the row in n1. You need a correlated subquery: SELECT fld_Date FROM Calendar AS n1 WHERE fld_WorkDay = 1 AND (SELECT COUNT(*) FROM Calendar AS n2 WHERE fld_Date BETWEEN '2014-11-07' AND n1.fld_Date AND fld_WorkDay = 1 ) = 5 LIMIT 1 The subquery is extremly inefficient if there is no index on the fld_Date column. You can avoid executing the subquery for every row in n1 by adding another condition with an estimate of the result date (assuming that there are between about four to five work days per week, and using a few extra days to be sure): ... WHERE fldDate BETWEEN date('2014-11-07', (5 * 4/7 - 10) || ' days') AND date('2014-11-07', (5 * 5/7 + 10) || ' days') AND fldWorkDay = 1 AND (SELECT ...
SQLite: How do I subtract the value in one row from another?
Here is my example table: column_example 10 20 25 50 Here is what I would like: column_example2 10 5 25 I'm sure this is a simple question, but I haven't found the answer in the SQLite Syntax web page or via Google. EDIT: To clarify, the code would likely return the outputs for: 20-10 25-20 50-25
This solution might be slow, but I had to consider the potential gaps between succeeding rowids: http://sqlfiddle.com/#!5/daeed/1 SELECT (SELECT x FROM t AS t3 WHERE t3.rowid = (SELECT MIN(tt.rowid) FROM t AS tt WHERE tt.rowid > t.rowid ) ) - x FROM t WHERE diff IS NOT NULL If it is guaranteed to not have any gaps between rowids, then you can use this simpler query: http://sqlfiddle.com/#!5/1f906/3 SELECT t_next.x - t.x FROM t INNER JOIN t AS t_next ON t_next.rowid = t.rowid + 1
Getting All the record of particular month - Building SQL Query
I need some help to build SQL Query. I have table having data like: ID Date Name 1 1/1/2009 a 2 1/2/2009 b 3 1/3/2009 c I need to get result something like... 1 1/1/2009 a 2 1/2/2009 b 3 1/3/2009 c 4 1/4/2009 Null 5 1/5/2009 Null 6 1/6/2009 Null 7 1/7/2009 Null 8 1/8/2009 Null ............................ ............................ ............................ 30 1/30/2009 Null 31 1/31/2009 Null I want query something like.. Select * from tbl **where month(Date)=1 AND year(Date)=2010** Above is not completed query. I need to get all the record of particular month, even if some date missing.. I guess there must be equi Join in the query, I am trying to build this query using Equi join Thanks
BIG EDIT Now understand the OPs question. Use a common table expression and a left join to get this effect. DECLARE #FirstDay DATETIME; -- Set start time SELECT #FirstDay = '2009-01-01'; WITH Days AS ( SELECT #FirstDay as CalendarDay UNION ALL SELECT DATEADD(d, 1, CalendarDay) as CalendarDay FROM Days WHERE DATEADD(d, 1, CalendarDay) < DATEADD(m, 1, #FirstDay) ) SELECT DATEPART(d,d.CalendarDay), **t.date should be (d.CalendarDay)**, t.Name FROM Days d LEFT JOIN tbl t ON d.CalendarDay = t.Date ORDER BY d.CalendarDay; Left this original answer at bottom You need DATEPART, sir. SELECT * FROM tbl WHERE DATEPART(m,Date) = 1 If you want to choose month and year, then you can use DATEPART twice or go for a range. SELECT * FROM tbl WHERE DATEPART(m,Date) = 1 AND DATEPART(yyyy,Date) = 2009 Range :- SELECT * FROM tbl WHERE Date >= '2009-01-01' AND Date < '2009-02-01' See this link for more info on DATEPART. http://msdn.microsoft.com/en-us/library/ms174420.aspx
You can use less or equal to. Like so: select * from tbl where date > '2009-01-01' and date < '2009-02-01' However, it is unclear if you want month 1 from all years? You can check more examples and functions on "Date and Time Functions" from MSDN
Create a temporary table containing all days of that certain month, Do left outer join between that table and your data table on tempTable.month = #month. now you have a big table with all days of the desired month and all the records matching the proper dates + empty records for those dates who have no data. i hope that's what you want.