How to query firebase event report with hourly breakup in Bigquery? - firebase

Below big query helps to extract registration event data datewise. But, How to extract registration event data hourwise. Pls. Help
select
app_info.id,app_info.version,traffic_source.medium,
traffic_source.source,geo.country,geo.city,
event_date,event_name,
param1.value.string_value as category,
FROM `<Table>`,
UNNEST(event_params) as param1
where _TABLE_SUFFIX BETWEEN '20190401' AND '20190402'
and event_name='Registration'
and param1.key='category'
group by 1,2,3,4,5,6,7,8,9

in your select change event_date to TIMESTAMP_TRUNC(TIMESTAMP_MICROS(event_timestamp), HOUR)

Related

How to find time spent (engagement_time) on our app by the users in BigQuery?

I am trying to calculate the total time spent by users on my app. We have integrated firebase analytics data in BigQuery. Can I use the sum of the values of engagement_time_msec/1000 in the select statement of my query? This is what I am trying :
SELECT SUM(x.value.int_value) FROM "[dataset]", UNNEST(event_params) AS x WHERE x.key = "engagement_time_msec"
I am getting very big values after executing this query(it giving huge hours per day). I am not sure if is it ok to use SUM("engagement_time_msec") for calculating the total time spent by users on the app.
I am not expecting that users are spending this much time on the app. Is it the right way to calculate engagement_time, or which is the best event to calculate the engagement_time?
Any help would be highly appreciated.
As per google analytics docs in regards to engagement_time_sec, this field is defined as "The additional engagement time (ms) since the last user_engagement event". Therefore, if you only look at this, you are losing all the previous time spent by users before the mentioned user_engagement event is triggered.
What I'd do, since now ga_session_id is defined, would be to grab the maximum and minimum for each ga_session_id timestamp, use the TIMESTAMP_DIFF() function for each case, and sum the results of all the sessions for a given day:
WITH ga_sessions AS (
SELECT
event_timestamp,
event_date,
params.value.int_value AS ga_session_id
FROM
`analytics_123456789.events_*`, UNNEST(event_params) AS params
WHERE
params.key = "ga_session_id"
),
session_length AS (
SELECT
event_date,
TIMESTAMP_DIFF(MAX(TIMESTAMP_MICROS(event_timestamp)), MIN(TIMESTAMP_MICROS(event_timestamp)), SECOND) AS session_duration_seconds
FROM
ga_sessions
WHERE
ga_session_id IS NOT NULL
GROUP BY
1
),
final AS (
SELECT
event_date,
SUM(session_duration_seconds) as total_seconds_in_app
FROM
session_length
GROUP BY
1
ORDER BY
1 DESC
)
SELECT * FROM final
OUTPUT (data extracted from the app I work at):
event_date | total_seconds_in_app
-----------+--------------------
20210920 | 45600
20210919 | 43576
20210918 | 44539

BigQuery + Google Analytics: Calculating quantities purchased by SKU. UNNEST not working

I'm trying to calculate the total quantities purchased for individual SKU's between certain dates. Final output should be date / SKU / Qty_sold
My dataset is the Google Analytics sample public dataset.
Main issue: When I try to run the below query using item.itemQuantity, I get the below error:
Syntax error: Unexpected keyword UNNEST at [6:1]
If you see the screenshot for item.itemQuantity, it seems to be nested. By adding the UNNEST function, it's supposed to flatten the table and get the count. This is my understanding of UNNEST. However, when I apply UNNEST, the query doesn't run.
Second issue: When I check the BQ GA schema, the definitions for hits.item.itemQuantity and hits.product.productQuantity seem to be the same? and I'm unable to differentiate between the two fields and which one should I use in my query.
https://support.google.com/analytics/answer/3437719?hl=en
hits.product.productQuantity INTEGER The quantity of the product purchased.
hits.item.itemQuantity INTEGER The quantity of the product sold.
Can anyone please explain how I can improve this query to get my desired result ? Thx.
SELECT
date,
hits.item.productSKU AS SKU,
SUM(hits.item.itemQuantity) AS qty_sold
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`
UNNEST (hits) hit
WHERE _TABLE_SUFFIX
BETWEEN
'20160801' AND '20160802'
Try below for hits.product
SELECT
date,
prod.productSKU AS SKU,
SUM(prod.productQuantity) AS qty_purchased
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hit, UNNEST(product) prod
WHERE _TABLE_SUFFIX BETWEEN '20160801' AND '20160802'
GROUP BY date, SKU
or below for hits.item
SELECT
date,
hit.item.productSKU AS SKU,
SUM(hit.item.itemQuantity) AS qty_sold
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hit
WHERE _TABLE_SUFFIX BETWEEN '20160801' AND '20160802'
GROUP BY date, SKU

How to get gender and age in BigQuery from Firebase Analytics?

I am using Analytics Events and trying to take advantage of the user data.
I can get pretty much data.
With this Query.
SELECT
*
FROM
`test-project-23471.analytics_205774787.events_20191120`,
UNNEST(event_params) AS event_params
WHERE
event_name ='select_content'
AND event_params.value.string_value = 'a_item_open'
However, I don't need all. So, I did
SELECT
event_params.value.string_value,
event_previous_timestamp,
device,
geo,
app_info
FROM
`test-project-23471.analytics_205774787.events_20191120`,
UNNEST(event_params) AS event_params
WHERE
event_name ='select_content'
AND event_params.value.string_value = 'a_item_open'
And then, I realized that the result doesn't have gender data and age data. And in the document, it says Firebase automatically gets the information. I'd like to combine sex, age(or age group) with the result from the query above.
How can I get it?
Note that this document is just an example on how to query BigTable data by using BigQuery and not from Firebase.
The Firebase layout mentions that it has a RECORD field named "user_properties" which has a "key" STRING field.
Thus, you could try:
SELECT DISTINCT user_properties.key
FROM
`test-project-23471.analytics_205774787.events_20191120`
To retrieve the correct name for the gender/sex property an include it in your query. For instance:
SELECT
event_params.value.string_value,
event_previous_timestamp,
device,
geo,
app_info,
user_properties.value.string_value as gender
FROM
`test-project-23471.analytics_205774787.events_20191120`,
UNNEST(event_params) AS event_params
WHERE
event_name ='select_content'
AND event_params.value.string_value = 'a_item_open'
AND user_properties.key = "Gender"
Nevertheless, if you don't find the Gender info, please consider this. Otherwise, I suggest reaching the Firebase support.
Hope it helps.
For privacy reasons these fields are not available in BigQuery export. You can only see aggregated data for gender and age in Firebase Analytics console.
You can't even use them for targeting in other Firebase features, like RemoteConfig, so user-level granularity is not possible.

Firebase vs BigQuery Active Users Discrepancies

I've integrated my Firebase project with BigQuery. Now I'm facing a data discrepancy issue while trying to get 1 day active users, for the selected date i.e. 20190210, with following query from BigQuery;
SELECT COUNT(DISTINCT user_pseudo_id) AS 1_day_active_users_count
FROM `MY_TABLE.events_*`
WHERE event_name = 'user_engagement' AND _TABLE_SUFFIX = '20190210'
But the figures returned from BigQuery doesn't match with the ones reported on Firebase Analytics Dashboard for the same date. Any clue what's possibly going wrong here?
The following sample query mentioned my Firebase Team, here https://support.google.com/firebase/answer/9037342?hl=en&ref_topic=7029512, is not so helpful as its taking into consideration the current time and getting users accordingly.
N-day active users
/**
* Builds an audience of N-Day Active Users.
*
* N-day active users = users who have logged at least one user_engagement
* event in the last N days.
*/
SELECT
COUNT(DISTINCT user_id) AS n_day_active_users_count
FROM
-- PLEASE REPLACE WITH YOUR TABLE NAME.
`YOUR_TABLE.events_*`
WHERE
event_name = 'user_engagement'
-- Pick events in the last N = 20 days.
AND event_timestamp >
UNIX_MICROS(TIMESTAMP_SUB(CURRENT_TIMESTAMP, INTERVAL 20 DAY))
-- PLEASE REPLACE WITH YOUR DESIRED DATE RANGE.
AND _TABLE_SUFFIX BETWEEN '20180521' AND '20240131';
So given the small discrepancy here, I believe the issue is one of timezones.
When you're looking at a "day" in the Firebase Console, you're looking at the time interval from midnight to midnight in whatever time zone you've specified when you first set up your project. When you're looking at a "day" in BigQuery, you're looking at the time interval from midnight to midnight in UTC.
If you want to make sure you're looking at the events that match up with what's in your console, you should query the event_timestamp value in your BigQuery table (and remember that it might span multiple tables) to match up with what's in your timezone.

Firebase Analytics: export data to BigQuery as a date

I'm trying to log events and specify the moment these moments happened.
For example:
Firebase.Analytics.FirebaseAnalytics.LogEvent ("EventGamePlayed", "sent_at", DateTime.Now.ToString ("yyyy-MM-dd hh:mm:ss"));
I would like to know if it's possible that, once it's exported to BigQuery, I could use this parameter as a date / timestamp so I can, for example, get all the X or Y events that happened last month.
Thanks!
You probably don't need this, actually. Referring to the Firebase schema for BigQuery exports, you can use either date or timestamp_micros within event_dim. For example, to find events on April 1, you could do:
#standardSQL
SELECT event
FROM YourTable
CROSS JOIN UNNEST(event_dim) AS event
WHERE PARSE_DATE('%Y%m%d', event.date) = '2017-04-01';
To find events that occurred between 12pm and 4pm UTC on April 1, you could do:
#standardSQL
SELECT event
FROM YourTable
CROSS JOIN UNNEST(event_dim) AS event
WHERE TIMESTAMP_MICROS(event.timestamp_micros) BETWEEN
'2017-04-01 12:00:00' AND '2017-04-01 16:00:00';

Resources