How to get gender and age in BigQuery from Firebase Analytics? - firebase

I am using Analytics Events and trying to take advantage of the user data.
I can get pretty much data.
With this Query.
SELECT
*
FROM
`test-project-23471.analytics_205774787.events_20191120`,
UNNEST(event_params) AS event_params
WHERE
event_name ='select_content'
AND event_params.value.string_value = 'a_item_open'
However, I don't need all. So, I did
SELECT
event_params.value.string_value,
event_previous_timestamp,
device,
geo,
app_info
FROM
`test-project-23471.analytics_205774787.events_20191120`,
UNNEST(event_params) AS event_params
WHERE
event_name ='select_content'
AND event_params.value.string_value = 'a_item_open'
And then, I realized that the result doesn't have gender data and age data. And in the document, it says Firebase automatically gets the information. I'd like to combine sex, age(or age group) with the result from the query above.
How can I get it?

Note that this document is just an example on how to query BigTable data by using BigQuery and not from Firebase.
The Firebase layout mentions that it has a RECORD field named "user_properties" which has a "key" STRING field.
Thus, you could try:
SELECT DISTINCT user_properties.key
FROM
`test-project-23471.analytics_205774787.events_20191120`
To retrieve the correct name for the gender/sex property an include it in your query. For instance:
SELECT
event_params.value.string_value,
event_previous_timestamp,
device,
geo,
app_info,
user_properties.value.string_value as gender
FROM
`test-project-23471.analytics_205774787.events_20191120`,
UNNEST(event_params) AS event_params
WHERE
event_name ='select_content'
AND event_params.value.string_value = 'a_item_open'
AND user_properties.key = "Gender"
Nevertheless, if you don't find the Gender info, please consider this. Otherwise, I suggest reaching the Firebase support.
Hope it helps.

For privacy reasons these fields are not available in BigQuery export. You can only see aggregated data for gender and age in Firebase Analytics console.
You can't even use them for targeting in other Firebase features, like RemoteConfig, so user-level granularity is not possible.

Related

Streaming Google Analytics 4 data to BigQuery causing data collection issues

We have configured a linking between the GA 4 property and GoogleBigQuery via the GA interface (without any additional code). It works fine, we see a migrated data in GBQ tables, but however, we face an issue with how this data is written in those tables.
If we look at any table we could see that events from different users can be recorded in one session (and there can be different clientIDs (and even usedIDs, which we pass when authorizing a user)) See an example
This is a result of executing following query:
SELECT
event_name,
user_pseudo_id,
user_id,
device.category,
device.mobile_brand_name,
device.mobile_model_name,
device.operating_system_version,
geo.region,
geo.city,
params.key,
params.value.int_value
FROM `%project_name%.analytics_256374149.events_20210331`, unnest(event_params) AS params
WHERE event_name="page_view"
AND params.value.int_value=1617218965
ORDER BY event_timestamp
As a result, you can see that within one session different users from different regions, with different devices and identifiers are combined. It is, of course, impossible to use such data for reporting purposes. Once again, it is a default GA4 → BigQuery setup in the GA4 interface (no add-ons).
We do not understand what the error is (in import, in requests, or somewhere else) and would like to get advice on this issue.
Thanks.
You should look at the combination of user_pseudo_id and the event_param ga_session_id. This combination is unique and used for measuring unique sessions across a property.
For example, this query counts the number of unique event names in each session:
SELECT
user_pseudo_id,
(SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') AS ga_session_id,
COUNT(DISTINCT event_name) AS unique_event_name_count
FROM `<project>.<dataset>.events_*`
GROUP BY user_pseudo_id, ga_session_id

ga_session_id is unique identifier on Google Analytics?

I checked the official docs here:
https://support.google.com/firebase/answer/7061705?hl=en
But, when I checked my data, there are several user_id and user_pseudo_id in one ga_session_id.
How is it possible?
ga_session_id is not supposed to be globally unique (afaik it's based on a skewed in-device timestamp) but in most circumstances (except for edge cases) it should be locally unique for a given user_pseudo_id
session_id is not unique (two or more users can have the same session_id).
It is just timestamp when the session started, so we need to concat session_id and user_pseudo_id or user_id.
I think the problem is the query used, with this I see the unique values associated correctly in my data:
SELECT event_timestamp, user_id, user_pseudo_id, event_name, event_params.value.int_value AS session_id
FROM `MYTABLE`,
UNNEST (event_params) AS event_params
WHERE event_params.key = "ga_session_id" LIMIT 1000

How to write query to find firebase event details in last 28days using bigquery with platform, steam_id, event_name filter in power bi?

firebase console and its first_open output
Firebase.console Result first_open 8,787- 8,575 -
I'm trying to create a query to get the event details using the big query but not it produces an exact result
My Query is
select platform, count(s.platform) from (SELECT * FROM `Table.events_*` where event_name = "first_open" and stream_id = "1757261196" or stream_id = "1759866139"
UNION ALL
SELECT * FROM `Table.events_intraday_*` where event_name = "first_open" and stream_id = "1757261196" or stream_id = "1759866139" ) s where and event_date between "20191204" and "20200101" group by s.platform
My filter is
Stream_id = ["1757261196","1759866139"]
platform = ["ios","android"]
dateRanges = last 28days
event_name = first_open
BigQuery Result:
[
{
"platform": "ANDROID",
"f0_": "428"
},
{
"platform": "IOS",
"f0_": "38"
}
]
But firebase console output and bigquery output are different I think it is due to query issue, please help me to write the correct query.
Your query has some missing pieces at the WHERE statement. So I'm not even sure if that QUERY that you shared with us works.
From what I could observe from the QUERY your WHERE statement has some issues with the precedence of the AND, OR operators.
What you have is:
SELECT * FROM my_table WHERE event_name = "first_open" AND stream_id = "1757261196" OR stream_id = "1759866139"
This is returning two sets:
One with the event_name="first_open", and the stream_id = "1757261196"
A second one, with the stream_id="1759866139"
This means that the first condition is not being applied correctly.
I recommend you to use the following structure:
SELECT * FROM my_table WHERE event_name = "first_open" AND (stream_id = "1757261196" OR stream_id = "1759866139")
This way you put together the conditionals of the stream_id which are the only ones that should be affected by the OR operator, and the first conditional is always applied.
After this take a good look at the final WHERE:
...) s where and event_date between "20191204" and "20200101" group by s.platform
This may not be working as you are expecting because of the data type of that column, and how you are passing the data to the BETWEEN. Be sure is a DATE type and is in the same format, you can always cast the values with DATE() if is something else.
EDIT:
After you link a project to BigQuery, the first daily export of events
creates a corresponding dataset in the associated BigQuery project.
Then, each day, raw event data for each linked app populates a new
daily table in the associated dataset, and raw event data is streamed
into a separate intraday BigQuery table in real-time. Data prior to
linking to BigQuery is not available for import (except for
Performance Monitoring data). By default, all web data from your App +
Web properties in Google Analytics will be exported as well.
Source
The query seems to be OK, there is only one more consideration to make:
Be careful when you use wildcards Table.events_* includes Table.events_intraday_*, if they are in the same Dataset in BigQuery. This could lead to duplicated data in your query and will cause a mismatch in your counts.
Besides that, I recommend you to follow the next steps, with the idea that the issue is not the query:
Verify that the tables for every single day that you are querying exist in BigQuery. The smaller count seems to be from a single day vs the bigger amount.
Validate that the tables from BigQuery contain the same data as the "Events" dataset from Firebase, you could be comparing two different datasets and hence the numbers will never match.

BigQuery Crashlytics - Crash free users / sessions

I've linked my firebase crashlytics data to bigquery and setup the data studio templates provided by google. A lot of great data in there except the most important metrics required for my dashboard: crash free users and crash free sessions as a percentage.
Nothing stands out in the schema which I could be used to calculate this.
Any ideas how I might get this value? It's displayed in the firebase dashboard so it must be available..
I looked into the documentation and found event_name='app_exception'. With that you can write a query like
WITH userCrashes AS (
SELECT user_pseudo_id, MAX(event_name = 'app_exception') hasCrash
FROM `firebase-public-project.analytics_153293282.events_20181003`
GROUP BY 1
)
SELECT
IF(hasCrash,'crashed','crash-free') crashState,
COUNT(DISTINCT user_pseudo_id) AS users,
ROUND(COUNT(DISTINCT user_pseudo_id) / SUM(COUNT(DISTINCT user_pseudo_id)) OVER (),2) AS userShare
FROM userCrashes
GROUP BY 1
But there is also a flag 'fatal' in the event parameters. In the example data, it's always true, but in case you wanted to take it into respect, you could do something like
WITH userCrashes AS (
SELECT
user_pseudo_id,
MAX(event_name = 'app_exception') hasCrash,
MAX(event_name = 'app_exception'
AND (select value.int_value=1 from unnest(event_params) where key='fatal')
) hasFatalCrash
FROM `firebase-public-project.analytics_153293282.events_20181003`
GROUP BY 1
)
SELECT
IF(hasCrash,'crashed','crash-free') crashState,
IF(hasFatalCrash,'crashed fatal','crash-free') fatalCrashState,
COUNT(DISTINCT user_pseudo_id) AS users,
ROUND(COUNT(DISTINCT user_pseudo_id) / SUM(COUNT(DISTINCT user_pseudo_id)) OVER (),2) AS userShare
FROM userCrashes
GROUP BY 1,2
Disclaimer: I never worked with firebase, so this is all just based on documentation and example data. Hope it helps, though.

BigQuery Export schema - how to export complete event column from firebase

I already link the firebase to bigquery, and everyday a new table gets created with the date stamp. The columns within the export can be found in the following link: https://support.google.com/firebase/answer/7029846?hl=en
BUT, there is not firebase analytics data(such as add_porduct_like, add_product_to_cart, and so on) being export. how can I export complete data into BigQuery.
Firebase Analytics data is already exported in those daily tables formed each day in Big Query.
What is required over here is to run queries in order to extract the relevant data.
Take a look at this doc for sample queries when Firebase Data is exported to Big Query.
In short, you need to make use of the schema and based on the Field Name, you can query the data obtained in Big Query by Firebase.
When you submit an event with parameters to Firebase Analytics, it is stored as an array in the column event_dim.params.
To get data from database, you will need to use this query (I'm using standard SQL):
SELECT
event_dim.name AS event_name,
event_dim.params AS event_params
FROM
`project.your_app.app_events_20171109`,
UNNEST(event_dim) as event_dim
If you want to get specific parameter, you'll also have to unnest another field:
SELECT
event_dim.name AS event_name,
event_dim.params AS event_params
FROM
`project.your_app.app_events_20171109`,
UNNEST(event_dim) as event_dim,
UNNEST(event_dim.params) as params
WHERE params.key LIKE "add_product_to_cart"
You can read more about how Firebase Analytics stores data and how to use UNNEST function here: https://firebase.googleblog.com/2017/03/bigquery-tip-unnest-function.html

Resources