Is it possible to get google analytics event timestamp in bigquery? - google-analytics

Im trying to get a the event timestamp from big query (google analytics 360) but I cant seem to find the correct export field to get it.
I have tried hits.eventInfo.timestamp and hits.eventInfo.datetime but none of them seem to work.
My query is
SELECT
hits.eventInfo.timestamp as purchaseDate,
fullVisitorId as visitorId
FROM (tables)
WHERE LOWER(hits.eventInfo.eventAction) == 'purchase'
GROUP BY 2
ORDER BY 1 DESC

LegacySQL
DATE_ADD(TIMESTAMP(FORMAT_UTC_USEC(visitStartTime*1000000)), (hits.time/1000), "SECOND")
This will return YYYY-MM-DD HH:MM:SS in UTC.
Hope it helps.

There is no field for hit timestamp however you can calculate this from hits.time and visitStartTime. There is a row from Schema description:
hits.time INTEGER The number of milliseconds after the visitStartTime when this hit was registered. The first hit has a hits.time of 0

Related

How to find time spent (engagement_time) on our app by the users in BigQuery?

I am trying to calculate the total time spent by users on my app. We have integrated firebase analytics data in BigQuery. Can I use the sum of the values of engagement_time_msec/1000 in the select statement of my query? This is what I am trying :
SELECT SUM(x.value.int_value) FROM "[dataset]", UNNEST(event_params) AS x WHERE x.key = "engagement_time_msec"
I am getting very big values after executing this query(it giving huge hours per day). I am not sure if is it ok to use SUM("engagement_time_msec") for calculating the total time spent by users on the app.
I am not expecting that users are spending this much time on the app. Is it the right way to calculate engagement_time, or which is the best event to calculate the engagement_time?
Any help would be highly appreciated.
As per google analytics docs in regards to engagement_time_sec, this field is defined as "The additional engagement time (ms) since the last user_engagement event". Therefore, if you only look at this, you are losing all the previous time spent by users before the mentioned user_engagement event is triggered.
What I'd do, since now ga_session_id is defined, would be to grab the maximum and minimum for each ga_session_id timestamp, use the TIMESTAMP_DIFF() function for each case, and sum the results of all the sessions for a given day:
WITH ga_sessions AS (
SELECT
event_timestamp,
event_date,
params.value.int_value AS ga_session_id
FROM
`analytics_123456789.events_*`, UNNEST(event_params) AS params
WHERE
params.key = "ga_session_id"
),
session_length AS (
SELECT
event_date,
TIMESTAMP_DIFF(MAX(TIMESTAMP_MICROS(event_timestamp)), MIN(TIMESTAMP_MICROS(event_timestamp)), SECOND) AS session_duration_seconds
FROM
ga_sessions
WHERE
ga_session_id IS NOT NULL
GROUP BY
1
),
final AS (
SELECT
event_date,
SUM(session_duration_seconds) as total_seconds_in_app
FROM
session_length
GROUP BY
1
ORDER BY
1 DESC
)
SELECT * FROM final
OUTPUT (data extracted from the app I work at):
event_date | total_seconds_in_app
-----------+--------------------
20210920 | 45600
20210919 | 43576
20210918 | 44539

how to calculate quantity using BigQuery in GA

I am trying to calculate number of transaction, quantity, revenue for last 14 days.
What I've got so far is
SELECT
sum(totals.transactions) AS Transaction ,
sum(hits.product.productQuantity) AS quantity,
sum(totals.transactionRevenue)/1000000 AS Revenue
FROM TABLE_DATE_RANGE([bigquery-public-data.google_analytics_sample.ga_sessions_],
TIMESTAMP('2019-10-01'), TIMESTAMP('2019-10-14'));
I get transaction and revenue same as what I've got from custom report but some how I get different number for quantity.
Am I doing something wrong or using wrong table?
I suppose to get 63 for the quantity but I get 2420 when I run the big query above.
Thanks in advance!
The query in the question produces no results. Thanks for choosing a public dataset source (so I could run the query), but there are no tables within that time range.
The same query over an existing time period:
SELECT
sum(totals.transactions) AS Transaction ,
sum(hits.product.productQuantity) AS quantity,
sum(totals.transactionRevenue)/1000000 AS Revenue
FROM TABLE_DATE_RANGE([bigquery-public-data.google_analytics_sample.ga_sessions_],
TIMESTAMP('2017-07-01'), TIMESTAMP('2017-07-12'));
317 27804 33020.66
Then I rewrote it as a #standardSQL query - to see if there is some implicit flattening that creates incorrect results. In #standardSQL you have to do explicit flattening, which I did like this:
SELECT Transaction
, (SELECT SUM((SELECT SUM(productQuantity) FROM UNNEST(product))) FROM UNNEST(hitsarray)) AS quantity
, Revenue
FROM (
SELECT SUM(totals.transactions) AS Transaction
, SUM(totals.transactionRevenue)/1000000 AS Revenue
, ARRAY_CONCAT_AGG(hits) hitsarray
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE _table_suffix BETWEEN '20170701' AND '20170712'
)
317 27804 33020.66
And you can see that it gives me the same results. Is the ARRAY_CONCAT_AGG() + SUM((SELECT SUM()) the correct way to unnest the hits and product data within? Well, depends on why you were expecting a different value. Please make that clear in the question.

Firebase vs BigQuery Active Users Discrepancies

I've integrated my Firebase project with BigQuery. Now I'm facing a data discrepancy issue while trying to get 1 day active users, for the selected date i.e. 20190210, with following query from BigQuery;
SELECT COUNT(DISTINCT user_pseudo_id) AS 1_day_active_users_count
FROM `MY_TABLE.events_*`
WHERE event_name = 'user_engagement' AND _TABLE_SUFFIX = '20190210'
But the figures returned from BigQuery doesn't match with the ones reported on Firebase Analytics Dashboard for the same date. Any clue what's possibly going wrong here?
The following sample query mentioned my Firebase Team, here https://support.google.com/firebase/answer/9037342?hl=en&ref_topic=7029512, is not so helpful as its taking into consideration the current time and getting users accordingly.
N-day active users
/**
* Builds an audience of N-Day Active Users.
*
* N-day active users = users who have logged at least one user_engagement
* event in the last N days.
*/
SELECT
COUNT(DISTINCT user_id) AS n_day_active_users_count
FROM
-- PLEASE REPLACE WITH YOUR TABLE NAME.
`YOUR_TABLE.events_*`
WHERE
event_name = 'user_engagement'
-- Pick events in the last N = 20 days.
AND event_timestamp >
UNIX_MICROS(TIMESTAMP_SUB(CURRENT_TIMESTAMP, INTERVAL 20 DAY))
-- PLEASE REPLACE WITH YOUR DESIRED DATE RANGE.
AND _TABLE_SUFFIX BETWEEN '20180521' AND '20240131';
So given the small discrepancy here, I believe the issue is one of timezones.
When you're looking at a "day" in the Firebase Console, you're looking at the time interval from midnight to midnight in whatever time zone you've specified when you first set up your project. When you're looking at a "day" in BigQuery, you're looking at the time interval from midnight to midnight in UTC.
If you want to make sure you're looking at the events that match up with what's in your console, you should query the event_timestamp value in your BigQuery table (and remember that it might span multiple tables) to match up with what's in your timezone.

Big Query and Google Analytics UI do not match when ecommerce action filter applied

We are validating a query in Big Query, and cannot get the results to match with the google analytics UI. A similar question can be found here, but in our case the the mismatch only occurs when we apply a specific filter on ecommerce_action.action_type.
Here is the query:
SELECT COUNT(distinct fullVisitorId+cast(visitid as string)) AS sessions
FROM (
SELECT
device.browserVersion,
geoNetwork.networkLocation,
geoNetwork.networkDomain,
geoNetwork.city,
geoNetwork.country,
geoNetwork.continent,
geoNetwork.region,
device.browserSize,
visitNumber,
trafficSource.source,
trafficSource.medium,
fullvisitorId,
visitId,
device.screenResolution,
device.flashVersion,
device.operatingSystem,
device.browser,
totals.pageviews,
channelGrouping,
totals.transactionRevenue,
totals.timeOnSite,
totals.newVisits,
totals.visits,
date,
hits.eCommerceAction.action_type
FROM
(select *
from TABLE_DATE_RANGE([zzzzzzzzz.ga_sessions_],
<range>) ))t
WHERE
hits.eCommerceAction.action_type = '2' and <stuff to remove bots>
)
From the UI using the built in shopping behavior report, we get 3.836M unique sessions with a product detail view, compared with 3.684M unique sessions in Big Query using the query above.
A few questions:
1) We are under the impression the shopping behavior report "Sessions with Product View" breakdown is based off of the ecommerce_action.actiontype filter. Is that true?
2) Is there a .totals pre-aggregated table that the UI maybe pulling from?
It sounds like the issue is that COUNT(DISTINCT ...) is approximate when using legacy SQL, as noted in the migration guide, so the counts are not accurate. Either use standard SQL instead (preferred) or use EXACT_COUNT_DISTINCT with legacy SQL.
You're including product list views in your query.
As described in https://support.google.com/analytics/answer/3437719 you need to make sure, that no product has isImpression = TRUE because that would mean it is a product list view.
This query sums all sessions which contain any action_type='2' for which all isProduct are null or false:
SELECT
SUM(totals.visits) AS sessions
FROM
`project.123456789.ga_sessions_20180101` AS t
WHERE
(
SELECT
LOGICAL_OR(h.ecommerceaction.action_type='2')
FROM
t.hits AS h
WHERE
(SELECT LOGICAL_AND(isimpression IS NULL OR isimpression = FALSE) FROM h.product))
For legacySQL you can adapt the example in the documentation.
In addition to the fact that COUNT(DISTINCT ...) is approximate when using legacy SQL, there could be sessions in which there are only non-interactive hits, which will not be counted as sessions in the Google Analytics UI but they are counted by both COUNT(DISTINCT ...) and EXACT_COUNT_DISTINCT(...) because in your query they count visit id's.
Using SUM(totals.visits) you should get the same result as in the UI because SUM does not take into account NULL values of totals.visits (corresponding to sessions in which there are only non-interactive hits).

Firebase Analytics: export data to BigQuery as a date

I'm trying to log events and specify the moment these moments happened.
For example:
Firebase.Analytics.FirebaseAnalytics.LogEvent ("EventGamePlayed", "sent_at", DateTime.Now.ToString ("yyyy-MM-dd hh:mm:ss"));
I would like to know if it's possible that, once it's exported to BigQuery, I could use this parameter as a date / timestamp so I can, for example, get all the X or Y events that happened last month.
Thanks!
You probably don't need this, actually. Referring to the Firebase schema for BigQuery exports, you can use either date or timestamp_micros within event_dim. For example, to find events on April 1, you could do:
#standardSQL
SELECT event
FROM YourTable
CROSS JOIN UNNEST(event_dim) AS event
WHERE PARSE_DATE('%Y%m%d', event.date) = '2017-04-01';
To find events that occurred between 12pm and 4pm UTC on April 1, you could do:
#standardSQL
SELECT event
FROM YourTable
CROSS JOIN UNNEST(event_dim) AS event
WHERE TIMESTAMP_MICROS(event.timestamp_micros) BETWEEN
'2017-04-01 12:00:00' AND '2017-04-01 16:00:00';

Resources