Merging datasets in Bigquery - firebase

After upgrading to Google Analytics in my Firebase project I linked up to a new GA property and had to re-setup my Bigquery integration (after accidentally linking/unlinking my GA account, my integration was turned off). I got it fixed by linking again but now all new data is fed into a new analytics_* dataset.
Since all my queries are referring to the old dataset it would be quite the effort renaming all of them to link to both the new dataset + the old dataset. Is it possible to either:
Change the destination table in the firebase Bigquery export (choosing the old dataset instead of the newly created one)
Somehow merge the two datasets (instead of copying them)
I understood it's impossible to rename datasets which would solve my issue if I could change the name of the new set to the old name and copy the contents of the old set to the new one.
UPDATE
I was able to fix it by:
unlinking the project again
using the firebase management api to link my firebase project again to the original GA property
https://firebase.google.com/docs/projects/api/reference/rest/v1beta1/projects/addGoogleAnalytics#request-body
This started feeding data back into my old property. I subsequently copied the partitioned tables from the newly created property/ table back into the old property (in Bigquery) using the same naming convention (eg. events_20190101) which correctly copied them in the partitioned dataset. I had to append some intraday events as well to the existing partitioned dataset but this solved my problem in the end.

According to the BigQuery export schema document for each single Firebase project linked to BigQuery, a single dataset named "analytics_" is added to your Bigquery project and such ID refers to your Analytics Property ID located in the Analytics Settings in Firebase (Settings -> Integrations -> Google Analytics).
It seems that this ID is generated automatically when you set up a property and it can be manually changed to a custom one.
Additionally, there's no way to merge datasets than copy the data between them. Thus, you could consider using scheduled queries to append the data in your old dataset.
Hope it helps

Related

Bigquery Firebase intraday events not deleted completely

I moved a dataset from US region to EU region, following the instructions given by Google.
If you choose the wrong region and need to change it after you've created the link:
Delete the link to BigQuery (see below).
Backup the data to another dataset in BigQuery (move or copy).
Delete the original dataset. Take note of the name: you'll need it in the next step.
Create a new dataset with the same name as the dataset you just deleted, and select the location for the data.
Copy the backup data into the new dataset.
Repeat the procedure above to create a new link to BigQuery.
After changing the location, you'll have a gap in your data: streaming and daily exports of data will not process between deletion of the existing link and creation of the new link.
After creating the new dataset in EU and activating the link, I have the following issues: the intraday tables haven't been canceled completely in the following days and, also, the datasets that I created for the backup (copy) are being updating by Firebase exports.
What is happening with the exports?

Use existing data streams to segment data from each individual app

From the old version of analytics I have 1 property with many data streams.
Each data stream represents 1 Android app.
The reports are not relevant as they aggregate all data streams with no way to segment each individual app.
How can I get a separate reporting for each app using the same data streams i.s. WITHOUT creating new firebase config files.
I have already tried to create a new property but when adding the datastream, there is no way to use an existing one.
Just found out a way to get what I need:
In a new report, click All users, then the Audience Name should be Stream ID and then the Dimension Value should be the app you want the report from.

GA4 to Big query - Dataset getting created but no tables are found

I recently connected my Google analytics (ga4) property to bigquery. I set up a daily export of the data. The data doesn't seem to get exported as I don't find any tables under the dataset in bigquery. The things I tried:
Delete the link and the dataset and start from scratch again. - Didn't work.
Check for service accounts - all were present and it still didn't work.
Issue: Dataset is getting created but the tables which should contain the raw data are not getting created.

Change default data collected by Firebase/Google analytics

We use Firebase/Google analytics in our android app. Each event is saved with a lot of extra information (user-id, device-info, timestamps, user-properties, geographical location …). The extra info is there by default but we don’t want it to be collected.
We tried 2 things:
1) Update Big Query Schema
Delete the unwanted columns from Big Query. Unfortunately, Big Query creates a new export every day. So we would need to know where those fields are coming from. Something we don't know.
2) DefaultParameters within the app
Tried to use default parameters from inside the app, so the city will always be null. Here is an example with the user’s city
Bundle defaultValues = new Bundle();
defaultValues.putString("geo.city", null);
FirebaseAnalytics.getInstance(ctx).setDefaultEventParameters(defaultValues);
Unfortunately, we still see geo.city in our BigQuery data filled in.
Is there a way of changing what is collected by default?
There is no way to disable the geography information. Analytics uses IP addresses to derive the geolocation of a visitor. Probably the solution about update Big Query Schema is a viable way. You have to create a system that carries out this update on a daily basis precisely because the export takes place every day.

Can I add a field to the app_events_intraday table in BigQuery?

I am currently extracting my Firebase event data from BigQuery to an onsite database for analysis. I extract the Firebase intraday table(s) along with the previous 4 days (since previous days' tables continue to be updated) every time I run the ETL job. Since there is no key or unique ID for events, I am deleting & re-inserting the past 4 days of data locally in order to refresh the data from BigQuery.
Would it be possible for me to create a new field called event_dim.etl_status on the intraday table to keep track of events that have been moved locally? And if so, would this field make its way into the app_events_yyyymmdd table once it is renamed from *_intraday to *_yyyymmdd?
Edit:
Some more context based on comments from dsesto:
A magical Firebase-BigQuery wizard automatically copies/renames the Event "intraday" table into a daily table, so I have no way to reproduce or test this. It is part of the Firebase->BigQuery black box.
Since I only have a production environment (Firebase has no mechanism for a sandbox environment), testing this theory would require potentially breaking my production environment which is why I posed a "is it possible" scenario in case someone else has done something similar.

Resources