We have several websites (same website but different topleveldomains) and store the GA360 data in BigQuery.
The GA360 data for each website is stored in the same BigQuery project but in a different dataset (date partitioned).
Example:
GA360 GBQ Project
Website 1
Website 2
Website 3
We are now starting to use Looker and I want to be able to analyze the data. My goal is to be able to create dashboards containing the GA360 data for each individual website but also be able to create dashboards containing he data for all websites combined.
As far as my knowledge goes for both Looker and GBQ the joining can be done in both Looker and GBQ.
Since both GBQ and Looker are quite new to me I am wondering how and where these datasets should be combined in the most (cost)effective way.
So my question is:
What is the best way to combine multiple websites' datasets stored in GBQ to be able analyzed in Looker?
Thanks!
You should be able to join between tables in different datasets by explicitly specifying the fully qualified table names in queries while creating derived tables in Looker. The data processing will happen in BigQuery
Related
I recently connected my Google analytics (ga4) property to bigquery. I set up a daily export of the data. The data doesn't seem to get exported as I don't find any tables under the dataset in bigquery. The things I tried:
Delete the link and the dataset and start from scratch again. - Didn't work.
Check for service accounts - all were present and it still didn't work.
Issue: Dataset is getting created but the tables which should contain the raw data are not getting created.
My team has linked our Firebase and BigQuery projects and set up the intraday table. However, the table is created unpredictably. I was able to use it yesterday (events_intraday_20200701), but it is already noon as of writing this, and there is still no intraday table (events_intraday_20200702) in the dataset. (The regular event tables are there, per usual). In the streaming area of the Firebase console, I can see hundreds of events being generated, but cannot query an intraday table to see them in realtime.
I also struggle to find resources clarifying when the table is created besides "raw event data is streamed into a separate intraday BigQuery table in real-time" from
https://support.google.com/firebase/answer/6318765?hl=en. Are there reasons why the table may not be created, or more details about what time during the day I can expect it to exist?
On a related note, is it true that Web events are not supported for the intraday table?
Thanks!
I am attempting to build several apps for different use cases in our business. However many of them share data. Such as our Inventory app will use the same customer data as our field audit app, and many apps will need to access the dataset we made to hold SKU information.
As such, my question is, if app a, has a table with x information. can App b access the same table?
It could.
You can add a model to an AppMaker App to get information from external sources.
See the information here:
https://developers.google.com/appmaker/models/overview
I would like some guidance to setup BigQuery data storage from Google Analytics.
We have 6 different websites which 4 of them belongs to a project and 2 of them to another, but we would like to analyse the data both separately for each site; the projects separately with the sites data; and all the sites together.
Hence, which is the best structure to setup in BigQuery?:
Two projects, with 4 and 2 datasets, or 1 main project with 2 datasets and 4 and 2 tables? or is that even possible.
Or is it so easy to extract the data that it doesn't matter, we can just put every site in an own project and extract the data as we want them.
Please give me some guidance in this issue
Kind regards
The short answer:
Or is it so easy to extract the data that it doesn't matter, we can just put every site in an own project and extract the data as we want them.
Yes!
The longer answer:
You can extract data from only one view per property (Set up a BigQuery Export), so start by identifying which one you'll link and ensure the settings are the same across all of the views you are going to import, assuming this is important to you.
Each profile/site will go into it's own dataset and will be partitioned by day, making it easy to query them individually, or together, as required.
It is possible to query across projects, so if you store data across two, you'll still be able to join them.
In my opinion it would make things easier for analysts if the data was all in one project, as you'll be able to save queries in a single location and track the query costs centrally, but if you need to keep 2 projects your data can still be connected.
For starters, I am relatively new to Firebase and BigQuery...so a bit naive...
Background:
I am working on a project where we have many distinct mobile apps - one for each country, for each OS (iOS, Android) - with all of the raw event data getting pushed into BigQuery datasets that live in the same project. The rationale behind having distinct apps by country relates to regulatory requirements.
For dashboarding purposes, I had hoped to be able to combine all datasets into a single aggregate dataset (with tables by year) that is partitioned by date via a "partition_date" column that I added. My goal has been to take advantage of the ability to use BigQuery nested structures, and to maintain the original nested fields from the raw Firebase tables.
I've been searching around over the past month or so, but I haven't come across any other use cases where people are dealing with many datasets.
Questions:
Does it make sense to take this approach (aggregate all datasets into a table partitioned by date, and a nested field based on app name)?
As an alternative - I tried denormalizing the tables, with tables sharded by date...though the resulting tables (not surprisingly) are much larger due to flattening the nested structures, which makes me think this approach is less than ideal. Should I actually consider this approach?
Should I consider just making a relatively simple aggregate table of KPIs? I've avoided this approach because it limits my ability to use date range filters that dynamically calculate unique users on the fly (using COUNT_DISTINCT(user_dim.app_info.app_instance_id) in Data Studio)
The end goal is to empower any user (ex. someone who doesn't know SQL) to be able to answer simple questions across all apps (ex. how many users opened the app yesterday), and to let end users make use of date range filters in the dashboard interface.
I've been able to write ad hoc queries to get at the answers by querying across all datasets, but I have not found a good solution that will make it easier for non-technical users within the dashboard.
Also, this is my first Stack Overflow question...please let me know if I am being too vague, including too many questions, or otherwise abusing the platform.
Thanks in advance for any thoughts.
If your end goal is to empower users to answer simple questions, aggregating KPI data makes sense to me. I would probably approach this by setting up a scheduled daily job that selects the relevant data from all the datasets and loads it into a new dataset, which can then be used in Data Studio. The new datasets could have the default Firebase date table suffix to support date range filters.
I am also relatively new to BigQuery and Firebase though, so maybe there is a better way.
You can find more information about scheduling in BigQuery here: Schedule query in BigQuery