Discrepant Google Analytics Data In PowerBI When Using Built In Connector - google-analytics

I was tasked with creating multiple PowerBI reports and dashboards that encompass our website data from Google Analytics (GA). I started with our primary website, and used the built in custom connector from PowerBI to consume this data. This connector, however, is "supported" by Google, and I'm having some issues with the data.
Google Analytics Data
PowerBI Data
As you can see in these two screenshots, some of the data matches, and some of it does not. The time frame for both screenshots show for Nov. 2019. I understand there may be a little bit of data discrepancy; however, the custom connector in PowerBI does not allow for me to show the unique users. This is one of the most important attributes to any websites as it is more accurate showing how many individuals are actually visiting your website due to the cookies on their machine. Another big data issue I noticed is the bounce rate is almost twice as much as the actual bounce rate in GA.
Is there any way to bring in the Unique User and Bounce Rate metric accurately? The marketing team currently uses DashThis, and it shows this data, so I know it's possible- I just don't know how it's being done.

So Power bi's connector for google analytics has a sampling issue and an API Calls limit. The standard limitations and quotas for Google Analytics API requests are documented in Limits and Quotas on API Requests. However, Power BI Desktop and Power BI service allow you to use the following enhanced number of queries.
Queries per day: 1,500,000, Queries per 100 seconds: 4,000
Might be better to connect Google analytics to a data warehouse such as Google BigQuery, which will help you get access to historical data that you won't be able to get with Power bi's connector. I'd suggest using a third-party connector.

Related

Can I switch my Google Analytics property from GA4 to UA?

For some reason PowerBI does not accept GA4 as a connection, so my data is stuck in GA4.
PowerBI does support Universal Analytics (UA), but I cannot find a way to convert my GA property.
Is there a way to do this or should I start over again and create a new property as UA?
PowerBI currently doesn't support GA4 as a destination because the underlying api is not product ready yet. See the response in the microsoft power BI community here: https://community.powerbi.com/t5/Desktop/Power-BI-Google-Analytics/td-p/1679530
For information about the api's and their current state:
This is a link to the api documentation the plugin is using (for universal analytics): https://developers.google.com/analytics/devguides/reporting/core/v4
This is a link to the api documentation of GA4 (which is still in beta): https://developers.google.com/analytics/devguides/reporting/data/v1
Regarding a solution for your issue:
In GA4 is it possible to export data to big query, which can be processed there and imported to PowerBI: https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-connect-bigquery
The export only works from the moment you enable it, so it is not possible to access the data this way before the enablement.
Second option is using data studio to connect to your GA4 data to be able to access the data you have measured before.
First off as everyone says you cant change the type of your Google analytics account if it was created as UA then its UA if it was created as GA4 then its GA4. There is also no way to exxport data.
Now lets look at the underlying cause of your problem Google analytics connector in Power BI Desktop rely on the Google Analytics Core Reporting API this API is used to extract data from Google Analytics UA accounts. There is a very big difference between the underlying data structure of a UA account and a GA4 account they are not completable in any way shape or form.
Thats why the Google analytics team is currently developing the Google analytics data api This api has the ability to connect to Google analytics GA4 accounts and extract data. At the time this post was created Power BI Google Analytics The Google analytics data api had just been released and was still in alpha test phase as you can see from the change log this API is still under development. I haven't checked but it looks like it may even be out of beta now. (TBH I am not really sure of the exact date it was released publicly as I have been in the alpha test group for a long time.)
However that being said a lot of the client libraries used by this api are still under development as well Google .Net Client library for Google Analytics Data API for example is still in beta. At the time that post was written i am almost 100% sure the client library had not been released.
This means that any power bi connector is probably not going to be released until both the API and the client library are considered stable. I know the team is working hard on the API and I Haven't seen any breaking changes in a very long time.
So to the user who added a bounty to this looking for an updated answer.
IMO I don't think you have long to wait before everything is stable enough for applications like the Power BI to start appearing that support GA4. That is assuming that someone creates one.
You can't covert GA4 to UA (or UA to GA4).
In your case you can create a new UA Property and start to track in it:

How can I fetch the number of pageviews per page on my site in the last 30 minutes with google analytics?

I would like to add a "what other people are doing" section to my react website, which shows the currently most interesting sections of my website using google analytics. This means I would need fetch the most popular pages every 30 minutes from google analytics from within my pages backend, presumably using the realtime api, because I know that the reporting of the reporting API can be up to 24 hours out of date.
But unfortunately, the realtime API seams to only be able to track users that are using the page currently, and there doesn't seam to be a way to distinguish which actual page there on.
Basically what I need is a way to figure out how many individual people (not just people refreshing over and over again) have visited a certain section of my page in the last 30 minutes. Google analytics seams to be the way to go but I'm open to any other solution, as long as it can be used from within the aws cloud.
What you need to remember is that the Google Analytics website also uses the Google Analytics API. If something you want to see can be done on the website for the most part you can also do it with the api, with the exception of a few of the calculated metrics which can be a bit tricky to duplicate with the API.
The reporting of the Google analytics Reporting API takes between 24 - 48 hours to process your data until that time your data will not be completed processing so the numbers will not be correct.
The google analytics real-time api would be an option but the data is limited into how long it stays around as well is what dimensions and metrics you can actually view as they are only giving you access to what they are sure has completed processing.
An issue with both of these solutions would be the quota, every view on google analytics is only allowed to make 10k requests per day. You will need to ensure that you stay within this so if you are only requesting data every thirty minutes around the clock you should be able to keep this to around 48 calls but its wroth noteing.
IMO Google analytics is not suited for this task you should keep a running count on your own website store it in the database and use that.

google analytics with power bi extraction

I connected to google analytics api and have difference in numbers, but i connected from data studio and have the same numbers than google analytics. Metrics like users, new users, sessions etc.
This is one of the cases where sampling just kicks in without the user knowing.
Sampling occurs when fetching data from GA API, this is how Power BI connects to the API.
Sampling usually occurs when your timeframe is larger in size or would result in too many dimensions being called.
If you have acces to GA premium, you can have access to unsampled reports or else you can divide the timeframe in shorter timeframes.
So Power bi's connector for google analytics has a sampling issue and an API Calls limit.
Queries per day: 1,500,000 Queries per 100 seconds: 4,000
Might be better to connect Google analytics to a data warehouse such as Google BigQuery, which will help you get access to historical data that you won't be able to get with Power bi's connector. I'd suggest using a third-party connector.

Google Core Analytics API - Data accuracy

I have a Console App. which downloads the Acquisitions Report by querying the Google Analytics Core API.
Here is a screenshot of my query as shown on Google Analytics Query Explorer:
I am unable to obtain a complete match of Goal Completions data between the API Data Output and the Analytics web interface. The variance is not too high but it exists. The Query Explorer and my Code Extract both match, but again, both of them mismatch with the data shown on the Analytics Web Interface.
All query parameters match 100%; including dates, Profile ID, ..etc.
I read articles online such as this one and this one, which elaborate on the inaccuracy of the Goal Completions data in google analytics, I wonder, could this be likely what is happening here?
Could this be related to bots and spiders? How do I exclude these from the query by using code?
Many thanks.

System design - Google Analytics

I'm working on the architecture for a project that includes a Android and iOS apps and a web interface with a subset of the mobile apps functionalities. The project is basically a e-commerce solution. In all three interfaces I'm using Google Analytics to track some information. However I'm having an internal discussion about the extent of the information I should send to GA. What should I store in GA and what should I store in my own server?
Let me give you some examples.
Session tracking is clearly something that belong to GA.
ProductDetailViews. Sounds like something that should go into GA, specially considering the enhanced e-commerce module.
Shared item. When a user shares some content over a social network, should I store that information on GA or in my own server? I'm inclined to GA but it becomes more ambiguos.
Do you see my point? Can someone share a general rule or recommendation on what should be saved in GA and what should be saved on the projects own server?
Thanks
For those examples I would generally send all the hits to Google Analytics. Here are a few reasons:
Preventing data silos. You want all of your data in one place and Google provides you with a database reachable via the API where you can keep all your data organised in one place. This is important when you are considering measuring performance, as you want to avoid duplication of conversions or traffic hits
Useage of Google Analytics advanced segments. With all your data in GA, you will be able to create advanced segments for analysis. But the real power is if you are using AdWords or retargeting, as you can send those Advanced Segments to AdWords, and target those users around the web with your custom data
Single point of reference for users All analytics are inaccurate, but you want to make sure they are inaccurate to the same degree. Using GA keeps all your data on the same playing field
Usability and Freedom of information Its easier to serve up your data to users within the GA interface as people are more likely to know how to navigate that than your database. You can also use the GA API to pull out any data you need to push into other visualisation tools.
User session merging With your data and userID tracking in GA, you may be able to track users as they arrive via mobile to desktop and back again, over multiple sessions.
What you need to avoid putting in to Google Analytics is personal info such as names, email address etc. There are against the TOS. But you can capture a unique userID, and match that outside of the tool later.

Resources