I'm very aware that my title isn't very "problem solving" oriented. However I' like to understand the issue I have and I hope that it might help others.
I'm connected to a Google Analytics view. I've created three different sources in Power BI from the same view:
Sources with:
date, sessions and source / medium dimension
date, sessions and a hit level custom dimension
date, sessions, hit level custom dimension and source / medium
date, sessions
Sources 1,2,4 return the same total amount of sessions (which match Google Analytics numbers) however the source 3 return only half of the sessions.
So I guess something is not working between the source / medium dimension and the custom dimension however I'd like to understand why.
Thanks.
Related
I wonder whether someone can help me please.
I have a user who under a specific property, sporadically receives the following error:
Some hits sent on 03-Jul-2018 to property ...... exceeded one or more hit quotas and were therefore not processed.
Hits can be dropped when daily or monthly hit limits are exceeded. You can view your hit volume levels in Property Settings in Analytics.
Hits can also be dropped if visitor hit limits are exceeded. This can happen when your site is incorrectly generating the visitor ID for a GA session. Contact your website administrator to check that the visitor ID generation has been correctly implemented.
They are not using the Premium account but when I look at the data for the day in question, there aren't any issues with regards to 'High Cardinality' which unless I've misunderstood I'd expect to see.
Could someone look at this please and offer some guidance where the issue may be because this area is fairly new to me.
Many thanks and kind regards
Chris
Collection limits are influenced by 2 factors:
The tracker: whether you use ga.js,gtag.js,analytics.js etc... here are the details.
The property type: whether you are using GA (10M hits / month) or GA 360 (2B hits / month).
In your case you are facing a property limit. To find out when such limits where reached, you can create a custom report using a time dimension (eg date+time) combined with the hits metric. You can also combine the hit metrics with other dimensions (country, browser, device) to see if you find any patterns as to why you're getting so many hits.
Cardinality is something else: it refers to the number of unique value combinations for your dimensions. For instance if you have 500K events where each event category is different, you'll have a Cardinality of 500K on the event category dimension. The more hits, the more likely you'll have a high cardinality, but the 2 aren't necessary related (if you send 10B events with the same category, the cardinality on the category is 1).
So focus on identifying and solving your limits/quotas issue, as it's the real issue here:
If the number of hits is legitimate (you have a huge amount of traffic), then the only options are to upgrade to GA 360 or reduce the number of hits for each session
If the number of hits is abnormally high (eg traffic is stable but hits increased dramatically), look for implementation issues, especially generic event trackers such as error tracking with tools like Google Tag Manager
We are using the free level of GA and have been creating reports using Custom Dimensions and Metrics since last summer.
We also use the Google Sheets Analytics add-on to post process data pulled from the API.
Overnight on 16-17 May (UK Time), our reports suddenly started showing as being sampled. Prior to that we had no sampling at all, as our reports are scheduled so I can look back through the revision history to see changes made when the scheduled reports run.
This sampling is occurring in custom reports viewed in the GA platform and in GA sheets. I've done some analysis and it appears to only occur at the point that more than one Custom Dimension is added to a report, or when the GA dimensions ga:hour or ga:dateHour are used (ga:date does not trigger sampling).
All our Custom Dimensions and Custom Metrics are set at Hit level (I've read a post where it was claimed to be due to mixing scopes on Dimensions & Metrics, but we are not doing this).
If I reduce the date range of a query (suggested as a solution on many blogs), the sampling level actually gets worse rather than better.
For the month of May we didn't even hit 4k sessions at property level. I can't find any reference anywhere to any changes being made to GA that would cause sampling to apply to our reports (change documentation, Google Blogs etc).
Is anyone else experiencing this or can anyone shed any light on why this might be happening? Given how we use GA if we can't resolve this then it's a year of work down the drain, so I'm really keen to at least know why this has suddenly happened even if ultimately nothing can be done about it.
Is it possible to export a list of all page views, with timestamps, out of google analytics?
Currently I can see page views per day, but not specific times for each view. I do see real-time data as it comes in, so I assume the time for each view is being stored.
If not, how might I go about sending that data when I report the page view in a way that will make it accessible in the dashboard?
[[UPDATE]] The Google Analytics v4 API now includes a "dateHourMinute" dimension.
Inside Google Analytics, I think the problem you will run into is that a "timestamp" includes Date, Hour, Minute, Second (and maybe millisec) and you can only drill into one secondary dimension at a time (ie: just date). An alternative method is to use the query explorer or just use the Google Analytics API.
Ad Hoc Report with query explorer
You can use the Google Analytics API Explorer: https://ga-dev-tools.appspot.com/query-explorer/
It will allow you to select:
A date range
Metrics: (like Users, pageviews, etc)
Dimensions: pagePath, Date, Hour, Minute (can't go to second or further, as in a timestamp, though)
Use filters to get rid of some data.
If your website gets a decent amount of traffic (> 50,000 pageviews per month), you might want to grab it 1 month (or less) at a time to avoid hitting limits/sampling effects.
#doctaj solution works. Please just note that if you are using the query-explorer for example, "dateHourMinute" dimension is considered 3 dimensions and will burn 3 spots from your available 7 or 10 dimensions that you can query. So if you need more dimensions, you need to decide which is more important to you. For example, you may want to query for the date and hour dimensions separately to drop the minutes in favour of something else.
Our application needs to gather usage data through the Measurement Protocol of Google Analytics.
We can successfully send "appview" hits to the Google Analytics server, and get a proper response by it (a GIF image). The appview hits appear on the GA Dashboard, along with the country of origin, session duration, etc.
We also have several custom dimensions and metrics that we want track for each hit. We have set those up in the GA Admin panel with the correct scope, index and active state. We have 3 Hit-scoped dimensions, 3 User-leveled dimensions and 1 Hit-scoped metric, all set to Active state.
We send the dimensions and metrics as described in the docs at
Custom Dimensions / Metrics
attached to the hits they apply for, like so:
...&cm1*=3 <--for the metric
and
...&cd6*=15 <--for the dimensions (some dimensions have numeric values, others are text)
The problem is that those metrics and dimensions don't show up in our custom reports: the reports always say "There is no data for this view.". For example, we have a report that has one dimension and one metric, without any filters, set to "Any view". It doesn't matter if the Type of the report is Explorer, Flat Table or Map Overlay, it never shows anything.
There have been several days since the hits were received and appeared in the dashboard, but the reports are still empty. So scratch out any processing lag.
We tried sending "event" hits instead of "appview" hits - again, the hits show up in the Dashboard, but the reports are empty.
We cannot get any useful insights without using dimensions and metrics - so there is no way to get by without this.
Because of reasons too long to describe, we cannot use any of the Google-provided Analytics libraries.
Is there anything else we need to do to see data in those reports?
When using the measurement protocol you need to that the profile is not set to exclude bot and spiders.
Go to the google analytics website under the admin and the settings under the profile in question.
Beyond that check the realtime reports you should see the hits coming in. You will need to wait 24 -48 hours for data to appear in the standard reports due to data processing lag.
I am fetching my data from google analytics core api. I came to know that we can fetch only 7 dimensions using api, But here I need to fetch more than 7 dimensions with correct metrics. Is there anyway (other than using paid google analytics) to fetch more than 7 dmensions with correct metrics from google analytics.
If not, then is there any mathematical formula through which we can find intersection of dimensions fetched using 2 different dimensions having one dimension in common.
Thanks
The only way I have found around the 7 dimension limit is to be creative with filters. It will mean that you are sending more queries to the server, but if you aren't worried about your quota limit then its doable.
Example: ga:visitorType only has two values. New Visitor, Returning Visitor
So you could run a query with your 7 dimensions. Then add A filter ga:visitorType=New Visitor then run the same 7 dimensions with a filter added of ga:visitorType=Returning Visitor
Bingo you just got an 8th dimension.
Again you need to be a little careful because you are going to increase the number of queries you run by each value of dimension you are doing creative filtering on. I wouldn't recommend doing it with say ga:landingPagePath because the number of landing pages you have is to big. You will end up running too many queries. But there are a few you can do this with. ga:isMobile, ga:deviceCategory, ga:date. Just remember the more creative filters you add the more queries you end up running. I had one where I ended up expanding it to 80 requests just to get the data back I needed. Scary but it worked.
If you have specific dimensions that can be used to identify a session at a specific time (e.g. a session ID and browser timestamp), you can execute multiple queries then patch them together.
I built a python program that will do exactly this: https://github.com/aiqui/ga-download
This program can bring together multiple groups of dimensions, so that any number of dimensions can be downloaded and combined into a single CSV file.