Average Session Duration Total Incorrect - google-analytics

I've currently got an integration with the Google Analytics (GA) API which pulls data on a daily basis and stores it within a database for use within a PHP project.
I'm trying to get the "Average Session Duration" within my project to match the value of which is set within GA for a specific channel (Organic Search), on a specific day (Apr 1, 2018 - Apr 1, 2018).
As you can see from the below image, the total Average Session Duration for that specific filter within GA is returning "00:04:28", for use later, this in seconds is 268 (Calculated by multiplying the time by [24 * 60 * 60]).
NB: I've removed some key, personal, information from the above image.
If we perform an export of this view, and select the option "Excel (XLSX)", this will give us a spreadsheet with values in which replicate what's returned from the GA API.
The below image shows this dataset, and i've highlighted a few key things which are explained below the image:
The contents highlighted in blue is static data which is calculated by GA at export-time. This includes an average from them, which is the last row within the highlighted section (H7).
The contents highlighted in red is using the time to seconds calculations (Which was explained earlier), and this successfully replicates the same contents which is within GA. This means that the Average Session Duration coming from the API (In Seconds) converts correctly to the values which are shown (In Time) on the GA dashboard.
Now, here's where I'm having issues; The contents highlighted in pink is using the AVERAGE calculation within Excel, and takes values from H2:H6 and I2:I6 respectively. What I'm expecting is for these two averages to match the averages at the bottom of the blue and red sections, however, as you can see they don't.
Is there anyone here that's able to help with how GA are calculating that average, please?

Related

Effects of high-cardinality Google Analytics event label fields?

I have a Google Analytics event label with high cardinality that I'd like to implement - it is a string that can take on any combination of a finite-but-large number of names in a comma-separated list.
I'm worried mainly about losing data - I found this Analytics Help support page:
https://support.google.com/analytics/answer/1009671?hl=en
...which states:
Reports containing high-cardinality dimensions may be affected by
Analytics system limits, resulting in the creation of a rolled-up
(other) entry in the report to contain the data that exceeds these
limits.
...and am wondering if that would also affect reports without the label included, i.e., reports just looking at unique category/action pairings - would GA still roll-up otherwise-identical into "other" entries if the (undisplayed) labels are different?
Also, am wondering if there would be any hits to performance for similar report types (not looking at labels, just category/action pairings).
Maybe this is just bad practice out of the gate? :)
Google Analytics stores daily, in the processed tables, up to a maximum of 50,000 rows (in Google Analytics 360 the limit increases to 1,000,000 rows, making the problem of data aggregation less frequent). As a result, many combinations of unique dimension values are stored for each table processed every day. If a given table has a larger number of combinations of values of dimensions, Analytics stores the top N values and creates a row of type (other) for the remaining combinations of values.
https://www.analyticstraps.com/valori-raggruppati-in-other-nei-report/
Anyway, I tried a custom report with label and without (same time period) and with label I got (other) while without that dimension I got the actual values.
So the problem you fear does not exist (unless the event action is also high cardinality) :)

Different Active Users count when using segments

I would love to understand what I'm looking at - why are the numbers different in this report when I add a segment?
This is the report without any segmentation:
This is the same report with the Mobile Traffic segment:
There two methods that Google uses to identify the number of users.
Calculation 1: Pre-calculated data
This calculation relies only on the number of sessions in the given date range and the time of each session. (This is determined by technology managed on the device, like a web browser, and is often referred to as the client-side time.) Because the result of this calculation can be added to the pre-aggregated data tables, Analytics can reference the table to quickly retrieve and serve this data in a report, including when you change the date range.
Calculation 2: Data calculated on the fly
Calculation 2 is based on the way you assign, collect, and store persistent data about your traffic. There are many solutions you can implement to customize this, but the most common way this data is going to be assigned and stored is through cookies managed via a web browser.
Adding a segment will force GA to calculate the data on the fly and that's why you are seeing a difference in the numbers.
Are you using GA free or 360? and the time range you are using is same in both reports?
You can also have a look into the Google article https://support.google.com/analytics/answer/2992042?hl=en
You are victim of sampling:
https://support.google.com/analytics/answer/2637192?hl=en
Sampling applies when:
you customize the reports
the number of sessions for the report time range exceeds 500K (GA) or 100M (GA 360)
The consequence is that:
the report will be based on a subset of the data (the % depends on the total number of sessions)
therefore your report data won't be as accurate as usual
What you can do to reduce sampling:
increase sample size in UI (will only decrease sampling to a certain extend, but in most cases won't completely remove sampling)
reduce time range
create filtered views so your reports contain the data you need and you don't have to customize them

Discrepancy in Google Analytics data when using segments

I'm having a tough time with Google Analytics, trying to understand why the value of metrics changes when segments are applied.
There is a standard audience overview report, which is based on 100% of sessions (no sampling) and the view is not filtered. The period is March of 2017.
Standard "All visitors" segment looks like this:
Then, there is another built-in segment called "Bounced Sessions". When I apply this segment, the "All visitors" values changes:
Amount of users increases, but the count of pageviews decreases.
Any ideas how to explain this?.. Thank you in advance!
Oki, there can be, multiple reasons. Let me explain first how these numbers are calculated, then we move on to your query.
There two types of data gathering and manipulation from google.
Pre-calculated data -- pre-aggregated tables
These are the precalculated data that Google uses to speed up the UI. Google does not specify when this is done but it can be at any point of the time. These are known as pre-aggregated tables
Data calculated on the fly
Some that you do which result in computation or manipulation falls under this category. Like using segments, creating custom reports etc.
Coming to your problem. When you apply segment, every metric that it effects will be calculated again. Thus it may result in numbers greater than you see in normal view.
Standard audience overview report is pre-aggregated at some point of the day. When you apply segment, the results will be calculated with the fresh data. Since latter is the latest, it will automatically give you increased number of the metrics. Even you can see a decrease as well, all depends on your data and user behavior.
Resolution: If you are a premium user, use Big Query. You must rely on big query for every metric as they are fresh and calculated on the fly

A function in Google Data Studio that calculates number sessions from a particular source divided to the overall sessions

I wish to have the percentage of sessions from facebook.com to overall sessions and demonstrate it in Google Data Studio. I can show the Facebook trend and compares it to overall by adding a calculated field as follows:
CASE WHEN Source = 'facebook.com' THEN 'Facebook' ELSE 'overall' END
But I couldn't figure out how to have a calculated filed that shows the percentage of Facebook sessions Is there a way to day?
Try to change the type of your metric from number to percent. Just click the "123" / edit mini button of the metric you need to change. Then under type, select numeric and percent.

Using the Google Analytics Export API to get a websites trend

i am working on an application that uses the Google Analytics Export API and i am trying to get the "Trend" result that Google shows against each of your site accounts as a percentage
ie UP 35.04% or DOWN 16.02%
How/where do they get this figure and is it available in the API somewhere.
i have tried comparing averages of last month to this month/first week of the period vs last week of the period etc, but i cannot seem to get the same numbers that Google provides.
any ideas?
thanks in advance
Doug
The formula is pretty simple (excluding today's data):
(Visits over the last 30 days - Visits
between 31 and 60 days ago) / (Visits
between 31 and and 60 days ago).
You can see it in action in the interface if you go to the default dashboard, where it shows you the last 30 days, then on the calendar, click "Compare to past" and select the default amount. It'll show you the numbers used for each calculation and the calculations as they appears in that account list.
The API does not, however, expose pre-calculated numbers (for example, they don't compute bounce rate for you; they just give you the pieces for it.)
So, you'd need to do two API requests to get this data. One for ga:visits in the last 30 days, and then one for ga:visits in the 30 days prior.
Then, when you get it, just subtract, divide, and multiply by 100, and you'll have the percent you're looking for.
UPDATE: The striked out part of the answer was true, but is no longer. The newest version of the Google Analytics API does provide access to some pre-calculated values.

Resources