Google Optimize - how experiment sessions and conversion rate are calculated? - google-analytics

I use Google Optimize with server-side A/B experiments. Server-side code targets audiences, consistently serves the appropriate variant to each user, etc. I only use the Optimize interface to create experiments, set objectives, create variants, and view reports. I configure Optimize as described in the Server-side experiments with Universal Analytics (analytics.js) documentation.
I have performed A/A test of the Homepage with 50:50 split. The target objective is event action - click some specific button.
I run UI load tests that open the Homepage and then in 30% of cases click the desired button otherwise opens another page. In total 4000 iterations.
I expect to see calculated conversion rate 30% (this is a part of the UI load test logic) but Optimize shows conversion rate 16% (almost a half from the expected). The number of experiment sessions is much higher then 4000 (as assumed from the test), thus, the conversion rate is lower.
The question is how Google Optimize calculates experiment sessions? Why the number of sessions is higher then actual number of the experiment page views? How conversion rate is calculated?

Related

Different Active Users count when using segments

I would love to understand what I'm looking at - why are the numbers different in this report when I add a segment?
This is the report without any segmentation:
This is the same report with the Mobile Traffic segment:
There two methods that Google uses to identify the number of users.
Calculation 1: Pre-calculated data
This calculation relies only on the number of sessions in the given date range and the time of each session. (This is determined by technology managed on the device, like a web browser, and is often referred to as the client-side time.) Because the result of this calculation can be added to the pre-aggregated data tables, Analytics can reference the table to quickly retrieve and serve this data in a report, including when you change the date range.
Calculation 2: Data calculated on the fly
Calculation 2 is based on the way you assign, collect, and store persistent data about your traffic. There are many solutions you can implement to customize this, but the most common way this data is going to be assigned and stored is through cookies managed via a web browser.
Adding a segment will force GA to calculate the data on the fly and that's why you are seeing a difference in the numbers.
Are you using GA free or 360? and the time range you are using is same in both reports?
You can also have a look into the Google article https://support.google.com/analytics/answer/2992042?hl=en
You are victim of sampling:
https://support.google.com/analytics/answer/2637192?hl=en
Sampling applies when:
you customize the reports
the number of sessions for the report time range exceeds 500K (GA) or 100M (GA 360)
The consequence is that:
the report will be based on a subset of the data (the % depends on the total number of sessions)
therefore your report data won't be as accurate as usual
What you can do to reduce sampling:
increase sample size in UI (will only decrease sampling to a certain extend, but in most cases won't completely remove sampling)
reduce time range
create filtered views so your reports contain the data you need and you don't have to customize them

Google Analytics suddenly started Sampling Data, 3k sessions for property over time period

We are using the free level of GA and have been creating reports using Custom Dimensions and Metrics since last summer.
We also use the Google Sheets Analytics add-on to post process data pulled from the API.
Overnight on 16-17 May (UK Time), our reports suddenly started showing as being sampled. Prior to that we had no sampling at all, as our reports are scheduled so I can look back through the revision history to see changes made when the scheduled reports run.
This sampling is occurring in custom reports viewed in the GA platform and in GA sheets. I've done some analysis and it appears to only occur at the point that more than one Custom Dimension is added to a report, or when the GA dimensions ga:hour or ga:dateHour are used (ga:date does not trigger sampling).
All our Custom Dimensions and Custom Metrics are set at Hit level (I've read a post where it was claimed to be due to mixing scopes on Dimensions & Metrics, but we are not doing this).
If I reduce the date range of a query (suggested as a solution on many blogs), the sampling level actually gets worse rather than better.
For the month of May we didn't even hit 4k sessions at property level. I can't find any reference anywhere to any changes being made to GA that would cause sampling to apply to our reports (change documentation, Google Blogs etc).
Is anyone else experiencing this or can anyone shed any light on why this might be happening? Given how we use GA if we can't resolve this then it's a year of work down the drain, so I'm really keen to at least know why this has suddenly happened even if ultimately nothing can be done about it.

Site Speed for Google Analytics not working

I am successfully tracking traffic on a site (so far over 150 hits), however the Site Speed is still showing all zeroes. I know that the default is to only calculate speed for 1% of the requests, but it seems that nothing is getting tracked at all for Site Speed. I'm using the newest version of Chrome.
Any ideas? Thanks!
Site speed sample rate sets the percentage of users to be tracked not the percentage of requests. Here's what the docs say:
This setting determines how often site speed tracking beacons will be sent. By default, 1% of users will be automatically be tracked. Note: Analytics restricts Site Speed collection hits for a single property to the greater of 1% of users or 10K hits per day in order to ensure an equitable distribution of system resources for this feature.
In your case, if you don't expect to have more than 10K hits to your property in a single day, you can safely set the sample rate to 100%. Note: this has to be done in the create method:
ga('create', 'UA-XXXX-Y', {'siteSpeedSampleRate': 100});
Have you called the _setSightSpeedSampleRate() method? That would be my best guess. According to the Google docs:
The _setSiteSpeedSampleRate() method must be called prior to _trackPageview() in order to be effective.

Could "filling up" Google Analytics with millions of events slow down query performance / increase sampling?

Considering doing some relatively large scale event tracking on my website.
I estimate this would create up to 6 million new events per month in Google Analytics.
My questions are, would all of this extra data that I'm now hanging onto:
a) Slow down GA UI performance
and
b) Increase the amount of data sampling
Notes:
I have noticed that GA seems to be taking longer to retrieve results for longer timelines for my website lately, but I don't know if it has to do with the increased amount of event tracking I've been doing lately or not – it may be that GA is fighting for resources as it matures and as more and more people collect more and more data...
Finally, one might guess that adding events may only slow down reporting on events, but this isn't necessarily so is it?
Drewdavid,
The amount of data being loaded will influence the speed of GA performance, but nothing really dramatic I would say. I am running a website/app with 15+ million events per month and even though all the reporting is automated via API, every now and then we need to find something specific and use the regular GA UI.
More than speed I would be worried about sampling. That's the reason we automated the reporting in the first place as there are some ways how you can eliminate it (with some limitations. See this post for instance that describes using Analytics Canvas, one my of favorite tools (am not affiliated in any way :-).
Also, let me ask what would be the purpose of your events? Think twice if you would actually use them later on...
Slow down GA UI performance
Standard Reports are precompiled and will display as usual. Reports that are generated ad hoc (because you apply filters, segments etc.) will take a little longer, but not so much that it hurts.
Increase the amount of data sampling
If by "sampling" you mean throwing away raw data, Google does not do that (I actually have that in writing from a Google representative). However the reports might not be able to resolve all data points (e.g. you get Top 10 Keywords and everything else is lumped under "other").
However those events will count towards you data limit which is ten million interaction hits (pageviews, events, transactions, any single product in a transaction, user timings and possibly others). Google will not drop data or close your account without warning (again, I have that in writing from a Google Sales Manager) but they reserve to right to either force you to collect less interaction hits or to close your account some time after they issued a warning (actually they will ask you to upgrade to Premium first, but chances are you don't want to spend that much money).
Google is pretty lenient when it comes to violations of the data limit but other peoples leniency is not a good basis for a reliable service, so you want to make sure that you stay withing the limits.

Google Analytics Event Tracking reporting inflated event values

We recently released two typefaces on our website for free (albeit suggesting an optional donation). I decided we should track downloads through Google Analytics using the event feature, so we ended up adding the corresponding JS snippet to the download form (on submit), something akin to this:
_gaq.push(['_trackEvent', 'Typeface', 'Download', 'Typeface #1', parseInt($('input[name=amount]').val(), 10) || 0]);
I also decided we might as well use GA to keep track of donations, so as you might have noticed the optional donation amount is being sent as the event value argument. There's already a browser-side numeric-only verification, and it will set it to 0 in case it's empty (NaN), so we're completely sure it's always an integer (required type for the argument).
I configured two different goals (one for each typeface) in our GA profile, using the two different events as their respective conditions, as recommended by every howto I've been reading about this subject.
However, some of the reported data appears to be somewhat inflated. According to GA there's been, as of now, 455 unique events out of 550 total events, which seems to be okay, but apparently it's worth a value of over a million dollars. And, believe me on this, we have not received such a huge amount, at least just yet.
According to GA: Event Value is the total value of an event or set of events. It is calculated by multiplying the per-event value by the number of times the event occurred.
I assumed I could set individual values to different instances of the same event, even GA documentation leads me to believe so with their examples, so I don't really understand why it's being reported as such an inflated total value.
Is there something wrong with my assumption? Is this the correct approach to what I'm trying to accomplish? should I just forget about keeping track of donations using this method and resort to using the e-commerce feature instead as I've also been reading about?
I'm not checking for any verification of a donation successfully completing, so I'm left with an estimate and I'm okay with that. Maybe someone jokingly wrote off some exaggerated amount then never completed the donation process?
Your assumption is right : you could set individual values to each event and "the report adds the total values based on each event count" (as explain in doc).
The main problem with your approach is the one you mentioned : you count the donation at form validation, before its confirmation and even before you told your visitor that the donation must be made via PayPal. So yes : some people probably wrote off some exaggerated amount or simply not complete the donation process.
I recommend you to use e-commerce tracking after the PayPal payment to avoid unconfirmed donation tracking and the lack of deduplication using goals values to monitor amounts.

Resources