Google Analytics: Making an Experiment respond to earnings - google-analytics

I'm currently in the process of creating an A/B test in Google Analytics (aka an "Experiment"). It's for a ticket purchasing page where there are several levels of tickets for sale (eg. Standard, Premium, First Class, etc).
We want the winning page variant to be the one that earns the most money, not necessarily just the one that triggers the most Goals (ie. number of successful transactions).
For example:
Variant A
Sales: 20
Total Sales Value: $100
(20 x Standard tickets)
Variant B
Sales: 5
Total Sales Value: $1000
(5 x First Class tickets)
We would want Variant B to win, even though the number of events triggered (aka "Goals") was higher with Variant A.
I cannot find anywhere discussing this issue, but surely the average value of an Event should factored into the success of a Variant?
We have the Goal configured like this at the moment:
One thing I've considered is increasing the Value threshold to be larger than the average sale, so the Goal doesn't trigger unless the page performs better than average. This isn't ideal for obvious reasons, though.
Is there another way?

Related

Frustrating Formula - Help Needed with Google Sheets formula/method

I am struggling to come up with a formula that fits certain criteria and was hoping someone with a better math brain than me might be able to help. What I have is a Google Sheets based tool that determines how much a someone has purchased of a product and then calculates the amount of times a special additional offer will be redeemed based on the amount spent.
As an example, the offer has three tiers to it. Though the actual costs will be variable for different offers let's say the first tier is gained with a $10 purchase, the second with a $20 purchase and the third with a $35 purchase (the only real relationship between the prices is that they get higher for each tier but there is no specific pattern to the costing of different offers). So if the customer bought $35 worth of goods they would get three free gifts, if they bought $45 worth they would get 4 and then an additional spend of $5 (totaling $50) would then allow them to redeem 5 gifts in total. It can be considered like filling a bucket, each time you hit the red line you get a new gift, when the bucket is full it's emptied and the process begins again.
If each tier of the offer was the same cost (e.g. $5, $10 and $15) this would be a simple case of division by the total purchase amount but as there is no specific relationship between the cost of the tiers (they are based on the value of the contents) I am having trouble coming up with a simple 'bucket filling' formula or calculation method that will work for any price ranges given to it. My current solution involved taking the modulus, subtracting offer amounts from the purchase amount etc. but provides plenty of cases where it breaks . If anyone could give me a start or provide some information that might help in my quest I would be highly appreciative and let me know if my explanation is unclear.! Thanks in advance and all the best
EDIT:
The user has three tiers and then the offer wraps around to the start after the initial three are unlocked once, looping until the offer has been maxed out. Avoiding a long sheet with a dynamic column of prices would be preferable and a small, multicell formula would be ideal
What you need is a lookup table. Create a table with the tier value in the left column, and the corresponding number of gifts for that tier value in the right column. Then you can use Vlookup to match the amount spent to correct tier.
I am not quite sure about, everything into one entire formula(is there a formula for loop and building arrays?)
from my understanding the tier amounts are viable, so every time you add a new tier with a new price limit then it must be calculated with a new limit price number...wouldn't it be much easier to write such module in javascript than in a google sheet? :o
anyways here is my workaround, that may could help you to find an idea
Example Doc
https://docs.google.com/spreadsheets/d/1z6mwkxqc2NyLJsH16NFWyL01y0jGcKrNNtuYcJS5dNw/edit#gid=0
my approach :
- enter purchases value
-> filter all items based by smaller than or equal "<=" (save all item somewhere as placeholder)
-> then decrease the purchases value by amount of existing number(max value) based on filtered items
-> save the new purchases value somewhere and begin from filtering again and decreasing the purchases value
(this needs to be done as many times again, till the purchases is empty)
after that, sums up all placeholder

In Google Experiments, how can I use Conversions per Unique Visitor as the Conversion Rate

We're running A/B Testing Experiments through Google Analytics. In one experiment, we defined a "conversion" as being when a user signs up for an account on our site. Each unique visitor can only convert once, since then they'll already have an account and can't sign up again. However, Google Analytics calculates the Conversion Rate as being Conversions per Experiment Session, so that if a user who already converted returns to our site, it will count as another session without a conversion.
I believe that this is messing up our results, because registered users are more likely to return to our site, so they would be adding sessions but not conversions to the total count, thus lowering the Conversion Rate for their variation. I would like Google to choose an Experiment Winner based on a Conversion Rate defined as Conversions per Unique Visitor. Is this possible?
Here's the experiment data (with a 50/50 split):
Original:
{Experiment Sessions: 652, Conversions: 53, Conversion Rate: 8.13%, Compare to Original: 0%, Probability of Outperforming Original: 0.0%}
Variation 1:
{Experiment Sessions: 492, Conversions: 54, Conversion Rate: 10.98%, Compare to Original: 35.02%, Probability of Outperforming Original: 94.6%}
In this case, the Original had about the same number of conversions, and assumingly about the same number of unique visitors, but quite a bit more sessions. I assume this to mean that people who saw the original version were more likely to return to the site in another session after account creation. However, Variation 1 is approaching 95% confidence because of the higher Conversion Rate, and is likely to be the winner. This just doesn't seem right to me. Is there a way to get better results here?
I believe the best approach here would be to use a custom analytic's event and trigger it according to your rules. Then on the experiment you choose this event as the conversion goal for it.

How does collection sampling affect the "live" stats for Google Analytics?

We've noticed lately that as our site is growing, our data in Google Analytics is getting less reliable.
One of the places we've noticed this most strongly is on the "Realtime Dashboard".
When we were getting 30k users per day, it would show about 500-600 people on line at a time. Now that we are hitting 50k users per day, it's showing 200-300 people on line at a time.
(Other custom metrics from within our product show that the user behavior hasn't changed much; if anything, users are currently spending longer on the site than ever!)
The daily totals in analytics are still rising, so it's not like it's just missing the hits or something... Does anyone have any thoughts?
The only thing I can think of is that there is probably a difference in interpretation of what constitutes a user being on line.
How do you determine if the user is on line?
Unless there is an explicit login/logout tracking, is it possible that it assumes that a user has gone if there is no user generated event or a request from the browser within an interval of X seconds?
If that is the case then it may be worth while adding a hidden iframe with some Javascript code that keeps sending a request every t seconds.
You can't compare instant measures of unique, concurrent users to different time-slices of unique users.
For example, you could have a small number of concurrent unique users (say 10) and a much higher daily unique users number like 1000, because 1000 different people were there over the course of the day, but only 10 at any given time. The number of concurrent users isn't correlated to the total daily uniques, the distribution over the course of the day may be uneven and it's almost apples and oranges.
This is the same way that monthly unique and daily uniques can't be combined, but average daily uniques are a lower bound for monthly uniques.

big difference in "visitor" count

I try to pull out the (unique) visitor count for a certain directory using three different methods:
* with a profile
* using an dynamic advanced segment
* using custom report filter
On a smaller site the three methods give the same result. But on the large site (> 5M visits/month) I get a big discrepancy between the profile on one hand and the advanced segment and filter on the other. This might be because of sampling - but the difference is smaller when it comes to pageviews. Is the estimation of visitors worse and the discrepancy bigger when using sampled data? Also when extracting data from the API (using filters or profiles) I still get DIFFERENT data even if GA doesn't indicate that the data is sampled - ie I'm looking at unsampled data.
Another strange thing is that the pageviews are higher in the profile than the filter, while the visitor count is higher for the filter vs the profile. I also applied a filter at the profile to force it to use sample data - and I again get quite similar results to the filter and segment-data.
profile filter segment filter#profile
unique 25550 37778 36433 37971
pageviews 202761 184130 n/a 202761
What I am trying to achieve is to find a way to get somewhat accurat data on unique visitors when I've run out of profiles to use.
More data with discrepancies can be found in this google docs: https://docs.google.com/spreadsheet/ccc?key=0Aqzq0UJQNY0XdG1DRFpaeWJveWhhdXZRemRlZ3pFb0E
Google Analytics (free version) tracks only 10 mio page interactions [0] (pageviews and events, any tracker method that start with "track" is an interaction) per month [1], so presumably the data for your larger site is already heavily sampled (I guess each of you 5 Million visitors has more than two interactions) [2]. Ad hoc reports use only 1 mio datapoints at max, so you have a sample of a sample. Naturally aggregated values suffer more from smaller sample sizes.
And I'm pretty sure the data limits apply to api access too (Google says that there is "no assurance that the excess hits will be processed"), so for the large site the api returns sampled (or incomplete) data, too - so you cannot really be looking at unsampled data.
As for the differences, I'd say that different ad hoc report use different samples so you end up with different results. With GA you shouldn't rely too much an absolute numbers anyway and look more for general trends.
[1] Analytics Premium tracks 50 mio interactions per month (and has support from Google) but comes at 150 000 USD per year
[2] Google suggests to use "_setSampleRate()" on large sites to make sure you have actually sampled data for each day of the month instead of random hit or miss after you exceed the data limits.
Data limits:
http://support.google.com/analytics/bin/answer.py?hl=en&answer=1070983).
setSampleRate:
https://developers.google.com/analytics/devguides/collection/gajs/methods/gaJSApiBasicConfiguration#_gat.GA_Tracker_._setSampleRate
Yes, the sampled data is less accurate, especially with visitor counts.
I've also seen them miss 500k pageviews over two days, only to see them appear in their reporting a few days later. It also doesn't surprise me to see different results from different interfaces. The quality of Google Analytics has diminished, even as they have tried to become more real-time. It appears that their codebase is inconsistent across API's, and their algorithms are all over the map.
I usually stick with the same metrics and reporting methods, so that my results remain comparable to one another. I also run GA in tandem with Gaug.es, as a validation and sanity check. With that extra data, I choose the reporting method in GA that I am most confident with and I rely on that exclusively.

Confusion over Google Analytics (GA) Absolute Unique Visitors data

GA Unique Visitors data isn't making sense to me. From the GA FAQ we get the following definition for 'Visits vs. Visitors'
"The initial session by a user during any given date range is considered to be an additional visit and an additional visitor. Any future sessions from the same user during the selected time period are counted as additional visits, but not as additional visitors. "
The part that I can't resolve with the GA graph is "Any future sessions from the same user during the selected time period are counted as additional visits, but not as additional visitors". For the graph below covering a 30-day period, I would understand the GA definition to mean that the data represents uniqueness across all 30 days, right? But if you look at the screen shot below, you see a regular pattern for each week over the 30-day period the report covers. From that, it seems the numbers we are seeing associated with each of the days of the graph (e.g. 3.92% (4142) for Tuesday, September 8) is a count of unique visitors just in the context of that one day - i.e. without correlating their uniqueness to the rest of the days in the 30-day period. If the graph actually showed uniqueness across the 30-day period, I would expect the daily numbers to start high in the early days of the period and decrease over the 30-day period as the number of already-seen visitors (i.e. returning visitors) increases, no?
What am I missing here?
UPDATE
Helpful clue from Jonathan S. below got me on the right track.
I think I understand now what the daily bar graph values mean, but it's a little counter-intuitive and I'd bet not what some others might be assuming as well. The reports states "39,822 Absolute Unique Visitors" at the top, which means just that: over the 30-day period we saw this many uniques. Fair enough. The confusing part is that the daily (or weekly) bar values in the graph below are not mutually exclusive uniques as I had assumed, but are values relative only to the 39,822 total - i.e. there is overlap between the unique visitor counts across any group of days. This means the sum of the daily % values > 100% and the sum of the daily count values > 39,822. The algorithm is: when you visit for the first time in the 30-day period, call that "today", you add 1 to the total (39,822) and 1 to the "today" bar value. When you show up again "tomorrow", you are NOT counted again in the total, but ARE counted as 1 in the "tomorrow" bar value.
alt text http://img.skitch.com/20090922-djti81ejj5gqn575ibf8cj1e8x.jpg
I believe it's just an issue of grouping. The top right of the graph has 3 icons to group by day, week, or month. It's currently grouping by day. So if I visit your site today and come back tomorrow, I'll be counted once for each day.
I tried looking at the month view for one of my sites but it didn't give me much meaningful data. I believe the above should answer your original confusion though.
Is it possible that you're searching for something what isn't existing anymore? Unique Visitors/Visits is old terminology. Check: https://www.seroundtable.com/google-analytics-sessions-users-18424.html
Then check how sessions and users are defined:
Sessions ("ex-visits", it's very detailed): https://support.google.com/analytics/answer/2731565?hl=en&ref_topic=1012046
Users in Google Analytics reporting are defined as "Users who have initiated at least one session during the date range". So IMHO it's not about 30 days, it's about the SELECTED date range.
I hope this helps.

Resources