Client ID and Sampling - google-analytics

I have a site that is on the brink of reaching the collection limits and would like to know some info on behaviours and effects of Sampling when a sampling rate is implemented and also when Google start to enforce Sampling.
Question 1: Does anybody know how Sampling works to ensure multiple visits by a Unique visitor give a true sampled view. By this I mean once a particular Client ID has started to be collected and included in a Sample will all future sessions for that Client ID be included or is this not guaranteed meaning only some visits by a particular client ID will be collected within the sample.
Question 2. As an extension to Question 1 does sampling across devices also ensure the same Client ID is collected i.e. PC Browser and Tablet Browser have the same ClientID set - will GA ensure both are included in the Sample.
Question 3. As an extension to the Quation 1 & 2 does Developer Implemented vs Google enforced Sampling cause different behaviours?
Question 4. As an extension to the above if I track a Web Site and App Session activity to the same account and use the same Client ID will Sampling ensure both are included.
Thank You

Related

Google Analytics Hit Quotas

I wonder whether someone can help me please.
I have a user who under a specific property, sporadically receives the following error:
Some hits sent on 03-Jul-2018 to property ...... exceeded one or more hit quotas and were therefore not processed.
Hits can be dropped when daily or monthly hit limits are exceeded. You can view your hit volume levels in Property Settings in Analytics.
Hits can also be dropped if visitor hit limits are exceeded. This can happen when your site is incorrectly generating the visitor ID for a GA session. Contact your website administrator to check that the visitor ID generation has been correctly implemented.
They are not using the Premium account but when I look at the data for the day in question, there aren't any issues with regards to 'High Cardinality' which unless I've misunderstood I'd expect to see.
Could someone look at this please and offer some guidance where the issue may be because this area is fairly new to me.
Many thanks and kind regards
Chris
Collection limits are influenced by 2 factors:
The tracker: whether you use ga.js,gtag.js,analytics.js etc... here are the details.
The property type: whether you are using GA (10M hits / month) or GA 360 (2B hits / month).
In your case you are facing a property limit. To find out when such limits where reached, you can create a custom report using a time dimension (eg date+time) combined with the hits metric. You can also combine the hit metrics with other dimensions (country, browser, device) to see if you find any patterns as to why you're getting so many hits.
Cardinality is something else: it refers to the number of unique value combinations for your dimensions. For instance if you have 500K events where each event category is different, you'll have a Cardinality of 500K on the event category dimension. The more hits, the more likely you'll have a high cardinality, but the 2 aren't necessary related (if you send 10B events with the same category, the cardinality on the category is 1).
So focus on identifying and solving your limits/quotas issue, as it's the real issue here:
If the number of hits is legitimate (you have a huge amount of traffic), then the only options are to upgrade to GA 360 or reduce the number of hits for each session
If the number of hits is abnormally high (eg traffic is stable but hits increased dramatically), look for implementation issues, especially generic event trackers such as error tracking with tools like Google Tag Manager

How can I view individual hits to pages within a GA custom report

I would like to compare some data between a 3rd party analytics tool and GA.
Now I would love to see the IP addresses that Ga is receiving however it seems that they do not reveal this information, fine, however, I cannot find a way to use the flat table in the GA custom report to show me the following if possible;
Full Date Time (Seems as though they don't want you to have this either)
Browser Version
Browser Width & Height
Page (from the hit)
And I would like this data not to be grouped by the metric, this way I can see that if the same user has hit a page 3 times it isn't grouped.
If anyone can help please let me know. If the question is poorly phrased please let me know.
Thanks,
Connor.
This requires some work, and it will allow the breakdown only for future hits, not for hits that are already collected.
To view individual hits you need to create a hit based dimension that is unique per hit. Unless your page has an amazing amount of traffic a timestamp in milliseconds (e.g. new Date().getTime()) will be sufficient (for your report you might want to format that in a nice way). So in the admin section of your GA property you go to custom definitions, create a hit scoped custom dimension, and then modify your pagecode to send the timestamp to that dimension. Hit scoped means it is attached to the pageview (or other interacton hit) it is sent with.
If you want to break down your report by user you need the clientid (clientid is how Google recognizes that hits belong to the same user). Again, send it as a custom dimension.
This does not tell you how many sessions the user had (there is no session identifier in GA). If you need to know that you can create a session scoped custom dimension and send a random number along ("session scope" means that GA only stores the last value in a session, so you don't need to maintain a session id over multiple pageviews, since the last value will be set for all hits within the session). The number of different sessions ids per client id then tells you the number of sessions per user.
The takeaway is that GA only shows aggregated data, and if you want to defeat this mechanism you need to throw data at it that cannot be aggregated further. You might run into other constraints (i.e. there is a limited number of rows per report).

Google Analytics Client Sampling when over 10milion hits/month

Google Analytics Documentation says the following:
hese limits apply to the Web Property / Property / Tracking ID.
10 million hits per month per property
If you go over this limit, the Google Analytics team might contact you
and ask you upgrade to Analytics 360 or implement client sampling to
reduce the amount of data being sent to Google Analytics.
For monthly total Analytics 360 limits, please contact your account
manager or service representative.
What does this mean exactly ?
I know there is sampling , the one which you see in your reports..
But if your traffic exceeds the 10 milion hits per month, is there an automatic sampling system which forbids you to capture all incoming traffic?
In other words : Does google limit your traffic automaticly at the source? Not in the reports but in the source, let's say I capture 20mil hits a month, will i have all that traffic in my property or does it stop at a certain point?
AGAIN : i'm not talking about report sampling but about the actual captured data a month
Thanks in advance
No, Google does not limit data collection. You have to implement this yourself, although they give you a means to do that at least in the Javascript Tracking code. Implementing sampling yourself would be a little tricky since you want to sample out whole sessions, not individual pageviews.
If you record 20 mio hits you will have them in your property. But at that point you operate outside your quota and Google has the right to terminate your account (they will not do that without getting in contact with you, provided you respond to mails send to the Google accounts authorized to use your GA properties).
So far Google has been, in my experience, very generous even with large overruns, but you should not base something business critical on the violation of TOS for a free service.

Hits Processed Per Month?

If you refer to http://www.google.com/intl/en_uk/analytics/premium/features.html, you will notice that Standard allows for 10 million hits processed per month and Premium allows for 1 billion.
I have a website on an account, with multiple "folders" for different sub-domains, and also different "Views" or dashboards for some of these sub-domains.
The website I am on recently lost tracking for conversion rates, and everything has plummeted to near 0%, which is an incorrect statistic. I am curious as to how I can figure up if this account is reaching the 10 million limit on the standard version. Or at least how to figure actual hits processed a day, week, or month?
Any ideas?
Thanks!
I don't know how Google enforces hit limits in 2015. However in 2013 a Google representative sent one of our bigger clients a document (answering a question about data limits) that contained the following paragraph:
How do data limits impact sampling? Google Analytics does not sample
your clients data at the point of collection or processing, regardless
of how far they exceed our stated limits. So no hits are discarded.
The only way to sample data at the point of collection is for clients
to use_setSampleRate in their tracking code.
[...]
[...] we reserve the right to shutdown their account [sc. if limits are exceeded], but it won't
happen before we have attempted to contact the account Admins multiple times
and we have exhausted all other options.
Unless Google has changed it's policy in the last 1,5 years I would say not, unprocessed hits are not your problem; it seems Google would have contacted you with an request to limit your hits or upgrade to Analytics Premium before problems occurr.
Plus, since you mentioned that you have several views - views do not count towards your quota (they display the same data in different ways). However properties (I think that is what you mean by "folders") do.
Updated 2017: It seems that Google intends to enforce limits more strictly. One of my clients now has the following warning in his GA interface:
Your data volume (XXX hits) exceeds the limit of 10M hit per month as
outlined in our terms of service. If you continue to exceed the limit
you will lose access to future data.
You can create a database table, like this:
visits(
id bigint primary key auto_increment,
ip text,
visit_date timestamp default current_timestamp
)
Upon each page visit, you can insert a record into the table. Later you can view statistics. For instance, visit count in a given day would look like:
select id, ip, visit_date
from visits
where visit_date >= '2015-07-21 00:00:00' and visit_date < '2015-07-22 00:00:00'

Track count of events unique by user rather than session

We have a way to fetch the number sessions unique per device and the number of The New Feature uses, this can be done with a public API and requires implementation of two events to be sent by mobile applications to Google Analytics server. It will give us a statistics of the sessions when The New Feature was used, although it doesn't directly reflect individual users activity.
Ex: the app was opened 1000 times among all unique users (devices), The New Feature has been opened 200 times, the resulting value is 200/1000 or 20%. The drawback is that at this particular case we have no way to tell that is wasn't one user who has opened The New Feature 199 times and another one who has opened it just once, the real retention rate is low to none.
The secondary statistics that we are aiming to be able to calculate is the percentage of unique users who have used The New Feature at least N times during the given period. This statistics should be a closer representation of the real The New Feature retention as it will both show the share of users who were using the feature and the dynamics of frequency. For that we are not clear of which events are needed to be set up.
Ex: the app was opened 1000 times: user A used The New Feature 10 times, user B 5 times, user C 4 times, most of the other users who used The New Feature opened it 2 times - The New Feature was opened 200 times in total. The resulting percentage of users: 10% have opened The New Feature at least once, 8% used it at least 2 times, ..., 1% used it at least 10 times.
The numbers from the second example are giving us more useful information about how often the new feature is being used, but it isn't clear how we can set it up. We would need a kind of the event that shows a number of uses of The New Feature unique by the users (not just sessions) and I think the event values might be used to distinguish the users, will it be possible to get the number of unique users who has triggered the event at least N times this way ? Any other suggestion is welcome.

Resources