Universal Analytics clientId vs userId - google-analytics

The docs describe the clientId as:
This anonymously identifies a particular user, device, or browser instance.
https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#cid
It can be used to send server side hits to analytics while still tying them to a particular user.
There is also a feature in closed beta called userId, which you will be able to pass once a user has authenticated: https://developers.google.com/analytics/devguides/collection/analyticsjs/user-id
userId is fairly self-explanatory. However, UA also allows you to pass your own clientid if you choose to. For developing CRM type tools, can one just associate the clientid with a user in the same way that you would with a userid? The goal is primarily to be able to track offline interactions and connect them with visitors in Analytics.

maembe,
clientID is a random number generated by Google Analytics, and keep in mind it's always required and its value should always be a random UUID (version 4) (you could technically use your own, but I am not sure how practical and reliable this would be). Most importantly, you can easily access it with predefined get function (see documentation).
For your needs, this is exactly what you should do -- if someone sings ups, store ClientID in your CRM and then if there is any offline purchase, record the transactions with measurement protocol using the stored clientID. Google Analytics will then make the link (attribution) with that visitor and you will see this in your reports. Also, take advantage of newly available custom metrics and dimensions which can store pretty much anything you want (think of customer segmentation etc.). Beware of storing PII though.
Hope this helps :)
I am curious how UserID is going to work, it might change everything, but for now, I wouldn't rely on it as there is very little information available.

This Analytics support page now states the differences between Client ID and User ID - https://support.google.com/analytics/answer/6205850?hl=en#clientid-userid
Essentially client IDs represent unauthenticated users, and are automatically randomly generated.
User IDs represent authenticated users, and must be set manually.
It's worth noting that user IDs cannot be things like an email address, or other data that would allow Google to identify the user
You will not upload any data that allows Google to personally identify an individual (such as certain names, Social Security Numbers, email addresses, or any similar data), or data that permanently identifies a particular device (such as a unique device identifier if such an identifier cannot be reset).
If you upload any data that allows Google to personally identify an individual, your Google Analytics account can be terminated, and you may lose your Google Analytics data.
Taken from: https://developers.google.com/analytics/devguides/collection/protocol/policy

I'd imagine User ID is designed to differentiate the behavior of an authenticated user. here

Related

Hierarchy of Google Analytics for Multiple Builds of Same App

I am setting up Google Analytics Accounts for a Product which have multiple builds as frontend for same user base.
So we have one Product called X and have:
Web Build
Mobile Web
Android App 1
iOS App 1
Android App 2
iOS App 2 6.
The main point is identical APIs and User base is used in all platforms and apps. So if we have a user John Doe he can login in any of the web or apps.
We want to extract following information from Google Analytics.
Under User ID feature want, sessions aggregations of that user around all build and apps, but identifiable. So I can know that user John login to web yesterday and used mobile app today.
Each user belong to a customer (company) in our system. So want to segregate all information based on companies.
I already have achieved point 2 by creating a custom dimension in Google Analytics and believe that's the best way to do it.
Now need suggestions from Gurus on how to acheive point 1 using Google Analytics.
Either use single account and single property for all builds and apps
If yes, then how to identify those apps and builds in sessions
If I use multiple properties/apps in GA account then how to aggregate user sessions among all?
Looking forward to hear how guys around hand or should have handled this scenario. Cheers!
This question is extremely broad, IMO any answer your going to get is going to be primarily opinion based. So here is my opinion and a little extra info to boot.
The first issue you are going to have is that there is a difference between Mobile google analytics accounts and web analytics accounts. The two do not mix. Mobile analytics accounts insert screen views with a screen name. While web accounts insert PageViews with a document location.
There is no way to analyze between two different Google analytics web properties. Unless you intend your android and ios apps to run as websites and send it like its a webpage its not going to work. You could potentially download the data into your own system or big query and analyze it there. Comparing your custom dimension to see what the users have done differently. I would wonder at the quality of the analysis you will get as there will be no real way for you to compare the data and match it up beyond using your custom dimensions user id and possibly date.
I am adding this because I am not sure what your saving in your custom dimension.
The second issue you are going to have is tracking. Google analytics TOS does not allow you to send any identifiable information to Google.
The Analytics terms of service, which all Analytics customers must adhere to, prohibits sending personally identifiable information (PII) to Analytics (such as names, social security numbers, email addresses, or any similar data), or data that permanently identifies a particular device (such as a mobile phone’s unique device identifier if such an identifier cannot be reset).
You could for example send your companies customer id for John as a user_id but user_id is an internal valuable used for internal processing this is not something you can extract out via the api.
The User ID enables the association of one or more sessions (and any
activity within those sessions) with a unique and persistent ID that
you send to Analytics.
To implement the User ID, you must be able to generate your own unique
IDs, consistently assign IDs to users, and include these IDs wherever
you send data to Analytics.
For example, you could send the unique IDs generated by your own
authentication system to Analytics as values for the User ID. Any
engagement, like link clicks and page or screen navigation, that
happen while a unique ID is assigned can be sent and connected in
Analytics via the User ID.
The best you could do would be to create a custom dimension and send that with every hit username=johnscustomerId. Which you appear to have already done. This is what I have done in the past and it works perfectly well.

Realtime Overview Analytics only showing one user at a time

When viewing the RealTime Overview section, I notice that when a new user comes online the previous user immediately disappears and is no longer shown in realTime. The next day analytics only shows history data for one visiting user, I know for a fact that this is incorrect, there should be data for multiple users.
I send up analytics data with a simple https request(shown below) . This works for all my other applications. The only difference is that I send up the uid for this app , could this be causing the issue that I am seeing ?
Views
https://www.google-analytics.com/collect
?v=1
&z=14807
&tid=<OUR-UA-ID>
&cid=2535285330542042
&dp=message_6
&dt=message_6
&cd=message_6
&an=freemium_3
&av=3
&uid=123456789
&t=screenview
Events
https://www.google-analytics.com/collect
?v=1
&z=52130
&tid=<OUR-UA-ID>
&cid=2535285331158735
&dp=authentication
&dt=authentication
&cd=authentication
&an=freemium_3
&av=3
&uid=123456789
&ec=authentication
&ea=get_user_info
&t=event
The "cid" in your http call is the client id, where client refers to the device or program that makes the request. It is usually stored in a cookie (on the web) or generated by an SDK (in an app) and is used to unify subsequent requests from the same device into sessions. Since it is set by the client it differs from device to device (and browser to browser), so it can not be used to identify a person across multiple devices.
After it became the rule that any given person might have two or more devices Google came up with the uid, the user id (which by their own TOS might not identify the user, so this is a bit of a misnomer; think "cross device tracking id" and the concept becomes clearer). The uid is set by serverside code, i.e. after the user logs in. Not only this allows to unify visits from multiple devices into distinct users, it also alleviates privacy concerns (since it is supposed to be only created after a users action; there are separate TOS which you have to accept if you create a user id view in the GA interface, and they stipulate that you have to secure the users agreement to use to user id feature).
So if you set the same user id in your code the sessions will be attributed to the same user, even when the cid differs; this is by design and is indeed the point of the uid.

Google Analytics - flagging PII/NPI (personally identifiable information & non-public information)

Can you set up alerts in Google Analytics to flag potential PII/NPI such as name, email address, billing address, billing details etc.? If so, how?
First I have do say I do not understand the downvote(s). For example I have seen applications with user logins where a full name was part of the page title - combined with time based dimensions that gave profile that say which user looked at what page at what time, and that would be clearly illegal. Even worse I have seen a case where security tokens were transmitted to GA that allowed access to secured resources. So clearly accidental transmission of PII to Google Analytics is a real thing.
Unfortunately there is not much you can do about it. You can either do a custom report with relevant dimensions and have it sent to you for a manual audit, or pull them via the API and have them programmatically examined via regular expressions that look for patterns like e-mail addresses etc. But by the time you can do that it is already to late, the data will already be permanently recorded in the GA property.
You have to stop this before the data is collected - if at all possible already in the website (via form validation etc), or use Google Tag Manager with custom javascript variables with validation rules, or filters in the analytics view (the latter being cumbersome and not very promising for this purpose).
The good news is that GA will not suddendly start to track PII on it's own. So you only need to check if your GA account tracks PII when you set up the account. Collect a few days data, validate that everything is okay, make changes as necessary and after all flaws are straightened out copy the view to start data collection from scratch and drop the old view if it contains PII.

Can I store user ids with google analytics?

I want to track how individual user navigate through a website. Using Google Analytics I can use custom variables so in theory I can use the user ids to track individual users. However, according to the terms of service of Google Analytics I can't use GA to track, collect or upload any data that personally identifies an individual (Section 7. privacy). Does a user id classifies as data that personally identifies an individual? I merely want to use it to link the data from Google Analytics with our own data.
Storing user IDs is probably a violation of section 7, but there are other ways around this problem.
Section 7 of the ToS states:
You will not (and will not allow any third party to) use the Service to track, collect or upload any data that personally identifies an individual (such as a name, email address or billing information), or other data which can be reasonably linked to such information by Google.
While sending Google information that is personally identifiable is simply not permitted, you can instead send an identifier which is known only to you. In fact Justin Cutroni (Analytics Evangelist at Google) has confirmed this in this article.
I've recently written up my current thinking on this with some a couple of world examples:
"Identifying your users in Google Analytics while complying with section 7 of the terms of service"
I'd say it does. It doesn't matter what you intent to do or not. The fact that you have the possibilty to map the id back to the person is enough.
Additionally: I'm not a lawyer but I'd say if you are tracking such data (with whatever technical means) it has to be in the terms of your website and signed by the user.
Personally identifiable information would be something like a name, phone number or email address. You'd be okay to store hashed information about this user in Google Analytics and then decrypt offline. E.g. You can store a hash of the user's name in Google Analytics and then decrypt offline.
You also may want to check you've the correct privacy policy in place and follow these rules: https://developers.google.com/analytics/devguides/collection/protocol/policy
You can use non obfuscated alphanumeric database identifiers that you might create for your visitors. Another acceptable option is to pass to Google Analytics an encrypted identifier that is based on PII that is not Protected Health Information (as defined under HIPAA), as long as you use the proper encryption level. Google has a minimum hashing requirement of SHA256 and strongly recommends the use of a salt, minimum 8 characters.
https://developers.google.com/analytics/solutions/crm-integration#user_id

Using Google Analytics to Track User Session

I have a PHP (5+) based website.
I want to track registered users of my site using Google Analytics. I basically just wish to grab their session id (or some way to identify each specific user)and view their movements and site usage from my Google Analytics dashboard.
Can this be done? If so, any guidance would be most welcome.
I believe the best way to do this is via a custom variable:
_setCustomVar(index, name, value, opt_scope)
It would be up to you to identify the 'value' as a unique session id for the user.
The scope in your case would be either session(2) or visitor(1) depending on how you want to track it.
http://code.google.com/intl/en/apis/analytics/docs/tracking/gaTrackingCustomVariables.html#setup
Technically you are able to store the username as a custom variable, but doing so violates the Google Analytics Terms of Service (ToS). Among other things, the (paraphrased) ToS states that you aren't allowed to store anything in Google Analytics that allows you to identify who the visitor is. This typically applies to usernames, IP addresses, phone numbers, etc.
If you decide to ignore the ToS and store the data anyway, it exposes YOU (not Google) to all kinds of data protection & PII laws. Especially in the EU.
If you have some budget money to work with, the best way to see the information you're looking for is to use an on premises web analytics software package that can process Google Analytics data. When you keep your data on premises, it alleviates many of the data privacy / protection / PII laws.

Resources