Background of my specific environment: I am trying to set up GA to track the user journeys around an iOS native app written in React Native. The app is in-house only and all users consent to being tracked in this way.
As I understand it, the measurement protocol allows either Client ID (cid) or User ID (uid). In the docs, both are optional, but at least one is required. Makes sense so far. Client ID is an anonymous UUIDv4 that tracks a browser or device, but not necessarily a user. User ID on the other hand is some uniquely-identifying value for a user, but may cross devices/browsers.
Initially I wrote my application using cid (as I hadn't yet read the docs on uid). Every time a user logged in the user's uuid would be stored and sent as the cid parameter. I also used session control sc. This worked - I saw sessions coming in, the User Explorer showed me the individual user uuids and sessions, etc. But upon rereading the docs, this seemed like an incorrect use of cid. It seems like cid is meant to be consistent and not something that changes as you log in and out.
As people on my app can log in and out (and possibly will - staff members may well share devices), using a long-lasting cid that is specific to the device won't allow me to track the habits of individual users and changing it per login-session, as mentioned above, feels wrong.
So I switched out cid for uid. The docs say both are optional and one is only required if the other is not used, so I figured I could just use uid and use the user's uuid in that field and it's otherwise act the same.
However, when doing it this way no hits seems to be registered at all - in "real time data" as well as the User Explorer.
Am I doing something wrong or is this by design? (If by design, are the docs wrong to say cid and uid are both optional or am I reading it incorrectly?)
Either way, best-practices-wise, do I:
Just go back to using an ephemeral cid that only changes when people log in and out. Individual sessions in the app don't change the cid but I do use sc to explicitly start and end sessions by navigation inference (click 'home' = sc=end, navigate from 'home', sc=start).
Use uid as I am trying to (i.e. switching out the current cid for uid), but also include a random cid that never changes (regardless of session lifetime, logging in and out, etc.).
Something else?
I have seen this question and its answer but it does not really answer my actual question regarding how cid and uid should actually be used, and if my issue is me or the way that GA works.
Oki so client-id is randomly generated value and must be sent with all hits.
In a non-User-ID-enabled view, Client-ID is used to calculate unique users. In a User-ID-enabled view, User-ID is used to calculate unique users
In reference to your app, you can send the unique Device id value in place of random value for client-id. This will resolve all your problems and ensure that the client id is sent with all the hits, otherwise it wont get registered in GA
Related
I am using Google Analytics 4 (GA4) on the client to track a whole bunch of different events. However, there are 2 scenarios that I can't cover client side:
A user completing check out on a payment page hosted by a third-party (Stripe in this case).
A refund that is made by the support team.
These events are handled by the server using webhooks. To me it seems like the most straightforward solution, would be to let the server send the event to GA4 (as opposed to the client sending it). I believe the Measurement Protocol should be used for this.
For each event submitted through the Measurement Protocol a client_id is required. When the client is submitting an event, this is an automatically generated ID which is used to track a particular device.
My question thus is, what should the client_id be when submitting an event server-side?
Should the same client_id perhaps be used for all events, as to recognize the server as one device? I have read some people proposing to use a randomly generated client_id for each event, but this would result in a new user to be recognized for every server-side event...
EDIT:
One of the answers proposes to use the client_id, which is part of the request as a cookie. However, for both examples given above, this cookie is not present as the request is made by a third-party webhook and not by the user.
I could of course store the client_id in the DB, but the refund in the second example is given by the support team. And thus conceptually it feels odd to associate that event with the user's client_id as the client_id is just a way to recognize the user's device? I.e. it is not the user's device which triggered the refund event here.
Another refund event example would be when user A makes a purchase with user B and user B refunds this purchase a week later. In this situation, should the client_id be the one of user A or of user B? Again, it feels odd to use a stored client_id here. Because, what if user A is logged in on two devices? Which client_id should be used here then?
Great question. Yes, your aim to use Measurement Protocol is a proper solution here.
Do not hardcode the client id. It's gonna be a hellish mess in reports. The nature of user-based reporting (which GA is) demands client ids to uniquely identify users. To your best ability.
GA stores the client id in a cookie. You should have convenient and immediate access to it on every client hit to BE. The cookie name is _ga. GA4 appends the measurement id to the cookie name. Here, google's docs on it: https://developers.google.com/analytics/devguides/collection/analyticsjs/cookie-usage But you can easily find it if you inspect "collect" hits and look at their payloads. There's another cookie named _gid that contains a different value. That would be a unique client id. Set it too if you can, but don't use it for the normal client id. It has a different purpose. Here how the cookie looks here, on stack:
And here it is in Network. You will need it for proper debugging. Mostly to make sure your FE client ids are the same as BE client ids:
Keep an eye on the cases when the cookie is not set. When a cookie is not set, that most frequently means the user is using an ad-blocker. Your analysts will still want to know that the transaction happened even if there's a lack of context about the user. You still can track them properly.
3.1 The laziest solution would be giving them an "AnonymousUser" client id and then append a random number to that so that it would
both indicate that a user is anonymous and still make it possible
for GA to separate them.
3.2 A better solution would be for you to make a fingerprint client id for such users, say, hashing a concatenated string of their
useragent+ip+locale+screen resolution, this is up to your analysts
to actually work on the definition of a unique user if the google
analytics library is unable to do it.
3.3 Finally, one of the best solutions for you would be generating a client id on your own, keeping GA's format and maybe adding an indicator there that it has been generated on your end just for easier debugging in the Future and setting it as a cookie, using it instead of _ga. Just use a different cookie name so that ad-blockers wouldn't know to block it.
If you want to indicate that a hit was sent through the server, that's a good idea. Use custom dimension for that. Just sync it with your analysts first. Maybe they wouldn't want that, or maybe they would want it in a different dimension.
Now, this is very trivial. There are ways to go much deeper and to improve the quality of data from here. Like gluing the order id, the transaction id, the user id to that, using them to generate client id, do some custom client tracking for the future. But I must say that it's better than what more than 90% of, say, shopify clients have.
Also, GA4 is not good enough for deeper production usage. Many things there are still very rudimentary and lacking. I would suggest concentrating on Universal Analytics and having GA4 as a backup for when Google makes GA4 actually good enough to replace UA. That is, unless you're downloading your data elsewhere and not using GA's interface for analysis.
It seems that this page (Relevant portion in the screenshot below), advices to either send the data along with the client_id or user_id. However fails to address the fact client_id is a mandatory field as stated here.
I believe it is probably safe to assume that randomly generating this field should work. At least it seems to on my end however be warned that I am unsure if this has any impact on attribution.
* In the above image, Device ID refers to client_id
I am searching for a way to track user behavior on my website. I want to know if it is possible to get a table with data looking something like this:
+------+---------------+-----------------+------+---------+
| time | ip or user_id | user_session_id | link | actions |
+------+---------------+-----------------+------+---------+
(Link - where user came from)
I want to track different user actions by sessions. Is this possible using Google Analytics or I should search other tools? My site is currently set up to track events but on my Analytics account I get only the number of events that occurred. I want to track what a specific user does on my site.
tl;dr: if you must do this use Mixpanel or similar software.
Time based dimensions are already available (date, hour, minutes and datetime). "link" would be referrer. Actions in Google Analytics are basically pageviews, events and transactions, so you have that, too.
IP and user id are a big no-gos. Storing anything that that identifies a person is a violation of Googles Terms of Service and depending on your location might be a violation of national laws.And if by user_id you mean the Google Analytics feature of the same name, Google says you may set it for logged in users and have to unset it for user that log out, so by extension that means storing it in Ga would probably be a violation of their TOS.
The GA session id is not exposed via the interface. You may read it from the cookie and store it in a custom dimension (I'm not sure if this is allowed within the TOS, on the other hand GA premium customers get this via a BigQuery export in any case, so it should be allowed).
If you simply want to tell different users apart you might simply generate a string in the UUID format and store that in a custom dimension. If you want to actually identify users (by name, adress etc), well, you are not allowed to and Google will terminate your account if they find out.
Not to mention that it completely eludes why so many people want to track individual users. You must not use GA information to target individuals, and simply looking at individual user paths will not help you (I wrote an article about that, although I do not expect that this will convince you).
Google Analytics is for technical and legal reasons not a good tool for tracking individual users, if you need to do this use a software that is made for this purpose. Mixpanel is often mentioned in that context but I'm sure there are many other solutions.
Google Analytics now has UserID tracking to better track individual users across devices. Is it possible to add a tracking variable (like utm_campaign) specifying user_ID so that GA will associate those links with the user?
I'm looking for a solution that'll work even when the user isn't logged in (mainly for email link tracking).
No. Of course you can append the user id yourself and set it in your tracking code. But that would not uniquely identify a user (links get bookmarked or passed around e.g. when the email is forwareded to somebody) and someone else might end up with that id. The idea of the user id is that the CMS or CRM takes responsibilty for uniquely identifying users so we don't have to rely on the somewhat fragile client side mechanisms.
If you use it for email tracking it should IMO be enough if the user id is set in the first call of the tracking code if you have session unification enabled, so you'd just need to tag the link in the mail and read the id from there, without needing to persists the user id via the links. While I haven't tested this extensively the documentation seems clear on the point:
Session unification is a User ID setting that allows hits collected
before the User ID is assigned to be associated with the ID, so long
as those hits happen within the same session in which a specific ID
value is assigned for the first time.
Please note that the user id feature does not expose data for individual users via the interface, insofar the idea that it "better track[s] individual users" is not quite right (it will recognize users across devices but will still aggregate the data). If you want individual user you need to store the user id (or some other unique id) as custom dimension.
I have a website that allows people to create an account (that is the conversion I wish to track).
I wish to know where a specific person is coming from. I have google analytics installed and have set up the registration page as a goal, but the reporting tells me traffic sources as an aggregated pie chart. It doesn't report down to the user account level to say that 'person with email xyz' came from 'facebook' for example.
What custom variables or mark up would I need to add to GA to report at that detailed level, if that is at all possible?
Otherwise, I will just have to record the first http_referer in a cookie and stick it in a database during the registration process.
Any advice?
Firstly I must ask you, how actionable do you think it is to look at data at that granular of a level? Finding out what % of people who registered came from facebook or some other place is actionable, because it helps you do things like determine where to focus marketing efforts. But individual users? How is this actionable to you? (hint: it's not)
However, if you are still determined to know this, you should first note that it is against Google's ToS to record personally identifiable data both directly (recording the actual value in GA) or indirectly (e.g. - recording a unique id that you can use to tie to personal info stored within your own system). If this is something you don't want to risk, I suggest moving to another analytics tool that does not have this sort of thing in their ToS (e.g. Adobe SiteCatalyst, which costs money, or perhaps you may instead prefer to choose an "in-house" approach, like Piwik)
If you are still determined to follow through with this and hope not to get caught or whatever, Google Analytics doesn't record data like what info a visitor filled out in a form (like their email address) unless you populate that data in a custom field/dimension/metric/event to be sent along with the request. Usually you would populate this on the form "thank you" page (which is usually the same page you use as your goal url or goal event if you're popping and using an event for your goal). So you would populate the email address in one of those custom variables and then have it as a dimension to break down the http referrer by.
What is the simplest way to count # of visits by a user in an ASP.NET web app?
Our app services anonymous users, registered users and an intermediate user, called a "prospect". Prospects are users that request information but do not create an account.
We leave an ID cookie for every type of user, and that's the key into our database for visit information.
Prospects never "sign in" per se, but we still want to count those visits. We also want to count member visits, even when they don't sign in.
I am thinking of storing the ASP.NET Session cookie and then incrementing our counter every time the session cookie changes.
Anyone out there already solve this, or have any suggestions?
PS: We are ASP.NET 1.1
Refinement: We want this data in our app's database, so Google Analytics is not a reasonable solutions for this...and we are using Google Analytics.
Use Google Analytics. You can specify "goals" and "funnels" that lead to these goals quite easily.
Since you're using 1.1... in the Global.aspx... in the App_EndRequest (or whatever), insert a record into a DB with Name, IPAddress, timestamp, etc.
EDIT: Don't do the session thingy, sessions can be cleared, plus how will you report on them days, or weeks later... insert a record (including the Request.Url.Path if you'd like to have those kinds of stats).
Because it's on the EndRequest method, it's pretty safe even if there is some sort of glitch... also, performance won't matter as the user has gotten his page sent to him already.
We leave an ID cookie for every type of user, and that's the key into our database for visit information
Sounds to me like you already have a counter, you just need to figure out a way to make this data useful.
SELECT COUNT(1) FROM TblUsers WHERE UserType = 'Prospect' AND DateRange Between....