How to remove PII from URL (GA4 w/ GTM) - google-tag-manager

PII (Personally Identifiable Information) should never be sent to Google Analytics, not only it breaches GA Terms of Use, but you also leaking sensitive user data. So how to remove PII from URL, such as query string params (email, userId, ...) or even from location path when using Google Tag Manager (GTM) and Google Analytics 4 (GA4)?

Let's assume you've got already set up GA4 property and GTM installed on your page.
So let's create new tag for GA4 configuration. As Measurement ID I use lookup table variable (it's perfect when you've got multiple environments like testing, staging, production - all those have separate Measurement ID, but uses same GTM install script), but you can just simply write your G-XXXXXXXXX Measurement ID here. Then expand Fields to Set section, add page_location as Field Name and click on lego button next to Value.
Click on + (plus button) in upper right corner to add new variable.
As a Variable Type choose Custom JavaScript. In upper left corner write name of your new variable, I used Redacted Page Location.
And now we are getting closer to how to remove PII. In Custom JavaScript section insert JS function which should return redacted URL. Mine function uses regular expressions to replace PII from URL with some redacted text. Parameters I wanted to redact from url path are IDs of company, project, epic, and task; and userId from query params.
function() {
var url = window.location.toString();
var filter = [
{
rx: /company\/\d+/g,
replacement: 'company/REDACTED_COMPANY_ID'
},
{
rx: /projects\/\d+/g,
replacement: 'projects/REDACTED_PROJECT_ID'
},
{
rx: /epics\/\d+/g,
replacement: 'epics/REDACTED_EPIC_ID'
},
{
rx: /tasks\/\d+/g,
replacement: 'tasks/REDACTED_TASK_ID'
},
{
rx: /userId=\d+/g,
replacement: 'userId=REDACTED_USER_ID'
},
];
filter.forEach(function(item) {
url = url.replace(item.rx, item.replacement);
});
return url;
}
Let's say the URL of my page is https://www.example.com/company/2247/projects/2114/epics/19258/tasks/19259?userId=1234567, this function redacts it to https://www.example.com/company/REDACTED_COMPANY_ID/projects/REDACTED_PROJECT_ID/epics/REDACTED_EPIC_ID/tasks/REDACTED_TASK_ID?userId=REDACTED_USER_ID.
Select newly added custom variable, it's name should be in Value field, and save your GA4 tag.
.
Now let's test it. Switch to Preview mode and open your web site. In GA head to Debug View of your GA4 property, wait for page_view to pop up in timeline (maybe you will have to reload you page again), click on it and expand page_location variable. You should see your redacted URL.
That's all, enjoy!

Related

How to track multiple websites moving to one domain with Google Analytics

I am managing multiple websites that will soon move to one domain with each respective market being contained in a sub-directory e.g www.example.com/uk/.
The current situation is that all markets have their own GA property. I was wondering what the implications would be in just leaving the current setup as is?
I imagine GA alerts will fire implying that GA tracking is 'missing' across the website. Or would it be recommended to set the cookiePath field for each respective in the analytics.js create command?
The requirement is that each market is to have their own GA property giving them more flexibility.
Make translation table
function getPropertyId(){
var propertyIDs = {
'uk' : 'UA-24574-1',
'de' : 'UA-32656-4',
'fi' : 'UA-54544-6'
};
var fallBackId = "UA-Falback";
var path = window.document.location.pathname.split("/");
if(path[1]){
var propID = propertyIDs[path[1]] ? propertyIDs[path[1]] : fallBackId;
return propID;
}
else {
return fallBackId;
}
}
Use it when setting Property ID
ga('create', getPropertyId() , 'auto');
Request URI
You can expect troubles with Request URI variable, because:
From comment:
#GKyle Imagine, your current URL is mycompany.uk/page.html and new URL will be mycompany.com/uk/page.html. In old setup will be Requested URI /page.html in new /uk/page.html. There will be inconsistency if you will do nothing. But if you set up a filter removing /uk, etc..
Wonderful regex: ^(/(uk|de|au|en)\b/?)(.*)
From here: RegExp - remove /en or /de from pathname string and return rest
Rewrite string is /$A3
Create Advanced Filter
And please, TEST IT BEFORE!
Result
You can smoothly change tracking from multiple domains under one main domain if you keep setting Property ID.
Keep in mind possible changes in certain reports, specially path based reports.

How to tie together front and back end events in google analytics?

I am tracking user events on the front end with google analytics, but I would also like to send back end events and be able to match up events for the same user in google analytics.
It looks like I should be able to pass the uid parameter: https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#uid but it looks like I also have to pass the tid parameter https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#tid .
The docs say that "All collected data is associated by this ID" (the tid).
What should I pass for the tid? Why can't I just pass the uid, if that is supposed to be a mechanism for tying events together?
I would like the backend to pass the uid to the front end (actually a one-way hash of the email), and then refer to the user in google analytics with this uid.
Is this feasible? I'm a bit confused about how to implement this.
Many thanks!
The "tid" - Tracking ID - is the Web Property, i.e. the "slot" in your Analytics account that the data goes to. If you do not send a tracking id the calls will disappear in limbo. You find the tid in your property settings under "Tracking Code". It is a string that starts "UA-" and so is also sometimes referred to as UA-ID).
The User ID will not help you to identify users, at least not by default, since it is not exposed in the Analytics interface (it should really be called the "cross device identification id", since that is what it's for). You need to create a custom dimension and pass the value of the User ID there if you want to identify users. Per TOS you must take care that no third party, including Google, can resolve your User ID (or any other datapoint) into something that identifies a person, altough of course you can use yourself to connect data to other data in your backend system.
Actually there is a proper way. I've implemented this for myself.
There's a Client ID parameter, that should be passed with your requests.
And here's you have two options:
Create this client id manually (by generating UUID) on server-side and pass it to front-end. Then use this value when you create your tracker and also use it for server-side requests.
//creating of a tracker with manually generated client id
ga('create', 'UA-XXXXX-Y', {
'storage': 'none',
'clientId': '76c24efd-ec42-492a-92df-c62cfd4540a3'
});
Of course, you'll need to implement some logic of storing client id in cookie, for example.
You can use client id that is being generated automatically by ga and then send it to the server-side by your method of choice. I've implemented it through cookies:
// Creates a default tracker.
ga('create', 'UA-XXXXX-Y', auto);
// Gets the client ID of the default tracker and logs it.
ga(function(tracker) {
var clientId = tracker.get('clientId');
//setting the cookie with jQuery help
$.cookie("client-id", clientId , { path : "/" });
});
Then on the back-end just access this cookie and use that client id for your requests.
Also some information con be found here: What is the client ID when sending tracking data to google analytics via the measurement protocol?

Tracking user id using with Google Tag Manager

I am fairly new to Google Analytics and Tag Manager world. I am trying to add user id to ga events. For example I would like to add user id in case a google analytic event occurred.
What is the best practice to store userId at client side?
In Cookie?
In pages with a metatag?
Or something else?
you have to pass it as a dataLayer variable and add a macro to pass it to the GA, by setting additional fields in GA UA code: https://support.google.com/tagmanager/answer/4565987
So to answer your question, you should store it in dataLayer (so basically in javascript) and pass on to GTM.
If you know how to use GTM properly it actually doesn't matter if you store this value directly in dataLayer, or in cookie, or in metatag, since you could then extract it from all mentioned locations by using js, but simplest one is the dataLayer.
I would store it in the dataLayer.
dataLayer = [{
'userId': 'abcdefg123456'
}]
Then if you want to reference it in an event, just pass {{userId}} as a custom dimension.
how complicated they made all this...
NOTE: This is short how-to for GoogleTagManager + Univarsal GAnalytics + user id... I am using the dataLayer to fire custom events and to unify sessions of our registered users.
1) create and empty array dataLayer before the GTM code on every page (in Rails you can set it in application.html/haml):
<script>
dataLayer = [];
</script>
2) then push your user id to the array:
<script>
dataLayer.push({
"&uid": "#{YOUR-BACKEND-USER-ID}"
});
</script>
3) Now, add your GTM code
(function (w, d, s, l, i) {
w[l] = w[l] || [];
w[l].push({
'gtm.start': new Date().getTime(),
event: 'gtm.js'
});
var f = d.getElementsByTagName(s)[0],
j = d.createElement(s),
dl = l != 'dataLayer' ? '&l=' + l : '';
j.async = true;
j.src =
'//www.googletagmanager.com/gtm.js?id=' + i + dl;
f.parentNode.insertBefore(j, f);
})(window, document, 'script', 'dataLayer', 'GTM-XXXXX');
4) Log into your tag manager account and create a new Macro "uid" of data layer variable type. Set the Layer Variable Name set to &uid
5) Still in your tag manager edit your tag(s) (in the sidebar). Open more settings > Fields to set and add FIELD NAME: &uid | Value: {{uid}}
6) Now in google analytics account enable USER-ID view. Go to admin > account > property >tracking info > USER-ID and enable CREATE
That would depend on how the user id you want to use is generated. If you create this in the tag Manager via JS you a cookie (to propagate the id between pages) would be the easiest way - GTM can read a first party cookie (else you could use some HTML5 localstorage thingamajig, but that you would need to implement yourself).
If you generate the client id serverside you should add it (as per Blexys answer) to a datalayer for GTM:
dataLayer = [{
'userID': '12345',
}];
Alternatively push it to an existing dataLayer variable:
dataLayer.push({'userID': '123456'});
In the Tag Manager you then create a macro of the datalayer type that reads from the key userID. A server generated userID that spans multiple session is of course only possible for authenticated users.
Bottom line is that the Tag Manager has no way of persistent storage, so if you create the userID yourself you need cookies (which might be deleted, so it's not reliable across sessions)
You can pass user ID in data layer variable or in cookie. It is your decision cookie or data layer. GTM is able to read from cookie also from data layer.
If You use cookie You have to configure 1st party cookie macro in GTM. If data layer You have to configure data layer variable macro in GTM.

Tracking by id in url with Google Analytics

I want to track views for stories on my site. I want to use Google analytics to do this. Off the bat Im thinking of doing this:
pageTracker._trackEvent('Story', 'View', 'Title of story');
But I also would prefer to track with the story id as well that is passed in the url. So if I want to run a report in GA I would like to have the option of getting stats by story title or by story id that is passed via url. Is that possible?
GA by default does not strip parameters from the URL when _trackPageview is triggered, so you will see unique pages show up in your reports. For example, these two will show up as separate entries:
/somePage.html?id=1
/somePage.html?id=2
edit:
Okay, you can use this to get whatever url parameter you want:
function getParam (n) {
var x=new RegExp("[\\?&]"+n.replace(/[\[]/,"\\\[").replace(/[\]]/,"\\\]")+"=([^&#]*)");
var r=x.exec(window.location.href);
return(r==null)?'':r[1] ;
}
// example
var story = getParam('story'));
now story has whatever value the story=xxx URL parameter value is, and you can use story as your category value in your event tracking argument

Tracking email opens in Google Analytics

We have tracking in our emails to track clicks back to our site through Google Analytics. But is there a way to track opens? I would imagine I have to add a google tracking image to the email somewhere. Possibly javascript too?
As others have pointed out, you can't use Javascript in email. The actual tracking is done by a request for __utm.gif though and the Javascript just constructs the GET parameters.
Google supports non-Javascript uses of Google Analytics per their Mobile web docs:
http://code.google.com/mobile/analytics/docs/web/
They document the full list of parameters, but the only necessary parameters are:
Parameter Description
utmac Google Analytics account ID
utmn Random ID to prevent the browser from caching the returned image
utmp Relative path of the page to be tracked
utmr Complete referral URL
The reference that describes all of the parameters that the Google Analytics tracking GIF allows is here. Use it to build an <img> tag in your email that references the GA GIF.
According to this post, the minimum required fields are:
utmwv=4.3
utmn=<random#>&
utmhn=<hostname>&
utmhid=<random#>&
utmr=-&
utmp=<URL>&
utmac=UA-XXXX-1&
utmcc=_utma%3D<utma cookie>3B%2B_utmz%3D<utmz cookie>%3B
It sounds like you are using campaign tracking for GA but also want to know how many opens there were. This is possible to do with Google Analytics, since they track pageviews or events by use of pixel tracking as all (I think?) email tracking does. You cannot use javascript, however, since that will not execute in an email.
Using Google Analytics pixel tracking:
The easiest way would be to use browser developer tools such as Firebug for Firefox or Opera's Dragonfly to capture a utm.gif request and copy the URL. Modify the headers to suit your needs. You can count it either as an event or pageview. If you count it as an event it should look something like this:
http://www.google-analytics.com/__utm.gif?utmwv=4.8.6&utmn=1214284135&utmhn=www.yoursite.com&utmt=event&utme=email_open&utmcs=utf-8&utmul=en&utmje=1&utmfl=10.1%20r102&utmdt=email_title&utmhid={10-digit time code}&utmr=0&utmp=email_name&utmac=UA-{your account}
You can use this to understand what describes what in the headers.
I better post this to save everyone the trouble of trying to construct that monstrous UTM gif URL.
You can now use the new Measurement Protocol API to send a POST request and easily record events, page views, hits, or almost any other type of measurement. It's super easy!
POST /collect HTTP/1.1
Host: www.google-analytics.com
payload_data
For example, here's a code snippet to send an event in C# (using SSL endpoint):
public void SendEvent(string eventCategory = null, string eventAction = null, string eventLabel = null, int? eventValue = null)
{
using(var httpClient = new HttpClient() {BaseAddress = new Uri("https://ssl.google-analytics.com/")}) {
var payload = new Dictionary<string, string>();
// Required Data
payload.Add("v", "1"); // Version
payload.Add("tid", "UA-XXX"); // UA account
payload.Add("aip", "1"); // Anonymize IP
payload.Add("cid", Guid.NewGuid().ToString()); // ClientID
payload.Add("t", "event"); // Hit Type
// Optional Data
payload.Add("ni", "1"); // Non-interactive hit
// Event Data
if (eventCategory != null)
{
payload.Add("ec", eventCategory);
}
if (eventAction != null)
{
payload.Add("ea", eventAction);
}
if (eventLabel != null)
{
payload.Add("el", eventLabel);
}
if (eventValue != null)
{
payload.Add("ev", eventValue.Value.ToString(CultureInfo.InvariantCulture));
}
using (var postData = new FormUrlEncodedContent(payload))
{
var response = httpClient.PostAsync("collect?z=" + DateTime.Now.Ticks, postData).Result;
if (!response.IsSuccessStatusCode)
{
throw new Exception("Could not send event data to GA");
}
}
}
}
Way easier than the hack with the __utm gif.
Helpful Example
You can easily add this to emails by doing this:
In an email:
<img src="{url}/newsletter/track.gif?newsletterName=X" />
In your MVC site, for example, NewsletterController:
public ActionResult Track(string newsletterName) {
using(var ga = new AnalyticsFacade()) {
ga.TrackEmailOpen(newsletterName);
}
return Content("~/images/pixel.gif", "image/gif");
}
In your Global.asax or RouteConfig:
routes.MapRoute(
"newsletteropen",
"newsletter/track.gif",
new
{
controller = "Newsletter",
action = "Track"
});
BOOM, done, son. You can now track email opens using a much nicer API that's supported and documented.
Is your requirement is to track how many times an e-mail is open by given user. We have similar problem. We are using SMTP relay server and wanted to track how many times our marketing e-mails are open in addition to google-analytics which register an even only when someone clicks inside link to our site in e-mail.
This is our solution. It is based on making a REST call by overriding image element of html (our e-mails are html base)
where TRACKING is dynamically generated url which points to our REST service with tracking information about person to which e-mail was send. It is something like that
//def trackingURL = URLEncoder.encode("eventName=emailTracking&entityType=employee&entityRef=" + email.empGuid, "UTF-8");
trackingURL = baseUrl + "/tracking/create?" + trackingURL;
It will be something like "https://fiction.com:8080/marketplace/tracking/Create?eventName=email&entityType=Person&entityRef=56"
When when actual e-mail html is generated it, TRACKING will be replaced by
Important point is to return a response of type image and return a one pixel transparent image with REST response.
So i'll assume that the email contains a link to your Site. Certainly GA can record how often that link is clicked because clicking the link will open the page in turn causing the function *_trackPageview()* to be called, which is recorded by GA as a pageview.
So as long as that page has the standard GA page tag, no special configuration is required--either to the GA code in your web page markup or to the GA Browser. The only additional work you have to do is so that you can distinguish those page views from page views by visitors from another source.
To do that, you just need to tag this link. Unless you have your own system in place and it's working for you, i recommend using Google URL Builder to do this for you. Google URL Builder is just a web-form in which you enter descriptive terms for your marketing campaign: Campaign Source, Campaign Medium, Campaign Content, Campaign Name. Once you've entered values for each of these terms, as well as entered your Site's URL, Google will instantly generate a 'tagged link' for you (by concatenating the values to your Site's URL).
This URL generated by Google URL Builder is the link that would be placed in the text of your marketing email.

Resources