Fix Self referral issue in GA4 property - google-analytics

Our GA4 property in Google analytics is showing our own website as a referral source. Normaly, in ga3 property there is a filter where you can exclude sites like payment portals and such. In the new property this feature is not yet available.
We tried using the following script to workaround the problem
var ref = {{Referrer}};
// don't bother if there is no referrer
if (!ref) return ref;
var newref;
// place your external referrers here (domain names)
// adding 'foo.bar.com' matches 'www.foo.bar.com' too
var domains = [
// banks
'rabobank.nl', 'ing.nl', 'abnamro.nl', 'regiobank.nl', 'snsbank.nl',
'asnbank.nl', 'triodos.nl', 'vanlanschot.nl', 'knab.nl', 'bunq.com',
'frieslandbank.nl', 'snsreaal.nl', 'secure-ing.com',
// payment providers, cards, foreign banks
'mollie.nl', 'mollie.com', 'paypal.com', 'paypal.nl', 'adyen.com',
'multisafepay.com', 'visa.com', 'wlp-acs.com', 'belfius.be', 'payin3.nl',
'icscards.nl', 'arcot.com', 'securesuite.co.uk', 'hsbc.com.hk',
'cm-cic.com', 'pay.nl', 'redsys.es', 'tatrabanka.sk'
];
domains.forEach(function(x) {
// loop through domains,
if(ref.match(RegExp('^https?://([^.]+\.)?'+ x +'/')))
newref = x;
})
// return referrer, or the new one
return newref ?
'https://' + {{Page Hostname}} + '/excluded-referrer/' + newref
: ref
}
The script does not work though. Could you give me any new recommendation on how to solve this issue or tell me if the code might be wrong?
Thanks

Update: Google actually releases gradually a referral exclusion feature in GA4, which you can find here:https://support.google.com/analytics/answer/10327750?ck_subscriber_id=750579406
Thankfully this will help with this problem.

Related

GA4 + GTM: Remove URL query params from all page data

How do I remove URL params from getting pushed to GA4 through GTM? I have tried many solutions but none seem to be working.
Which key in "Fields to Set" do I need to use so GTM replaces the url query param from all dimensions like page_path, page_location, path_referrer?
This article has been my life saver when dealing with URL params in GA4, but please use my experience and avoid the mistake of applying the script directly to page_location.
page_location is what I call a technical dimension that GA4 uses to sort referring websites according to its internal rules and do any other GA4 things. Remove URL params from page_location using GTM, and you'll stop seeing all channels, reliant on UTMs—so paid search, display, paid social, email etc (provided you use UTMs, of course). Don't forget: in this case, you remove the URL params in GTM before they get in GA, so if GTM strips params out, GA doesn't see them.
To illustrate my mistake, this is how my GA4 configuration tag in GTM looked like initially:
Bad idea. Don't touch page_location.
The best approach is to just create your own dimension which you would use to store 'clean' URLs, say, page_URI. The reason: you stop relying on GA built-in dimensions that (potentially) are prone to change and you create something of your own that you will have control over and can add to any event as a dimension.
Below is my version of the script in GTM, deployed as a Custom Javascript Variable:
function() {
var params = ['hash', 'email', 'utm_source', 'utm_medium', 'utm_campaign', 'utm_content', 'utm_term', 'gclid', 'fbclid', 'mc_cid', 'mc_eid', 'msclkid']; //Add URL params to be excluded from page URI
var a = document.createElement('a');
var param,
qps,
iop,
ioe,
i;
a.href = {{Page URL}};
if (a.search) {
qps = '&' + a.search.replace('?', '') + '&';
for (i = 0; i < params.length; i++) {
param = params[i];
iop = qps.indexOf('&' + param + '=');
if(iop > -1) {
ioe = qps.indexOf('&', iop + 1);
qps = qps.slice(0, iop) + qps.slice(ioe, qps.length);
}
}
a.search = qps.slice(1, qps.length - 1);
}
return a.href;
}
Two things to mention about the code:
List all params you want to strip out in the array params;
a.href = {{Page URL}} - the code makes use of GTM's built-in variable Page URL (hence double curly brackets) that captures the full URL (without hostname, though). If you feel fancy, you can replace it with plain JS.
So the code above now populates the GTM field/GA4 dimension page_URI in the main configuration tag and any other tags, where I think having a clean URI is useful:
I do realize that this approach uses up one GA4 dimension, but it's a price I'm willing to pay to have a clean URL in the absence of a better solution.
In the GA4 tag in GTM try to set page_location as Field to Set and a Custom JavaScript Variable as value with this definition:
function(){
return document.location.hostname + document.location.pathname;
}
i.e. (note: App+Web is old name of GA4):
You can also use the following JavaScript in the custom JavaScript variable instead of the custom JavaScript mentioned above.
In this custom JavaScript instead of creating a new anchor element, we simply are taking the full page URL and then using the JavaScript's in-built URL() method to convert it to a proper URL that can be programmatically managed and then manage it according to the need.
I'm sharing my script below:
Step 1
Create a custom JavaScript variable inside your GTM and add the following JavaScript code into it.
function() {
// Set the array with the list of query string you would like to remove being shown up in Google Analytics 4
var excuded_query_params = [
'add',
'the',
'query',
'strings',
'you',
'would',
'like',
'to',
'be',
'removed',
'from',
'GA4',
'analytics',
'report'
]
// Get the full Page URL from GTM in-build variables
var page_url_string = {{Page URL}}
// Convert the received URL from string format to URL format
var page_url = new URL( page_url_string )
var page_url_copy = new URL( page_url_string )
// Loop through the query parameters in the URL and if there is any query param which is in the excluded list,
// remove that from the full URL
page_url_copy.searchParams.forEach( function(param_value, param_name) {
if( excuded_query_params.includes( param_name ) ) {
page_url.searchParams.delete( param_name )
}
} )
// Return the final URL
return page_url.toString()
}
Please Note: as we are going to replace the value of page_location a default GA4 variable's data - it is highly recommended that you do not remove the utm_ query parameters from the URL as GA4 reports use that data internally and that may lead to report breaking. So, it's best that you do not remove query parameters like utm_souyrce, utm_campaign etc.
Step 2
Inside your GA4 Configuration Tag, click on Fields to Set and add a new field with the Field Name set as page_location and value set as this custom JavaScript variable.
Step 3
Now it's time to preview inside GTM and deeply.

Enhanced ecommerce don't recognize internal promotion view object

I've been implementing all the enhanced eCommerce tracks for the past few weeks and I could do most of the job successfully thanks to Simo Ahava's blog. But now I'm struggling with the internal promotion view tracking.
I choose to implement the view tracking with the concept of True View Impressions also with a base on Simo's work and for products it was ok. So I modified the customTasks from the link to track internal promotion but, for some reason, the enhanced eCommerce isn't recognizing the promoView object. But it's recognizing the promoClick (?).
I've made a test: I substitute the promoClick for a impression object and it works! So, my strong guest, it's that the problem it's really on my object. My object's format can be seen here.
And to illustrate the way the object it's being constructed:
var targetElement = {{Click Element}},
event = {{Event}},
batch = window[promoBatchVariableName],
impressions = google_tag_manager[{{Container ID}}].dataLayer.get('ecommerce.promoView.promotions'),
ecomObj = { };
if (event === 'gtm.click') {
while (!targetElement.getAttribute(promoIdAttribute) && targetElement.tagName !== 'BODY') {
targetElement = targetElement.parentElement;
}
}
var latestPromoImpression = impressions.filter(function(impression) {
return impression.id === targetElement.getAttribute(promoIdAttribute);
}).shift();
var promoImpressionsArr = batch.map(function(id) {
return impressions.filter(function(impression) {
return impression.id === id;
}).shift();
});
if (event === 'gtm.elementVisibility'){
promoImpressionsArr[maxPromoBatch - 1] = latestPromoImpression;
}
console.log(ecomObj)
ecomObj.promoView = { promotions: promoImpressionsArr};
if (event === 'gtm.click') {
ecomObj.promoClick = {
promotions: [latestPromoImpression]
};
console.log("click")
}
return {
ecommerce: ecomObj
};
}
Could someone help me with some ideas?
This answer is just to close the question! As I pointed in the comments:
" I found the problem. And it's not on my object itself only. xD The problem is the undefined elements as you pointed at the beginning of our talk. I'm waiting for the dev team to change the data-attributes of the elements on our site's pages because sometimes we don't get any individual identifier variable. So, in the meantime, I've implemented a way to get always a product id even in these cases but as the identifier doesn't exist in the CSS selector if the element has an id in the 'entrance object', the element is set as undefined. "

GCalendar API - Exclude Events By ID?

I'm particularly trying to write this in C#, but has anyone managed to create a LIST request for events that OMIT events by a list of IDs? The idea here is to omit Google Calendar events that I've already pulled before in my previous requests (this would be stored in my application data) so that the events are always new. Here's my current list request code below:
// Create Google Calendar API service.
var service = new v3GCal.CalendarService(new BaseClientService.Initializer()
{
HttpClientInitializer = credential,
ApplicationName = ApplicationName,
});
var today = DateTime.Today;
var tomorrow = today.AddDays(1);
// Define parameters of request.
v3GCal.EventsResource.ListRequest request = service.Events.List("manager#affirmmedicalweightloss.com");
request.TimeMin = today;
request.TimeMax = tomorrow;
request.ShowDeleted = false;
request.SingleEvents = true;
request.Q = "";
request.MaxResults = 10;
request.OrderBy = v3GCal.EventsResource.ListRequest.OrderByEnum.StartTime;
I'm not too hopeful - I've been digging around trying to find something with this feature, but with no luck. I would ideally include in the request something like:
"id NOT IN " + collection of existing id strings
But I don't see documentation on this anywhere.
Has anyone pulled this off, or considered filing a feature request for it? I thought of filing one, but given the issue tracker that I found at issuetracker.google.com, I'm not too hopeful this would get implemented anytime soon...
If your issue is regarding recurring event, you can try to use Events: instances that returns instances of the specified recurring event. Doing so excludes the recurring events, but includes all expanded instances. If not, you can file a bug here.

I am noticing double entry (cpc and organic) for the same user ? little confused

I'm noticing double entry in google analytics. I have multiple ocurrences where it looks like the user came from the CPC campaign (which always has a 0s session duration) but that very same user also has an entry for "organic" and all the activities are logged under that.
My site is not ranked organically for those keywords. Unless a so many users come to my site, leave, and google for my "brand name" on google and revisits, this doesn't make sense.
I'm a little confused. Here's the report:
preview from google analytics dashboard
Based on the additional information in your comment, that the sites is a Single Page Application (SPA), you are most likely facing the problem of 'Rogue Referral'.
If this is the case, what happens, is that you overwrite the location field in the Analytics hit, losing the original UTM parameters, whereas referral is still sent with the hit, so Analytics recognizes the second hit as a new traffic source. One of the solutions is to store the original page URL and send it as the location, while sending the actual visited URL in the page field.
A very good article on this topic with further tips, by Simo Ahava, is available for your help.
Also please note, that as you have mentioned, that the first hit shows 0 second time on page, you might need to check, whether the first visited page is sent twice. E.g. sending a hit on the traditional page load event, and sending a hit for the same page as a virtual page view.
I have come up with a solution to this problem in a Gatsby website (a SPA), by writing the main logic in the gatsby-browser.js file, inside the onRouteUpdate function.
You can use this solution in other contexts, but please note that the code needs to run at the first load of the page and at every route change.
If you want the solution to work in browsers that do not support URLSearchParams I think you can easily find a polyfill.
Function to retrieve the parameters
// return the whole parameters only if at least one of the desired parameters exists
const retrieveParams = () => {
let storedParams;
if ('URLSearchParams' in window) {
// Browser supports URLSearchParams
const url = new URL(window.location.href);
const params = new URLSearchParams(url.search);
const requestedParams = ['utm_source', 'utm_medium', 'utm_campaign', 'utm_content', 'gclid'];
const hasRequestedParams = requestedParams.some((param) => {
// true if it exists
return !!params.get(param);
});
if (hasRequestedParams) {
storedParams = params;
}
}
return storedParams;
}
Create the full URL
// look at existing parameters (from previous page navigations) or retrieve new ones
const storedParams = window.storedParams || retrieveParams();
let storedParamsUrl;
if (storedParams) {
// update window value
window.storedParams = storedParams;
// create the url
const urlWithoutParams = document.location.protocol + '//' + document.location.hostname + document.location.pathname;
storedParamsUrl = `${urlWithoutParams}?${storedParams}`;
}
Send the value to analytics (using gtag)
// gtag
gtag('config', 'YOUR_GA_ID', {
// ... other parameters
page_location: storedParamsUrl ?? window.location.href
});
or
gtag('event', 'page_view', {
// ... other parameters
page_location: storedParamsUrl ?? window.location.href,
send_to: 'YOUR_GA_ID'
})

Google Analytics dimension pagePathLevel more than 4

I have very long web pages paths reported to google analytics:
/#/legends_g01/games/legends_g01_02_academy_i-170909-55/notes/1/dynamics
/#/legends_02_academy_i/games/legends_g01_02_academy_i-170912-64/notes/12/players
/#/legends_05/games/legends_05-170912-84/notes/22/players
/#/legends_g01_02_academy_i/games/legends_g01_02_academy_i-170919-78/notes/34/levels
I'm using Core API to create a query where I need to have a metric ga:users with dimension by the last path part (7th). The starting part of the path doesn't matter here and should be ignored.
So if there is ga:pagePathLevel7 then I can use
dimension: ga:pagePathLevel7
metrics: ga:users
And see the result like this:
dynamics: 34
players: 45
levels: 87
How can I do this without ga:pagePathLevel7?
It seems that I'm the only one here with such a problem.
As I failed to find a direct solution I ended up adding custom dimensions to my google analytics. I added the dimensions for the last important path parts and changed the code on the site to supply this data together with the pageView url.
import ReactGA from 'react-ga';
export const statDimension = (dimensionName, value) => {
if(value)
{
let obj = {};
obj[dimensionName] = value;
ReactGA.set(obj);
}
};
export const statPageView = (url, game_id, clip_num) => {
if(!url)
{
url = window.location.hash;
}
//set game_id
statDimension(STAT_DIM_GAME_ID, game_id);
//set clip number
statDimension(STAT_DIM_CLIP_NUM, clip_num);
ReactGA.pageview(url);
return null;
};
I use react-ga npm module for submitting data to google analytics.
Now I can use custom dimensions together with filters on my urls to get stats based on the parts of the path with depth > 4.
May be that's not an elegant solution but is a working one.
Hope this will be helpful for somebody like me.

Resources