Trying to figure out why Google Analyics report more users than our own stack - iframe

Comparison is between the Google Analytics script and a "chat widget" loaded in a iframe on the onload event. The iframe is not loaded if the useragentlooks like a crawler. The code in the iframe creates a websocket connection and then a user is upserted in our backend.
I'm looking at one website where we currently get around 2/3 of new users reported by Google Analytics as users in our backend.
With the new Google Analytics UI, it's not clear to me how much crawlers are actually filtered or not (e.g. is Google Analytics always filtering Google Crawler?). That could explain a lot.
Our "tracking id" is saved in localstorage which is a bit more persistent than a cookie, but I can't see it making a big difference.
The extra loading time and the websocket connection can probably explain a bit (old browsers that don't support TLS 1.2, baldy configured proxies, etc) can explain some of it, but 1/3 seems big.
Any insights in Google Analytics metrics or other ideas are welcome.

Bot traffic in Google Analytics is not filtered by default. You can enable it in View Settings.
Also it's possible to pass User Agent strings into Google Analytics as a custom dimension and then create filter to exclude sessions based on User Agents that you know to be bots.
https://www.lunametrics.com/blog/2015/04/01/eliminating-bot-traffic-from-google-analytics-once-and-for-all/
The second reason of difference may be that GA tracker fires when page starts to load, but "chat widget" when window is loaded. Some bounced sessions (and users) are counted by GA, but not counted by "chat widget".

Related

GTM consent mode Google analytics not tracking pageviews without consent

we have recently implemented cookies consent mode in Google tag manager. Pageviews in google analytics (both universal and GA 4) dropped to about 4 % of the original page views.
When I visit the web page without giving any consent some requests to google-analytics.com still appears in network tab. This tells me something is still tracked. No cookies are saved as expected.
Do you have a different experience? Should GA track pageviews anonymously even without consent or it this a correct behavior? Why would analytics script send any requests if it not appears in GA?
It is expected that you see requests to Google Analytics. However in consent mode you should not see _ga cookies being set (or existing _ga cookies being used). That's basically what consent mode is, cookie-free tracking. You should also see a parameter gcs=100 in the GA request to confirm that consent mode is indeed active.
However these requests are not surfaced in the reports (there would be no point, since without user identifier you cannot create sessions). Apparently they are being used for some sort of Machine Learning thing in the background, but I don't think this has much effect on the analytics part of GA (afaik this is more for Google Ads and GA audiences).
So yes, seeing requests in consent mode is normal (if this helps in any way with GDPR compliance etc. in entirely another matter). If you do not run campaigns and do not hope for an uplift via ML models applied to consent mode data, then you probably do not need to bother and can just disable the tags if no consent is given.
It happens because you have to set ad_storage and analytics_storage as denied by default in initialization code, like described in the documentation (in the example only ad_storage is denied, just add analytics_storage too): https://developers.google.com/tag-platform/devguides/consent#implementation_example

How does Google Analytics filters duplicate site entrances

We are implementing a native analytics system and want to apply the same tracking principles Google Analytics uses. We've figured everything out but one thing:
Every time I refresh a page with an url that has utm-parameters attached to it, Google Analytics somehow figures out that it's not actually a visit but the same page that gets refreshed and shows only one visit in its dashboard from that particular source.
Is anybody aware how GA specifically does that so I can replicate it in our system?
I know that I can use
performance.navigation.type
in my JS script, but it doesn't give me desired results.
Any help would be much appreciated.
Attribution in GA happens on the Google servers, so JavaScript will be of limited use. Basically since a reload means that the user has the same client id and no change in the channel (source, medium and campaign are the same as in the previous visit) the existing session will be continued (a change of campaign/source information would trigger a new Google Analytics session).
Google has a very nice chart that explains how campaign information and traffic source information is processed.

What might interfere with Google Analytics pageview tracking?

I'm using Google Analytics to send and track a registration goal, based on a custom pageview.
Basically, every time a user registers and gets redirected to dashboard, I trigger a custom pageview using JS:
ga('send', 'pageview', {page: '/gapv/registration/success'});
The tracking works. However, when I compare the GA tracking with the number of registrations from the database of my website, the GA goal completion always shows around 10% less registrations.
I checked the tracking on various desktop browsers, all seems to be fine. Additionally, I hear that content blocking on mobile browsers gets more and more popular.
Is it possible that browser content blocker plugins might block my custom pageview from being fired? I send the pageview on DOM ready.
What else might explain the difference between my DB results and the GA goal count?
Any ideas are very welcome.
Users can opt-out from Google Analytics
Depending on where your tracking code is placed users may leave the page before the hit registers with GA (i.e. users might leave before the DOM is ready - can you move your tracking to pageload ?)
Some Ad blockers block GA, too (some claim this is especially true if you use the "advertising features")
There might be occasional javascript errors that prevent tracking (or users do not have javascript enabled at all)
As a serverside mechanism your database will be more reliable. While you may be able to bring the discrepancy down to single digits it is not likely that the numbers precisely match (even if you do server-side tracking with GA via the measurement protocol you still have to respect user opt-out).

Sending events to 3rd party domains with Google Analytics Universal

We provide an iframe-based embeddable widget to our clients and want to be able to send their Google Analytics account events based on what is happening inside our iframe. We used to be able to do this using the old ga.js with the following code:
_gaq.push(['NAMESPACE._setAccount', CUSTOMER_GA_ACCOUNT_ID]);
_gaq.push(['NAMESPACE._setDomainName', CUSTOMER_DOMAIN]);
_gaq.push(['NAMESPACE._trackEvent',"category", "event", "label"]);
This code would be fired inside our iframe, but would result in an event being pushed to the customer's google analytics account, letting them have some insight into customer behavior inside our embeddable widget. The NAMESPACE in front of the actions would keep these events separate from our GA stuff, and would prevent our GA stuff from going to their GA account. They would provide us with their Google Analytics account ID when they wanted to enable this feature.
Is there a way to do something similar with the new Universal Analytics API?
I don't seem to see a similar way to namespace actions and the cross-domain tracking documentation doesn't indicate how to send events/pageviews to two domains/GA accounts (ie we want to track pageviews on our GA account, and send the events to our customer's GA account).
I suggest going the roundabout way - set a pixel in your widget and make the actual endpoint for that pixel a script on your server that sends the data via the measurement protocol to your clients analytics account.

Combining Google Analytics client- and server side tracking

I track my users behavior using Google Analytics client side. At some point the user is redirected to one of my partners and I receive a callback from the partner about the value the user has generated.
The question is: How do I append that value from the callback as a goal with a value to the users session?
The objective is to be able to get insights about which of my traffic sources generate the most value (and not just most conversions - as that could be tracked with event tracking).
The solution might be similar to this one, but I'm not sure it's still the best solution - and there could be others as well.
Using Google Analytics to track the same session in client javascript and server side tracking calls?
Thanks in advance.
So it seems that this is now possible with the new Universal Analytics from Google by saving/syncing the users UUID to the server and then tracking any relevant events.
https://developers.google.com/analytics/devguides/collection/analyticsjs/user-id
This package seems like a nice wrapper: https://www.npmjs.org/package/universal-analytics
Another option is to use same userId (not clientId). That works for multi-device tracking https://support.google.com/analytics/answer/3123662
Universal-analytics is better in that you can use it for anonymous users as well as signed-in users
But tracking by userId is better in that you can send events even if user is not currently on your website doing http or websocket requests.

Resources