After some experimenting, I noticed it is possible to send events directly to a server container via HTTP request instead of pushing to the data layer (which is connected to a web container). A big advantage of this setup is that the front-end doesn't need to load any GTM script. Yet, I have some doubts because I don't find much documentation about this setup. This setup also brings some challenges like implementing automatically collected events (e.g. page_view). Does anyone have experience with this setup or is able to tell me why I shouldn't be following this path?
Regards, Thomas
This is definitely not a best practice, although this is actually a technically more beneficial path since... A few things, actually:
Can make your tracking completely immune to adblockers.
Has the potential to protect from malicious analytics spam, also makes it way harder for third parties to spoil your data.
Doesn't surface your analytics stack and libraries to the public.
Is typically way lighter than the GTM lib.
You have a much better degree of control about what happens and have much more power over the tracking.
But this is only if you have the competency to develop it, which is a rarity, actually. Normally web-developers don't know analytics well enough to make it work well while analytics developers lack the technical knowledge. You now suddenly can't just hire a junior or mid implementation expert to help with the tracking. A lot of those who call themselves seniors wouldn't be able to maintain raw JS tracking libraries either.
As you've mentioned, you won't be able to rely on automatic tracking from GTM or gtag libraries. And not having automatic events is actually not the issue. The more important thing is manually collecting all dimensions, including the proper maintenance of client ids and session ids.
Once your front-end is ready, it's important to note that you don't want to expose your server-side GTM's endpoint. I mean, you can, but this would defeat the purpose significantly. You want to make a mirror on your backend that would reroute the events to the sGTM.
Finally, you may want to make up some kind of data encryption/protection/validation/authentication logic on your mirror for the data. You may consider it just because without surfacing the endpoints, you're now able to further conceal what you're doing thus avoiding much of potential data tampering. This won't make it impossible to look into what you're doing, of course, but it will make nearly impossible any casual interference.
In the end, people don't do it because this would effectively double the monetary cost of tracking since sufficient experts would charge approximately double from what regular analytics folks charge. However, the clarity of data will only grow about 10-20%. Such an exchange generally doesn't make business sense unless you're a huge corporation for which even enterprise analytics solutions like Adobe Analytics is not good enough. Amazon would probably be a good example.
Also, If you're already redefining users and sessions, you're not that far from using something like Segment for tracking and then ETLing all that into a data warehouse and use a proper BI tool for further analysis. And now is there still sense in having the sGTM at all if you can just stream your events to Segment realtime from your mirror, and then it can seamlessly re-integrate this data into GA, Firebase, AA, Snowflake, Facebook and tens if not hundreds more destinations, and this all server-side.
You want to know where to stop, and the best way to do it is by assessing the depth of the analysis/data science your company is conducting on the user behavioral data. And in 99% of cases, it's not deep enough to even consider sGTM.
In response to #BNazaruk
So it's been a while now… I've been looking into the setup, because it’s just way too cool. I also took a deeper dive into CGTM to better understand the benefits of SGTM. And honestly, everything that has the probability to replace CGTM should be considered. My main reasons are;
Cybersecurity - Through injection it is possible to insert malicious software like keyloggers. The only thing that withholds this, are the login details to CGTM. These are, relatively speaking easy to get with targeted phishing.
Speed - A CGTM setup, with about 10 - 15 tags, means an avg performance loss of 40 points in Lighthouse.
Quality - Like you said; because browser restrictions like cookie policies and ad blockers that intercept/manipulate/block CGTM signals: On avg. 10-20% of the events are not registered in proper fashion.
Mistakes - Developing code outside a proper dev process, limits the insight into the impact of the code with possible errors or performance loss as a result.
So far I have created a standardized setup (container templates, measurement plans, libraries) for online marketers and developers to use. Within the setup, we maintain our own client and session ID’s. Developers are able to make optimal use of SGTM and increase productivity drastically. The only downside to the setup is that we still use CGTM to implement page_view and exceptions. Which is a shame, because I’m not far away from a full server-to-server setup. Companies are still too skeptical to fully commit to SGTM I guess. Though, my feeling says that in 5 years time, high-end apps won't use CGTM anymore.
Once again, thanks for your answer, it’s been an important part of my journey.
I am working on a project to remove gtag based tracking from a website. We have concerns about not being able to implement tracking due to poor network connections. The ideal solution would be to handle conversion tracking on the backend when we receive a purchase request (so as not to send too many requests.) It seems that Google Analytics Measurement Protocol would be the correct tool for the job. However, this (somewhat frustrating) note mentions that "only partial reporting may be available."
We have explored server side GTM, but that still adds unwanted network requests and is not the approach we want to go for. It seems that there are workarounds such as in this question, but they seem pretty precarious. Is there any other api available, or approach that might fit the use case we are looking for?
We're using Firebase for analytics on our mobile apps. But Firebase only appears to report on active users for 1, 7 and 28-day rolling periods. These are not the industry standard reporting metrics I'm looking for.
We also have a web app, where we're counting unique active users in Google Analytics, and we'd like to be able to compare (and combine) MAUs from our apps in firebase with web MAUs calculated in GA.
Is this possible without BigQuery?
If no, how much will BigQuery cost us?
It seems crazy to have to purchase BigQuery for this purpose alone. Any help is appreciated.
Is [it] possible [to get MAU] without BigQuery?
If the intervals in the analytics reports in the Firebase console don't suit your needs, you will have to roll your own. There is nothing built into Firebase for custom intervals. Most developers use BigQuery for such custom reporting, especially since this is quite easy to do by tweaking the default Data Studio template.
If no, how much will BigQuery cost us?
If you have a look at the BigQuery pricing page, you'll see that this is quite involved making it hard to answer without knowing your exact amount of data. In general: if you store and process more data (i.e. have more users in your app or more reports), you will pay more. Luckily there is now a BigQuery sandbox, which allows you to process significant data without paying (even without entering a credit card). This gives you an option to try BigQuery, before committing to it.
i have a realtime platform when users are staying on pages for a long duration, i found that after 5 minutes (more or less) the GA realtime stop show them so i created timer that each 4 minutes send pageview and this way all users remain "connected" to GA.
I wonder if it's a good approach or it's can may produce un-accurate data on the reports later.
Is anyone experienced that?
Your terminology seems a little off - users do not become "disconnected" from Google Analytics, the difference between realtime reports and data from the reporting api is that the former shows only a subset of ad hoc computed dimensions and metrics whereas the reporting api shows, after some processing latency, the full set of metrics and dimensions, including stuff that required more processing time like session- and user scoped data.
Other than that your approach is fine. There is a limit on the number of API calls you are allowed to make - the documentation has an example on how to calculate your calls to stay within the limits, and Google suggests to implement some sort of serverside caching if you do need a lot of realtime dashboards.
But this is not going to affect the data quality of reports in any way. Realtime API is a read-only API, the worst thing that can happen is that you exceed your quota and get blocked for the rest of the day. So there is no way this would create "un-accurate data on the reports later".
Say, I'm developing a Windows (if OS is important) application that will be available to download for free and I would like then to collect some usage statistics. In the easiest case - count of application launches. It looks superfluous to maintain a server (e.g. VDS) just for this.
I've been thinking to use Google Analytics for this (manually send requests to GA server). This will probably work, but it is not GA designed for - the idea looks like a hack.
What are the options here?
I don't think this is a hack. It's all just data about user interaction. There is little logical difference between opening a desktop app and clicking a button vs opening a web page and following a link. Both are measurable user actions you can track, aggregate and put on graphs.
In fact, Google provides a lower level HTTP based "Measurement Protocol" that is intended for exactly that.
https://developers.google.com/analytics/devguides/collection/protocol/v1/
From the overview:
The Google Analytics Measurement Protocol allows developers to make
HTTP requests to send raw user interaction data directly to Google
Analytics servers. This allows developers to measure how users
interact with their business from almost any environment
Just put an HTTP request with the correct parameters in your application launch or button click code and it will collect the data. Any data you want to collect.
In other answers to this question there are suggestions like making web services or storing the data locally but why reinvent the wheel? Google Analytics already provides the collecting and reporting tools and it seems like a good solution.